IDEA2IMG

Updated 400 days ago
  • ID: 52854809/1
Idea2Img applies LMMs functioning in different roles to refine the T2I prompts. Specifically, they will (1) generate and revise text prompts for the T2I model, (2) select the best draft images, and (3) provide feedback on the errors and revision directions. Idea2Img is enhanced with a memory module that stores all prompt exploration histories, including previous draft images, text prompts, and feedback... We introduce "Idea to Image", a system that enables multimodal iterative self-refinement with GPT-4V(ision) for automatic image design and generation. Humans can quickly identify the characteristics of different text-to-image (T2I) models via iterative explorations. This enables them to efficiently convert their high-level generation ideas into effective T2I prompts that can produce good images. We investigate if systems based on large multimodal models (LMMs) can develop analogous multimodal self-refinement abilities that enable exploring unknown models or environments via self-..
  • 0
  • 0
Interest Score
2
HIT Score
0.00
Domain
idea2img.github.io

Actual
idea2img.github.io

IP
185.199.108.153, 185.199.109.153, 185.199.110.153, 185.199.111.153

Status
OK

Category
Company
0 comments Add a comment