I want to generate an image for a section in an article. The section contains about 500 words and have 3 paragraphs.
Should I use the texts in the whole section as a prompt? Or create an abstract of the section in about 10 words as a prompt?
I have tried and it seems whole section works better. But sometimes the whole section will generate images with a lot of objects, which is not desired. So I think may be too many info will confuse the AI model?
If created with GPT around 100 if too short/long content will be added/cut, causing the image created to not match what was desired
You could have the LLM generate a separate “image description” (based on text from article) after you generate the article then send that to DALLE3…
For shorter prompts, it’s more likely DALLE3 will add/change the prompt. With longer prompts, there’s usually less changing. However, like you said, you want it to be relevant, so have the LLM generate a second image description after the article…
Do you mean I should first send a prompt to gpt-4 model with the following prompt:
Please generate an image description for the following text "Include ALl Section Texts".
Then use the response as the prompt to the DALLE3 model?
You could do it in two calls, but I’d try one.
Add something like, "At the end of the article, include the heading “Article Image:” and three to five sentences that can be used to generate an appropriate image for the article.
Or you could just send the entire article a second time to a smaller model (maybe 3.5-turbo) and have it do a proper image description. More work/cost, but the results should be a lot more consistent for you.