DALL-E Image generation from story Maintaining the scenes consistency

msusman97 · May 17, 2024, 4:43am

Hi,

We are developing an app that allows users to input a story, such as “Wonderland – Down the Rabbit Hole.” Our application will first split the whole story into chunks based on the scenes, potentially using some OpenAI API for this chunking.

Then, we want to use the DALL-E API to generate images for each scene prompt .

Currently, we are manually chunking the story and generating an image for each sentence, representing individual scenes. We want OpenAI to handle the chunking for us. I believe using prompt engineering could be a good choice for this. What would be a suitable prompt for this? Any other suggestions would be appreciated as well.
The images generated from individual independent prompts are not coherent and consistent according to the theme of the story. For instance, the character (who should remain the same throughout the story) appears completely different in each scene in terms of costume and physical appearance. Additionally, the background environment changes with every prompt. We need consistency throughout the story maintaining the context.

How can we achieve this consistency?

Thanks for your time and consideration.

anon22939549 · May 17, 2024, 4:57am

It’s not really possible to create consistent characters or scenery with Dall-E 3.

You’ll probably want to wait for the image capabilities of gpt-4o to be released.

davi.miyake · June 3, 2024, 6:45pm

So the demo presented in the GPT 4o release that shows character consistency was not yet deployed to final users? That is why I am struggling with image consistency, even being a GPT plus user. Do you know when this will be available?

anon22939549 · June 3, 2024, 7:24pm

That is correct.

The demos from the Spring Update event show the new model’s native image generating ability, which has not yet been enabled for users. The model is sending messages to the Dall-E 3 model to handle image generating duties right now.

The image generating capability of gpt-4o is currently being “red-teamed” (tested for safety and alignment). There is no set time-table for this process, all we’ve heard is that it would be weeks to months. With that said, they could be all set to release it only to discover something at the last minute and need to push it back again, that’s why they don’t provide launch dates in advance.

Topic		Replies	Views
DALLE -3 Pixar art style Alternative Prompting api , dalle3	3	4295	May 3, 2024
Chatgpt like prompting in API? (Dall-e 3) API gpt-4 , chatgpt , api , dalle3	15	6176	July 10, 2024
How to use image-to-image generation with DALL·E 3 via OpenAI API? API chatgpt , api , dalle2 , dalle3	3	2695	April 2, 2025
Picture-to-picture with GPT-4o and DALL·E API does not match ChatGPT API gpt-4	2	279	July 29, 2025
Image generation ID doesnt work for consistency Community dalle3	10	3553	May 27, 2024

DALL-E Image generation from story Maintaining the scenes consistency

Related topics