DALL-E Image generation from story Maintaining the scenes consistency

Hi,

We are developing an app that allows users to input a story, such as “Wonderland – Down the Rabbit Hole.” Our application will first split the whole story into chunks based on the scenes, potentially using some OpenAI API for this chunking.

Then, we want to use the DALL-E API to generate images for each scene prompt .

  1. Currently, we are manually chunking the story and generating an image for each sentence, representing individual scenes. We want OpenAI to handle the chunking for us. I believe using prompt engineering could be a good choice for this. What would be a suitable prompt for this? Any other suggestions would be appreciated as well.

  2. The images generated from individual independent prompts are not coherent and consistent according to the theme of the story. For instance, the character (who should remain the same throughout the story) appears completely different in each scene in terms of costume and physical appearance. Additionally, the background environment changes with every prompt. We need consistency throughout the story maintaining the context.

How can we achieve this consistency?

Thanks for your time and consideration.

1 Like

It’s not really possible to create consistent characters or scenery with Dall-E 3.

You’ll probably want to wait for the image capabilities of gpt-4o to be released.

So the demo presented in the GPT 4o release that shows character consistency was not yet deployed to final users? That is why I am struggling with image consistency, even being a GPT plus user. Do you know when this will be available?

That is correct.

The demos from the Spring Update event show the new model’s native image generating ability, which has not yet been enabled for users. The model is sending messages to the Dall-E 3 model to handle image generating duties right now.

The image generating capability of gpt-4o is currently being “red-teamed” (tested for safety and alignment). There is no set time-table for this process, all we’ve heard is that it would be weeks to months. With that said, they could be all set to release it only to discover something at the last minute and need to push it back again, that’s why they don’t provide launch dates in advance.

1 Like