Concur. I was able to get close, but there’s obvious differences if you look at them all together. The tech is coming along, though. I bet DALLE4 will have it… we can hope. Until then, I’m gonna keep tinkering with prompts.
Actually there is a way to do this, but I think you’ll need to cross with midjourney. Here is youtube video on the process:
DALLE3 definitely does have its limitations in terms of graphical generations and photo manipulation, but midjourney is good for what you’re looking to do. So you can sketch or draft your characters on DALLE3 then upload image on midjourney to create a consistent character you’re looking for.
You can request the seeds of the image and use that to recreate similar images with different variables.
for example you DALLE3 creates a superhero then you request the seeds from DALLE3 then you’re able to take the seed to recreate similar images or characters:
{
“prompts”: [
“Illustration set against the backdrop of Santa Monica Beach under a starry sky. The beautiful girl, with a clear face and no glasses, looks on with a mix of concern and awe. Beside her, the superhero is in a powerful stance, fiercely tearing away his tuxedo to reveal his superhero cape and uniform underneath. The scene captures the moment of transformation, with the fabric of the tuxedo rippling in the air and the emblem on his superhero uniform shining brightly.”
],
“seeds”: [4202394079],
“size”: “1792x1024”
}
but if you create a new chat and use the same seed it gets pretty wonky… so i recommend completing the character or goal with the consistency you’re looking for in one chat. I’m sure there is a way to use that same seed one month later to update the character, but you may need to tinker with it from then
Hey!
Just saw this long post in X trying to do this exact thing. Haven’t tried myself yet but try it out and see if it works.
Thread by @chaseleantj (Chase Lean)
A grid prompt always works.
Create a “four panel comic about a fluffy bear
” and the bear will be the same inside those four panels.
You can iterate on that image by using the gen_id
of the image and refer to it by its referenced_image_ids
parameter.
Great article here on Medium title : 99% character consistency with DALL-E 3
Can’t post a link here but I’m sure Google will find it for you.
Is it this?: 99% Character Consistency with DALL-E 3
Yup. I’m working on expanding the prompts now to see if I can stretch it to 24 scenes.
I’ve found that sometimes DALL-E will produce these grids of images if you ask for two variations on an image.
For example, in this case, I asked DALL-E to “Give me two variations on that image.” I expected it to give me two images, instead, the output was a single image, but divided in two (below).
A grid of images is really interesting, and also sparked an idea:
Ask it to create a grid with 16 panels and then upscale them with another AI (for now). Should get you 16 high-resolution images.
Bing DALLE3:
a grid that consists of 16 panels, each showing a teddy bear from different angles, in the style of Roald Dahl
Nero AI Upscaler:
Of course, not perfect since it leaves some artifacts, but now you have 16 images in the same style. The question is if all 16 descriptions will fit into one prompt…
With another upscale algorithm you can get even clearer results;
I always use the open source, stand-alone Upscayl, it runs even on a low-end tablet (no GPU power needed).
Upscale that upscale!
The image above is an upscale to 1024 x 1024
, but you can even make it bigger with multiple passes;
(just a detail, I don’t want to blow up this server - the output was like 15.680 x 15.232 pixels
)
Workflow
You lose some grain / details / accents, but when you prepare the image “to be upscaled” at forehand (make it a bit more graphical / slightly vector-like) then upscaling is not a big deal afterwards.
If you want to print it (magazine, not glossy - so 300dpi) the original image can be more than a square meter in size.
The preview in that image is it's left eye @25%.
Upscale that upscale!
In my opinion this leaves too much artifacts. It’s big, sure, but very AI-ish. Looks like the teddy bear is tearing apart.
It’s just a quick example to help folks out who doesn’t know about upscaling / vector / pixelated art.
It’s an old and well-known trick to divide an image into x-parts for getting consistent characters.
With upscaling you can separate them into stand alone panels.
And in print all those fine / grainy details are lost anyhow (source : I used old skool printing techniques (analogue) for about 35 years know). Most plotters (even digital) can’t handle those fine rasterization things.
Sorry, didn’t get it. Where I should to put all of this information?
This is what is described, in the “custom instructions” of ChatGPT Plus. Click your user name to reveal the menu.
I have the answer to your frustrations friend. There are three ways to do it.
- By using the generation id and seed id. I usually just use the generation id.
- By using custom instructions
- By using a home made or an already made GPT.
I do apologize, I don’t have time right now to explain but when I’m free I can.