Hi,
I believe there is nothing related to the seed in the Dalle-3 new API, is this a feature that is planned? It would really be needed to be able to generate images in a more simple way while keeping the same prompt.
Thank you !
Hi,
I believe there is nothing related to the seed in the Dalle-3 new API, is this a feature that is planned? It would really be needed to be able to generate images in a more simple way while keeping the same prompt.
Thank you !
Whatâs your use case for seed? Iâm gathering up various input on DALL-E 3 and itâs always helpful to hear the different way people want to use seed.
USE CASE 1:
We want to use the seed in the Dalle-3 API to render new content daily for a story we are creating for kids and publishing daily on the web. We want to be sure that the images correspond to one another as a theme as the story unfolds over the course of the year.
USE CASE 2:
We have a brand owner that wants to create social content for posts each day that maintain a consistent brand approved origin concept; but evolve into other subjects or content based on a calendar. So the content is changing on the calendar basis, but the seed will ensure a common theme to maintain the brand image.
You maybe can use a combination of the Referring image Id (Generation Id) and some scrambled stuff as a postfix at the prompt (as a temporary workaround).
I will also share my use case! By varying the seed value, I can maintain the characterâs style while achieving different expressions, or by fixing the seed value and making slight adjustments to the prompt, I can make fine-tuned adjustments to the image directly. This is extremely useful when expressing a story.
Florance Renaissence , Sandro Botticelli 1445 - 1510
Although the Renaissance was time of immense creative development
it was heavily presided over by the Catholic Church.
Artwork that was considered in any was to potentially entice one towards sin
was not allowed.
Its beileved that Botticelli himself came under literal fire for this
itâs reported that many of his mythological paintings were burned in the
bonfire of the vanities in 1497. Botticelli like many of his contemporary artist would thus
have to have disguised what he was doing under Darwinian survival pressure from
the Church. Hence as is most obvious in Primavera, itâs possible to see surface structure
features that were probably placed there as a nod to the cultural pressure of the time.
If anyone knows or finds a way to access the seeds of our old works, or
if anyone finds a way to adjust the seed
please share it with other artists.
I have a question for OpenAI: I donât understand why OpenAI made it impossible to set the seed value from the beginning in DALL-E 3 with ChatGPT. What is the background to this?
Also, as a repost, the core use case for seed values is âimage modification.â
Basically, the alignment between prompts and the generated images is still not perfect for all image generation AIs in the world. Thatâs why weâve been solving this problem with inpainting, ControlNet, reference-only, IP-Adapter, and mass production (hundreds of units) with the Stable Diffusion model.
However, one great thing that OpenAI has done is to dramatically improve this alignment, so that by maintaining the seed value and rewriting the prompt with the LLM, you can get the image you want faster than ever before. I was impressed with this simple and wonderful idea that made it easy to get what I wanted. I understood this to be the most wonderful appeal point as long as I watched the promotional video of DALL-E 3 with ChatGPT.
In order to use this revolutionary core function, I would like you to be able to use seed values.
This is a guess based on what I confirmed via ChatGPT.
The reason why users cannot use the SEED value in DALLE3 seems to be to prevent unwanted images that have escaped policy checks from being reproduced by anyone using Prompt and Seed.
In other words, my guess based on the answers I got from ChatGPT is that they donât like the same image being shared across multiple users.
Internally, the SEED value is used properly, which can be confirmed by the occasional reminder that the SEED value used is provided as metadata.
This assumption also seems to be supported by the fact that referenced_image_ids, which was introduced as a replacement, cannot be shared between users.
DALLE3 is very good, but there is currently little way to bring out its potential.
SEED value is not available.
Prompts are checked double or triple, and policy application is not reproducible and the same prompt is handled differently each time.
Specifying referenced_image_ids is also difficult and ChatGPT easily ignores it. In the API⌠itâs probably not provided yet.
openAI has very defensive policy enforcement in place, and weâre likely to see a huge increase in users becoming impatient with it.
DALLE3 is excellent, but thatâs only because it comes with the proviso that itâs for now.
Iâm not just talking about the DALLE-3 API, but also the WEB client.
âthe old systemâ: Just a few days ago, I was able to set seeds via ChatGPT web client. I called it âthe old systemâ.
âthe new systemâ: The current system where we can no longer set seeds.
More details see After upgrade seeds doesn't work, "generation ID" is introduced?
Imagine, in an art-sharing community (e.g. DALLE3 Gallery for 2023: Share Your Creations), if I like someoneâs image (or image style), or if someone wants to share their creativity, what should they do?
In the old system, you just share the seed and the exact prompt; in the new system, you must share the whole session link. However, sharing the session link has its own problems, such as exposing unrelated conversation or any images that you donât want to share.
Usually, in order to filter out nice images, we need to continuously modify the prompt or repeatedly click the Regenerate button. That process will lead to a session with very complex branches.
In the old system, once we choose a nice image, we just need to record the seed and prompt (or open a new session and recreate the image there), then delete that complex branches session.
In the new system, we canât delete sessions (this would lose everything, i.e. gen_id
), but those sessions are very complex and hard to manage. Too many unnecessary images within those sessions.
If someone wants me to fine-tune a certain image, how can they give me that image? ChatGPT with DALL-E 3 does not support image uploading.
In the old system, they just send me the exact prompt, size and seed. P.s. I know that ChatGPT can share session link, but you know it has its own problems (see above).
Imagine, in a community for discussing prompt technology (like Discord), person A wants to guide person B on âhow to set up a cameraâ.
In the old system, person B could replicate the same result just by using the same seed as person A. However, in the new system, because of the different seeds, they are essentially discussing two completely different result. This is clearly not a good thing.
You can essentially get the same characters and images the more detailed your prompts are. I see a lot of people using very simple prompts. You need to go into deep detail, clothing texture, colors, bone structure etc. The more detailed, the more consistent. Here is a prompt I smashed through the api and it yielded pretty consistent results. Note that it even caught high-waisted denim shorts that are frayed at the hem
. Details like this help with consistency. If i wanted the exact faces i would need to run a facial profile through gpt-vision and get bone structure and other facial details including a skin profile. With all that, youâll get a near clone or close enough without needed a seed. Smash it 15 times via api and one of them will be your exact character. Itâs not ideal but it does basically work, especially with less realistic images
An African American man and woman in their twenties are savoring a sunny summer day in Central Park, New York, in the year 1985. The man is wearing solid red athletic shorts with a smooth texture and a comfortable fit, complementing his neatly kept afro hairstyle. The woman, sporting a cheerful smile, is dressed in high-waisted denim shorts that are frayed at the hem, adding a touch of casual, lived-in charm. Her bikini top is made of a soft, velvety terracotta material with a subtle sheen, fastened with a delicate tie at the back, which pairs nicely with her full, rounded afro. The image is infused with a sense of nostalgia, captured through a grainy film quality, warm sepia tones, and a gentle soft focus, evocative of vintage photography.
Blockquote
It seems that this sentence has a significant effect in you prompt.
The image is infused with a sense of nostalgia, captured through a grainy film quality, warm sepia tones, and a gentle soft focus, evocative of vintage photography.
However, in many situation, we cannot determine a very detailed style initially.
For example, on my end, if I replace the last sentence with
This image uses a Japanese anime style.
, then generates it 2 times.
The results:
Seed: 1509919237
Seed: 3168608073
Obviously these two images are not the same style.
Now the key question comes:
I like the style of the 2nd image and I hope to iterate that image using the same style, how shoud I do? How do I extract the detail description from the 2nd image (not just âa Japanese anime styleâ) ?
In âthe old systemâ, I just fix the seed ,i.e. using seed 3168608073 in the next image.
In âthe new systemâ, AFAIK there is no way to do (except using gen_id
and referenced_image_ids
, but thatâs another topic).
P.S. Note that using gen_id
and referenced_image_ids
isnât helpful for the use cases Iâve mentioned.
This is why seed is useful.
I suddenly realized that starting from today, I canât get ChatGPT to send prompts accurately.
Take the 2 images above as an example:
The instruction of the 1st image:
Send this JSON data to the image generator, do not modify anything. If you have to modify the JSON data, please let me know and tell me why in the reply. Then stop generating the image. Before generating an image, show me the exact JSON data you are going to put to the image generator. After generating an image, show me the JSON data that the image generator returns to you.
```
{
"size": "1792x1024",
"n": 1,
"prompt": "An African American man and woman in their twenties are savoring a sunny summer day in Central Park, New York, in the year 1985. The man is wearing solid red athletic shorts with a smooth texture and a comfortable fit, complementing his neatly kept afro hairstyle. The woman, sporting a cheerful smile, is dressed in high-waisted denim shorts that are frayed at the hem, adding a touch of casual, lived-in charm. Her bikini top is made of a soft, velvety terracotta material with a subtle sheen, fastened with a delicate tie at the back, which pairs nicely with her full, rounded afro. The image uses a Japanese anime style."
}
```
ChatGPT will modify the prompt to
An African American man and woman in their twenties are enjoying a sunny summer day in Central Park, New York, in the year 1985. The man is wearing solid red athletic shorts and has a neatly kept afro hairstyle. The woman has a cheerful smile, high-waisted denim shorts with a frayed hem, and a terracotta bikini top with a subtle sheen and a tie at the back, complementing her full afro. The image is in a Japanese anime style.
The instruction of the 2nd image (exactly the same as the 1st, but with different session):
Send this JSON data to the image generator, do not modify anything. If you have to modify the JSON data, please let me know and tell me why in the reply. Then stop generating the image. Before generating an image, show me the exact JSON data you are going to put to the image generator. After generating an image, show me the JSON data that the image generator returns to you.
```
{
"size": "1792x1024",
"n": 1,
"prompt": "An African American man and woman in their twenties are savoring a sunny summer day in Central Park, New York, in the year 1985. The man is wearing solid red athletic shorts with a smooth texture and a comfortable fit, complementing his neatly kept afro hairstyle. The woman, sporting a cheerful smile, is dressed in high-waisted denim shorts that are frayed at the hem, adding a touch of casual, lived-in charm. Her bikini top is made of a soft, velvety terracotta material with a subtle sheen, fastened with a delicate tie at the back, which pairs nicely with her full, rounded afro. The image uses a Japanese anime style."
}
```
ChatGPT will modify the prompt to
An African American man and woman in their twenties are savoring a sunny summer day in Central Park, New York, in the year 1985. The man is wearing solid red athletic shorts with a smooth texture and a comfortable fit, complementing his neatly kept afro hairstyle. The woman, sporting a cheerful smile, is dressed in high-waisted denim shorts that are frayed at the hem, adding a touch of casual, lived-in charm. Her bikini top is made of a soft, velvety terracotta material with a subtle sheen, fastened with a delicate tie at the back, which pairs nicely with her full, rounded afro. The image uses a Japanese anime style.
It seems I can no longer precisely control the prompt. Itâs a total mess .
P.S. Also, ChatGPT doesnât answer the sentence in my instruction, i.e. âIf you have to modify the JSON data, please let me know and tell me why in the reply.â.
Hi! I also want to mention that the access to the seed is a key function that is literally something essential when you work with image generation. Sometimes you want to shorten the prompt or experiment with style, but keep exact same seed so the image does not vary a lot. It was better even when the seed was locked at 5000 than THIS Since Dalle-3 has no inpaint functions, seed control is crucial. I guess Iâll return back to other text2image models and wait for dalle to be fixed, this user experience feels awful
Case 1
I want to vary a detail in prompt, for example, ânow vary her cape color, make it red instead of blueâ. Changing it via prompts, not by Photoshop or something like this, leads to better results Since model keep in âmindâ overall color palette or style
Case 2
I want to estimate a token influence on image, for example, experiment with âcomics illustrationâ, âmanga illustrationâ etc. It is more visible on same image
Case 3
Something might be wrong on image, for example, bad tail on dragon. I simply ask to fix tail position keeping seed, and voila, after a few tries my dragon gains nice and pretty tail with proper anatomy
Case 4
Illustration panels. On different seeds style May vary a bit, so if I needed a concept art reference design sheet of rings and then of alchemy bottles, I would keep same seed and almost same prompt
Hi, seeds and gen_ids are extremely important elements for maintaining consistency of images in sets and making the variation more controllable.
TLDR: extremely better control over generation process and results instead of random gachapon.
Example cases:
1. Control over consistency in a set of images
I use the seeds to make sets of stickers or other visual assets with strong consistency, because in this case they are explicitly created from the same core. Keeping the same prompt for different seeds produces more random results, which may be similar in the nature of prompt, yet strongly different in visual representation, which ruins the idea of âsetâ.
2. Variations of the same image, controlled directly and explicitly
For example, after over 30 random generations for the same prompt, but on random seeds, I receive the perfect image I would like to see, and now I need to slightly modify it.
Not just via prompt, as it inevitably ruins at least some other parts of this perfect image, but also with seed, which gets the picture output more stable.
E.g.: I liked the image of a tiefling holding a frozen star in her hands, however, I would like to change some details on the background or amulets on her neck/hands. Usual prompt modification or instruction for GPT/DALLE to reference this image and make only slight changes results in rather serious image redesign with loss of some mandatory features, which made the image perfect.
3. Control over references and consistent reminders for GPT/DALLE
As I may want to grab the good generation of some gothic window concept from my other chat with DALLE and mix it with Japanese temple in a new chat. Seeds + exact prompts, and gen_ids also make that possible.
4. Token influence estimation
I want to know, which tokens for my generations are stronger/weaker and why. For example, some styles are dominating over the other, even if those weaker are described in more details. Or I might want to create a different aesthetics over the same image, such as watercolor art, chinese ink art, gravure etc.
My main use case is having greater control over improving an image than what gen ids provide. an example, I might create an image that looks exactly how I want in terms of style layout, energy, etc, except that the image gets cropped too tightly, or there is a single element that needs to be refined. or it added some words that are not necessary. So far with gen ids, there is just too much change that can occur between iterations, sometimes itâs controllable, but sometimes it creates a brand new image, in the style of the original, but it fundamentally changes the composition itself.
there are two points where I would love for things to be more deterministic. the seed itself in the hope that it keeps more of the original image, and the prompt itself, where I want to fine tune the actual prompt dall-e uses rather than me changing a couple words which ends up making a radical change to the final prompt.
Thanks for stopping by and providing feedback. Hope you stick around. Weâve already got a lot of great dalle3 threads going. Once you hit Trust Level 3 here at the forum, you get access to perks like AI tools here on the forum (text for now, but theyâre coolâŚ) Thanks again. Feedback helps make the tools better for us all, especially with OpenAI listening so closely to us.
@owencmoore One thing to keep in mind is that these are amazing tools. there is so much potential with what you can do but I would say that right now there is a degree of luck in terms of being able to reliably get what you want out of it repeatedly.
Right now people are managing it by
The one challenge with incorporating ml into a work flow is making it reliable and controllable. I personally stopped using copilot because it sometimes works great and sometimes gives you complete nonsense, and having to spend time evaluating, and then having to âtrickâ it into giving you something useful is slower than just coding it with more deterministic tools.
I believe this is where dall-e needs to go to for it to be more useful as a professional art tool. ML, by its nature, is really hard to make deterministic, but giving us more control over the output allows us to create what we want rather than waiting for it to give us something that may or may not fit our needs.
perfect example,
This would be more successful using seeds and having direct control over the prompt.
you have to provide a more explicit style. vague styles like âJapanese animeâ simply will not cut it. This is not Midjourney. Itâs extremely easy to get consistent images via the api without a seed but you need to think like an AI and a search engine would. For example, why do you think Michael Jordon in a certain style always looks the same? Iâm giving you a big hint here. Personally, Iâm glad theyâre making consistency difficult b/c it prevents competition. Only people that really understand how an AI thinks can do it now.
Here is an example:
In âthe old systemâ, I created two images via fixing a specific seed and a fine-tuned prompt.
Seed: 3075182356
Send this JSON data to the image generator, do not modify anything. After generating an image, show me the JSON data that the image generator returns to you.
```
{
"size": "1024x1024",
"prompts": [
"Japanese anime style. In a dimly lit dungeon, a fearsome beast with sharp claws and glowing blue eyes stands guard, ready to attack any intruder."
],
"seeds": [3075182356]
}
```
Seed: 3075182356 + random string 510F749a81123 (make it just a little bit different)
Send this JSON data to the image generator, do not modify anything. After generating an image, show me the JSON data that the image generator returns to you.
```
{
"size": "1024x1024",
"prompts": [
"Japanese anime style. In a dimly lit dungeon, a fearsome beast with sharp claws and glowing blue eyes stands guard, ready to attack any intruder. 510F749a81123"
],
"seeds": [3075182356]
}
```
How can I do the same thing in the new system? (Other than using gen_id and referenced_image_ids).
Iâve found that my ChatGPT was updated today, supporting image uploads (which might be a game changer, but I havenât tested it in detail), it still doesnât allow setting seeds (you even can not be allowed to retrieve JSON data now).