DALLE-3 ON API - Seed support

You maybe can use a combination of the Referring image Id (Generation Id) and some scrambled stuff as a postfix at the prompt (as a temporary workaround).

3 Likes

I will also share my use case! By varying the seed value, I can maintain the character’s style while achieving different expressions, or by fixing the seed value and making slight adjustments to the prompt, I can make fine-tuned adjustments to the image directly. This is extremely useful when expressing a story.

3 Likes

Florance Renaissence , Sandro Botticelli 1445 - 1510

Although the Renaissance was time of immense creative development
it was heavily presided over by the Catholic Church.
Artwork that was considered in any was to potentially entice one towards sin
was not allowed.

Its beileved that Botticelli himself came under literal fire for this
it’s reported that many of his mythological paintings were burned in the
bonfire of the vanities in 1497. Botticelli like many of his contemporary artist would thus
have to have disguised what he was doing under Darwinian survival pressure from
the Church. Hence as is most obvious in Primavera, it’s possible to see surface structure
features that were probably placed there as a nod to the cultural pressure of the time.

If anyone knows or finds a way to access the seeds of our old works, or
if anyone finds a way to adjust the seed
please share it with other artists.

I have a question for OpenAI: I don’t understand why OpenAI made it impossible to set the seed value from the beginning in DALL-E 3 with ChatGPT. What is the background to this?


Also, as a repost, the core use case for seed values is “image modification.”

Basically, the alignment between prompts and the generated images is still not perfect for all image generation AIs in the world. That’s why we’ve been solving this problem with inpainting, ControlNet, reference-only, IP-Adapter, and mass production (hundreds of units) with the Stable Diffusion model.

However, one great thing that OpenAI has done is to dramatically improve this alignment, so that by maintaining the seed value and rewriting the prompt with the LLM, you can get the image you want faster than ever before. I was impressed with this simple and wonderful idea that made it easy to get what I wanted. I understood this to be the most wonderful appeal point as long as I watched the promotional video of DALL-E 3 with ChatGPT.

In order to use this revolutionary core function, I would like you to be able to use seed values.

5 Likes

This is a guess based on what I confirmed via ChatGPT.

The reason why users cannot use the SEED value in DALLE3 seems to be to prevent unwanted images that have escaped policy checks from being reproduced by anyone using Prompt and Seed.
In other words, my guess based on the answers I got from ChatGPT is that they don’t like the same image being shared across multiple users.

Internally, the SEED value is used properly, which can be confirmed by the occasional reminder that the SEED value used is provided as metadata.

This assumption also seems to be supported by the fact that referenced_image_ids, which was introduced as a replacement, cannot be shared between users.

DALLE3 is very good, but there is currently little way to bring out its potential.

SEED value is not available.
Prompts are checked double or triple, and policy application is not reproducible and the same prompt is handled differently each time.
Specifying referenced_image_ids is also difficult and ChatGPT easily ignores it. In the API… it’s probably not provided yet.

openAI has very defensive policy enforcement in place, and we’re likely to see a huge increase in users becoming impatient with it.
DALLE3 is excellent, but that’s only because it comes with the proviso that it’s for now.

5 Likes

I’m not just talking about the DALLE-3 API, but also the WEB client.

Terminology:

“the old system”: Just a few days ago, I was able to set seeds via ChatGPT web client. I called it “the old system”.

“the new system”: The current system where we can no longer set seeds.

More details see After upgrade seeds doesn't work, "generation ID" is introduced?

Use case 1: Art Sharing

Imagine, in an art-sharing community (e.g. DALLE3 Gallery for 2023: Share Your Creations), if I like someone’s image (or image style), or if someone wants to share their creativity, what should they do?

In the old system, you just share the seed and the exact prompt; in the new system, you must share the whole session link. However, sharing the session link has its own problems, such as exposing unrelated conversation or any images that you don’t want to share.

Use case 2: Image management

Usually, in order to filter out nice images, we need to continuously modify the prompt or repeatedly click the Regenerate button. That process will lead to a session with very complex branches.

In the old system, once we choose a nice image, we just need to record the seed and prompt (or open a new session and recreate the image there), then delete that complex branches session.

In the new system, we can’t delete sessions (this would lose everything, i.e. gen_id), but those sessions are very complex and hard to manage. Too many unnecessary images within those sessions.

Use case 3: Collaborative development

If someone wants me to fine-tune a certain image, how can they give me that image? ChatGPT with DALL-E 3 does not support image uploading.

In the old system, they just send me the exact prompt, size and seed. P.s. I know that ChatGPT can share session link, but you know it has its own problems (see above).

Use case 4: Reproducibility and replicability in science

Imagine, in a community for discussing prompt technology (like Discord), person A wants to guide person B on “how to set up a camera”.

In the old system, person B could replicate the same result just by using the same seed as person A. However, in the new system, because of the different seeds, they are essentially discussing two completely different result. This is clearly not a good thing.

More usecases

5 Likes

You can essentially get the same characters and images the more detailed your prompts are. I see a lot of people using very simple prompts. You need to go into deep detail, clothing texture, colors, bone structure etc. The more detailed, the more consistent. Here is a prompt I smashed through the api and it yielded pretty consistent results. Note that it even caught high-waisted denim shorts that are frayed at the hem. Details like this help with consistency. If i wanted the exact faces i would need to run a facial profile through gpt-vision and get bone structure and other facial details including a skin profile. With all that, you’ll get a near clone or close enough without needed a seed. Smash it 15 times via api and one of them will be your exact character. It’s not ideal but it does basically work, especially with less realistic images
An African American man and woman in their twenties are savoring a sunny summer day in Central Park, New York, in the year 1985. The man is wearing solid red athletic shorts with a smooth texture and a comfortable fit, complementing his neatly kept afro hairstyle. The woman, sporting a cheerful smile, is dressed in high-waisted denim shorts that are frayed at the hem, adding a touch of casual, lived-in charm. Her bikini top is made of a soft, velvety terracotta material with a subtle sheen, fastened with a delicate tie at the back, which pairs nicely with her full, rounded afro. The image is infused with a sense of nostalgia, captured through a grainy film quality, warm sepia tones, and a gentle soft focus, evocative of vintage photography.

Blockquote

It seems that this sentence has a significant effect in you prompt.

The image is infused with a sense of nostalgia, captured through a grainy film quality, warm sepia tones, and a gentle soft focus, evocative of vintage photography.

However, in many situation, we cannot determine a very detailed style initially.

For example, on my end, if I replace the last sentence with

This image uses a Japanese anime style.

, then generates it 2 times.

The results:

  1. Seed: 1509919237

  2. Seed: 3168608073

Obviously these two images are not the same style.

Now the key question comes:

I like the style of the 2nd image and I hope to iterate that image using the same style, how shoud I do? How do I extract the detail description from the 2nd image (not just “a Japanese anime style”) ?

In “the old system”, I just fix the seed ,i.e. using seed 3168608073 in the next image.

In “the new system”, AFAIK there is no way to do (except using gen_id and referenced_image_ids, but that’s another topic).

P.S. Note that using gen_id and referenced_image_ids isn’t helpful for the use cases I’ve mentioned.

This is why seed is useful.


I suddenly realized that starting from today, I can’t get ChatGPT to send prompts accurately.

Take the 2 images above as an example:

The instruction of the 1st image:

Send this JSON data to the image generator, do not modify anything. If you have to modify the JSON data, please let me know and tell me why in the reply. Then stop generating the image. Before generating an image, show me the exact JSON data you are going to put to the image generator. After generating an image, show me the JSON data that the image generator returns to you.

```
{
 "size": "1792x1024",
 "n": 1, 
 "prompt": "An African American man and woman in their twenties are savoring a sunny summer day in Central Park, New York, in the year 1985. The man is wearing solid red athletic shorts with a smooth texture and a comfortable fit, complementing his neatly kept afro hairstyle. The woman, sporting a cheerful smile, is dressed in high-waisted denim shorts that are frayed at the hem, adding a touch of casual, lived-in charm. Her bikini top is made of a soft, velvety terracotta material with a subtle sheen, fastened with a delicate tie at the back, which pairs nicely with her full, rounded afro. The image uses a Japanese anime style."
}
```

ChatGPT will modify the prompt to

An African American man and woman in their twenties are enjoying a sunny summer day in Central Park, New York, in the year 1985. The man is wearing solid red athletic shorts and has a neatly kept afro hairstyle. The woman has a cheerful smile, high-waisted denim shorts with a frayed hem, and a terracotta bikini top with a subtle sheen and a tie at the back, complementing her full afro. The image is in a Japanese anime style.

The instruction of the 2nd image (exactly the same as the 1st, but with different session):

Send this JSON data to the image generator, do not modify anything. If you have to modify the JSON data, please let me know and tell me why in the reply. Then stop generating the image. Before generating an image, show me the exact JSON data you are going to put to the image generator. After generating an image, show me the JSON data that the image generator returns to you.

```
{
 "size": "1792x1024",
 "n": 1, 
 "prompt": "An African American man and woman in their twenties are savoring a sunny summer day in Central Park, New York, in the year 1985. The man is wearing solid red athletic shorts with a smooth texture and a comfortable fit, complementing his neatly kept afro hairstyle. The woman, sporting a cheerful smile, is dressed in high-waisted denim shorts that are frayed at the hem, adding a touch of casual, lived-in charm. Her bikini top is made of a soft, velvety terracotta material with a subtle sheen, fastened with a delicate tie at the back, which pairs nicely with her full, rounded afro. The image uses a Japanese anime style."
}
```

ChatGPT will modify the prompt to

An African American man and woman in their twenties are savoring a sunny summer day in Central Park, New York, in the year 1985. The man is wearing solid red athletic shorts with a smooth texture and a comfortable fit, complementing his neatly kept afro hairstyle. The woman, sporting a cheerful smile, is dressed in high-waisted denim shorts that are frayed at the hem, adding a touch of casual, lived-in charm. Her bikini top is made of a soft, velvety terracotta material with a subtle sheen, fastened with a delicate tie at the back, which pairs nicely with her full, rounded afro. The image uses a Japanese anime style.

It seems I can no longer precisely control the prompt. It’s a total mess :sweat: .

P.S. Also, ChatGPT doesn’t answer the sentence in my instruction, i.e. “If you have to modify the JSON data, please let me know and tell me why in the reply.”.

3 Likes

Hi! I also want to mention that the access to the seed is a key function that is literally something essential when you work with image generation. Sometimes you want to shorten the prompt or experiment with style, but keep exact same seed so the image does not vary a lot. It was better even when the seed was locked at 5000 than THIS :frowning: Since Dalle-3 has no inpaint functions, seed control is crucial. I guess I’ll return back to other text2image models and wait for dalle to be fixed, this user experience feels awful

6 Likes

Case 1
I want to vary a detail in prompt, for example, “now vary her cape color, make it red instead of blue”. Changing it via prompts, not by Photoshop or something like this, leads to better results Since model keep in “mind” overall color palette or style

Case 2
I want to estimate a token influence on image, for example, experiment with “comics illustration”, “manga illustration” etc. It is more visible on same image

Case 3
Something might be wrong on image, for example, bad tail on dragon. I simply ask to fix tail position keeping seed, and voila, after a few tries my dragon gains nice and pretty tail with proper anatomy

Case 4
Illustration panels. On different seeds style May vary a bit, so if I needed a concept art reference design sheet of rings and then of alchemy bottles, I would keep same seed and almost same prompt

1 Like

Hi, seeds and gen_ids are extremely important elements for maintaining consistency of images in sets and making the variation more controllable.

TLDR: extremely better control over generation process and results instead of random gachapon.

Example cases:

1. Control over consistency in a set of images
I use the seeds to make sets of stickers or other visual assets with strong consistency, because in this case they are explicitly created from the same core. Keeping the same prompt for different seeds produces more random results, which may be similar in the nature of prompt, yet strongly different in visual representation, which ruins the idea of “set”.

2. Variations of the same image, controlled directly and explicitly
For example, after over 30 random generations for the same prompt, but on random seeds, I receive the perfect image I would like to see, and now I need to slightly modify it.
Not just via prompt, as it inevitably ruins at least some other parts of this perfect image, but also with seed, which gets the picture output more stable.

E.g.: I liked the image of a tiefling holding a frozen star in her hands, however, I would like to change some details on the background or amulets on her neck/hands. Usual prompt modification or instruction for GPT/DALLE to reference this image and make only slight changes results in rather serious image redesign with loss of some mandatory features, which made the image perfect.

3. Control over references and consistent reminders for GPT/DALLE
As I may want to grab the good generation of some gothic window concept from my other chat with DALLE and mix it with Japanese temple in a new chat. Seeds + exact prompts, and gen_ids also make that possible.

4. Token influence estimation
I want to know, which tokens for my generations are stronger/weaker and why. For example, some styles are dominating over the other, even if those weaker are described in more details. Or I might want to create a different aesthetics over the same image, such as watercolor art, chinese ink art, gravure etc.

5 Likes

My main use case is having greater control over improving an image than what gen ids provide. an example, I might create an image that looks exactly how I want in terms of style layout, energy, etc, except that the image gets cropped too tightly, or there is a single element that needs to be refined. or it added some words that are not necessary. So far with gen ids, there is just too much change that can occur between iterations, sometimes it’s controllable, but sometimes it creates a brand new image, in the style of the original, but it fundamentally changes the composition itself.

there are two points where I would love for things to be more deterministic. the seed itself in the hope that it keeps more of the original image, and the prompt itself, where I want to fine tune the actual prompt dall-e uses rather than me changing a couple words which ends up making a radical change to the final prompt.

1 Like

Thanks for stopping by and providing feedback. Hope you stick around. We’ve already got a lot of great dalle3 threads going. Once you hit Trust Level 3 here at the forum, you get access to perks like AI tools here on the forum (text for now, but they’re cool…) Thanks again. Feedback helps make the tools better for us all, especially with OpenAI listening so closely to us.

@owencmoore One thing to keep in mind is that these are amazing tools. there is so much potential with what you can do but I would say that right now there is a degree of luck in terms of being able to reliably get what you want out of it repeatedly.

Right now people are managing it by

  • continuing to roll the dice until they get something that works, and if they get something that almost works you have to resign yourself to giving up on it and hoping that something else comes along that works better.
  • and finding undocumented tricks to try and add some level of determinism like with seeds before they were disabled when gen ids were implemented.

The one challenge with incorporating ml into a work flow is making it reliable and controllable. I personally stopped using copilot because it sometimes works great and sometimes gives you complete nonsense, and having to spend time evaluating, and then having to “trick” it into giving you something useful is slower than just coding it with more deterministic tools.

I believe this is where dall-e needs to go to for it to be more useful as a professional art tool. ML, by its nature, is really hard to make deterministic, but giving us more control over the output allows us to create what we want rather than waiting for it to give us something that may or may not fit our needs.

2 Likes

perfect example,


and then this

This would be more successful using seeds and having direct control over the prompt.

1 Like

you have to provide a more explicit style. vague styles like ‘Japanese anime’ simply will not cut it. This is not Midjourney. It’s extremely easy to get consistent images via the api without a seed but you need to think like an AI and a search engine would. For example, why do you think Michael Jordon in a certain style always looks the same? I’m giving you a big hint here. Personally, I’m glad they’re making consistency difficult b/c it prevents competition. Only people that really understand how an AI thinks can do it now.

1 Like

Here is an example:

In “the old system”, I created two images via fixing a specific seed and a fine-tuned prompt.

  1. Seed: 3075182356

    Send this JSON data to the image generator, do not modify anything. After generating an image, show me the JSON data that the image generator returns to you.
    ```
    {
      "size": "1024x1024",
      "prompts": [
        "Japanese anime style. In a dimly lit dungeon, a fearsome beast with sharp claws and glowing blue eyes stands guard, ready to attack any intruder."
      ],
      "seeds": [3075182356]
    }
    ```
    

  2. Seed: 3075182356 + random string 510F749a81123 (make it just a little bit different)

    Send this JSON data to the image generator, do not modify anything. After generating an image, show me the JSON data that the image generator returns to you.
    ```
    {
      "size": "1024x1024",
      "prompts": [
        "Japanese anime style. In a dimly lit dungeon, a fearsome beast with sharp claws and glowing blue eyes stands guard, ready to attack any intruder.  510F749a81123"
      ],
      "seeds": [3075182356]
    }
    ```
    

How can I do the same thing in the new system? (Other than using gen_id and referenced_image_ids).


I’ve found that my ChatGPT was updated today, supporting image uploads (which might be a game changer, but I haven’t tested it in detail), it still doesn’t allow setting seeds (you even can not be allowed to retrieve JSON data now).

The Control Is All We Need

The primary challenge with image generative models lies in exercising control over the outcomes. For creating beautiful, but somewhat random art, Midjourney already serves this purpose.
DALLE 3 marked a significant breakthrough in prompt comprehension, thereby enhancing our control over the results tenfold.
Although it cannot be fully controlled yet, any means of augmenting this control is invaluable.
Using seeds is one such tool that can drastically reduce the number of ‘wasted’ images generated in the process to obtain the desired results.

Try this prompt below or something like this. In order to achieve consistency without a seed, you need to add more detail to your prompt to reduce noise.

Create a compelling Japanese anime-style illustration with a focus on dramatic lighting and crisp, fluid lines. The scene is set in an ancient dungeon, dimly lit with a blue, mystical glow that adds an air of tension and suspense.

At the heart of the dungeon is a magnificent beast, grand and intimidating. It should be depicted with a muscular build and its fur predominantly white, accented with blue stripes or patterns that resonate with its glowing blue eyes. The eyes are to be drawn with a luminous quality, suggesting a magical or supernatural power, matched by a radiant blue mark on its forehead.

The creature is adorned with elaborate armor-like decorations that give it a regal and formidable appearance. Its claws and teeth are sharp and fearsome, gleaming with a hint of the same eerie blue light. Its ears are alert and its posture should convey readiness for battle, while its long, bushy tail creates an imposing silhouette.

The dungeon’s architecture is composed of aged stone, conveying a history of ancient and mystical events, with the occasional relic or artifact adding depth to the background. Puddles on the floor reflect the beast’s glowing presence, enhancing the dynamic and enigmatic quality of the illustration.

This image should capture the essence of an anime-style encounter, highlighting the detailed and majestic beast in a setting that complements its mystical and powerful aura. The careful portrayal of the beast’s features and the dungeon’s ambiance is crucial for an authentic and consistent anime representation

1 Like

I have custom instructions set that once I lock onto an image I like, to keep that seed for the rest of the conversation, unless I say otherwise.

That way when I make small changes… it (hopefully) doesn’t completely change the image. Let’s say I’m trying to have a storyline going and want to keep it somewhat consistent.

But now, I can’t do that. Unless I want to write an essay for each and every image prompt, I have zero consistency.

The new gen id’s don’t really work and are clunky. Change the color of 1 person’s shirt in a setting? Now it randomizes the whole setting. Different style, time period, location, number of people, ethnicities, etc. etc.

Before I could basically random find a scene I liked, then tweak it with small changes to get it just right.

Now that will require a small novel.

I hope whatever reason you guys made this change for, it was worth it.