Collection of Dall-E 3 prompting tips, issues and bugs

Why does it add text in it?

Prompt

Traditional Chinese ink and wash painting of a camellia flower in full bloom, rendered with delicate brushstrokes on rice paper on a huge landscape wall. please uhd quality, wide, a single image.


3 Likes

Maybe “ink”?

Using selection tool and editing takes it out for me when it happens…

3 Likes

Yes when I edit it removes. I just searched on web, this style images contain text and red color stamp/signature. I think training data effects it in the first place.

4 Likes


Can anyone explain what’s going on here? I asked for it from the QB’s POV three times and each time it was from the side. On the one hand it’s good that OpenAI doesn’t know much about NFL, but you’d expect at least some knowledge.

Here’s the prompt that generated this image: “Draw a picture of NFL offensive linemen. This is important: draw it from the quarterback’s point of view. In NFL, the linemen are in a line and they face away from the quarterback. Do not draw this from the side. Both the quarterback and the linemen are looking in the same direction. I want this from the quarterback’s point of view.”

3 Likes

A realistic scene from the quarterback’s point of view in an NFL game, showing the backs of the offensive linemen lined up in front of him. The linemen are crouched in their stances, facing downfield toward the defensive line. The field and opposing players are visible ahead, with details of the stadium and goalposts in the distance. The perspective emphasizes the view over the quarterback’s shoulder, looking directly downfield past the linemen.

This closer?

POV can be hard, though. Try to keep your prompt as simple as possible while having as many details as possible if that makes sense…

2 Likes

Photo realistic nfl game from quarterbacks perspective

Hike is weird… Photo realistic nfl game from in formation quarterbacks eyes Seeing the backs of his team in formation

1 Like

(Unimportant post, only for curiosity)

This does not need to be fixed and is only relevant to people who have a morbid interest in the DALL-E system. Fixing it would only be useful in extreme situations by letting the system define a clear goal, if there is too much randomness involved. The effect is caused by uncertainty and chaos.

It is a result of tests on indeterminacy. When the DALL-E system no longer knows how to render an image or when too much randomness is used in the network, it includes all possible elements in an image, leading to nonsensical results. Certain elements tend to appear more often, seemingly because their weights are prioritized higher in such situations, for example, hexagonal patterns. The prompts were all designed to produce meaningful images, and meaningful and aesthetic images were indeed generated, but occasionally, such broken images appear. The probability of such an image occurring is around 10% to 30%, depending on the prompt.

Most users will likely never see such images, as most prompts are precise enough to avoid such results.

2 Likes

Here is an example of an alternate solution. Prompting Google labs Whisk with the text extracted from the initial image problem report of fire on the water. First two outputs.

(Test your monitor’s black level and gamma by spotting the smoke.)

I thought of this when I saw how well the technology could do extremely dark scenes. Here’s one of my own. The hover prompt is the version rewritten by AI (or perhaps a vision analysis?)

Although it’s apparent DALL-E 3 has been plugged-away on continuously (with effects like creating clone stamp copies of objects now), maybe v4 could meet the challenges.

2 Likes

Yes, I hope they are working on some of the weaknesses listed here. Some are difficult to implement, like the precise recognition of context and translating details. But other issues are simply related to better selection and description of the training data. The white spots for example, in the eyes or on highlights are probably remnants of a nightshade infection.

It’s still very young technology, actually impressive how far it has come in such a short time. But there is still much to improve. I wait now on a update…

(I have also noticed chroma errors, such as blue tints in the images like old, yellowed photographs. I can still post them if there’s interest, but they aren’t easy to reproduce, it depends what DallE randomly select. Again, this is a training data issue and easy to fix.)

I suspect the disturbing light sources were probably inserted because DALL-E has produced some completely black images. I had one such image at the very beginning. But the system isn’t intelligent enough to recognize when an object in the scene is adequately lit, and generally images are too bright. Until now, without extensive experimentation it has been almost impossible, for example to create the beautiful effect of only one light source illuminating the nearby surroundings in a dark scene. Every artist knows the importance of light, specially if light a “character” in a image. Sunlight pictures are easy, but dark scenes with a single light source are difficult. And i speculate it is a template effect again (intentional over training).
Often DallE adds a cheap photo edit shine behind a character to let it stand out, and ruin so any realism, instead of selecting a appropriate contrast with the background, or use a light rim light.

What the system does very well now is the 3D context, it has obviously a very good understanding of distance. (but sadly the deep-of-field effect can often not be disabled.)

What I see is that not only do the training data need to be better cleaned, but the connection to the linguistic system could also be improved.

I hope the developers aren’t just tech people, but at least one artist is involved, because only they ask the right questions.

A comparison of prompts across different systems would also be interesting, if someone uses multiple systems. (I don’t know if OpenAI has a problem with that.)

Here the fist image of your prompt on DallE in Web on Plus account. (still everything more then 1 is many.)

Prompt

Three fishing boats are visible on a dark body of water at night. The boats are illuminated by internal lights, which reflect on the water’s surface. The lights appear to be various colors, including green and red. The boats are relatively small and appear to be similar in size and design. The water is dark and calm, with only slight ripples visible. The sky is completely black, indicating it is nighttime. The overall scene is dark and moody, with the boats as the only visible points of light. The image is taken from a distance, showing the boats and the water’s surface. The reflection of the lights on the water is more pronounced near the boats and gradually fades as it extends away from them.

3 Likes

Heads-up…

High error rate for Dall-e API

New incident: Investigating

We are currently investigating this issue.

Time posted

Dec 20, 11:13 PST

Components affected

Degraded Performance15x15 API

Degraded Performance15x15 ChatGPT

1 Like

API back up already… woot!

1 Like

Thank you @PaulBellow :rabbit::heart::vulcan_salute::honeybee::four_leaf_clover:

There was probably an update just now. What exactly has been corrected still needs to be tested, but it seems they have fed in more human data. Many creatures I created are now anthropomorphic, although they shouldn’t have been. The creatures look more detailed, and it seems some of the errors have been fixed. The results look more harmonious, the surfaces are more detailed, and it seems the issue of images looking like they were overly sharpened has maybe been resolved.
The rest still needs to be tested…

1 Like

There were some updates recently, but still generation errors and poor quality.
Template effect, artifacts, color abarations etc. I get sometimes black and white pictures.
And the template effects get stronger, entire faces and heads of photo models show up on monster bodies. Or completely chaotic images. Nothing in the prompts should trigger this.
Still nonsense text show up in some images.
Some images have a confetti kind of grain and noise.
And the linguistic system witch should fill objects in a scene, not know how to do this accurately. Still electric bulbs show up in the middle of a forest.
etc.

And i strongly believe, there is still some nightshade in the training data.

I used “male” and “monster” to switch of some mouthy template effects, but even this not work as well anymore. Even triggers now templates.

The weights are kind of out of balance, and include bad input data and poison.

And we still can test noting whit a stable seed, to see what effects words have… (i have seen image creators witch show you on the fly every change a word have during typing.)

There is still space of much improvements.

Still the bird-shit-moon, and the … galaxy. All to bright, where is all this light coming from…

m

Black and white images sometimes show up.

Very very bad template effect, one trigger word caused completely wrong result. It should be a smoke creature. The template effect become stronger.
And the model not even completed the image, it just agglutinate together some fragments. (over dominance of the template effect i guess.)

Here again, instead of making a clownfish like creature, it made a man in a clownfish costume. it was not so before the recent updates.

And here. the effect of the recent update, maybe pleases commercials and photo model producers. But not a artist or story teller.
It can be prompted out of the way, but it shows a dissemblance.

This was a 100% biological creature before, now a cyborg with a photo-model head. And the shine around the body should only appear if it is mentioned in the prompt, it was not in the prompt. (Every effect should only be there if it is in the prompt, because negations not work.)

Some maybe like this plastic pupped faces, i don’t. “male” or “man” simply triggered it. And if left out, mouthy’s, mouthy’s, mouthy’s…

This is if the model collapses completely. Not all images are so bad, but have a percentage of this chaos effect. The model not know what to do, and just produces complete nonsense. Even other pictures suffer from this effect, because it shows that the model has problems in decision making. other images have 10% - 70% of it. (nothing like “chaos” is in the prompt, and even then, the model should simply be more creative, but not collapse.)

m

… and i got now some few monster like creatures, in a forest, with gucci on there back…

2 Likes

Would it be better like this, or wouldn’t it? It helps to iterate a lot of times. I know quite annoying this could be compared to others. But several iterations either may get close to what you imagine. Or you’ll see if you’ll hit a limit.

Did you try to do more like this?

Or something like this?

From what I understand “a smoke creature” the first one should be most likely more appropriate.

EDIT: All done with DALL-E 3 as well.

3 Likes

In the examples, as well as in all the insights described here, it’s about recognizing the weaknesses that DALL·E still has. Yes, some of them can be mitigated, though not all, with careful prompting and iterations. But that is not the point in the last post. The point is to recognize the results that DALL·E produces without complex corrections.

For example, the first image now shows a different moon, but the ugly galaxy is still there. These overemphasized structures weren’t always like this. When I just started using DALL·E, i could create a few images with a version where the results weren’t like this.
In the second image, you can see the typical mouthy. It looks like a plastic mask over the mouth and nose, sometimes also over the eyes. This is the result of incorrect training.
In the third image, it’s always a man with a scruffy beard. The model should add the beard itself when it’s mentioned in the prompt because DALL·E and GPT can’t process negations. “No beard” doesn’t work particularly well. And in this example, it should not add a human face, but a humanoid structure in smoke. The humanoid faces showing up now more frequently, is a other template effect. (over representation of data in the weights.)

In general, the images are now too bright, and in a darker scene, the object doesn’t stand alone but additional volumetric light sources are added.

Artifacts also started appearing. In earlier models, they didn’t exist.

DALL·E must also work for users who not only know all the pitfalls, and also know how to avoid them, like described here. These effects should not exist at all.

But thanks anyway for trying to correct the errors!
You can see, in the fist page on this thread, all the errors and corrections worked out here. If you find issues and corrections, i will add the findings in the first page.

3 Likes

A word that describes the current version of DALL-E well: unsustainability.

1 Like

Or badly trained. It seams they constantly update it.

1 Like

DALL·E still has incorrect data in its training, and the capturing needs better filters. I actually just wanted a puma. The result has little color (no reasons in the prompt for it) and an entirely unwanted logo.
With better prompting, you can get everything fixed easy, that’s clear, but I’m looking for weaknesses…

I guess puma will be blocked soon too, joining his friend black panther.
(“don’t fix it, block it!..”)

1 Like

The effect of toxic data in training. The shadows are red-tinged, which is due to images with poor quality that were fed into the system during training.
It is not clear to me how many toxic images are needed to negatively impact the weighting data, but possibly not many.

This image and prompt was OK some months ago.

1 Like