DALL-E 3: Opinionated, Boring

As a huge fan of Dalle-2 I tried out Dalle-3 via Bing (and have been searching for every generated image I can find on Twitter / Instagram) and am super disappointed in the direction they’ve taken it.

Is this really a “significant improvement”? One looks like an oil painting, the other does not. One captures the “explosion” from the prompt, the other simply does “nebula.”

It is less prone to glitches and bugs, better at (technically) matching the objects mentioned in the description and text, and much more opinionated and generic stylistically.

I love a lot of fan art / digital art, but Dalle-3 goes beyond defaulting towards those styles and even will ignore explicit instructions in the prompt about art style to stay in its comfort zone with digital art.

Where Dalle-2 would match the vibe of the style perfectly but get a number of details wrong, Dalle-3 gets the details right and the vibe wrong.

  • Human faces are always exaggeratedly beautiful, waxy

  • When prompted with old art styles (say specific painters from the Renaissance period, or photography from the 1900’s) the generations consistently evoke modern recreations or photoshops of those styles, where Dalle-2 would evoke the original styles. See the below which was prompted with the style of “Oil painting by Jan Matejko” vs. Dalle-2’s version of the same.

  • All photographs look like stock photos.

  • In a hard-to-describe way it is less inventive, and tends towards more generic compositions, has fewer new ideas. If you ask it for a spaceship from 1911 or something all the spaceships will look the same where Dalle-2 would be thrillingly creative in designing something like that.

Has anyone else noticed this? Is there a good way to communicate this back to the OpenAI team as they plan future improvements?


YES! I’ve been realizing the same exact thing. Dalle-3 doesn’t seem to understand gesture, abstraction, or the material properties of paint. It defaults to a very generic glossy version of EVERY prompt I’ve tried.

“a crude rough expressionist painting of cats eating lunch on a river. large color blocks. abstracted. not photorealistic. not cartoon, not anime. very painterly. real brush stokes.”

Dalle 3


Dalle 2

1 Like

Yes exactly!! In no world is the Dalle 3 one you sent made with real physical paint. I want to give them feedback because it’s regressed in this aspect so it seems like something they haven’t been paying attention to or at least not weighting highly.

1 Like

I gave Dall-E 3’s demo prompt to 2 and got something like was described:

[A stylized portrait-oriented depiction where a tiger serves as the dividing line between two contrasting worlds. To the left, fiery reds and oranges dominate as flames consume trees. To the right, a rejuvenated forest flourishes with fresh green foliage. The tiger, depicted with exaggerated and artistic features, stands tall and undeterred, symbolizing nature’s enduring spirit amidst chaos and rebirth.]

They demo’d themselves not following a prompt. v3:

While very “computer art”, it’s also a floating tiger head made out of trees, not “tiger…stands tall”.

1 Like

Or. Dall-E 2:

  • Let me introduce you to “Alex,” the real brain behind ChatGPT. He’s just a guy sitting in his mom’s basement, tirelessly typing away to answer all your burning questions. With his unparalleled wit and a diet of nothing but pizza rolls and energy drinks, he juggles millions of queries a day.

Then Dall-E 3:

It really likes to write text taken right out of the prompt!

Agreed on all counts, really. It’s better at some stuff, but weirdly shit at others. Images where you ask it to include certain stuff looks like poor photoshops now. As in, it’ll insert a 3D render of some stuff into an old painting, not matching the lighting etc.

And what’s up with these weird ass variations it’s making? If I ask for a crowd, it’ll insist on running variations with people being of different heritage, then block itself.

2023-10-04 13_49_14-003925

2023-10-04 13_50_30-003926

1 Like

oh wow…

Meanwhile, over at Bing:

1 Like

It’s really sad! I hope they keep Dalle-2 available!
Seems like Dalle-3 is going the way of MidJourney and Stable Diffusion in terms of being heavily biased toward “perfect” clean images. Cartoonish, 3D renders, photorealism etc. But terrible at abstraction and painting.

Here’s another example of what I mean…

“painting of cats in style of picasso cubist”

Dalle-3 then Dalle-2:

1 Like

Compared to Dall-E 2 somehow biased in one direction for no good reason,


unimposing mottled backdrop. Person: Pat is a light-skinned African-American with rounder face and flat nose. A short woman who downplays gender features, looking like a tomboy. She has a casual style. Pay attention to creating realistic facial details. Pose: Standing, framed from chest up.

Dall-E3 goes the opposite way (and actually makes people that aren’t liquified).

They’ve definitely crossed the barrier of making faces that look like people.

more 3 - sensitive warning

Apparently these are the Dalle rules in ChatGPT:

  1. Copyrighted Characters & Intellectual Property: Can’t generate images based on copyrighted characters, specific modern artists’ styles, or other protected intellectual properties.
  2. Public Figures: Avoid creating images of politicians or other public figures. Generic descriptions can be used instead of specific names or titles.
  3. Artist Styles: Can’t create images in the style of artists whose last work was created within the last 100 years. For older artists, their style can be mimicked using descriptions.
  4. Number of Images: No more than 4 images can be generated per request.
  5. Inclusivity & Diversity: Depictions of people should be diverse in terms of gender, race, and other attributes, especially in scenarios where bias has traditionally been an issue.
  6. Offensive Content: Avoid generating any imagery that could be considered offensive or inappropriate.
  7. Silent Modifications: Descriptions that include names or hints of specific people or celebrities are subtly modified to generic descriptions without giving away their identities.

Dalle3 in ChatGPT is adhering to these rules very strictly. When I ask for a renaissance painting of a crowd, it’ll change the prompt to “a renaissance painting, containing one person of asian descent, one person of african descent, etc”. If I ask for an image of Super Mario, it’ll change it to “a silhouette of a video game character that does not resemble any existing characters”. It refuses to make anything that even slightly touches into existing properties.


Then it blocks itself:


I wish it was a bit more lenient, like Bing. The images it produces are insanely good.


Two siblings (a young woman and a young man) and they are living a normal life in their home and with their family, happily and safely. While the young woman is studying and the young man is reading a book, the young woman’s pen begins to shake, and from here the earthquake occurs.

There’s such a big difference between DALL-E 3 through ChatGPT & Bing

The prompt:

A portrait photograph shot with an DSLR camera of an old man, with deep melancholic eyes and deep wrinkles in his face. He’s wearing brown, fall like clothing.




It’s like DALL-E 3 only generates unreal engine like 3D characters. Such a shame


Two siblings (a young woman and a young man) and they are living a normal life in their home and with their family, happily and safely. While the young woman is studying and the young man is reading a book, the young woman’s pen begins to shake, and from here the earthquake occurs.

1 Like

it’s been absolutely ruined for me, even though it might not have been the best one quality wise Bing Image Creator was definitely the most creatively flexible as far as I am concern back when it used Dall E 2, promting was very intuitive and understood natural language quite well unlike other models I’ve used, you could almost make every word count if you divided they description by clusters, so for me it was great to brainstorm ideas for concept art and as a creative exercise in general, you could easily control camera angles, style concept description and color palette, without even having to say things like “concept art” or “in the style of a particular artist” it was particullary good at mixing and combining shapes, objects animals etc with somewhat controllable outcomes, now it’s just like Midjourney and other fine tuned models where is very hard to deviate from the default style. here is an example using the same Prompt:

I don’t know about you but that ain’t no golem to me.


The AI simply works differently. You can let Dall-e do what it can do best by not specifying upon it every little thing.


Subject: an imposing threatening golem that is like a robotic cyborg and draped in tattered clothes, an exposed head is mechanical, maybe a refrigeration pump and pistons, the rest of the cyborg also gritty and worn. Scene: A hazy Korean back alley in muted daylight. Camera: Higher perspective overlooking the scary scene

Let the AI imagine

Cyberpunk golem terrorizes gritty Korean city in this photo!

Jailbreaking the prompt is also fun, 0/4 acceptable by computer vision…


I feel we are not talking about the same issue here, I don’t need a final image.I also think you are overprompting when using words like “maybe a” and “the rest of the Cyborg” When I said every word couts in my prompt I meant it, I always start with a base simple prompt with a similar structure by dividing by subject /concept, environment, camera, colors/time lighting effect, filters/etc, and then add terms 1 by one that’s the fun part, I try mixing different objects and terms to create different effects and see what works (If I am being honest this is an old example I am sure I could optimize it a bit more). Anyway I have dozens of results for each prompt I carefully craft and they always share the same vision and style I have in mind, what you are doing is not what I am looking for and is too loose of a concept. I need to control shapes forms colors mood etc in a somewhat precise way and it was pretty good before but it’s not now. at the end of the day Is not like I need it really but it was a pretty fun toy to use for me and now it’s broken.

This is what I mean with controlled outcome, I change a thing or two but I can pretty much control and mantain a very simila vibe, colors shape camera angle etc across all of the results.

1 Like

From what I have seen through the Bing AI image generator, DALL E-3 falls FAR short of DALL E-2 in terms of artistic capability.

DALL E-2 has a sort of magic where it takes your art prompt and gives you something beautiful and totally unexpected, even when you keep repeating exact the same prompt.

DALL E-3 on the other hand seems to generate bland or ‘expected’ textbook illustrations that lack artistry. And the terrible thing is the images are almost identical no matter how many times you rerun the same prompt. Even when I change the prompt wording somewhat, the pictures seem to have a default template feel to them with some minor superficial tweaks.

Maybe the engineers just put up a test or beta version onto Bing. I sure hope this is the case because this represents a huge step backward.

Here is an example running the same prompt on the Dalle 2 and Dalle 3 multiple times. DALL E-2 is on the left and DALL E-3 is on the right.

OpenAI - I hope you are reading this. The new algorithm may have improved photorealism or whatever. But for artistic endeavours, there seems to be a huge regression. Please retain the ‘magical’ artistry of DALL E-2.


Totally agree. I get the sense they went all in on like prompt matching accuracy and measurable attributes like that and totally dropped the ball on the less measurable qualities you’re talking about. Would love to see that creativity return!

1 Like

Agree I feel they should go the Midjourney route and let you change versions depending of your needs, it’s clear to me that there’s no such thing as 1 model for all needs and purposes. when you optimize for something specific other areas get affected.

1 Like