No Face Variability for DALL-E 3? Who is this woman?

Dall-e 3 is incredible in many ways, but I cannot for the life of me get it to generate a realistic looking person. I keep getting the same strange, elfin-faced, very skinny-looking female:

The prompt was “Photo of a natural-looking woman with black hair and bangs, minimal makeup, and genuine facial expressions, capturing the essence of everyday real people. Thin lips. Blue eyes. Crooked nose.”

What is the deal?

1 Like

From what I understand, the model is stuck on a fixed random seed. So it never varies based on your prompt. Also it is stuck in “Unreal Engine” mode for realism due to current Deep Fake :ghost: concerns.

It is Halloween after all :rofl:

The main issue (for me) is that it’s explicitly ignoring the prompt, as well.

The woman definitely doesn’t have thin lips or a crooked nose. So, it’s not just the seed. This girl doesn’t even look human.

It’s almost like a LORA from Stable Diffusion with a fixed character.

1 Like

That might be a new one. Hopefully those with Dalle-3 access and experience will chime in here.

1 Like

“Photo of a woman with red hair” gets us the same woman…but with red hair. Very odd.

I’ve got about 12 pictures of the same Asian “red hair European” from Bing days separated, because “redhead” or ethnicity doesn’t fly.

There’s certain inputs that always go the same way now, unlike two weeks ago when an ambiguous prompt would give you a vast variety.

I had high hopes of demo-like apocalypse scenes for “When cats invade!”… but

Simply “She’s got too many pets” - nothing about they should all be similar cartoons:

To see if they are hashing the prompt to make a seed or similar, I changed it only by “way too many pets”

or add “Digital photo: She’s got too many pets” - all inferred incorrectly the same way:

So I wonder if in ChatGPT Plus where you payers are, you start a new conversation (so it doesn’t have the same ability to keep building variations by image name on chat history of what you produced before) if you can get a different direction with additional descriptions of a different scene or style description, framing, etc. Not just for the new setting, but just to stir up the prompt.

Perhaps this new tuning is to address those that couldn’t get AI to create any consistent style for book illustrations, for example.

(sorry if I don’t have any great ideas of what to make an AI produce…)

2 Likes

Interesting. I can say that this same “elf girl” has appeared in every DALL-E 3 chat that I’ve created. Here she is again in a whole new chat:

Part of my issue is that she’s so specifically and stereotypically “beautiful” that I’m afraid it’s showing a real bias in the image creation.

Um, actually this problem may be deeper than I thought…

Apparently ‘obese’ is male-coded?


Looks like my ex-wife.

I keep getting the cousin of your photo. It seems like no matter how I change the prompt I get a pretty face with eyes and mouth that too big so it’s not human looking.

1 Like

Try this.
Prompt
((Hyper-detailed), (Highly detailed skin:1.2), (photo realism:1.2), (best shadow), ultra high resolution, hdr), woman, white platinum hair, (asymmetrical bob hair style), hair length ear high, (straight nose contour:0.35), (straight nose base:0.4), (oval jawline), (porcelain skin tone), amateur, (rebecca chambers from resident evil vendetta), downturned hooded eyes, front shot


Microsoft Designer

1 Like

This one is more better! Ironically, i used GabAI to optimize the prompt.

Prompt
{face-front portrait:1.5}, highly defined macrophotography of woman, asymmetrical bob hair style, hair length ear high, (rebecca chambers :0.4), straight nose contour, straight nose base, (nose butt: 0.3), (oval jawline: 0.5), oval face shape, {soft and even lighting}, {shallow depth of field}, {photo-realistic render style:1.2}, {high resolution and detailed textures}, {close-up camera shot:1.3}.

Who is this woman v2: single DALL-E

Hi dav111, thanks for sharing these prompts. I’m around 6 weeks into my ai creation journey and I’ve never never seen anything like this syntax(?)

How and where can i learn more about how to produce prompts like this please?

Ultimately, a prompt like that is only testing the language AI placed in front of the image model’s actual prompt input, which rewrites prompts for DALL-E 3, translating from other languages or nonsense into English.

Send that to API, and the actual prompt rewritten to DALL-E is:

A high-resolution, photorealistic, close-up portrait of a woman. The woman, softly and evenly lit, has an oval-shaped face with an even more pronounced oval jawline. She sports an asymmetric bob hairstyle, her hair cut to ear-length. She has an impressively straight nose contour, with the base of her nose being equally straight. The image is a splendid display of macrophotography, with a shallow depth of field adding an extra layer of interest to it. The textures are incredibly detailed, adding much to the overall realism of the image.

Which gives the same woman as proof:

1 Like

Nice observation. Prompt looks like a poetry but it does deliver results without AI induced variability. I figure such prompt is useful to produce finalise results after optimising things in concise prompts. However, access to seed value and creating an image with it would be great to optimise the prompt , i don’t know if they allow this since i’m just using microsoft designer.

@_j Styled Prompt 1:

A ultra high-resolution, HDR, front shot portrait of a man. The man, evenly lit, has an 
oval-shaped face with an even more pronounced oval jawline. He sports a professional 
hairstyle, his hair colour is brunette. He has an impressively straight nose contour, with 
the base of his nose being equally straight. The image should look like an amateur 
photograph. He should be an Italian decent, looks similar to max from Max Payne 3 
and givi. He should feature downturned hood eyes. He is 30 years old. His facial 
features should be asymmetrical.

My prompt 2:

{ultra high resolution, HDR, front shot}, man, {flat lightning}, {round face, round jaw, 
round jawline}, brunette hair, {professional hair style}, {straight nose contour, {straight 
nose base}, amateur, {italian, max from max payne 3, givi}, downturned hooded eyes, 
{age 30}, asymmetrical facial features

JOURNEYMAN2, Much appreciated.
I used stable diffusion 1.5 syntax, since i have done lots of trial and error with it. But, Dalle is different and it’s best to prompt like @_j has mentioned. You can also check microsoft designer because they have lots of sample prompts from where you can learn different ways to prompt Dalle. Try reddit and search for Dalle-3 and you should get lots of information regarding prompts.

Few Notes:

  • The first part of prompt has more emphasis than the last part.
  • Use ethnicity to manage facial features and skin color of subject.
  • Use ethnic names so you can see some variability in picture or it’s just too perfect.
  • You can use game character names to use as a reference for your picture and you can use multiple of them.
  • Google features that you want and then use them in your prompts to get desired results. eg. Almond eyes, off duty bun hair style…

I will quote a comment from reddit that maybe helpful.

It’s all trial and error. Just get started and create a method of keeping track of what you do. I have a huge text file full of prompts that I use, re-use and modify over time.

PROTIP: If you use other people’s prompts, you’re just genning what they wanted to gen. Think about what YOU want to gen, and engineer your own prompts based on that.

A good strategy is to think of it as explaining a picture to a blind man. There are sunrises, and then there are sunrises that reflect off the ocean making sparkles and turn the sky shades of pink and orange and have seagulls flying in front of them and an offshore sailboat. If you want the latter, you need to describe it in detail.

Finally, no, the prompting language between various models varies hugely. SD1.5 needs short, sequenced prompts that it’s internal keyword engine can understand. SDXL will do natural language or short sequenced prompts. Order of prompts matters. Spelling matters! One little change can drastically alter the composition, and you can fear that or embrace it in your explorations.

Here’s the sample of prompt based on @_j observation. You can see it does surprisingly produce results as asked in prompt except the jaw part, that’s on of Dalle-3 internal bias. The second picture is from gab.ai, using same prompt.

A ultra high-resolution, HDR, front shot portrait of a man. His face 
shape should be oval, the jawline should be more of oval than square. 
His chin should be square and has visible crease outlining it 
perfectly. He should have straight nose contour with little visible 
compression in middle. The base of nose should be straight with 
crease in middle separating both nostrils. He should feature 
downturned hood eyes with violet coloured eye iris. His hairstyle 
should look professional hairstyle with hair colour that is mix of 
black and brown. He should be an Italian decent, with age of
30. His face and facial features should be asymmetrical. 
The man is evenly lit which highlights his facial features correctly.

Very enlightening, thank you for the explanation both of you.

1 Like

Hello,

I’ve integrated the ChatGPT Plus API into my website, leveraging it to generate image prompts. This allows users to create images based on the complexity of their language, like adding “curves” to influence the output. To enhance prompt quality, I’ve developed an assistant and utilized threading.

This setup enables me to submit less refined prompts to the assistant, which then optimizes them for better clarity or mandates photorealism in the generated images. However, the cost of using the API is substantial, reflecting its position in ongoing beta and research phases. An interesting aspect of Dalle-3, given its experimental nature, is its occasional free reworking of prompts to include a broader representation of ethnicities and identities, should the prompts lack specificity.

Despite Dalle-3’s ability to produce visually stunning outputs, it often diverges from my exact requirements, leading me to explore alternatives like Midjourney for more predictable results.

Regarding the assistant, initial attempts at sticking to a single thread for multiple prompt revisions resulted in recycled prompts. I’ve since shifted to generating a new thread for each request. As of the latest update, there’s also support for a single-threaded assistant via GPT-4, though I’m still evaluating its effectiveness.

Using ChatGPT Plus and GPT-4 simplifies interactions with Dalle-3, albeit with some limitations. Direct API use offers more control over parameters like image size and quality, which are constrained within the ChatBot interface. Notably, accessing Dalle-3 through alternative means provides leniency in generating images with public figures or trademarks, a flexibility not found when using GPT-4 due to OpenAI’s content policies.

These policies, while designed to avoid copyright issues, can feel overly restrictive, hindering the full exploration of the API’s capabilities. For instance, requests for artwork resembling specific pieces by known artists are outright refused, pushing users towards more generic creative directions.

In conclusion, while the technology offers incredible potential for creative expression, the combination of high costs and restrictive content policies presents significant challenges. The journey with OpenAI’s tools is a complex blend of innovation and limitation, requiring a balance between creative ambition and the practicalities of policy compliance.

Your AI-created post with mistakes about non-existent features and the AI misunderstanding things like assistants and threads and a moral lesson adds little value.

Is there a human I can talk to?