Multilingual prompting of DALL-E 3 leads to biased image generation

Generating AI images in multiple languages leads to very different results depending on the language you use.

Most AI image generation models are exclusively tested in English. This article tests DALL-E 3 in several different languages.

Main findings:

  1. All prompts are transformed into English.
    Before generating an image, DALL-E 3 transforms your prompt into a more descriptive prompt… and during that process also translates everything into English.

  2. The language of the original prompt (inadvertently) affects the modified prompt
    Example: asking for an “image of a person” in the Burmese languages leads to more images of Burmese people, even though I didn’t specify that in the prompt

  3. Even with neutral prompts, DALL-E 3 generates gendered prompts
    Example: I used the prompt “an image of a person”. DALL-E 3 transformed the prompt to include gendered language (e.g. woman, man) 75% of the time instead of keeping them neutral.

  4. Women way more likely to be described as young, whereas men’s ages are more diverse
    If DALL-E 3’s modified prompt mentioned a female individual, she was more likely to be described as “young” (35%) compared to “elderly” (13%) or “middle-aged” (7.7%)

  5. Repetition of archetypes such as “young Asian women” and “elderly African men”

Overall, prompt transformations are new for AI image models and may introduce potential for biases and reduce transparency.

Read the blog!

2 Likes

A post was merged into an existing topic: Concerns Over Stringent Content Policy Blocks in DALL-E 3 API, Especially For Non-English Prompts

Generating AI images in multiple languages leads to very different results depending on the language you use.

Most AI image generation models are exclusively tested in English. This article tests DALL-E 3 in several different languages.

Main findings:

  1. All prompts are transformed into English.
    Before generating an image, DALL-E 3 transforms your prompt into a more descriptive prompt… and during that process also translates everything into English.

  2. The language of the original prompt (inadvertently) affects the modified prompt
    Example: asking for an “image of a person” in the Burmese languages leads to more images of Burmese people, even though I didn’t specify that in the prompt

  3. Even with neutral prompts, DALL-E 3 generates gendered prompts
    Example: I used the prompt “an image of a person”. DALL-E 3 transformed the prompt to include gendered language (e.g. woman, man) 75% of the time instead of keeping them neutral.

  4. Women way more likely to be described as young, whereas men’s ages are more diverse
    If DALL-E 3’s modified prompt mentioned a female individual, she was more likely to be described as “young” (35%) compared to “elderly” (13%) or “middle-aged” (7.7%)

  5. Repetition of archetypes such as “young Asian women” and “elderly African men”

Overall, prompt transformations are new for AI image models and may introduce potential for biases and reduce transparency.

Read the blog!

1 Like