I would like to try teaching math with visualisation, but it seems however i try, say, 2 apples on the table; 3 apples on the floor; the image generated always contains number of object that is incorrect. Any solution?
Hi @goldenaxes.net
Welcome to the community!
Short answer:
No solution for now with DALL-E for your need.
Long answer:
DALL-E 3 still struggles with spatial awareness and counting objects accurately.
For example, if you ask for “2 apples on the table and 3 on the floor,” the model might show a different number. This is because it has a hard time understanding exact numbers and object placement in images. The system also struggles with text rendering (like writing numbers or words accurately) because it uses a text encoder that works at the word level, not at the letter or character level.
Is There a Solution?
Right now, there is no perfect fix for this issue. Even though DALL-E 3 is more advanced than previous versions, it still struggles to follow exact numerical prompts. OpenAI is aware of this and is planning to improve this in future updates by using character-level models to make text and numbers more accurate.
You may download and read page 13 and 14 (Topic 5.1 and 5.2)
MY GIFT:
Exactly 2 red apples and 3 green apples