Misspellings or grabled text in generated illustrations

teccardt · March 4, 2025, 2:38pm

When ChatGPT creates an image with incorporated text labels, the text is often misspelled or garbled. ChatGPT explains that this happens because the image generator prioritizes visual rendering over text accuracy. My proposed solution (which ChatGPT agrees with) is to generate the image with blank speech bubbles and labels, then pass the location, size, and shape of these text areas to a subsequent step where proofread text is correctly generated and superimposed. This would significantly improve the usability of AI-generated infographics, comics, and labeled illustrations.

_j · March 4, 2025, 3:18pm

That sounds like the kind of thing you can ask the AI to produce when it makes the image - or command ChatGPT to never ask for speech or captions in the image description it sends at all. It will take a delicate touch of the right words for there not to be text, as DALL-E image generation model has the uncanny ability to insert words from prompts into images (like “coffee coffee”) even when not requested, and doesn’t really understand “blank” or “empty”.

Topic		Replies	Views
Spelling mistakes in Dalle-3 generated images API gpt-4 , dall-e-3 , dalle3	15	11820	July 31, 2024
Dall-E is sooo bad at recognizing letters and numbers - any advice? Prompting gpt-4 , chatgpt , dalle3 , dalle3-feedback	11	3213	May 17, 2024
Does anyone experience issues with Dall-E3 generating typos in text within images? Prompting gpt-4 , dalle3	16	24951	February 19, 2024
Feedback on Rendering Accuracy and Efficiency in Image Generation Prompting chatgpt	1	163	October 20, 2024
Keep Dalle from including text Prompting	21	18198	February 6, 2024

Misspellings or grabled text in generated illustrations

Related topics