I am using Dalle-3 using APIs to generate social media posts inside our application. I have often noticed the spellings are wrong in the generated image even when told explicitly in the prompt to use correct spellings. Can anyone suggest a better approach?
Because of the way DALLE works, text is difficult. DALLE3 is an improvement over DALLE2, and Iâm sure DALLE4 will be even better. Image text generation isnât stable enough for production use, imho.
Photoshop.
extra words for discourse
Yeah, this is a known area of improvement for us, we have some stuff in the works to address is so stay tuned! My honest suggestion today is to not use DALL-E to generate the text if itâs more than a few words and add the text after the fact using a tool like Canva.
This issue has been going on for so long that I started to think it was impossible to get the text right in images. But today, while browsing through the gallery, I noticed that many usersâ artworks included correctly spelled text. So, I decided to give it a try. I casually created a statue and asked for the base to be engraved with âWhat exists is reasonable,â and to my surprise, it was spelled correctly. Maybe you could give it more tries?
This issue is really annoying, even after giving correct spelling , itâs still giving wrong spelling . So most of the time , have to give up on using texts .
The question I have is âHow do you stop DALLE 3 from producing text in images?â
The answer to discouraging text is not to just send text to the image creator like it was a story.
Instead, describe the contents of an image as if you were piecing them together yourself. Just typing something up in a style where I know itâs not going to put âfrogâ as text into the image:
âCreate a storybook image, with the style of a line drawing that is filled with watercolor. The imagery should portray a frog that riding on the head of a crocodile, The fat frog is green, but also has colorful accents and large eyes. The crocodile is seen partially submerged in the water, with just the top of its head emerging, with the frog seated on the crocodile head. The background beyond the water will be a pastel jungle scene that comes down to the waterâs edge. The image is wide format, filling the frame edge-to-edge.â
Thus what I have been using after trying different prompts and getting wrongs texts on images
The only reason I signed up and paid for Plus was to get access to DALL¡E 3. Since it canât create images with the right text, even after prompting it with the correct spelling it still doesnât spell properly. So whatâs the point in paying for it?
Thatâs like buying a model-T and asking why itâs not electric and doesnât have cruise control yet! Small smile.
Seriously, though, if you havenât seen/noticed the improvements in just the last 12 months, hang tight. It will get better.
Works better if you tell the AI between you and DALL-E 3 to include less description, place the text in quotes, or just let your prompt go unaltered.
Yeah! Iâm inclined to agree.
@grandell1234 suggested photoshop. Iâve been using photoshop ai to do some cool, cool stuff⌠but text is still gobbledygook yâall. Even if you only, and explicitly, tell it to do text, or some text effect.
In fact, overall, Dalle does some wayy cooler fully rendered images. The stuff that @_j just posted is way more coherent than anything Iâve seen PhotoshopAI come up with, text-wise and at a single go.
If I want text, I have to add it later. In fact, thatâs one thing I like about Photoshop AI better than DALLE, its how you make selections and have the AI fill the space⌠In a more-perfect universe Iâd be able to make a selection in Photoshop and have DALLE do the rest, which is kind-of what it sounds like whatâs happening with Sora and Premier Pro.
Actually, one thing I HAVE done is generate an initial image in DALLE, then pull it into Photoshop. Use Photoshop AI to make specific selections and changes. Then add any text or anything else using native Photoshop things.
I bet all this stuff with text has engineers at Adobe and Open pulling their hair out.
Yeah, I think Photoshop Beta has Adobeâs latest image model, but it still lags on DALLE and even SD and MJ? Itâll improve, though, Iâm sure.
The gen-AI Photoshop CROP tool has been great for expanding ebook covers to full-wrap. Gotta take it slow and it messes up a lot, but saves a lot of time.
Adobe has Firefly too. I believe thatâs the tool that does Vector Text⌠but thatâs not full images.
@_jâs example is one of the best Iâve seen. A few months back the DALLE team on Discord mentioned they were tinkering with text a bit.
hi @longankilpatrick, what is the correct technical term for this behavior? in large GenAI image models