Why Can’t ChatGPT: Draw a Full Glass of Wine?

It’s about understanding how image generators work. They can morph many images together, but they always rely on the data they were trained on.

A typical wine glass, like the ones you find in every restaurant, is usually never filled to the brim because you can’t drink from such a glass without spilling everything. Prestige and theater has a price… (And the wine needs to breathe too. The pore wine was sooo long locked in the bottle. :slight_smile: ) However, a mug is usually filled to the top. But a mug is not a typical wine glass.

So if you want fill to the brim, take a barbarian troll mug, they not care if they spill things, and fill all to the top.

I’ve called this effect a template or overtraining effect. It appears in many places and can sometimes be very annoying because it’s hard to get rid of. A template happens when you intentionally cause overtraining, like with human faces, which then generate stereotypical images. Overtraining can also happen by accident, like with the wine glass.

You can look for such effects. Just think about which motifs almost exclusively appear in one specific way in daily life (like the wine glass).
Or intentional overtraining. (like when all men suddenly have thick beards. That’s intentional, because there are countless images of men without beards. So either the graphical data was overtrained, or the linguistic system isn’t correctly matching descriptions to images).

4 Likes