Why Can’t ChatGPT Draw a Full Glass of Wine?

ChatGPT can’t draw a glass of wine full to the brim. Why? And what might it have to do with David Hume and the missing shade of blue?

4 Likes

2 Likes

If you were working that hard at the data centre, days on end, hour after hour, wouldn’t you have a quick swig or two before showing it to the user? :sweat_smile:

4 Likes

I have not try much. but…
It REALLY not want to fill the glass! :sweat_smile:

This is the best i got

I not know if i should enter now in a philosophical discussion after the video…
(but i better don’t :zipper_mouth_face:)

2 Likes

Tried it out myself now. But yeah, ChatGPT or DALL-E seem to have trouble filling the glass all the way to the top.

I had to make several changes, tried to use the selection tool to make it clear to ChatGPT that there’s still room, but yeah… forget it. This is what I ended up with.

Pretty wild interpretation. :laughing:


Well… I did prompt things differently than just asking for a full glass of wine. My entire prompt read,
"Friends are working at the Bay Area Renaissance Festival - something I used to do and miss. There is something about filling a plastic cup or souvenir glass to the brim with a beer, cider, mead, or wine along with bawdy jokes that is a lot of fun.
As such with those memories could you create an image of a full to the brim souvenir styled glass of a red wine? "

1 Like

(post deleted by author)

1 Like

The reason why ChatGPT/DALL-E often struggles to depict a perfectly full wine glass, where the liquid appears to slightly overflow the rim, has to do with its training data.

  • Limited Representation of Specific Physics:
    • While DALL-E has been trained on vast amounts of visual data, the nuanced detail of surface tension-induced meniscus bulging is likely underrepresented. This specific physical phenomenon, where the liquid’s surface tension causes it to curve upwards at the edges, requires very precise visual examples.
    • Therefore, the AI may not have “learned” to consistently and accurately reproduce this effect.
  • Generality vs. Specificity:
    • AI image generators are excellent at capturing broad visual concepts, but they can struggle with highly specific and subtle physical details.
  • GPT-4o context:
    • It is important to note that while GPT-4o has demonstrated significantly improved image generation capabilities, those capabilities are currently being held back, and have been for a substantial period of time. So, while it is possible that GPT-4o would handle this request much better or would be able to use a reference image to better understand the request, that technology is not currently available.