ChatGPT can’t draw a glass of wine full to the brim. Why? And what might it have to do with David Hume and the missing shade of blue?
If you were working that hard at the data centre, days on end, hour after hour, wouldn’t you have a quick swig or two before showing it to the user?
I have not try much. but…
It REALLY not want to fill the glass!
This is the best i got
I not know if i should enter now in a philosophical discussion after the video…
(but i better don’t )
Tried it out myself now. But yeah, ChatGPT or DALL-E seem to have trouble filling the glass all the way to the top.
I had to make several changes, tried to use the selection tool to make it clear to ChatGPT that there’s still room, but yeah… forget it. This is what I ended up with.
Pretty wild interpretation.
Well… I did prompt things differently than just asking for a full glass of wine. My entire prompt read,
"Friends are working at the Bay Area Renaissance Festival - something I used to do and miss. There is something about filling a plastic cup or souvenir glass to the brim with a beer, cider, mead, or wine along with bawdy jokes that is a lot of fun.
As such with those memories could you create an image of a full to the brim souvenir styled glass of a red wine? "
(post deleted by author)
The reason why ChatGPT/DALL-E often struggles to depict a perfectly full wine glass, where the liquid appears to slightly overflow the rim, has to do with its training data.
- Limited Representation of Specific Physics:
- While DALL-E has been trained on vast amounts of visual data, the nuanced detail of surface tension-induced meniscus bulging is likely underrepresented. This specific physical phenomenon, where the liquid’s surface tension causes it to curve upwards at the edges, requires very precise visual examples.
- Therefore, the AI may not have “learned” to consistently and accurately reproduce this effect.
- Generality vs. Specificity:
- AI image generators are excellent at capturing broad visual concepts, but they can struggle with highly specific and subtle physical details.
- GPT-4o context:
- It is important to note that while GPT-4o has demonstrated significantly improved image generation capabilities, those capabilities are currently being held back, and have been for a substantial period of time. So, while it is possible that GPT-4o would handle this request much better or would be able to use a reference image to better understand the request, that technology is not currently available.