Hey all. So I just got images to work with the Assistants API but its odd. GPT 4o and mini just dont seem to know what is in an image. Its odd. If I explicitly prompt it by saying “what is in this png image”, it will extract the text from it sometimes but like. If i give it an image of a basketball player and ask it to tell me who this is, it doesnt know.
Is the vision capability basically only capable of segmenting out text from a png?