I have pdf files that contain images as well as text. I would like to ask ChatGPT / my custom GPT / gpt4 via the assistants API questions about these documents, not only about the text but also about the images.
I suspect this is currently not possible, as GPT is saying it can analyze the image content in the uploaded pdf, but the answers (e.g. when asking what is shown on a particular image ) seem like it guessed what is in it from the surrounding text.
So I would like to confirm, can GPT “see” / have access to images in pdf files or is only an OCR performed on the files?
Hi All. New user here and still learning the basics. I built a custom GPT and uploaded various PDFs with text, images, drawings, etc. But seems like the custom GPT 4o still cannot “see” the images in a PDF. Is there a simple way for it to ‘see’ images in pdf’s without having to extract the images and upload separately - kinda defeats the purpose. I would have thought this was possible with the ‘Vision’ capability showcased ? Thanks