Is it possible to analyze images contained in pdf files?


I have pdf files that contain images as well as text. I would like to ask ChatGPT / my custom GPT / gpt4 via the assistants API questions about these documents, not only about the text but also about the images.

I suspect this is currently not possible, as GPT is saying it can analyze the image content in the uploaded pdf, but the answers (e.g. when asking what is shown on a particular image ) seem like it guessed what is in it from the surrounding text.

So I would like to confirm, can GPT “see” / have access to images in pdf files or is only an OCR performed on the files?



Would like to know the same.
Any update on this from the team?

It’s not currently possible, I did this, and it said it’s unable to see any images in a PDF file.

1 Like

Hi All. New user here and still learning the basics. I built a custom GPT and uploaded various PDFs with text, images, drawings, etc. But seems like the custom GPT 4o still cannot “see” the images in a PDF. Is there a simple way for it to ‘see’ images in pdf’s without having to extract the images and upload separately - kinda defeats the purpose. I would have thought this was possible with the ‘Vision’ capability showcased ? Thanks