GPT-4o PDF upload vs API vision

Hi,

I’m trying to process documents via the API. For this, I convert the pdf to images and send them to the API with my prompt.

Unfortunately, for some of them this API call misses some details.
However, when I upload the PDF to ChatGPT and use the same prompt, it get’s it right.
Does anyone know how the ChatGPT interface does pdf processing vs. the API’s vision capability?

Hi and welcome to the Community!

Is your PDF a scan or machine readable?

yeah you’re right. I just realized that the PDF only works in ChatGPT if it has embedded text.
Thanks!

Wait, is that right? Take this example PDF:

https://pdf.datasheetcatalog.com/datasheets/2300/45014_DS.pdf

As far as Foxit and PyMuPDF can tell, it does not have embedded text, but ChatGPT parses it perfectly. What am I missing?