What is the exact method used by ChatGPT 4o to read PDFs?

vishrao · January 31, 2025, 9:52pm

Hey everyone,

I’m working on a research project that involves an individual opening the ChatGPT WebApp, uploading a PDF that contains some text, and then asking ChatGPT to summarize it. Let’s say everyone has access to 4o.

To convince everybody that doing this will result in getting all the relevant information in the summary, I am planning to run it on 1000 PDFs on my own and analyzing the results. Since I have over 1000 PDFs to run, I have set up a pipeline in python using OpenAI’s API. Now the OpenAI API does not allow for PDF uploads (yes there is the new Assistants beta version but they mention it is not good for summarization yet). So I can use some simple methods like using PyPDF2 to read and extract the text from the PDF and feed that in as part of the prompt. However, I want to be rigorous here and not make any assumptions. So I want to know the exact method used by OpenAI to parse the uploaded PDFs in ChatGPT so that I can write a python script to mimic that.

For instance, in Gemini’s API documentation (ai.google.dev/gemini-api/docs/document-processing?lang=python), they say that you can use ``
doc_data = base64.standard_b64encode(doc_file.read()).decode(“utf-8”)" to extract the data from the PDF and include that as part of the prompt. I want something concrete like this for ChatGPT as well.

Can someone please guide me?

Thanks!

Topic		Replies	Views
Can you explain how to analyze a PDF file in GPT-4? API	9	72199	December 13, 2023
What is the API equivalent of uploading a PDF? API gpt-4o	1	4982	June 20, 2024
GPT-4o PDF upload vs API vision API	3	9620	May 17, 2024
Question about extracting images from files with GPT4o API gpt-4	0	2474	May 20, 2024
Could you explain how to use chatGPT to upload and analyze PDF? API	3	7000	December 17, 2023

What is the exact method used by ChatGPT 4o to read PDFs?

Related topics