Scanned pdf with API and ask questions

gchikaidze · October 15, 2024, 10:51am

Hello, I want to give the ChatGPT API multiple scanned PDF files and ask questions about them. From what I’ve seen, I was only able to send images to the API. Can anyone help?

jr.2509 · October 15, 2024, 12:05pm

Hi there and welcome to the Community!

Your understanding is correct. You would have to supply the PDF pages as images to one of the models that support vision (i.e. gpt-4-turbo or the newer gpt-4o models) in order to achieve that.

gchikaidze · October 15, 2024, 12:48pm

import openai
import base64
from PIL import Image
from io import BytesIO
from openai import OpenAI

# client = OpenAI()

# Load image from file
def load_image_as_base64(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode("utf-8")

# Example usage
image_path = "./Photos/10_I.png"  # Replace with the path to your local image
encoded_image = f"data:image/jpeg;base64,{load_image_as_base64(image_path)}"

# Use this encoded image as part of your request
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "tell me this fields:Sendername, sender taxcode, sender address, Date, Receiverbankcountry, Receiver client address, Receiverbankcode, Receiverbankname, Receiveraccount, Receivername, Currency, Amount, Details of payment, Invoice number.",
                },
                {
                    "type": "image_url",
                    "image_url": {"url": encoded_image},
                }
            ],
        }
    ],
    max_tokens=500,
)

print(response.choices[0].message.content)

This is the code I’m using now, but I want to provide PDF file input because there are multiple images in the PDF. I want to upload more than one PDF. Is it not possible? i have to send images one by one?

jr.2509 · October 15, 2024, 12:55pm

I’d recommend having a look at this OpenAI cookbook:

Topic		Replies	Views
Best practice scanned PDF / What model to use? API chatgpt , plugin-development , api , gpt-4-vision	4	2431	January 10, 2026
Process scanned pdfs through api API gpt-4 , chatgpt , api , pdf , ocr	3	1372	January 10, 2026
What is the best way to parse a PDF file with ChatGPT? API	10	51910	January 10, 2026
Train assistant to read PDF with images API gpt-4	10	2345	January 10, 2026
GPT-4 API for Educational Application API gpt-4 , chatgpt	2	1629	January 24, 2025

Scanned pdf with API and ask questions

Related topics