[Responses API] File Upload in Query vs ChatGPT

I have been trying out the new Responses API and wanted to include multiple PDF files in the conversation like this:

chat_starter = client.responses.create(
    model="gpt-4o-mini",
    instructions=prompt_instructions,
    input=initial_query,
    temperature=0.7
)

starter_id = chat_starter.id

initial_file_response = client.files.create(
    file=open(file_path, 'rb'), purpose="user_data")
initial_file_id = initial_file_response.id

initial_response = client.responses.create(
    model="gpt-4o-mini",
    instructions=prompt_instructions,
    input=[
        {
            "role": "user",
            "content": [
                {
                    "type": "input_file",
                    "file_id": initial_file_id,
                },
                {
                    "type": "input_text",
                    "text": initial_query,
                },
            ]
        }
    ],
    previous_response_id=starter_id
)

I chained multiple responses with files in each line. However, as I add more files (4 files to be exact), I hit an error stating:

BadRequestError: Error code: 400 - {'error': {'message': 'The total token count of all files exceeds the maximum limit for this model. We can only stuff the first 4 files.', 'type': 'invalid_request_error', 'param': 'input', 'code': 'context_length_exceeded'}}

So ideally, I would reduce the token count but is there another way I can go about this? I can upload way more files on ChatGPT and ask questions on each one in the same conversation. How do I reproduce that behaviour in the API?

2 Likes

Try RAG via File Search instead?

https://platform.openai.com/docs/guides/tools-file-search

If I used RAG and turned the input file into a vector store, can the model still understand the tables and graphs in the uploaded PDF file? My understanding is the vector store is created only from text in the file.

1 Like