Your input exceeds the context window of this model

tjhoo · July 12, 2025, 1:30pm

Hi
I want to extract the details from PDF to JSON format. So I first read the PDF content in Base64-encoded,

// const data = fs.readFileSync("invoice-1.pdf");   // ~352KB
const data = fs.readFileSync("invoice-2.pdf");   // ~12MB
const s = data.toString("base64");

then I defined this instructions

You are a data parser. Parse the following details from the document accurately and return it in JSON format.

- Document date

in this Response API:

const response = await client.responses.create({
  model: "gpt-4.1-2025-04-14",
  instructions,
  input: [
    {
      role: "user",
      content: [
        {
          type: "input_file",
          filename: "statement.pdf",
          file_data: `data:application/pdf;base64,${s}`,
        },
      ],
    },
  ],
});

the program worked fine with invoice-1.pdf but failing with invoice-2.pdf (~12MB)

BadRequestError: 400 Your input exceeds the context window of this model. Please adjust your input and try again.
    at APIError.generate (file:///home/user/temp/openai/node_modules/.pnpm/openai@5.9.0_zod@3.25.76/node_modules/openai/core/error.mjs:41:20)
    at OpenAI.makeStatusError (file:///home/user/temp/openai/node_modules/.pnpm/openai@5.9.0_zod@3.25.76/node_modules/openai/client.mjs:156:32)
    at OpenAI.makeRequest (file:///home/user/temp/openai/node_modules/.pnpm/openai@5.9.0_zod@3.25.76/node_modules/openai/client.mjs:301:30)
    at process.processTicksAndRejections (node:internal/process/task_queues:105:5)
    at async file:///home/user/temp/openai/server.js:54:18 {
  status: 400,
  headers: Headers {},
  requestID: 'req_7121698917b295d2e9b3e565fcd278be',
  error: {
    message: 'Your input exceeds the context window of this model. Please adjust your input and try again.',
    type: 'invalid_request_error',
    param: 'input',
    code: 'context_length_exceeded'
  },
  code: 'context_length_exceeded',
  param: 'input',
  type: 'invalid_request_error'
}

How can I fix and avoid this limitation?

_j · July 12, 2025, 1:44pm

PDF limits: max 100 pages, with size limits 32MB total content.

It sounds like the document text extraction went crazy on you, since the model can accept a million tokens.

As an alternate diagnostic step, upload to the file storage API, and then include in the user message by using the file ID instead. That may remove any concerns about improper encoding that we don’t see.

Then, the PDF may be broken, password-protected, have different search text than seen. I would use a Python PDF library to extract the document text yourself, replicating what the API might do, and see if you get useful text from it.

Topic		Replies	Views
Too many images in PDF (reponses api) creates error Bugs responses	3	339	March 19, 2025
PDF files being cut off in Responses API API api , responses-api	2	498	June 27, 2025
Need to come of finish_reason: length Max token alternatives API api , api-output-length	5	1890	April 10, 2024
This file contains too much text content. Please try again with a smaller file GPT builders custom-gpt	3	1381	June 30, 2024
[Responses API] File Upload in Query vs ChatGPT API file-uploads	3	3014	October 26, 2025

Your input exceeds the context window of this model

Related topics