MAJOR BUG: Scanned PDFs are no longer working via the responses file input API

Also have you checked max tokens? Make sure it is high enough. And markdown takes less token as well so better for longer pdf’s

We don’t set max tokens for any of the models - the default is the max limit. Didn’t know HTML stresses LLMs. I think that OpenAI datacenters are already stressed as MS indicated today on their earnings call :neutral_face:

The bottom line is, when File inputs was released for gpt-4.1, we jumped on it as it was a good fit for us. We tested a bunch and had no issues until now. It seems that LLM prompt acceptance is a bit sensitive now. Besides, we are not the only ones having problems with this.

We have taken this out of production until it becomes stable - maybe when Sam is allocated more NVIDIA Blackwell chips.

@OpenAI_Support is there an ETA or update on this issue? been getting radio silence for almost two weeks ago.

Hi Jackie, I’ve been looking into this. It has to do with project storage size. This hopefully should be fixed EOW. If you have any recent request id’s to share that would be awesome!

4 Likes

Hey Gireesh, gotcha! Here’s a recent response_id that failed: resp_681e6cd5704c8191895b4c0c51b09e27037ea14e8cc1888a

One way is to convert PDF into Word Doc and then use the doc.

This is definitely not solved yet. Sometimes it mysteriously works, but in most cases it doesn’t. Tested it with Chat Completions API and Responses API: o4-mini, o3 and GPT-4o.

@OpenAI_Support @Gireesh_Mahajan still having this issue did it end up being patched over the weekend?

1 Like

@OpenAI_Support @Gireesh_Mahajan how is it still cooked it’s been a month :sob:

Hello! We’re actively investigating and prioritizing this issue, but resolution will take some time. Thank you for your patience while we work toward a solution.

4 Likes

Thank you for your patience. Happy to share that we have rolled out a fix for this issue!

2 Likes

I’m seeing a large degradation in scanned PDF processing as of this morning (possibly coinciding with the recent update?)
Things were actually working ~mostly fine before this.

Specifically, I’m receiving responses that indicate the model can’t see the uploaded file at all (as opposed to being unable to read a received file)

hello! Could you provide more details on the degradation you are seeing? Such as the issue you are experiencing, request ID etc.

1 Like

I’m also experiencing new responses indicating pdf files are not received by the model.

const completion = await openai.responses.parse({
    model: "gpt-4.1",
    input: [
        {
            role: "user",
            content: [
                {
                    type: "input_file",
                    filename: pdf.filename || "document.pdf",
                    file_data: `data:application/pdf;base64,${pdf.base64}`,
                },
                {
                    type: "input_text",
                    text: "summarize the document in a few sentences",
                },
            ],
        },
    ],
    temperature: 0,
})

return completion.output_text // "It appears there is no document provided for me to summarize. Could you please upload the document or share its content?"

// "Of course! Please provide the document or its main points, and I'll summarize it for you."

  • pdf.base64 data is definitely correct (size:1488566, mimeType: application/pdf)
  • Tried with different pdfs
  • This call worked until late 06/24/2025. I used it with gpt-4.1-mini and gpt-4.1

Confirming PDF acceess outage: model says it doesn’t see attached file. I’m providing a valid pdf by a raw S3 or Vercel Blob url. It was working fine couple of days ago.

GPT-4.1

1 Like

We are having the same problem of the pdf not being received by the model. Apparently it started after the roll out.

It happens even and the playground. I couldn’t identify the difference between pdf’s that work and the ones that don’t work.

Playground resp: resp_685e993da9308198aa6b7077b324e907004056804f61d022

The problem is still there, as of right now. We are extracting information from a bunch of PDFs, and if the pdf has selectable text, it works beatifully; if however, the PDF is scanned, it straight up refuses to extract anything.

Things we’ve tried and did NOT help:

  1. Structured output vs. no structured output - doesn’t make a difference
  2. Different model - fails on both 4o and o1.
  3. store-> false
  4. Giving different instructions.

We are using the responses API.
It works fine if I manually upload the exact same PDF to the chatgpt website.

I cannot upload the exact same PDF, since it’s private, but if @OpenAI_Support cannot reproduce the issue, I’ll create a new one.
For now, I’m really not sure what to do, I guess I’ll try converting these PDFs to images, and uploading that; but we’ve also been seeing image problems. If anyone has ideas, let me know.

As of now, picture attachments urls are stopped working too. Model doesn’t understand urls completely. Base64 attached picture with data: works well.
Testing with chat completions API, gpt-4.1

@OpenAI_Support

Hello,
Here is what I found so far.

  • Responses API + base 64 encoded text based PDF => document is read by the model

  • Responses API + base 64 encoded image based PDF => model can’t read anything from document (since 06/24/2025?)

  • Chat completions API + base 64 encoded text based PDF => document is read by the model

  • Chat completions API + base 64 encoded image based PDF => document is read by the model

It seems the image to text processing step previously performed by the Responses API is not applied anymore.

Please have a look
https://github.com/maiwenn/pdfAnalyzeDebug/blob/master/index.js

Hi everyone! We've resolved the regression that followed the initial fix and have tested the update. Please let us know if you're still experiencing any issues.

4 Likes