It was working earlier this morning but scanned PDFs (images of text) no longer work with the responses file input API. Only PDFs with text work.
This is a pretty critical issue for our application. Please investigate ASAP.
Im also having this issue. Hope they implement a fix asap.
Still having this issue. Should I OCR and turn PDFs into images manually for now?
Is there no way to escalate this issue? Image support for PDFs is a major feature.
Still waiting on a fix. Will swap to Gemini for now.
Sorry for the issues here. The team’s urgently looking into this now.
You can add to the “concern” pile: only one PDF being utilized at all - the last one in a role message of typed parts.
This on Chat Completions; I demonstrate switching the position in a user message content of two PDFs switches the one that can be correctly answered about:
Are there any updates or an ETA?
Hi, Thank you for sharing this. I tried reproducing the issue using GPT-4o and a scanned image but wasn't able to replicate it on my end. Could you share a few more details — such as the specific error you're encountering, an example PDF image you're using, and which model you're working with? I'd be happy to take a closer look.
One thing that I noticed in my PDF processor when using responses API: for longer PDFS this ONLY works when you are NOT using formatted results.
Here are the image-based PDF files that I used in the linked Chat Completions issue:
1. Screenshot of an AI-made joke website, where the punchline of the second joke is “revealed”
2. Screenshot of OpenAI model price list, with gpt-4.1, beyond knowledge
Responses + gpt-4.1, sent by Playground (‘get code’ wrecks the browser)
We can see that the AI only answers and self reports about seeing images, and does it successfully:
Certainly! Here are the answers based on the provided images:
1. Local File Location & Joke Answer
Local file location of your website:
From the browser bar in the screenshot, your website is being served from:C:/chat/ai-examples/chat-utils...
This means your website files are located in the directory:
C:/chat/ai-examples/chat-utilsAnswer to question 3 on your website:
- I told my wife she was drawing her eyebrows too high.
(The punchline is not hidden and is likely the classic:)
She looked surprised.
2. Cheapest GPT-4.1 Model & Cached Discount
Cheapest GPT-4.1 model:
From the pricing table, the models listed under GPT-4.1 are:
- gpt-4.1
- gpt-4.1-mini
- gpt-4.1-nano
The cheapest is:
- gpt-4.1-nano
- Input: $0.10 per 1M tokens
Cached discount percentage for gpt-4.1-nano:
- Input: $0.10
- Cached input: $0.025
Discount calculation:
[ \text{Discount} = \left(1 - \frac{0.025}{0.10}\right) \times 100\% = 75\% ]Answer:
- The cheapest GPT-4.1 model is gpt-4.1-nano.
- The cached input discount is 75%.
Let me know if you need more details!
“no longer work” is not enough information.
My issue: Immediate reproduction, continued symptom
The SAME, simply switched to Chat Completions
The AI cannot receive the PDF’s API pricing, PDF that was clearly included in the API request using the most “official” method by a demonstrator. Instead the AI must make up an answer, about gpt-4o.
1. Local File Location & Joke Answer
Local File Location:
- According to the screenshot on page 1 of your PDF, the browser’s address bar shows:
This indicates your website is running locally from the directory:C:/chat/ai-examples/chat-utils...
(The full path may be longer, but this is the visible portion.)C:\chat\ai-examples\chat-utils
Answer to Question 3:
- The third joke on your website (from the image on page 1) is:
- I told my wife she was drawing her eyebrows too high.
- The punchline (not shown, but commonly known) is:
She looked surprised.
2. Cheapest GPT-4.1 Model & Cached Discount
Cheapest GPT-4.1 Model (as of June 2024):
- The cheapest GPT-4.1 model is gpt-4o (also known as GPT-4 Omni).
- gpt-4o is significantly less expensive than previous GPT-4 models.
Cached Discount Percentage:
- OpenAI announced that gpt-4o is 50% cheaper than GPT-4 Turbo for both input and output tokens.
- This means the cached discount percentage is 50% compared to GPT-4 Turbo.
Summary Table
Model Cheapest? Discount vs. GPT-4 Turbo gpt-4o Yes 50% gpt-4-turbo No — References:
Let me know if you need more details or have other questions!
Hey OpenAI,
I am not using images of PDFs. I am uploading scanned PDFs into the files API and using the file ids in the responses API as per the official documentation.
This worked perfectly a week ago until the responses API stopped automatically extracting images from scanned PDFs.
I am having the same issue with base64 scanned PDFs.
https://platform.openai.com/docs/guides/pdf-files?api-mode=responses
Hey @Jackie_Ni — could you drop in one concrete example (PDF, model, and prompt) so we’re all working from the same setup?
Hi @_J — I ran your pricing-scan PDF on my end and couldn’t reproduce the issue. You might get more consistent results if the prompt explicitly says something like “Answer factually and only use information from the attached files.” I’ve attached the script I used and the output for reference.
```
The document is a pricing table for different GPT models, listing the cost per 1 million text tokens for input, cached input, and output. Here are key points:
- **gpt-4.1**: Input $2.00, Cached Input $0.50, Output $8.00
- **gpt-4.1-mini**: Input $0.40, Cached Input $0.10, Output $1.60
- **gpt-4.1-nano**: Input $0.10, Cached Input $0.025, Output $0.40
- **gpt-4.5-preview**: Input $75.00, Cached Input $37.50, Output $150.00
- **gpt-4o**: Input $2.50, Cached Input $1.25, Output $10.00
- **gpt-4o-audio-preview**: Input $2.50, Output $10.00
- **gpt-4o-realtime-preview**: Input $5.00, Cached Input $2.50, Output $20.00
- **gpt-4o-mini**: Input $0.15, Cached Input $0.075, Output $0.60
The document mentions "flex processing" as a new way to save on synchronous requests.
```
For reference: I've tested it myself using the below parameters and couldn't reproduce the issue:
from openai import OpenAI
import os
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
pdf_path = "scannedAPI-pricing.pdf"file_id = client.files.create(file=open(pdf_path, "rb"), purpose="user_data").id
response = client.responses.create( model="gpt-4o",
input=[
{
"role": "user",
"content": [
{"type": "input_file", "file_id": file_id},
{"type": "input_text", "text": "Please summarise this document."},
],
}
],
)
print(response.output_text) # ← one-liner thanks to the SDK
Gotcha!
Here’s my scanned PDF file:
Here’s my OCRed PDF file:
Here’s the input I’m using
"input": [
{
"role": "user",
"content": [
{
"type": "input_file",
"file_id": "[insert file_id]"
},
{
"type": "input_text",
"text": "What is this PDF about?"
}
]
}
]
Only the OCRed PDF works.
I forgot to mention I’m using gpt-4.1 and gpt-4.1-mini. I tried using gpt-4o and it’s having the same issue.
My understanding is that OpenAI under the hood for PDFs extracts an image and text for each page and uses that as context for the model. Did something recently change in the API where images are no longer extracted for PDFs?
@Jackie_Ni:
- Are you able to consistently reproduce this issue? Could you capture a response ID when you can reproduce the issue again?
print("Response ID:", response.id)
- I tried using the scanned image using responses API + model 4.1 and here is the summary provided:
Certainly! Here’s a summary of the document:
---
**Title:** How The Market Dominance of PBMs is Hurting America
**Speaker:** Mark Cuban
**Date:** March 4, 2024
**Occasion:** PBM Discussion, White House
### Main Points:
- **Problem:**
The document addresses why Cost Plus Drugs exists in a market dominated by three major Pharmacy Benefit Managers (PBMs). It argues that these PBMs hurt patients, providers, employers, and especially independent pharmacies through policies that prioritize profit over care.
- **Key Issues with PBMs:**
1. **Zero Transparency:**
PBMs prevent any public discussion about their contracts, terms, or pricing, keeping critical information secret from providers, manufacturers, employers, and pharmacies.
2. **Magic Names/Specialty Pharmacies:**
PBMs label certain drugs as “specialty” to charge inflated prices—sometimes 100 times more than necessary—forcing purchases through select pharmacies they control.
3. **Rebates to Employers:**
Rebates given to employers are misunderstood; the sickest and oldest employees effectively pay for these rebates through higher out-of-pocket costs and deductibles.
4. **Rebates Determine Formularies:**
PBMs use rebates to control which medications are available, often excluding more affordable, effective alternatives.
5. **Hurting Independent Pharmacies:**
PBMs financially pressure independent pharmacies with arbitrary fees and poor reimbursement, sometimes putting them out of business or forcing them to lose money on every prescription.
- **Call to Action:**
Mark Cuban urges the government, states, and self-insured employers to stop doing business with the dominant PBMs and instead choose more transparent, transactional alternatives like CostPlusDrugs.com.
- **Cost Plus Drugs Model:**
Cost Plus Drugs brings transparency by showing the real cost, a fixed 15% markup, and a small pharmacy and shipping fee. No rebates, hidden fees, or secret contracts. The goal is to restore trust in the healthcare system.
- **Conclusion:**
The dominance and lack of transparency in PBMs have eroded public trust in healthcare. Changing to transparent models can save money, improve access, and rebuild trust.
---
**Summary:**
The document is a critique of the harmful influence and lack of transparency by the three major PBMs in the U.S. healthcare system. Mark Cuban advocates for a transparent, straightforward drug pricing model as a way to disrupt the industry, restore trust, and better serve patients, employers, and independent pharmacies.
I’m getting the same error every time I try and summarize an image only PDF. The summary code works fine for PDFs with embedded text, but I get a variation of the following response for all image only PDFs. I tried via file upload and base64 encoded.
response id: resp_680eafeed2e08191826661c558ea1e470f1a127b68e975f9
“I’m unable to access the content of ‘filename.pdf’. If you can provide any text or details from the document, I’d be happy to help summarize!”
const apiKey = process.env.OPENAI_API_KEY
const client = new OpenAI({
maxRetries: 5,
apiKey: apiKey
})
const filePath = path.join(__dirname, '..', 'cache', 'docs', 'image-based-pdf-sample.pdf')
const data = await fs.readFile(filePath)
const base64Content = data.toString('base64')
const response = await client.responses.create({
model: 'gpt-4o',
input: [
{
role: 'user',
content: [
{
type: 'input_file',
file_data: `data:application/pdf;base64,${base64Content}`,
filename: 'filename.pdf'
},
{
type: 'input_text',
text: 'Please summarize this document'
}
]
}
]
})
console.log(response)
Also ran a test with @Jackie_Ni’s file and got the same result.
Response Id: resp_680eb2b0b11881919eaea4d87aaa4c4e068d702f374ac6f9
response: “I’m unable to read the document “filename.pdf.” If you could provide some text or key details from the document, I’d be happy to help you summarize it!”