Hi everyone,
I’m working with the OpenAI Responses API to analyze scanned PDF invoices and extract structured data.
Context / constraints
-
All my interactions with the API are done via plain HTTP requests.
-
I cannot use SDKs or more advanced client-side helpers; only raw HTTP is possible in my application.
-
PDFs are scanned invoices (1–3 pages, ~2–8 MB).
What I observed
1) Responses API with input_file (base64) + prompt in the same request
When I send a request like:
-
input_file.file_data(PDF in base64) -
plus a prompt in the same
/v1/responsescall asking to analyze and extract data
I often get very inconsistent results:
-
Sometimes the extraction is correct
-
Other times the response is clearly wrong, hallucinated, or unrelated to the actual PDF
I’m not sure why this happens, but because of this inconsistency I decided not to combine upload + analysis in the same request anymore.
2) Decoupled approach: upload first, then analyze by file_id
To make things more deterministic, I switched to this flow:
-
Use Responses API only to upload the PDF (via
file_data) -
List files using Files API and retrieve the corresponding
file_id -
Call Responses API again, referencing the PDF via:
{ "type": "input_file", "file_id": "file-xxxx" }and send the prompt to analyze the document
This approach is conceptually much cleaner and avoids resending large base64 payloads.
The problem
With this file_id-based approach, I frequently get intermittent 500 errors from the Responses API:
{
"error": {
"message": "The server had an error processing your request...",
"type": "server_error"
}
}
Important details:
-
The same PDF + same prompt sometimes works, sometimes fails
-
Retrying often succeeds
-
Files are listed as
status: processed -
Errors appear during the analysis step, not during upload
Questions
-
Is the recommended / most stable pattern for PDF analysis:
- Upload once → always reference by
file_id?
- Upload once → always reference by
-
Are there known limitations or edge cases when using
file_idwith scanned PDFs? -
Is it expected that combining upload + analysis in one Responses request can lead to inconsistent outputs?
-
Any best practices to reduce 5xx errors when analyzing PDFs via Responses API over plain HTTP?
I’m mainly trying to understand the ideal architecture for this use case and whether what I’m seeing is expected behavior or not.
Thanks in advance ![]()