Invalid input: Expected file type to be a supported format: .pdf but got .docx

aryanchaurasia.348 · March 12, 2025, 4:49am

When I use file search with new Response API and instead of PDF i use docx in vector store i get error “Invalid input: Expected file type to be a supported format: .pdf but got .docx.”

With PDF new response api works but not with docx

However documentation mentions vector store supports docx.

Update:
Just wanted to add here incase it hepls anyone:
Basically I was adding my tools like this:

{‘type’: ‘file_search’, ‘vector_store_ids’: [‘vs_id’]}

And also I was adding file id to my content array:
“content”: [
{
“type”: “input_file”,
“file_id”: http://file.id,
},
{
“type”: “input_text”,
“text”: “What is the file about?”,
},
]

Removing from content array worked. Otherwise it was just giving api error and no request id. I have got all working just waiting for code interpreter availability then i can release it to prod.

smit · March 16, 2025, 10:57pm

I have the same problem! I am able upload all type of files via https://api.openai.com/v1/files and it gives a file_id.

however, when I ask via the API t convert into Markdown it only works with .pdf but not for .docx and .doc, while the front-end does.

Anybody a clue?

avowkind · March 20, 2025, 10:37pm

confirmed. The playground file upload allows me to upload a text file for example and I can ask questions about it in the same query. However programatically I get this error.

input

{ 'model': 'gpt-4o-mini', 'input': [{'role': 'user', 'content': [{'type': 'input_file', 'file_id': 
'file-SYjmkPFtqptWYLy8dDLt3D'}, {'type': 'input_text', 'text': 'summarise the release process for alleycat'}]}], 'temperature': 0.7, 'instructions': 'The 
user has attached a file for you to analyze.'}

error:

{'error': {'message': 'Invalid input: Expected file type to be a supported format: .pdf but got .md.', 'type': 
'invalid_request_error', 'param': 'input', 'code': None}}

mocapitan · March 31, 2025, 11:21am

I tested the examples from openai with different PDF files.
https://platform.openai.com/docs/guides/pdf-files?api-mode=responses

Result: Some PDF’s work, others don’t

My use case: I wanted to use openai for reading my PDF and producing summaries because the VectorStore does not understand PDFs that consists only of an image, so no text-retrieval from PDF but really do OCR. And exactly the PDFs that did not work with PDF Splitters around cannot be read by the API. Additional: if I do it manually in the chatGPT UI it works as expected but not using the file-uploads from openAI’s file API.

Let me know if you find out more!
Regards,
Michael

mocapitan · March 31, 2025, 11:25am

If I try jpg it states immidiatly
message: ‘Invalid input: Expected file type to be a supported format: .pdf but got .jpg.’,

So somehow the file upload from the API works different from the examples approach.

_j · March 31, 2025, 12:09pm

This API topic seems very easy to become confused in, or to think you are doing one thing and be doing the other.

There’s two possibilities.

Direct file attachment of a PDF from files endpoint id to a user message

this uses both text extraction and vision, placing the whole file contents (and possibly exceeding context if too large)

Using file_search tool, in combination with a vector store

where vector store file attachment is where you would encounter issues.

The symptom is “dumb file inspection” — which is also the cause.

mocapitan · April 2, 2025, 10:33am

Direct file attachment of PDF-Files is not working accurate. If I attach a PDF that has only a image in it with no text, e.g. if you printed out something with “save PDF” than it fails. That’s what I think I found out but maybe someone can test it as well? I get the error from the API that it cannot access the PDF. I cannot upload an example PDF here.

baldwin · May 12, 2025, 11:31pm

“Invalid input: Expected file type to be a supported format: .pdf but got .jsonl.”, “type”: “invalid_request_error”

I get the same issue but with jsonl files that I aim to use in the batch API. The docs say that the batch API requires jsonl.

nick_gic · June 30, 2025, 4:24pm

We’re trying to use docx/doc in the responses API via a file id. I asked the OpenAI support bot and here is what it said:

Hi!

I’m an AI support agent and happy to help. Currently, OpenAI’s API only supports certain file formats for reading and extracting content.

For the responses API and tools like the Assistants API, only PDFs are accepted for document analysis or referencing. When you upload a .docx or .doc file and try to use its file ID, you’ll see an error like the one you posted, stating that only PDF is supported.

How to proceed:
- Convert your Word documents (.docx/.doc) to PDF format before uploading.
- Upload the resulting PDF using the files API and reference the new file ID in your request. If you have any additional questions about file handling or supported formats, let me know!

Topic		Replies	Views
XLSX Files not supported for File Upload API API	9	6954	September 11, 2024
What file types are actually supported? Documentation assistants , file-uploads , assistants-files	23	8199	May 16, 2025
Md and txt file not uploading Bugs	16	831	March 18, 2025
How to use multiple files in Assistants API API	3	4636	November 20, 2023
OpenAI API for image text extraction API	6	17260	November 17, 2023

Invalid input: Expected file type to be a supported format: .pdf but got .docx

Related topics