Difficulty Creating a Vector Store with PDF Files Using the REST API

Hi, everyone!

I’m having trouble creating a vector store using the OpenAI REST API. According to the documentation, the /v1/files endpoint only accepts .jsonl files for upload, especially when working with fine-tuning. However, when I try to create a vector store using the /v1/vector_stores endpoint, I get an error if I include .jsonl files, stating that this format is not supported for retrieval.

Here’s the error response:

{
    "error": {
        "message": "Files with extensions [.jsonl] are not supported for retrieval. See https://platform.openai.com/docs/assistants/tools/file-search#supported-files",
        "type": "invalid_request_error",
        "param": "file_ids",
        "code": "unsupported_file"
    }
}

In the documentation, examples using the Python SDK suggest that it’s possible to use formats like PDF and TXT to create a vector store. Here’s a snippet of the provided example:

python

CopiarEditar

# Python SDK Example
vector_store = client.beta.vector_stores.create(name="Financial Statements")

file_paths = ["edgar/goog-10k.pdf", "edgar/brka-10k.txt"]
file_streams = [open(path, "rb") for path in file_paths]

file_batch = client.beta.vector_stores.file_batches.upload_and_poll(
    vector_store_id=vector_store.id, files=file_streams
)

I’d like to know how to achieve this same process using the REST API (without relying on the SDK).

  • What is the correct way to create a vector store with supported file formats like PDF, CSV, or TXT?
  • Is there a specific endpoint or configuration required to make this work?

Thank you in advance for your help!