OpenAI Assistant Vector Store File Batch Returned Failed Status

I performed similar thing to what OpenAI wrote in their doc, when creating a vector store from multiple files using File Batch, as follows:

# Create a vector store caled "Financial Statements"
vector_store = client.beta.vector_stores.create(name="Financial Statements")
 
# Ready the files for upload to OpenAI
file_paths = ["edgar/goog-10k.pdf", "edgar/brka-10k.txt"]
file_streams = [open(path, "rb") for path in file_paths]
 
# Use the upload and poll SDK helper to upload the files, add them to the vector store,
# and poll the status of the file batch for completion.
file_batch = client.beta.vector_stores.file_batches.upload_and_poll(
  vector_store_id=vector_store.id, files=file_streams
)
 
# You can print the status and the file counts of the batch to see the result of this operation.
print(file_batch.status)
print(file_batch.file_counts)

Turned out file_batch.status returned ‘failed’ status. But, when I checked to my OpenAI Dashboard, the files have uploaded well to the Storage, and have attached to the Vector Store as intended.

When I did client.beta.vector_stores.retrieve(vector_store_id=<VECTOR_STORE_ID>).file_counts, my files that I intended to upload had already been there. When I tried to chat with the Assistant using that Vector Store, it worked pretty well to answer the question based on the uploaded file.

But instead, the File Batch status returned failed status and the file_counts property had total zero file.

What did happen to the FileBatch? Anyone faced similar issue?

2 Likes

I’ve tried to perform the same thing using OpenAI’s front-end. Everything worked fine in the front-end, but actually the response of file_batches endpoint return the same thing

1 Like

I encountered the same error. You find any other solution?

{
  "id": "vsfb_64f743db6a76425492fd86beca60f8b2",
  "created_at": 0,
  "file_counts": {
    "cancelled": 0,
    "completed": 0,
    "failed": 0,
    "in_progress": 0,
    "total": 0
  },
  "object": "vector_store.file_batch",
  "status": "failed",
  "vector_store_id": "vs_KuTyJK6nYKjvyIp6b7DM7dAZ"
}

I’m also facing the same issue, the batched file status is failing but the assistant seems to be working fine even after that, I am using this on my live application, i just commented the status check condition for now, but I hope openai fixes this asap.

1 Like

I did the same thing. And to get the individual file status I retrieved it from vector_stores.files instead of vector_stores.file_batches.

3 Likes

I have the same problem , the API returns status: failed

Thanks, this one indeed works!

I attempted to create and upload files to a vector store through the API following the File Search guide. The code from the guide printed out:

failed
FileCounts(cancelled=0, completed=0, failed=0, in_progress=0, total=0)

However, I was able to see the vector store I tried to create in the storage section on the OpenAI playground and create an assistant that could access and search the files I put in the vector store.

It seems that the bug is in reporting the file_batch status rather than in creating the vector store.

1 Like

It’s ridiculous that this works.

Worked for me too, here’s a code snippet:

# Upload the batch
 await openai_client.beta.vector_stores.file_batches.upload_and_poll(
        vector_store_id=openai_vector_store.id, files=upload_file_paths
    )

# List all files in the store. Results are paginated unfortunately.
    vector_store_files = []
    vector_store_file_list_response = await openai_client.beta.vector_stores.files.list(
        vector_store_id=openai_vector_store.id
    )
    while vector_store_file_list_response.has_next_page():
        vector_store_files.extend(vector_store_file_list_response.data)
        vector_store_file_list_response = await vector_store_file_list_response.get_next_page()

    # Use asyncio.gather to wait until the status of all individual files in the vector store is 'success'. Don't poll too fast if you have many files cuz there's another limit per minute for polling.
    await asyncio.gather(
        *[
            openai_client.beta.vector_stores.files.poll(
                file_id=vector_store_file.id, vector_store_id=openai_vector_store.id, poll_interval_ms=60*1000
            )
            for vector_store_file in vector_store_files
        ]
    )
1 Like

Apologies for the trouble here. We’ve identified the bug and are rolling out a fix shortly!

6 Likes

This should be fixed, thank for your patience and apologies for the trouble once again. :pray:

1 Like

Indeed! Thanks a lot for the quick fix :raised_hands: