OpenAI assistant vector store file_batches returns a completed status, but some files failed

When I use file_batches to upload multiple files to a vector store, occasionally the file_batches status shows as ‘completed’, but there are some failed files in file_batches.file_counts, such as ‘FileCounts(cancelled=0, completed=53, failed=1, in_progress=0, total=54)’. What could be the cause of this, and how can it be resolved?
My code is as follows:

def upload_file(file_paths, client,vector_store_ids):
    file_streams = []
    try:
        for path in file_paths:
            file_streams.append(open(path, "rb"))

        file_batch = client.beta.vector_stores.file_batches.upload_and_poll(
            vector_store_id=vector_store_ids[0], files=file_streams
        )
        if file_batch.status == "completed":
            print(f"success:{file_batch.file_counts}")
            return True
        print(f"{file_batch.status}:{file_batch.file_counts}")
        return False
    finally:
        for file_stream in file_streams:
            file_stream.close()
1 Like

I would check your file types and file extensions. For instance, doc is not supported but docx is. In my case, I use libreoffice to convert such files. For the extensions, I found that they must be lowercase, so I fix that as needed before attempting a file upload.

2 Likes

All my files are markdown files with the extension .md. The issue is that after re-uploading the files, they can be successfully uploaded. This situation happens occasionally. I am not sure what the cause is.

Same thing is happening with the node sdk. And it only returns the count of failed files, so you can’t tell which ones to retry. Since that’s the case, I’m just going to handle each file individually so I know when it fails and can retry on the spot.

1 Like