Creating vector stores via threads vs. fileBatches

Our product uses the OpenAI API for a chat feature that queries prompts against a client’s set of data. The client data is organized into hundreds of files to provide a good level of granularity for reference citations.

The client may specify an ad hoc set of data to query against, which requires a new vector store each time the data set changes. Good performance is essential; delays of several minutes while creating a new vector store results in a poor user experience.

We have found that calling vectorStores.fileBatches.createAndPoll() generally has poor performance. Even when called with just a handful of files, latency is always at least 3-5 seconds and sometimes up to 2 minutes for no apparent reason.

On the other hand, when specifying a set of fileIds when calling threads.create(), performance is much better. I have called this with over 100 fileIds successfully in around 1 second.

Unfortunately, threads.create() does not seem to be reliable. There is a limit of 500 fileIds, and if I pass it a large number of larger files, many files fail, and sometimes the upload never finishes and the vector_store status remains ‘in_progress’ indefinitely. And there doesn’t seem to be a “poll” version of thread create that returns once the vector store is successfully created.

So here are my questions:

Why is the fileBatches.create() API so slow?

Is there a way to make threads.create() reliable?

Can files be added to the thread vector store incrementally while maintaining the performance advantage?

What, ultimately, is the preferred way of uploading a large number of files to a vector store in a performant manner?