Weird 409 error: concurrency issue with vector stores

Hi everyone!

I’m using the Assistants API to extract information from a few PDFs. I have the following workflow (using the python SDK):

  • I make a single openai client and create a vector store
  • I upload the files to a vector store once using upload_and_poll
  • I then make a bunch of async calls (in parallel), each of which spawns a separate thread and run, and each of which use file_search tool with this vector store

The error I’m getting is (only occasionally, and without any specific pattern):
“Error code: 409 - {‘error’: {‘message’: ‘The vector store was updated by another process. Please reload and try again.’, ‘type’: ‘concurrent_modification’, ‘param’: None, ‘code’: None}}”

I’m not modifying the vector store myself, so I suspect something is happening when too many threads access the same vector store? Does anyone have experience dealing with this 409 error?

4 Likes

For reference – the error occurs when I try to create the thread.

1 Like

Did you solve this? I’ve got the same issue under the same circumstances. Seems like the concurrent questions to the same vector store with the same assistant are a problem?

I didn’t; I’m just retrying when it fails, that seems to work

1 Like

Can you check what version of the Python lib you have?

pip show openai

Just checked, I’m on version: 1.42.0 !

I had a similar issue/error using the Node SDK and the Uploads API.

error: {
    message: 'The vector store was updated by another process. Please reload and try again.',
    type: 'concurrent_modification',
    param: null,
    code: null
  },
  code: null,
  param: null,
  type: 'concurrent_modification'

Retrying did work, but on average it was taking me 3 attempts to get through 20 small file uploads.

My initial assumption was that I was misusing async/await somewhere, and possibly that I need to complete all Uploads before updating the Vector Store to add the uploaded files to it.

I was using a .map() operation on an array of files, and it was processing the files using concurrent async operations.

The solution I settled on was to remove the .map() and replace it with:

for (file of array) {
  ...
  upload_process_for_each(file);
}

Doing so converted all of the concurrent async operations into sequential processing operations, completely processing each uploaded file before moving onto and starting the next file’s upload process.

It takes significantly longer but it is much cleaner. I stopped receiving the error: { type: 'concurrent_modification' } error, and all of my larger (600-1000KB) .pdf files stopped failing on upload.

I’d love some additional details on this issue.

We are getting the The vector store was updated by another process. Please reload and try again. error even when we do not touch vector stores themselves.

We work with pre-configured vector stores that are then assigned to an Assistant thread on-demand, using a fixed Assistant, see following code snippet:

        const threadCreateParams: ThreadCreateParams = {
            tool_resources: {
                file_search: {
                    vector_store_ids: vectorStoreId ? [ vectorStoreId ] : [],
                },
            },
            messages: [ { role: 'user', content: userPrompt } ],
            metadata: { projectId, sessionId },
        };
        const thread = await this.openai.beta.threads.create(
            threadCreateParams,
        );

Then we run the assistant using the standard create and poll call

        const run = await this.openai.beta.threads.runs.createAndPoll(
            thread.id,
            runCreateParams,
            { pollIntervalMs: POLL_INTERVAL_MS },
        );

Using OpenAI JS SDK v4.63.

Do the calls above modify the vector store and warrant the error?

Can the Assistant object re-use cause these issues?