Assistant with document_search enabled - long response time

We have an Assistant with document search enabled connected to a vector store. We use model gpt-4o and api version 2. We connect to the Assistant through the OpenAI Assistant API. We have about 30 pdf documents in the vector store. We exeprience that the response time from the Assistant is very long. We believe it’s kind of slow to get a response as it has to check for the run status every time and check if it’s completed before it can get a response message. It run around 7 iterations before the status of the thread run becomes completed and that’s the only time it can get the messages data.

Here are the steps on which our code is created:

• First, it retrieves the assistant based on the assistant id.

• Then, create a thread from it.

• If a thread is successfully created, it creates a message with the user message based on the chat box.

• If the thread message is successfully posted, it now creates a run containing the thread id, assistant id and instructions. The run is what determines if the posted message is processed.

• In order to check the status of the run, it needs to check the run status via a loop while the status is still in queued or in progress status.

• If the status returned by the run status API is completed, it gets the list of messages in the thread as a response from the AI. Then, displayed it to the chat box as a response.

I would be very interessted in any comments on the process. Is our logic and thinking wrong? The current response time is about 20 seconds.

//Thanks

1 Like

If you are tracking assistants and threads that exist, you can submit messages and runs direct to them without doing calls to see if they work.

You can investigate receiving a streaming response. All the other file searching still goes on, but you can give the illusion of a faster answer by displaying the final response as it is received.