Assistant v2 not using vector store via API

I build an app using assistant api with vector store attached to the assistant. If I ask a question via playground, the assistant is able to retrieve data from the vector store (via embedding). But if I use the API call, it is not using the vector store to retrieve relevant data.

Remember, the same assistant works on playground, but not via API. Any idea why this may be happening??

1 Like

I am having the same issue. No clue as how to fix it yet.

2 Likes

Seeing that this is a known issue without any solution at the moment, I’ve decided to mix manual embedding with assistant api and the result is excellent. Token usage is minimised from average of 10k (on playground with vector store) to just 1k-2k with embedding.

Here’s my workflow if anyone is interested

From Admin Dashboard

  • Admin select file(s) to upload
  • Files are uploaded to web server storage and also upload to OpenAI files API
  • OpenAI file ID is attached to the vector store, and also saved to local DB
  • Fired a queue worker to extract file content from my local file storage on a page by page basis
  • create an embedding of the pages via embedding api (ensuring that there is an overlap across pages)
  • save embeddng to local database table with the text

On the Chat Interface

  • User asks a question
  • an embedding is created for the question
  • vector search is carried out on my local embedding store and relevant context are retrieved (with distance < 0.6)
  • The context (if found) and questions are added to the thread
  • Thread is executed and result is retirieved

** Not much difference from manual embedding **
I know this process is not that much different from the old approach of using vector db to manually create embedding, however, it serves the following purposes:

  • For now, it is a temporary solution to the vector store issue, once this issue is fixed, I can disable the module for doing the manual embedding
  • Unlike with completion api, I don’t have to worry about sending the conversation history since that is already part of the assistant api thread. I only need to include the new questions and any relevant context retrieved via embedding
  • Token usage is much better and controllable for now. Again, once the vector store issue is sorted, I can rely on openAI to retrieve the relevant context