Assistant gets data from wrong vector store file

I have an assistant that is supposed to extract data from various pdf files.
For this matter, the files are receipts and the assistant is supposed to extract the total payment amount.
The instructions are the same for all files, but the id of the file is sent in the message via the assistant API.
I have the following flow (all done sequentially via the OpenAI API):

  1. upload a file
  2. add the file to the vector store
  3. use the assistant API to send the file name to the assistant and get the extracted data in the response
  4. delete the file from the vector store
  5. delete the file from the (general) files

The problem is that often I see that the assistant extracting data that does not exist in the requested file, but only exists in a different file than the file that was requested.
Any idea why this happens and if there is a way to avoid this?

1 Like

+1 to this issue. I have embeddings in my vector store, and sometimes when you ask it a question that can give more than one result, it works fine, but if you ask it about something specifically and the result is 1 record, it gets it wrong.
For simplicity: my data is json with “id” and “name”, when I ask it a question pertaining to a record, it matches the name just fine, but the id I get is the wrong one…

1 Like