I have an assistant that is supposed to extract data from various pdf files.
For this matter, the files are receipts and the assistant is supposed to extract the total payment amount.
The instructions are the same for all files, but the id of the file is sent in the message via the assistant API.
I have the following flow (all done sequentially via the OpenAI API):
- upload a file
- add the file to the vector store
- use the assistant API to send the file name to the assistant and get the extracted data in the response
- delete the file from the vector store
- delete the file from the (general) files
The problem is that often I see that the assistant extracting data that does not exist in the requested file, but only exists in a different file than the file that was requested.
Any idea why this happens and if there is a way to avoid this?