I have a vector store connected to an assistant. The data in the vector store is gathered by converting images to text using vision technology.
My question is: I want to update the parts of the data periodically. I have the updated data, but I am unable to locate the old data in the store to replace it.
Does anyone have a solution or recommendation for this?
hey. I think the solution is to store the vector file id corresponding to the correct image so that you know what file id has to be replaced when updating the information about an image. I had the same issues, and decided to use CloudKit to store the references (working on Xcode / Apple) so create a database with a simple dictionary (or more complex if you want) remember that the file id changes every time you update, so you need to update the database as well every time. or is there a way to really update a file in 1 call ? I am deleting and reuploading currently.
Hi @ephraim1
Thank you for your reply.
How about adding a modification date as metadata for each piece of data? Then, when I want to add an updated version, I just add the whole piece with the newer modification date without removing the outdated data.
Does ChatGPT understand to consider the lates modification date when there are mainly two almost similar data with two different modification dates?
As far as I am aware you need to delete the old one and upload the new one, there is no way to tell what data will get included by the vector search having different metatdata and dates.
Currently, OpenAI does not support direct updates to vector stores. To make updates, you must delete the existing vector store and recreate it with the updated set of files
Currently, OpenAI supports active updates to vector stores. There is no need to delete an existing vector store to add more documents or to remove others.
To remove a document that you have attached to a vector store, you would use the delete vector store file method.
Delete a vector store file. This will remove the file from the vector store but the file itself will not be deleted. To delete the file, use the delete file endpoint.
If you have lost the association between original names and file IDs, you can use the method List vector store files
get https://api.openai.com/v1/vector_stores/{vector_store_id}/files
Returns a list of vector store files. Then from that, you can get file ID objects and their original file names.
You do not actually need to delete a file to add another, even one with the same file name, as they are referenced by ID. It will simply mean that both versions of the document will have chunks potentially returned by searches. You can see if this is suitable or intolerable for your application. The AI would be able to read any metadata such as an updated date that you include in a text file if the total file length is smaller than the chunk size, so that the in-context metadata is not potentially split.
You can then proceed to upload your new file with data, and attach it to the vector store.