If the documents uploaded to the vector store started life as web pages (or something accessed with a link), it makes sense to give the original link and not some local file name back as part of a query response. However, there doesn’t seem to be any way to store arbitrary metadata with the files in the vector store. Even if query by metadata isn’t supported (yet), can’t there be a standard way to store that metadata?
Hi!
Adding metadata to the embeddings is quite normal when using a separate vector database. This is currently not supported by the assistants API unless you store the files with a id as the filename that allows you to easily map back to the URL.
What separate vector databases do you recommend?
I’ve been using pinecone because of its good integration of metadata and vector search. But haven’t done extensive comparisons.
I prefer to not endorse any specific provider without a specific reason.
Here is a rather comprehensive list from Wikipedia:
And another list: Vector Databases (are All The Rage) | by Christoph Bussler | Google Cloud - Community | Medium
Or just build your own in less than 10 minutes (maybe an hour if you have little experience)
If you know how to normalize a vector and how to sort a list you should be good up to a thousand entries per store.