Vector Store Database recommendations for chat app (not assistant?)

I’m looking at adding vector store for RAG lookup for an internal chat assistant. Not currently looking at using OpenAI Assistants for this (as we want to be able to add additional LLMs in the future).

Does anyone have a recommendation for a good solution (open source or paid) that can handle short-lived vectorstores (i.e for files uploaded within threads) as well as longer-lived vectorstores for knowledgebase style assistant workflows with more of a “upload once and query” workflow?

Looking at Pinecone, CouchDB, Weaviate - what I have noticed is that these DB’s dont tend to have the “expiry time” support out of the box, so will need some manual clean-up.

Example usages would be such as:

  • Global assistant for multi-user chats - single persistent vectorstore with potentially a thread specific datastore for thread files
  • User Knowledge Base - single persistent vectorstore that is available to any assistant/bot choice
  • Threads - short-lived, 14 days or so data stores

Any recommendations?

Weaviate is a good choice. You can easily add an “expiry time” to the metadata and clean these objects in your code.

Ace thanks! From a setup perspective with Weaviate, are you creating separate collections for each thread and then cleaning them up or using filter key(s) on the collection for each user/tenant?

I use my local db for chat histories (threads). I keep them as I may eventually use them to fine-tune – or add them as embeddings to enhance search results. Haven’t quite decided yet.

I think you should post your question on the Weaviate forum – I believe they may have an out-of-the-box solution for you.

Chat Histories I’m also keeping local, it’s the files within that and making them vector searchable which is the use case I’m trying to figure out.

I.e user uploads file, upload to vector DB, query files with thread vector DB/persistent vector DB dependent on the request type.

Will post in Weaviate, thanks

I see now. Mines is a completely different use case. But, if I were to use your approach, it would be through tenancy or filtering. I seem to recall that Weaviate came out with something that may address your need.