This is a general question about how to approach my project. I assume this is an incredibly common issue. I have a website with two types of data:
- Company data. This is public information and would be available to all our users.
- User data. This is private and access is limited to the logged in user.
Each user should be able to chat with a unique combination of 1 & 2.
I’ve been building a threaded chat tool using the Assistants api and integrating tool usage but haven’t progressed to using data sources yet. All information in the chat threads so far has been passed via the prompt.
What I’m looking for is the best way of tackling the problem. If my dataset was only #1 (public company data), this would be solved by using files in Assistants or using a vector DB like pinecone (I’m very new to vector dbs). But with the combination of 1 & 2 making every user dataset unique, I’m not sure of the right approach.
Do I build something using pinecone that maintains a vector db for every single user on the website or is there a better approach?
Could this be accomplished using Assistant files, given the large number of potential users?
Am I missing basic concepts of how this all works (totally possible)?