Optimizing Chatbot Conversations: Strategies for Effective User-Refinement Integration

Hello, I am in the process of developing a chatbot that utilizes RAG to respond to questions. My system includes a vector store containing all relevant data, and I archive all conversations. My goal is to enhance specific AI-generated responses through user refinement. To achieve this, I am contemplating creating a new entry in the vector database with the following format:
question: {a question}
answer: {an answer}
I intend for this data to carry more weight than other information in the database. Initially, I attempted to communicate this prioritization to GPT by incorporating rules in the prompt, but encountered inconsistency in the responses.

Subsequently, I explored fine-tuning the model with examples that encompassed both general and user-refinement contexts. While this approach yielded success, particularly in refining responses based on context, it faced challenges with simpler queries such as greetings.

An alternative consideration was maintaining two distinct vector stores—one for general knowledge and the other for user refinement. The idea is to first search for a response in the user-refinement vector store. If unsuccessful, the general knowledge vector store would be the fallback option. However, I encountered an issue with the similarity search, as it sometimes yielded unrelated vectors.

I am curious if there is a way to establish a minimum score threshold for the similarity search. What would be the most effective approach to implement this feature? Any insights or suggestions would be greatly appreciated.

Is there any other way to think about this problem?

Welcome to the community!

Hmm… this one’s tough.
There’s a reason that none of the AI companies have personalized their chat experiences for users (yet). This isn’t an easy problem to solve with the tools available.

I’d actually think fine tuning is a safer bet here, but that runs into its own complexities and difficulties. As you’ve seen, fine tuning has the best success for something like this.

To directly answer your question about similaritysearch, I would look into cosine similarity (Cosine similarity - Wikipedia). This is probably the best tool in your arsenal for something like that.

However, I’m not so sure that’s going to actually solve your problem in the greater scheme of things. Maybe implementing user refinement as contextual data would be good, but it’s probably going to give marginal results at best. The RAG approach can’t be done in isolation either; you would still need to make sure that user’s refinement data is only accessible to their instances, which would involve its own data structure. Otherwise, you have no way of verifying clearly whether or not it will retrieve the right vector store for the right user100% of the time, and because of that, this could lead to some big problems. Remember too, that knowledge retrieval is just adding to the contextual data, so in the end, is this really the most efficient way to do this?

If it were me, I would suggest going back to fine tuning approaches. However, please be aware this doesn’t scale well. So, unfortunately, there’s no solid solution for this yet. This is likely going to be a big undertaking and a lot of advanced ML research in order to come up with an efficient solution on your own. Handling this for yourself or for personal use is more doable, but in terms of an actual user base, this becomes extremely tough to figure out.