Handling Long Conversations with Context Management

I am building out a chatbot that needs contextual awareness of the current conversation as well as history since the conversation started.
To do this I am employing a hybrid approach of last N messages + vector based retrieval on Q/A pairs that would be relevant from the same conversation.

When a message is received I classify the intent of the message to route it into one of 5 main buckets to be handled. This all works fine, but in extended conversation, the user can choose to change the intent of their question at any point. Should I be feeding in the entire conversation to the classification function at each new message to determine the new intent? Or should I be building out an assessment that determines whether each new message is related to the past session or part of a new session that handles this distinction before I determine the intent?

Additionally, is there a best practice for using the API to query the user until sufficient information is available to complete a function call / is deemed sufficient by your system?

user message > classifier > handler
user message + previous q/a > classifier > handler
user message > new session or existing session relevance > classifier > handler

1 Like

My advice would be:

  1. Know your users. Learn what they want, and build that.
  2. Simply offering the user the choice to create a new chat/thread/session should be enough. Then just run RAG on the messages within that conversation.

Thats not within the parameters of my project, this should all be done automatically without user input. By chatbot I literally mean a bot that responds to user messages via chat in a single thread i.e. in a iMessage conversation.

I think what sps was trying to say was, why are you managing threads for the user?

That is perhaps what is perplexing me when I look at this at least. It should also be known that if you do this automatically, a user may not realize they changed their intent, and suddenly context has been swapped around, and then they become either frustrated or confused. Or both. And when the user doesn’t understand what is going on, they will definitely express their discontent. It will also make it nearly impossible to troubleshoot when there’s a mismatch or misinterpretation / misunderstanding.

So, to add, I think allowing users to handle a new chat or session themselves reduces a lot of friction both in what you’re asking for and in unanticipated issues. Now, you can certainly categorize them by intent to use for your own purposes, but don’t use that to replace the user’s typical ability to refresh a chat, start a new one, etc.

I appreciate the explanation of why it is simpler to let users manage threads and sessions, and I totally agree, however for my use case that is not within the scope of the project.

I think session management is the key here which also may overlap heavily with context management. Bringing in the right time sensitive information as well as the right information from long term memory and determining when the time sensitive information is or is not relevant are the core tasks to having a robust solution

I haven’t tried what you are doing, but I’ll throw out an idea. You could try a conversation summary memory (to get a little more conversation into the context) plus a RAG function that lets the assistant look up from the history if it chooses. This would be much simpler and it might be worth seeing how good the results are.