Context aware context for follow-up question

Scenario: I have a chatbot with memory built on GPT-4.
Each time user asks new question, I get context from embeddings for this particular question and inject into request.

Example
Q1: Who is CEO?
[from embeddings is retrieved about-us page content and added to context]
A1: CEO is John

Q2: Since when?
[from embeddings gets retrieved absolute irrelevant context for sentence “Since when?” and passed for completion]

A2: generic none-sense

The issue is that follow-up question may not be context aware of what conversation was about so far.

What is recommended approach to provide context to follow-up question in order to retrieve relevant context for completion?

5 Likes

Typically, you pull back an embed for the entire Q/A sequence and not just one thing at a time, just like gpt needs context, so does the embedding retrieval. This is an active area of research, so you may discover better ways or methods as you go.

1 Like

Thanks for response!

Correct me if I’m wrong.
Imagine long conversation about one topic and the last question about something different.
Retrieving embeds is pure math and will retrieve most of the context for the first topic because density of keywords about this one will dominate. Hence probability to retrieve context for the last question is quite low. Isn’t it?

Absolutely, this is why embeddings are not the main use case for chat based systems, everything is a compromise at this stage. You rely on the Q/A session remaining on topic. You could pass the current question to the AI and also pass it all of the context thus far and ask it to return a Yes/No answer to the following question “Is this question off topic given the context?” if it returns Yes you can then throw the old context away and start anew.

I tested your scenario. I created a mock up company for testing to be sure that it will not get the info from its training data. I used Chat Completions API for the last part so that I can include chat history. It was able to answer the second question.

Is the embedding looped in this way?
messages=[
{“role”: “user”, “content”: content},
{“role”: “system”, “content”: content}
]

Your issue is simple. You need to create a standalone question to submit on your subsequent queries. I made a fairly rambling video about it here: https://youtu.be/B5B4fF95J9s

This is how it was explained to me: Chat Completion Architechture - #7 by SomebodySysop

2 Likes

It really depends how questions are posed.
For instance you can ask
“What is wether in Paris?”

And then
“same for Rome?”

Did it work?

Nevertheless, I solved it with another API call, providing previous exchange and asking to create stand-alone question from follow up question retaining context from previous exchange.

1 Like

Yes, it does depends on how you construct the question. I think the challenge for us devs is to handle such cases gracefully like how you solved it by doing another API call. Anyway, as for your weather sample, I tested using function calling and it works.

Now: What if I posed a question like this:

"That may be true, but what about for the last one we talked about before that temporary diversion where I discussed Beckham and where the football term “bend it” came from?

(where the topic I’ve returned to is antique typewriter parts)

You can see that appropriate minimized context-aware conversation history to be passed for AI understanding can be hard to discern without some intelligence. Even a classifier able to answer “how reliant is this question on prior context” is foiled. Naive embeddings is going to return the opposite context of what is needed. The articulation of the topic may be far back when I’m talking bails and whiffletrees.

One needs to create a tree-like context structure of conversational topics and allow for rejoining and AI language skills to discern where to begin anew, while letting AI know that it’s been talking about other things we muted. And not use more expensive compute than simply having given the AI chatbot more.

Really depends what stack youre using.I’m using langchain and have stacked LLMs to generate the query vector the following way

template_2 = “”"

You are an AI Assistant for Enhanced Questions: Leverage chat history insights and user input to synthesize a 
refined question. Summarize the chat history below to capture key points. Combine this summary with the user's 
question to craft a new, concise question that adds depth. Make the user's question more informative by 
integrating only relevant chat history insights.

Chat History:
{chat_history}

User's Question:
{question}

Newly Generated Question:"""

template_3 = “”"
You are an AI assistant for answering questions. Given the context and the generated question,
provide a conversational answer to the user.Do not make up any answers, if you do not know simply say you do not know.

Context:
{context}
Generated Question:
{Generated_Question}
Do not answer in Markdown Give your answer in plain text. Represent quotations with actual quotations and newline spaces
need not be included:"""
1 Like

Thank you!

I’ will give a try to this

You are an AI Assistant for Enhanced Questions: Leverage chat history insights and user input to synthesize a refined question. Summarize the chat history below to capture key points. Combine this summary with the user's question to craft a new, concise question that adds depth. Make the user's question more informative by 
integrating only relevant chat history insights.

Hey,

There is very simple solution that can remember the long conversation, more than 100+ messages history, and will answer the questions. It’s a mix of prompt engineering + database + logical flow.

I have created a detailed article on how to implement it. Link below -

https://kanishka-sahu.medium.com/build-context-aware-chatbot-that-remembers-previous-chat-openai-909b2ccb27b9

if you have doubts about anything lets discuss