Memory issue when semantic search with embeddings

I have one related question on Embeddings. So every time a user asks the bot a question, the bot searches the knowledgebase and answers. the following use case impossible:
USER: How can I register for starter training ?
BOT: If you meet all the conditions, you can register from the link …
USER: What does it cost ?
BOT: I dont know.

how can i deal with this problem.Since the user specified the course name in the previous question, but not in this question, there will be no match from the knowledgebase. How can I search while keeping the context from the previous question?

1 Like

I’m looking for a similar solution. My assumption is that by integrating ChatGPT with a Knowledge Base (KB) will help.

I solve this issue with an additional module that transforms a contextual question into an stand-alone one. Using your example: “ What does it cost ?” is a contextual question. It only makes sense in the context of an on-going conversation (all the previous utterances). The equivalent stand-alone question would be something like: “How much does starter training cost?” This stand-alone question can be understood even without the prior context. Therefore, it can be embedded and matched against existing knowledge in the knowledge base.

Therefore, when my users ask a question, I run this contextual → stand-alone question module first. I give it the previous utterances as context in combination with the last question, and I ask it to output the equivalent stand-alone version of this question. And then, you can just conduct your semantic search as usual.

It works really well, even with some basic prompt engineering. Let me know if you still have doubts. Hope it helps! :slight_smile:


Do you always do this for all questions? If you’re not doing it for all the questions, how do you distinguish whether is contextual question or stand-alone?

All of them. Makes it easier. You have to instruct the model to produce just the same question if the question is already a stand-alone one.

I tried it as you said with a prompt. But it doesn’t always produce the right questions or changes the content of stand-alone questions. I couldn’t work it stably. I think it may be due to the prompt, can you share the prompt with me?

I’m sure you’ve already figured it out by now, but for those searching for this answer:

a. You maintain the user questions and bot answers in chat history.
b. I don’t use a module, but I make an API call to OpenAI model: I send the new question + chat history and ask AI to give me a standalone question. This standalone question now contains the context of the conversation.
c. I search my vector store using the standalone question for relevant documents.
d. I send standalone question + relevant docs to the AI for a response.
e. I add new standalone question + AI response to chat history.

Rinse and Repeat.

Since the standalone questions always contain the context of the conversation and are always in the chat history, this should mitigate the issue of losing context.

If chat history starts to get too large, I start removing the earliest entries from the history to reduce it’s size.

Hope this helps.

1 Like

I am by no means saying this is the best or even a good prompt. It’s been working for me so far.

	$instruction = "Generate a standalone question which is based on the new question plus the chat history. Just create the standalone question without commentary. New question: ".$question;
	$chatHistory[] = ["role" => "user", "content" => $instruction];

    // Define the data for the API request
    $data = array(
        'messages' => $chatHistory,
        'model' => $model,
		'temperature' => $temperature

Is there anyone who can produce a more permanent and continuous solution?

Oh My, I knew there was a META way of solving this. But every time I thought I was getting close I only just got more frustrated and had to take breaks. If I can break it down into base pieces of concepts I could create a rythm pattern to use effectively.

What if you collect each sentence as dedicated embedding and when you want to “sum” them up you just do an “average” action?