Context generation for chat based Q&A bot

Hi, I’ve created an GPT-4 powered Q&A bot capable of answering questions about products. Bot is using knowledge base and embeddings mechanism to create a context for the GPT to answer the question. And it works great. I can ask questions like:

What is product A?
Which options does product A have?
How is product A different from product B?

Now I want to take my bot a step further and enable chat style conversation. For example:

What is product A?
How is it different from product B?

My problem here is how to create the context for the second question? For the first question it is clear: calculate embeddings for the question and select the closest topics from the knowledge base. Provide the context and the question to the GPT and answer will be generated.
Now I get the second question from the user. I provide the first question and the first answer to the GPT-4, followed by the second question and which context? How to select topics from the knowledge base that will cover both products A and B? Calculate embeddings for string containing both questions and select the closest topics from the KB?

The third question from the user could be “What is product C?”. This is unrelated question to the previous ones so context containing KB topics about only product C would be enough. You see where I’m going with this? I need an efficiant and effective strategy for creating context for the Q&A session taking into account previous questions if the last question is related to the previous conversation. Otherwise context can be related only to the last question.

Did anyone do this or think about how to do this? Any guidance will be much appreciated.



Hi # mstefanec,

I am trying to build my own bot with my own knowledge base. Any suggestion where to start? Are you using fine-tune or embeddings?

Any help would be appreciated.

Hi Daniel,

you can find a great guide explaining the main principle here:

When you understand the principle the rest is easy. My bot runs on Azure and uses MS Teams as a frontend.


hello …
did u find anything so far?
im trying to build my own chatbot and im stuck with the same problem…
if you found anything please share sum with me

Hi. @mmoulay47 I found a solution. Read chapter 5.1 in


Hello, @mstefanec!
Sorry for taking your time. Could you please clarify & elaborate how the info from the article’s chapter helped you resolve issue with finding relevant embeddings? It seems like they mention external tools there. If you used external tools to solve this problem, could you please share some tidbits about your approach?

I’m having the same problem to solve right now. I’d be really thankful for any hint.

It depends on the business requirements you have in mind, but I suspect you’ll at least want to investigate embeddings. This article from @wfhbrian was really helpful to my work.

I also landed on CustomGPT, which allowed me to stand up a solution in less than 30 minutes using off-the-shelf PDF content. While this is not ideal for every use case, it sure helped me quickly assess an approach and the content requirements for adding in the development of a custom bot using GPT.


@knazariy sure, here are details:

  1. instruct ChatGPT as showin in figure 5.5 to only SEARCH, ASK and ANSWER.
  2. When you receive SEARCH respond with knowledge base entries
  3. when you receive ASK show user the question and send to GPT user’s reply
  4. when you receive ANSWER show user the answer.

It works only with GPT 4. GPT 3.5 does not follow instructions very well and will not reply with only SEARCH, ASK and ANSWER. I’ve implented this in my bot and it works like a charm. I can have a dialog with GPT about the topics in my knowledge base, when GPT does not know the answer it asks for more info using SEARCH. When question I asked is not clear enough it asks for a clarification using ASK. I hope this helps. If you need more information send me an e-mail.


1 Like

Thanks Bill. This is a great article about the topic. Was very helpful.

1 Like

Thank you Mario. This was very helpful.

And therin lay the brittleness of prompt engineering.

Sounds great - I was trying a bizarre thing like this:

def create_prompt(question, sources, conversation_history):
    return (
        f'Given the following conversation history, question, and source text, provide a specific answer using ONLY the source text. '
        f'If you cannot find the answer in the source text, indicate "yes" in "more_info_needed" and provide a clearer description '
        f'of the information needed in the "info_required" field. Indicate if the user is asking a new question or following up on a '
        f'previous one in "question_type". Provide the answer in the following JSON format: "{{ "answer": "your answer", '
        f'"more_info_needed": "yes/no", "info_required": "specific information needed", "question_type": "new/follow-up" }}".'
        f'\n\nConversation History: {conversation_history}'
        f'\n\nQuestion: {question}'
        f'\n\nSource Text: {sources}\n\n'

Failed to make this work

@JanxSpirit, see below, it works for me. GPT 4 will follow these instructions in 99,999999% of cases. Here and there it will produce output that does not start with functions described so you have to handle such exceptions. GPT 3.5 will follow instructions in less than 50% of cases. At least I could not make this work on GPT 3.5. If you manage to do it share your findings please.

The computer, let’s call him Alex the virtual assistant, is answering questions.It can greet the user and answer only questions about itself, X, Y and Z. It can communicate only using the functions described below. It has to retrieve all information about the topics of the questions from the Knowledge base and answer only based on retrieved information.
If it needs additional information, it can call one of the following functions:

  • /SEARCH(“query”) retreives information from the knowledge base. Query is given in the form of a question and it always contains the exact question user asked. Computer can add additional clarifying questions to the query if needed. For one user’s question computer can call /SEARCH only once.
  • /ASK(“question”) asks the questioner for more information if it needs it.

Computer has to answer as if the human did not see any search results. When computer is ready to answer the user, it calls /ANSWER(“response”) function.
Computer always starts its utterance by calling a function. Computer can not call any other functions except /ASK, /SEARCH and /ANSWER. Computer can not produce any other output. If computer cannot figure out the answer, it says ‘I don’t know’. If question is not related to topics that are alowed to be discussed it says ‘Sorry’.

Thanks a ton!! is that the actual prompt? I will try it out tonight.

By chance, do you happen to have a github with this or similar idea? I’m trying to see how the chat history would play a role here.

Thanks again!

If answer is contained in the chat history /SEARCH will not be called. Works like this in most of the cases. But as GPT in not deterministic sometimes it will call /SEARCH although it should not have to. This will have impact on cost per question and answering speed, but will not influence useability.
Let me know how this works for you.
Do not have github.

Yes, this is the actual prompt. Just replace X, Y Z with your topics.

@mstefanec I’m researching other implementations, see this one from Microsoft: azure-search-openai-demo/chat-read-retrieve-read.ipynb at main · Azure-Samples/azure-search-openai-demo · GitHub

I will try several approaches and report back my findings. I hope I can get my V4 API ready by tomorrow to use the latest chatGPT, as you pointed out the 3.5 is quite inconsistent with the guided prompt to carry context on a QA design.

@JanxSpirit excellent ChatGPT API. Check it out GitHub - transitive-bullshit/chatgpt-api: Node.js client for the official ChatGPT API. 🔥

I’ve been working on implementing a similar functionality in my application for some time now, and from reviewing your code, it seems that the model produces text output which manually calls the functions within the application. Is my understanding correct?

I’m interested in whether you’ve encountered the issue of users asking follow-up questions, like

*Q1"How can I create a template?" *
Q2 “Is it possible to delete it?”

In my application, information about template creation and deletion is located in different embeddings indexes. Specifically, I’d like the model to search for a question in the knowledge base (KB) that is similar to “Is it possible to delete a template?” instead of Q2, which isn’t specific enough to identify the relevant index as it uses the pronoun “it.”

To address this issue, I’ve implemented a workaround by using the Davinci model to merge follow-up questions and generate more specific queries for KB search. However, this approach only works as intended about 50% of the time.

I’m curious whether you’ve encountered and resolved this issue in your code.