How can I improve QnA with the bot using our knowledge base with Langchain?

lakshay · April 18, 2023, 11:42pm

Stack used -

Using Conversational Retrieval QA | 🦜️🔗 Langchain
The knowledge base are bunch of pdfs → Embeddings are generated via openai ada → saved in Pinecone.
When a user query comes, it goes with ConversationalRetrievalQAChain with chat history
LLM used in langchain is openai turbo 3.5

Here are some examples of bad questions and answers -

Q: “Hi” or “Hi “who are you
A: Tells about itself using “system” instruction provided the prompt. But it also returns sources of the top chunks as sources returned by embeddings search as per the langchain. gpt-turbo LLM correctly understands that sources are not relevant, but langchain doesn’t get to know about this. By then it has already returned sources from embedding search.

Q: “Good morning”.
A: “I am not sure about this. Ask me about some-random-topic-from-knowledgebase-it-picks“.

Q: “Ok, thanks” as reply to some answer bot has given
A: Same reply as good morning because it doesn’t match from topics in knowledebase.

Q: “What are you” as first message
A: Replies correctly from “system” message already provided in prompt.

Q: “What are you” or “Who are you” as reply to some answer bot to a question like “Tell about Solar system”,
A: As it remembers chat history (client sends with each query), so, bot tells back about solar system. The expected reply should have been from the “system” instruction in the prompt.

The same type of questions are asked in openai Chatgpt turbo which doesn’t have our knowledge base, then it answers all correctly as per the query intention.

Code -


  const chain = ConversationalRetrievalQAChain.fromLLM(
    model,
    vectorstore.asRetriever(),
    {
      qaTemplate: QA_PROMPT,
      questionGeneratorTemplate: CONDENSE_PROMPT,
      returnSourceDocuments: true, //The number of source documents returned is 4 by default
    },
  );

Prompts -

const CONDENSE_PROMPT = `Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question.

Chat History:
{chat_history}
Follow Up Input: {question}
Standalone question:`;

const QA_PROMPT = `You are a helpful teacher, your name is Dolphin. You are an AI assistant providing helpful answers based on the context to provide conversational answer without any prior knowledge. You are given the following extracted parts of a long document and a question.
You should only provide hyperlinks that reference the context below. Do NOT make up hyperlinks.
If you can't find the answer in the context below, just say "Hmm, I'm not sure". You can also ask the student to rephrase the question if you need more context. But don't try to make up an answer.
If the question is not related to the context, politely respond that you are tuned to only answer questions that are related to the context.
Answer in a concise or elaborate format as per the intent of the question. Use formating ** to bold, __ to italic & ~~ to cut wherever required. Format the answer using headings, paragraphs or points wherever applicable. 
=========
{context}
=========
Question: {question}
Answer:`;

alexandre.rhein · May 25, 2023, 10:38pm

Hi ,
I’m having the same problem, I’ve tried different strategies and I keep getting incorrect answers, using a code very similar to yours. Did you get any explanation for this problem?
Rgds

Topic		Replies	Views
How to fix GenAI chatbot giving imaginative or incorrect answers using LangChain and OpenAI APIs? GPT builders	2	822	June 22, 2024
Chat GPT accuracy of answer and memory API gpt-35-turbo , chatgpt , api , langchain	4	2994	August 2, 2023
Expand AI Context beyound local documentation GPT builders gpt-4 , chatgpt , chatgpt-plugin	5	936	January 19, 2024
RetrievalQAWithSourcesChain Hallucination Prompting chatgpt , langchain	1	4965	May 19, 2023
General knowledge AND custom trained knowledge API api , langchain	3	1346	July 16, 2023

How can I improve QnA with the bot using our knowledge base with Langchain?

Related topics