How to fix GenAI chatbot giving imaginative or incorrect answers using LangChain and OpenAI APIs?

I’ve been developing a generative AI chatbot using LangChain and OpenAI’s GPT-3.5 Turbo model, along with OpenAI embeddings. The chatbot is designed to answer questions based on the content of a specific PDF document, which in this case is the Wikipedia page for Scuderia Ferrari.

Here are the steps I’ve taken:

  1. Used LangChain to process and manage the PDF content.
  2. Employed OpenAI embeddings to encode the content for better context understanding.
  3. Integrated the GPT-3.5 Turbo model to generate responses based on the user’s questions and the encoded context.

Despite these efforts, I’m encountering the following issues:

  1. Inaccurate Responses: Approximately 60% of the time, the chatbot provides incorrect or imaginative answers that are not grounded in the content of the PDF.
  2. Context Misunderstanding: Often, the chatbot responds by stating that the question is not related to the context, even when it clearly is.

What could be causing these issues, and how can I improve the accuracy and relevance of the chatbot’s responses? Are there specific techniques or best practices within LangChain, OpenAI embeddings, or GPT-3.5 Turbo that I should be using to enhance the chatbot’s performance?

Any insights or suggestions would be greatly appreciated. Thank you!

import os
from dotenv import load_dotenv 
import streamlit as st
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_community.embeddings import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains import create_retrieval_chain
from langchain_community.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter  # Added
import bs4
from langchain_community.document_loaders import PyPDFLoader #directly load from pdf
from langchain_community.document_loaders import PyMuPDFLoader #directly load from pdf
from langchain_community.document_loaders import TextLoader

# Load environment variables from .env file
open_api_key = os.getenv('OPENAI_API_KEY')
langchain_key = os.getenv('LANGCHAIN_API_KEY')

# Set environment variables for OpenAI and Langchain
os.environ['OPENAI_API_KEY'] = open_api_key
os.environ['LANGCHAIN_API_KEY'] = langchain_key
os.environ["LANGCHAIN_TRACING_V2"] = "true"

# Define a function to interact with the chatbot
def run_chatbot():
    st.title('LangChain Chatbot with OpenAI')

    # User input text box
    input_text = st.text_input("You: ", "")

    # Initialize the language model
    llm = ChatOpenAI(model='gpt-3.5-turbo',temperature=0.7,max_tokens=1500) # Change to desired model

    # Define a prompt template
    prompt = ChatPromptTemplate.from_template('''Read the whole content carefully and understand the meaning of each and everyline in the context.When a question is asked understand the question carefully and search for a meaning ful response from the context.Do not give your own answers.
                                                <context>{context}</context>  Question:{input}''')

    loader = PyPDFLoader("sf.pdf")
    text_documents = loader.load()

    # Split documents into chunks
    txt_spt = RecursiveCharacterTextSplitter(chunk_size=10000, chunk_overlap=1000)
    documents = txt_spt.split_documents(text_documents)

    # Initialize Chroma DB and embeddings
    db = Chroma.from_documents(documents, OpenAIEmbeddings())

    # Create a document chain
    doc_chain = create_stuff_documents_chain(llm, prompt)

    # Create a retrieval chain
    retriever = db.as_retriever()
    retrieval_chain = create_retrieval_chain(retriever, doc_chain)

    # Check if there is input text
    if input_text:
        # Generate a response using the retrieval chain
        response = retrieval_chain.invoke({'input': input_text})

        # Display the response
        st.text_area("Bot:", value=res)

# Run the chatbot
if __name__ == "__main__":

Here’s some thoughts:

  1. document chunking and retrieval isn’t trivial. Check the actual resulting prompt and see if the retrieved content actually provides enough context. It’s possible that the quality of the information isn’t good enough, or something gets mangled in the html->pdf->txt conversion.
  2. OpenAI’s embeddings are probably good enough.
  3. If there is enough infomation and GPT-3.5 struggles, you may need to use GPT-4 instead.