Need some help with context and accuracy

Having some trouble with getting accurate results and not sure how to fix it. I am using Llamaindex to generate a model of my content, then using a QuestionAnswerPrompt with the following format:

    Assume you are having a conversation with a customer.

    We have provided the context information below.
    -----------------------------
    {context_str}
    -----------------------------

    Given this information, please answer the question: {query_str}
    Provide your answers only based on information found in the context.

For my knowledge base I have indexed the following document: schengen_area.pdf - Google Drive

which is simply the first 2 pages of the Wikipedia article converted to PDF format.

The problem I am having is that I get the correct answer to one question but a similar question results in an inaccurate answer directly contracting the first answer that it gave me. The second answer is coming from some older info which is not currently accurate. I want to prevent this. I have already prompted that I want the answer based on my context.

Hi @arouzak

Does this pertain to OpenAI API?

What endpoint are you calling?

Hi @sps, yes, I am using Llamaindex to query OpenAPI. Following is in Python.

index = GPTSimpleVectorIndex.load_from_disk('/gpt-data/index.json')     
response = index.query(user_message, text_qa_template=qa_prompt, response_mode="default") 

qa_prompt is the QuestionAnswerPrompt. Here is the example:

https://gpt-index.readthedocs.io/en/latest/how_to/customization/custom_prompts.html

I haven’t used that particular library/tool. So I can’t comment on what sort of changes you should make to your code. Here’s a list of of official and community libraries for the API.

As for the prompt, it’ll vary with the model being used for completions.

Using lower temperature will definitely help with using the data provided in the context, however, I am not aware if that library allows to do so.

If I am correctly interpreting your use-case - you’re looking for factual QnA.

If you were using chat completions endpoint, the conversation would have been like:

  1. User message with context as content.
  2. System message with consent set to “Answer the following question within the context of this conversation: {question}”

@sps, thanks! The issue is that the text context could be much larger than the token limits so llamaindex chunks and creates embeddings. The problem is specific to the way llamaindex does it with the PDF because when I manually enter the context in text it works fine.

Hi there,

I am facing the same issue. Did you find a workable solution?
I experienced a better response with querying the data with temperature=0, response_mode= “tree_summarize” and top_k=5.
But even then, the data is sometimes incomplete or inconsistent with other answeres.

Perhaps it needs a lot of tweaking using different options.