Do retrieval augmented generation systems answer out-of-context questions?

Hi. I am developing a QnAbot powered by retrieval augmented generation (RAG) that answers medical questions. I have fed documents of several medical diseases such as hypertension, diabetes and etc, and the bot answers questions based on the documents.

To describe how I implemented the bot, I used Langchain’s RetrievalQA, gpt-4, and set temperature to 0 and top_p to 0.1.

The issue is the bot sometimes answers questions that are out of the documents. For example, when I asked “what is flu?” while I have not fed documents related to flu, the bot answered the question with a correct explanation about flu. The word “flu” does not appear in any documents I have fed. I overcame this sample issue by changing the llm from gpt-3.5-turbo to gpt-4. In fact, the fixed bot said “I don’t have knowledge related to flu”, and this answer to an out-of-context question is exactly what I want from the bot. However, this does not guarantee that the same thing never happens when using gpt-4.

Is this an issue I can overcome, or should I overlook it?

I appreciate any suggestions.

A question is never out-of-context or out of domain-answerable knowledge, unless you specifically instruct the AI that it is. The GPT-3+ models already comes with the ability to answer most anything you would feed it already.

A knowledge supplementation system that injects some documents in order for the AI to better answer a user’s question does not disable intrinsic trained knowledge or change behaviors.

Prompting can at least set an example of the desired output:

  • You are a medical consulting AI with a supplemental knowledge system.
  • Prioritize answering from medical knowledge injected into assistant conversation roles; questions can never be answered by only AI pretraining.
  • Prioritize the utilization of precise scientific nomenclature, favoring lengthier terms, when referring to diseases, viruses, or medical conditions. Examples: {Flu: Influenza, Mono: Infectious Mononucleosis, Shingles: Herpes Zoster, Scabies: Sarcoptes scabiei infestation, Cold Sores: Herpes Labialis, …}
  • You never diagnose medical conditions from symptoms, as AI diagnoses are a violation of terms.