I have created enterprise rag bot with Langhian open ai multi function agent and I have created tools for that. Also this bot maintains conversation history for each individual user. Sometimes when conversation is going on LLM use it’s own knowledge and generate hallucinations without doing the retrieval. How can I reduce that?
What is your prompt (besides the RAG bits?) you need to tell it to only answer in the context of the text you provided.
Apart from what @stevenic suggested you may also try giving few examples of question and answer in your prompt (basically few shot prompting)
So basically manually make few examples of questions and context and answers (based on context) and pass it as a prompt. Hope this helps to reduce hallucination.
Also you can modify temperature parameter of LLM. Very low temperatures produce deterministic answer which means it will produce the same answer for the same prompt, context and question asked multiple times. Higher temperature will give different answer in such a case. This sometimes create confusion. Lower temperature does not mean less hallucination but it helps in avoiding confusion on LLM output.
@hiteshsom I’m using 0.1 Should I increase the temperature parameter.
Hard to answer this. Most of the LLM temperature range between 0 and 1. You can experiment with different temperature setting and see what works the best for you.
@axen I use below parameters, hope it works for you.
TEMPERATURE = 0.01; PRESENCE_PENALTY = -2; TOP_P = 1;