I’m about to deploy a company RAG system addressed at potential customers. However, I’m realising now I couldn’t quite avoid the bot from falling into a big misconception: even if I explain in the system message that this is not the case, it answers questions as if retrieved documents were exhaustive to the question raised. For instance, if I ask to the bot how many new projects were started in my company last year, it will only count those described in the retrieved documents, even though, again, I compel the bot to consider that there may more information than that retrieved and that they may rather suggest the user to write us a mail than giving definitive answers.
This is puzzling me quite a bit, because I don’t want the bot to give the impressions that we do just a fraction of our actual activities. Did anyone face this problem before? How to solve it?
I understand your struggle. Just a couple of quick questions:
Why not giving the bot and API to query the number of projects and get the exact results? So that it is not the bot but a database who provides the definitive answer.
If you prefer that the bot asks the user to send an email for this type of information, does this bot has clear instructions about what kind of information is answerable and what kind of information needs to be provided by human via email or other means?
If you consider that this bot should not answer this type of questions, why you have not implemented a filter logic to prevent this type of questions get to the bot? (Because after all, the bot’s nature forces it to answer user inquiry).
thank you both for your answers. I managed to get the bot to not give complete answers with incomplete information by just tweaking the prompt in a way that stresses more that directive. It wouldn’t comply without repeating it in all caps, but now it does.
Unfortunately, my project doesn’t have the budget to consider most of your suggested solutions, and by the way we don’t really expect users to make questions that require an exhaustive scan of the database - my example was more of an edge case than anything - so I think we are good now.