Handling Ambiguous Knowledge Base Queries (RAG)

Hey guys,

There is a question about RAG. For instance, the knowledge base contains ‘turn on fog lights when visibility is less than 100 meters’, but the question is whether fog lights need to be turned on when visibility is less than 1000 meters. The expect answer should be “No need to turn on fog lights”. In this case, should we use RAG to cover all scenarios, or is another technology required? Thanks.

That certainly doesn’t seem like an ambiguous case. We can do semantic search against examples that might be stored in a vector database and used as knowledge injection based on an AI input prompt.

== Cosine similarity and vector comparison of all inputs ==
0:“turn on fog lights when visibi” <==> 1:“do fog lights need to be turne”:
0.9282514391673857 - identical: False
0:“turn on fog lights when visibi” <==> 2:“Double yellow lines indicate t”:
0.7609393465425045 - identical: False
0:“turn on fog lights when visibi” <==> 3:“Fog lights annoy many drivers,”:
0.8504834360882413 - identical: False
1:“do fog lights need to be turne” <==> 2:“Double yellow lines indicate t”:
0.7384444816824471 - identical: False
1:“do fog lights need to be turne” <==> 3:“Fog lights annoy many drivers,”:
0.8298013798346225 - identical: False
2:“Double yellow lines indicate t” <==> 3:“Fog lights annoy many drivers,”:
0.7673605880058811 - identical: False

It has the highest similarity even though one is information, and I wrote the other as a question. Others are attempts at similarity but distraction.

But it seems you are considering automatic augmentation to other techniques.

There might be cases where you’d want the AI to directly access, search, and browse files via functions to discover answers within structured data files.

@_j Thanks for your reply. Sorry I didn’t say clearly. My expect is that “when visibility is less than 1000 meters, should NOT turn on the fog lights”, but I meet the case that the LLM still suggests to turn on fog lights. So I’m trying to find a better solution.

So it sounds like the AI is just being wrong. Not reading the injected information carefully.

You can tell it to read the question carefully, and use automated text retrieval not as an answer, but as a source needing determined examination to ferret out the correct answer. Have print its thinking first, then apply critical logic to solve problems. A bump up to gpt-4 if you need the best available determination of fact and applicability.

1 Like