Mitigating Hallucinations in RAG - a 2025 review

Stumbled upon this review while doing literature research on LLM hallucinations. Recommend reading if you are working with RAG. https://www.mdpi.com/2227-7390/13/5/856

3 Likes

Thanks for the reference. While the thing is actually interesting. There is one thing that bothers me:

So many people including developers are trapped (IMHO) by misleading terminology introduced by the investor’s pitch decks in early LLM start-ups.

I’m taking about “knowledge” and “understanding” in particular…

But if you switch those to what they should be in reality: “learned language patterns" and “inference time context" - suddenly the “hallucinations” instead of being something natural and almost inevitable, become the flags of poor context quality (or prompt clarity failure), low quality of training/fine-tuning or application workflow design not thought through enough.

But those are kind of avoided by many developers as they are more in the “complex” domain of linguistics, however still being in the dev area of it.

So several years later we are still at the point where the new generation of developers with linguistics background is yet somewhere near entering middle school…

Just that little “mind shift" resolved the hallucinations problem for me in late 2022 - because I see the following:

When the model “hallucinates" - check what you’ve designed wrong in your workflow that corrupted either the context you presented or the the training you did.

Doesn’t mean hallucinations do not happen to me.

It’s just that with this approach I use them to warn me that I’m hammering the screws instead of using screwdriver.

You can’t ask a pre-trained large language model (basically a grammar book on steroids) to think as human in a broken context.

Either fix the context, simplify the task, or train the model to work its way through the low quality context. Or all three in the same time.

It’s often all three.

4 Likes

Love this, @sergeliatko I find the same to be true most of the time

2 Likes

Oh those fancy words, sometimes they sell, sometimes they undermine the industry…

I think hallucinations is mostly the problem of expectations. While it’s ok to hit them all the time while experimenting, that’s how you know you push it out of comfort zone…

But damn, so many investors and businesses got sold on those “knowledge” and “understanding” magic triggers, that now they are asking for results, and poor devs can’t deliver (also believing in the same sh.t) something viable and predictable.

Hopefully, they came up with a term “hallucinations”.

Otherwise they would have to account for “we know nothing about how it works"… or rather " magic is what we’ve built in but cannot control “.

2 Likes

I believe the review supports your argument nicely. And I couldn’t agree more: “learned language patterns" and “inference time context" are important to understand.

With a disclaimer that I’m not a developer, simply reading research for fun, here’s how I came to see potential reasons for poor outcomes:

  1. Biased training data (perpetual 10:10 wrist watch and seahorse emoji are great examples)
  2. FT+RL (overfitting to narrow datasets, inducing sycophancy, etc)
  3. Prompt Engineering + Context accumulation + External Tools (so many pitfalls here, from initial errors in LLM outputs snowballing into severe hallucinations - to prompt injection concerns - to RAG-mediated hallucinations discussed in the review mentioned earlier)
  4. Any combination of the three above.

You do have to be a magician to figure them all out.

1 Like

Search this forum for “Abra Kadabra" just for fun :rofl::rofl::rofl:

Its pretty easy to mitigate hallucinations, especially when it comes to rag.

The trick is in understanding that the entered data, takes priority over training data.

This is extraordinarily true when it comes to reference data. Like [a list of things]

1 Like