Mitigating Hallucinations in RAG - a 2025 review

GenerativeJourney · October 11, 2025, 10:10pm

Stumbled upon this review while doing literature research on LLM hallucinations. Recommend reading if you are working with RAG. https://www.mdpi.com/2227-7390/13/5/856

sergeliatko · October 11, 2025, 10:54pm

Thanks for the reference. While the thing is actually interesting. There is one thing that bothers me:

So many people including developers are trapped (IMHO) by misleading terminology introduced by the investor’s pitch decks in early LLM start-ups.

I’m taking about “knowledge” and “understanding” in particular…

But if you switch those to what they should be in reality: “learned language patterns" and “inference time context" - suddenly the “hallucinations” instead of being something natural and almost inevitable, become the flags of poor context quality (or prompt clarity failure), low quality of training/fine-tuning or application workflow design not thought through enough.

But those are kind of avoided by many developers as they are more in the “complex” domain of linguistics, however still being in the dev area of it.

So several years later we are still at the point where the new generation of developers with linguistics background is yet somewhere near entering middle school…

Just that little “mind shift" resolved the hallucinations problem for me in late 2022 - because I see the following:

When the model “hallucinates" - check what you’ve designed wrong in your workflow that corrupted either the context you presented or the the training you did.

Doesn’t mean hallucinations do not happen to me.

It’s just that with this approach I use them to warn me that I’m hammering the screws instead of using screwdriver.

You can’t ask a pre-trained large language model (basically a grammar book on steroids) to think as human in a broken context.

Either fix the context, simplify the task, or train the model to work its way through the low quality context. Or all three in the same time.

It’s often all three.

Foxalabs · October 11, 2025, 11:03pm

Love this, @sergeliatko I find the same to be true most of the time

sergeliatko · October 11, 2025, 11:41pm

Oh those fancy words, sometimes they sell, sometimes they undermine the industry…

sergeliatko · October 11, 2025, 11:46pm

I think hallucinations is mostly the problem of expectations. While it’s ok to hit them all the time while experimenting, that’s how you know you push it out of comfort zone…

But damn, so many investors and businesses got sold on those “knowledge” and “understanding” magic triggers, that now they are asking for results, and poor devs can’t deliver (also believing in the same sh.t) something viable and predictable.

Hopefully, they came up with a term “hallucinations”.

Otherwise they would have to account for “we know nothing about how it works"… or rather " magic is what we’ve built in but cannot control “.

GenerativeJourney · October 12, 2025, 12:20am

I believe the review supports your argument nicely. And I couldn’t agree more: “learned language patterns" and “inference time context" are important to understand.

With a disclaimer that I’m not a developer, simply reading research for fun, here’s how I came to see potential reasons for poor outcomes:

Biased training data (perpetual 10:10 wrist watch and seahorse emoji are great examples)
FT+RL (overfitting to narrow datasets, inducing sycophancy, etc)
Prompt Engineering + Context accumulation + External Tools (so many pitfalls here, from initial errors in LLM outputs snowballing into severe hallucinations - to prompt injection concerns - to RAG-mediated hallucinations discussed in the review mentioned earlier)
Any combination of the three above.

You do have to be a magician to figure them all out.

sergeliatko · October 12, 2025, 12:37am

Search this forum for “Abra Kadabra" just for fun

generalbadwolf · October 12, 2025, 2:40am

Its pretty easy to mitigate hallucinations, especially when it comes to rag.

The trick is in understanding that the entered data, takes priority over training data.

This is extraordinarily true when it comes to reference data. Like [a list of things]

Topic		Replies	Views
How should we evaluate hallucinations in RAG systems when semantically similar context may still be irrelevant or incorrect, and the real failure may lie in retrieval or source quality, not the model itself? Community api	2	427	August 22, 2025
Measuring hallucinations in a RAG pipeline Community hallucinations , api-hallucinations	4	1525	April 3, 2026
Why language models hallucinate [OpenAI Research Paper] Community hallucinations	2	645	September 5, 2025
User Guidelines for Dealing with Hallucinations Prompting chatgpt	3	1798	July 15, 2025
Why is my fine-tuned model hallucinating? Community fine-tuning	2	2402	October 6, 2023

Mitigating Hallucinations in RAG - a 2025 review

Related topics