LangChain vs File_Search: Which is Better for Reducing GPT-4 Hallucinations?

joaquin.marroquin · September 23, 2024, 7:10pm

Hi, I’ve been working with an assistant using gpt4o-mini to retrieve data from a file through file_search. Currently, it’s just one XML file, but the idea is to load around 10 files of the same type, reaching about 2MB of data.

As expected, we’ve run into hallucination issues (making up names, creating full descriptions out of thin air, and the worst part—completely ignoring or sometimes badly formatting URLs).

I’ve tried different approaches to reduce these hallucinations, but it’s led to increasingly complex processes:

Having the assistant review its own responses
Repeating incorrect requests
Using multiple assistants
Leveraging function_calls to request URLs from an API

All of these have significantly slowed down response times and ruined the user experience.

While researching, I came across LangChain. From what I understand, it can be used to integrate data from files into the chatgpt model, but I don’t fully grasp it and I’m not sure if it’s worth the effort to explore and implement it.

So, my question is: Is it worth implementing LangChain to reduce hallucinations? Does it offer any advantages over simply using file_search?

I really appreciate any advice or help you can offer.

platypus · September 23, 2024, 9:40pm

Hi @joaquin.marroquin !

I was a very early adopter of LangChain (and LlamaIndex). Over time I found that I could do lots of things very quickly myself (in a few lines of code) and not deal with various abstractions that LangChain introduces. With that said, it still has lots of uses and a very thriving community.

Short answer: it doesn’t necessarily solve your hallucination issues - it may even make it worse.

Here are some of my takeaways that may hopefully be of use to you:

LangChain won’t help you reduce hallucinations! In fact, because of how it abstracts lots of things (like summarizations, composable prompts, etc), it may actually increase hallucinations if you are not careful!
Regardless of the solution, you have to implement some kind of context/conversation constraints (in your system prompt) to guide it towards your knowledgebase (your XML files) only; you also need to play around with temperature values and the chunking strategy for the vector storage. Usually, for optimal minimization of hallucinations, you need to build more complex mechanism around it (e.g. some pre-filtering of ingress data / queries).
LangChain does have great data ingestion support, if you are dealing with various types of document ingestion for example
There are various helper functions e.g. for summarizing very large documents that won’t fit in the context window, using a “map-reduce” procedure - but as I mentioned before, you have to be careful
Other LLMOps bells and whistles like monitoring (LangSmith), multi-agent support (LangGraph), etc.

ladislav.urban365 · January 6, 2025, 9:27am

Hi Joaquin,
the issue with fabrication of descriptions and URLs is not specific to gpt4-o. I have found that grounding the references in some databases or graph db is helpful. There is a list of kind of hallucinations and their root causes. There is a list of root causes at Dynocortex:

Root Causes of GenAI Hallucinations

Training dataset includes stale, wrong or misleading information collected from the internet.
Training dataset has gaps in areas that are important for a specific task. For example not enough contracts for the healthcare industry.
Training dataset includes data that is biased towards certain answers even if it is not supported by statistics. This includes cases when creators of a LLM overcompensate given bias in the opposite direction.
Overfitting on certain documents will cause regurgitation of these documents instead of expected generated text.
Reliance on unreliable shortcuts such as events mentioned earlier in the text cause events mentioned later in the text.
LLM’s preference of parametric knowledge over desirable context to generate responses.
LLMs struggle with long-term causal reasoning and often fail when the narratives are long and contain many events.
Queries with multiple meanings and inability to select the correct meaning from previous queries and given context.
Wrong recognition of change of topic in string of questions.
Insufficient context to correctly answer queries. Recognition what is necessary information to answer certain questions.
Ability to filter out irrelevant details given in the context.
It is easier to mitigate the issue when you know the reasons they appear. I hope it helps.

Topic		Replies	Views
How to Reduce Hallucinations in ChatGPT Responses to Data Queries Prompting gpt-4 , adv-data-analytics	5	11881	December 2, 2024
Help with PDF-Based Chatbot and hallucination issues Feedback api , pdf , chatbot , assistants-api	6	1390	August 28, 2024
User Guidelines for Dealing with Hallucinations Prompting chatgpt	3	490	July 15, 2025
Why is my fine-tuned model hallucinating? Community fine-tuning	2	2224	October 6, 2023
Retrieval-augmented generation (RAG)/endpoint assist tools API gpt-4 , chatgpt , api , lost-user , assistants-api	1	185	March 3, 2025

LangChain vs File_Search: Which is Better for Reducing GPT-4 Hallucinations?

Root Causes of GenAI Hallucinations

Related topics