Hi, I’ve been working with an assistant using gpt4o-mini to retrieve data from a file through file_search
. Currently, it’s just one XML file, but the idea is to load around 10 files of the same type, reaching about 2MB of data.
As expected, we’ve run into hallucination issues (making up names, creating full descriptions out of thin air, and the worst part—completely ignoring or sometimes badly formatting URLs).
I’ve tried different approaches to reduce these hallucinations, but it’s led to increasingly complex processes:
- Having the assistant review its own responses
- Repeating incorrect requests
- Using multiple assistants
- Leveraging
function_calls
to request URLs from an API
All of these have significantly slowed down response times and ruined the user experience.
While researching, I came across LangChain. From what I understand, it can be used to integrate data from files into the chatgpt model, but I don’t fully grasp it and I’m not sure if it’s worth the effort to explore and implement it.
So, my question is: Is it worth implementing LangChain to reduce hallucinations? Does it offer any advantages over simply using file_search
?
I really appreciate any advice or help you can offer.
Hi @joaquin.marroquin !
I was a very early adopter of LangChain (and LlamaIndex). Over time I found that I could do lots of things very quickly myself (in a few lines of code) and not deal with various abstractions that LangChain introduces. With that said, it still has lots of uses and a very thriving community.
Short answer: it doesn’t necessarily solve your hallucination issues - it may even make it worse.
Here are some of my takeaways that may hopefully be of use to you:
- LangChain won’t help you reduce hallucinations! In fact, because of how it abstracts lots of things (like summarizations, composable prompts, etc), it may actually increase hallucinations if you are not careful!
- Regardless of the solution, you have to implement some kind of context/conversation constraints (in your system prompt) to guide it towards your knowledgebase (your XML files) only; you also need to play around with temperature values and the chunking strategy for the vector storage. Usually, for optimal minimization of hallucinations, you need to build more complex mechanism around it (e.g. some pre-filtering of ingress data / queries).
- LangChain does have great data ingestion support, if you are dealing with various types of document ingestion for example
- There are various helper functions e.g. for summarizing very large documents that won’t fit in the context window, using a “map-reduce” procedure - but as I mentioned before, you have to be careful
- Other LLMOps bells and whistles like monitoring (LangSmith), multi-agent support (LangGraph), etc.
4 Likes