Finetuning Causes Hallucination

s007 · August 28, 2024, 4:36am

Hi all!! So I’m working on an application where an LLM takes in a transcript of an audio and uses it to fill out a JSON.

For a while I was using gpt-3.5-turbo, however, it would mess up some small things (not using certain abbreviations, not knowing that DNR is a code status, etc.).

So, I finetuned gpt-4o-mini on 30 transcript/JSON pairs. While the model makes fewer smaller mistakes and always gets the structure of the JSON correct, it hallucinates content information more frequently.

Ultimately, my first priority is having minimal hallucination and the second is the correct abbreviations. Any advice on what I should do? Should I create more examples or try finetuning a different model?

jr.2509 · August 28, 2024, 4:49am

Hi there and welcome to the Forum!

Fine-tuning under the OpenAI endpoint is not intended to teach the model new knowledge such as abbreviations.

To ensure the model takes into account these specific information you either want to include these as context into your prompt or implement some form of RAG pipeline that would retrieve the relevant abbreviations based on the content of the audio transcript.

As for the JSON structure, you can either continue to rely on a fine-tuned model or try out the new structured output feature, which might make the fine-tuning redundant. If you do continue with the fine-tuning I would personally further increase the training examples to closer to 100.

s007 · August 28, 2024, 4:57am

Hi! That all makes sense and I’m actually already using structured outputs. What is finetuning used for then if not to teach writing style (such as using abbreviations)? Also, do you have any insight on why finetuning would increase hallucination? Thank you so much for responding!

jr.2509 · August 28, 2024, 5:35am

Typically you’d use fine-tuning to get the model to respond in a certain style (e.g. language, tone) or if the task requires very specific steps to be followed.

Here are links to a couple of resources discussing when to consider fine-tuning:

It can depend on a couple of factors. First, as discussed, the model does not retain knowledge. So unless you are given the fine-tuned model access to information about abbreviations as part of the context, it will likely make something up. Your temperature setting may also play a role and a higher temperature can reinforce this behaviour. But really I think it likely is due to the fact that you tried to use fine-tuning for something it is not intended for.

Topic		Replies	Views
Fine tuned model is hallucinating API fine-tuning , davinci , hallucinations	4	1363	December 24, 2023
Fine-Tuning for Improved Data Extraction and Hallucination? API fine-tuning , api , hallucinations	1	161	November 18, 2024
Hallucination after fine tuning API api	5	361	August 13, 2024
Why is my fine-tuned model hallucinating? Community fine-tuning	2	2242	October 6, 2023
Fine tuning - how exactly does it work? API	6	2609	December 23, 2023

Finetuning Causes Hallucination

Related topics