Whisper hallucination - how to recognize and solve?

anon10827405 · December 19, 2023, 7:01am

You can line up the times and cut the moments when the VAD is off. In post processing.

For live transcripts you’d need to use a buffer that records only when the VAD is active

antoto · December 21, 2023, 2:52pm

Thanks @anon10827405 for your input. At the moment, I am using VAD to esclude audio clips where there is no voice activity at all. Because I am working on a live environment, those clips can be just skipped, so the hallucination issue doesn’t get that bad. What really improved things for me was to use prompts. I am currently using the transcript results of the previous 30 seconds of audio.

Curiously enough, it seems like hallucinations can be triggered with a sort of prompt injection, whenever the same word is repeated in the audio over and over again. For instance, 4 or 5 NOs in the input audio will result in tons of NOs in the output transcript.

shevvydj · March 20, 2024, 10:25am

from my experience, this is caused by poor quality microphone input usually.

BrianLovesAI · May 8, 2024, 3:37am

It is still happening on 08/May/2024. Has anyone solved this issue? Should I remove all the silence from the mp3 file?

BrianLovesAI · May 13, 2024, 3:10am

My apologies, I feel like I was addressing a different issue. I was experiencing hallucinations in silence. I am going to check the ‘no_speech_prob’ attribute, and I believe it will help me.

thisChristopher · July 15, 2024, 8:05pm

Can someone please explain how the ‘no_speech_prob’ attribute is incorporated into code like this:

from openai import OpenAI
client = OpenAI()

audio_file= open("/path/to/file/audio.mp3", "rb")
transcription = client.audio.transcriptions.create(
  model="whisper-1", 
  file=audio_file
)
print(transcription.text)

Thanks…

Topic		Replies	Views
How to avoid Hallucinations in Whisper transcriptions? API whisper	33	22731	May 20, 2025
Hallucination on audio with no speech API whisper	7	7762	December 25, 2023
Whisper hallucinations + dropped sentences: Help? API whisper	3	3616	February 29, 2024
'Transcription Outsourcing, LLC' repeated throughout whisper transcript API api , whisper , hallucinations , audio	18	865	October 5, 2024
Mitigating Random Text Generation in OpenAI Whisper Transcriptions API whisper	0	881	May 30, 2024

Whisper hallucination - how to recognize and solve?

Related topics