Whisper hallucinations + dropped sentences: Help?

keithwhor · November 6, 2023, 5:20am

I’m trying to use Whisper to transcribe audio files that contain lots of background noises – mostly forest noises, birds and crickets – and lots of dead air. The audio quality of the speaker varies, sometimes they are close to the mic and sometimes further away.

When attempting to use Whisper (at temperature: 0, 0.01, 0.2 …) I mostly get garbage out. It can successfully transcribe a few sentences and then will just barf out hallucinations, often just repeats of previous phrases. It often gets stuck repeating a single word or phrase and then won’t transcribe any other speech at all.

I thought this might be an audio quality issue so I used ffmpeg to clean up the files. Normalize speech volume and then an RNN to remove most non-speech noise. To a human listener the audio files sound pristine. It barely helped.

Is there something I’m doing wrong or is the technology for transcription only adapted for high-quality, consistent audio? I played with other models which performed far worse. I’m really disappointed that the first helpful use-case I’ve had for these models is a non-starter, and I’m hoping there might be something that I’m missing here. Is there something open source I can use that has better configuration options?

[Edit: Have figured out some workarounds using different models, but wish there was better accuracy for my input type.]

Fusseldieb · November 6, 2023, 11:22am

Whisper really lags massively behind in terms of “AI”. It isn’t as smart as their GPT-3.5 or 4 counterparts.

I’m looking forward for a real model that can have a system prompt and then follow that.

One can dream.

satoshinakashu · January 21, 2024, 1:29pm

Try AssemblyAI, I got much better results than OpenAi’s Whisper on our website AI.OpenSubtitles.com. We turned off the Whisper option for now because of the amount of complains we are getting. So we are waiting for OpenAI’s customer support to fix the problems we are having before enabling it again.

Fusseldieb · February 29, 2024, 1:50pm

Without retraining a new model I find it difficult. There’s not much they can do to fix this with the current version.

Topic		Replies	Views
Whisper API stutter and erring like LLMs API whisper	1	1193	December 25, 2023
Whisper spitting out gibberish when trying to transcribe API whisper	4	1262	June 14, 2024
Whisper Transcription Questions API whisper	10	4771	March 13, 2024
Gpt-4o-mini-transcribe and gpt-4o-transcribe not as good as whisper Feedback api	3	3634	April 23, 2025
Whisper API quality degrading over time API api	15	1319	May 28, 2024

Whisper hallucinations + dropped sentences: Help?

Related topics