Whisper sometimes randomly skip sentence

When we create transcription using Whisper API we encountered weird error. Sometimes (handful times in an hour of audio) there is skipped sentence. Timing of previous and next sentence is adjusted to cover missing sentence without gap. Previous sentence is wrongly timed few seconds longer. Following sentence starts a few seconds earlier. When we run same file again errors appear on different places.

Does anybody else encountered this error?

2 Likes

This skipping happens to me quite often and usually when the speaker I am transcribing is quoting something, almost as if Whisper is avoiding potential plagiarism or copyright or some such thing.

1 Like

I’ve been using the Whisper API for some time, and I’ve noticed that it’s been acting “lazy.” It’s skipping important parts of the transcription, which didn’t happen before (I tested it on a model installed on my local machine, and the transcription is perfect, with 100% success in the transcription).
Furthermore, it seems to be random because if I try to transcribe the same audio file again, sometimes it transcribes the part it couldn’t transcribe in the previous attempt.
I transcribe phone calls, so I believe it wouldn’t fall under copyright issues.

1 Like

Yes, I get this issue too. Just noticed it recently that some sentences are being dropped randomly within the middle of a longer transcription. This is a real shame because it puts into doubt the quality of any transcription. A workaround for now, can be to use the phone apps built-in voice transcription services instead of using openAI apps transcription button. Or for pre-recorded content use otter.ai

Does your transcription also contain quoted passages?

How is the audio quality at the time of the missing sentences?

Could you give a small snipped of the audio?

Is this repeatable?

If you wish to give private details you can use the forum DM feature to send this to me privatly.