Whisper AI doesn't transcribe

fmikele · December 13, 2023, 10:08am

Using Whisper AI, it doesn’t transcribe the first approximately 10 minutes of the audio file I provide as input (italian language)

Bai_Lan_Blues · December 13, 2023, 1:43pm

I think a little more information is needed for someone to be able to understand and help with the issue you are facing.

I work with Whisper API a lot. If you are able to share the audio file and show the parameters you are using in the API request, I can see if I have the same issue on my end or not.

fmikele · December 13, 2023, 4:13pm

Hi, These are the parameters I use:

model: "large-v3",
      translate: false,
      temperature: 0,
      transcription: "plain text",
      suppress_tokens: "-1",
      logprob_threshold: -1,
      no_speech_threshold: 0.6,
      condition_on_previous_text: true,
      compression_ratio_threshold: 2.4,
      temperature_increment_on_fallback: 0.2

As for the Audio file (mp3) I use as input, it’s quite long (around an hour), and even though the voice isn’t very clear in the first few minutes, the transcription starts after about 10 minutes.

Is there a parameter setting that can solve this issue?

Thanks

Bai_Lan_Blues · September 25, 2024, 7:48pm

Sorry, I totally missed notifications from this forum. I’m sure you have probably long moved on from this issue.
But for what it’s worth, I would have first tried to cut the first 10 minutes (the part that didn’t transcribe) and tried to send that to the API by itself, just to see what would happen.

I’ve had issues with parts of the audio not being transcribed. This happens whenever the audio starts with an “aah” or “ooh” sound, or any other vocalization/exclamation that is not a word. The transcription then may skip a long portion of the audio after this.

The same happens also if such a non-word sound follows after a long pause, though not as often.

In any case, when dealing with an hour of audio, I would probably add some code to automatically cut it up and send it as segments, and then resend any segment there was an error.

Something else (even though I haven’t tried this myself) you can do is modify the speed of the audio to slow it down, before you send it to the API. You’d do this automatically with ffmpeg or something. I’ve seen some anecdotes stating that this increases accuracy

aaditya1 · December 25, 2024, 1:01pm

Hi @Bai_Lan_Blues , @fmikele I’m facing the similar issue. Could you please help me to resolve it? Here is the issue: whisper-asr-model-skipping-chunks-in-audio-transcription/1067744

Topic		Replies	Views
Whisper leaves out chunks of speech in longer transcript Bugs whisper	7	2278	March 5, 2025
Whisper api, not transcrip all audio API whisper	3	2058	October 28, 2023
Dialog before long pause gets repeated over and over again by Whisper API whisper	3	2188	November 6, 2023
Whisper API skipping on parts of transcriptions API whisper	13	7602	December 27, 2024
Whisper skipping some parts of the audio Bugs api , whisper	1	935	July 29, 2024

Whisper AI doesn't transcribe

Related topics