Whisper hallucination - how to recognize and solve?

I use the API and never have ran into issues either. Even with Spanish. I have been running it on my phone using Silero VAD and only experience hallucinations when maybe a single word or two is accidentally caught.

Strange. Whisper actively tries to prevent this exact issue using Beam Search and by using a dynamic temperature setting (if you have set it to 0). Whisper has a ~13% error rate with Croation.

So, three questions:

  1. Are you using a prompt to prime the transcription process?
  2. What is your temperature setting?
  3. How are you starting the audio? You said it’s live. How are you capturing the audio?
1 Like