I’m using whisper-large-v3-turbo
to transcribe voice inputs in both English and Arabic. However, I’m encountering an issue where the Arabic word “نعم” (which means “yes”) is consistently being transcribed incorrectly as “Naah” or “Naahe”.
Has anyone else experienced this behavior with Whisper? If so, what strategies or configurations have you found effective in improving transcription accuracy for short Arabic words like this?
Any insights or suggestions would be greatly appreciated.
You can try a different model.
While whisper has less truncation problems, other models can perform better for specific languages.
https://openai.com/index/introducing-our-next-generation-audio-models/
Also, all models perform significantly worse than usual if the audio is too short with only a single word.
1 Like
My priority is to use a free model for transcription. Can you suggest any?
Nothing that comes to mind ATM, but I’ve heard some people do fine-tuning on whisper to improve performance.
1 Like