Whisper Transcription Inconsistencies and Guidance Needed for Production Use

Saka · November 26, 2025, 2:17pm

We are currently integrating Whisper for speech transcription in our VR Unity based game and overall the performance has been great. However we are running into some inconsistent behavior that is preventing us from confidently moving to production.

In some cases Whisper returns text that is completely unrelated to the audio. A few examples we have seen are

{“text”:“वा पल भॉळबा사�ा सं चर सर तककुव जानती साना टललरीster”}
{“text”:“”}
We have also noticed many responses where the output is only the word “you” even though the audio clearly contains a full spoken sentence.

These issues happen randomly and are difficult to reproduce which makes reliability a concern for live user interactions.

We are using whisper 1 through the API and it generally works well for our use case. At the same time we are open to trying out other models if they provide better consistency.

We would really appreciate any guidance from the community or OpenAI team on best practices for getting stable and accurate transcriptions. For example recommended audio formats, preprocessing steps, parameter choices or API usage patterns that can help avoid these hallucinations and keep the transcription aligned with the actual speech.

Any insights would be super helpful as we are preparing to ship this in production soon.

Thanks in advance!

Topic		Replies	Views
Whisper hallucinations + dropped sentences: Help? API whisper	3	4301	February 29, 2024
Inaccurate transcripts on Whisper API chatgpt , api , whisper	0	248	December 27, 2024
Whisper transcription failures and hallucinations API	4	965	April 5, 2024
RealTime API Transcription errors Bugs realtime	7	2537	January 9, 2025
Whisper API for Hindi Speech to Text API whisper	3	1902	March 5, 2025

Whisper Transcription Inconsistencies and Guidance Needed for Production Use

Related topics