Whisper is translating my audios for some reason

juansoldi · March 7, 2023, 1:31am

I just got started using the new Whisper API (the one with the endpoint at https://api.openai.com/v1/audio/transcriptions). It works incredibly well when it gets the language right, but for some reason, it will sometimes give me an Arabic or Indi transcription.

I’m a native Spanish speaker so my English pronunciation may have caused the AI to think that I’m speaking another language. But when I translated the Arabic transcriptions into English, the translation was exactly what I said! So the AI actually did understand what I said in English, and then translated it into Arabic. I have zero idea why this is happening. I don’t know if there is a way to specify the languages I want to use, let alone how to tell Whisper not to not translate anything I say into another language.

Any idea why it’s doing this and how can I prevent it from doing it? I also tried filling up the prompt with English text but that doesn’t seem to make much of a difference.

curt.kennedy · March 7, 2023, 1:42am

You could try also setting the prompt variable to contain a few sentences in English to get it to stay on track, from the Docs:

curt.kennedy · March 7, 2023, 1:57am

Oops, just read this. Hmm. Stumped on this one.

luke · March 7, 2023, 11:39pm

Are you specifying the language parameter? If you know the input language, this will help with more consistent results.

juansoldi · March 20, 2023, 12:59am

Yes, I didn’t know about that before but now I’m using just ‘en’. I would like to use multiple languages though, so I wonder if I can use a comma separated list.

Also, I thought a bit more about this and now think that maybe what happened is not that it was translating but just switching character sets, since it’s weird that it never translated into a language that uses the Roman script. I’m assuming there is some standard way to map between characters sets? This also sounds like a simpler explanation. But the issue hasn’t occurred since I added the language parameter.

uvzu · October 8, 2023, 2:20pm

This is still happening. I would say something in English and then it will show me what I said, but in Russian.

juansoldi · October 9, 2023, 10:17am

Yup, it’s still happening to me too. Sometimes it seems to help to speak very slowly. I’m convinced it’s related to the character sets since it always happens to translate into languages that use different characters. I think the AI has a kind of separate “brain area” that decides which charset to use based on your accent, and if it sounds a little Arabic it’ll switch to that charset and then be forced to translate into any language that uses this charset so that the output sentence makes sense.

kittichoteshane · October 10, 2023, 4:26pm

Having a prompt with two languages with the same amount of each language seems to help a bit for me. Although, not perfect. In my case, the user might say either Japanese only, English only, or both.

For example, my prompt was “私の名前は山本です。My name is Yamamoto.”.

Then, I tried saying “よく覚えてないんですけど、 I think I did it for around 5 years.” and whisper returned an expected response in both Japanese and English. (よく覚えてないんですけど、 I think I did it for around 5 years.)

ref:
formData.append(
“prompt”,
“私の名前は山本です。My name is Yamamoto”
);

daveckw · December 16, 2023, 6:03am

I faced the same issue. I am from Malaysia. When i speak in English, it showed me Malay in the transcription. The meaning is exactly the same as what I spoke in English.

clju · January 15, 2024, 10:24am

Same here! It seems to occasionally translate my English input into French (note that I do have a French accent when I speak English :))

taiyodayo · April 10, 2024, 6:09am

I wonder if a resolution is found?

It seems whenever whisper detects an accent, it translates from English to which ever language of the speaker’s native language. We’ve observed it with Japanese, Spanish, Italian, Russian, Indonesian.

We want a consistent transcription in English and this problem is a deal breaker for us.
(Interesting behaviour thou.)

harshvadhiya · May 6, 2024, 5:24am

I’m having same trouble. When I speak in English, it keeps converting it to Hindi. I’m from India, so I suppose my accent might have some Hinglish influence.

Any solutions so far?

sps · May 6, 2024, 6:53am

Welcome @harshvadhiya

If the language of the audio is known, use the language parameter to specify it in the ISO-639-1 format in your API call. This will ensure that the model only transcribes in the specified language and also increase the accuracy.

harshvadhiya · May 6, 2024, 7:25am

Hi @sps

Thank you for answering, I tried another thing and it worked for me.

transcription = whisper_transcriber(r"output.wav", generate_kwargs = {"task":"transcribe", "language":"en"} )

the generate_kwargs paramater worked to forcefully transcribe to English only.

brollsroyce_tek · June 12, 2024, 3:39pm

Same issue as @harshvadhiya, indian as I am. However, using the language parameter would limit the language to be specific. But what I am working on requires the transcription to be in whatever language is spoken. That seems far from perfect at the moment.

anon10827405 · June 12, 2024, 10:45pm

You can use your own classifier to determine the language and set the language parameter that way.

alainryckelynck · December 12, 2024, 10:10pm

This is still happening now.
Sounds insane to me that a system that can transcribe and then translate, cannot have the option to just give the transcribe.
And we suggest we should use another product in between… huh.
Speak in english with whatever accent, you will get traduction of the speech into the accent language. It could be considered racist.

anon10827405 · December 12, 2024, 11:09pm

It does give the option to just transcribe you just need to set it.

If you don’t know which language it’s transcribing and it’s failing to classify it correctly you can use your own classifier.

alainryckelynck · December 13, 2024, 7:56am

I’m not sure you understand: When speaking with an accent in english, it understands what you say, but returns a transcription of the translation ofwhat was said in another language, which apparently is detected from the voice accent.
I say “how are you” with french accent, it transcribes “comment allez-vous”. Get it?
That mean it has understood the voice, and decides on its own to change the transcription of the voice to another language.
And I cannot force it to use one language because my users need to be able to use whatever language.
What is the setting you use?
const response = await openai.audio.transcriptions.create({
model: “whisper-1”,
file: audioFile,
});
const transcription = response.text;

anon10827405 · December 13, 2024, 5:16pm

So use your own classifier to determine the language if the whisper classifier is failing you. Or let the users select the language.

Topic		Replies	Views
Whisper transcription translates to random language (Malay) API whisper	8	1309	July 16, 2024
How can I stop whisper to translate my audio? API	2	2636	December 17, 2023
Languages in Realtime API API realtime	9	4937	May 5, 2025
Whisper-1 joint translation and transcription API	6	3403	October 21, 2024
Transcription multilingual audio API api , translation	16	3565	November 7, 2023

Whisper is translating my audios for some reason

Related topics