Whisper is translating my audios for some reason

alainryckelynck · December 17, 2024, 5:03am

It is giving a translation of the transcription instead of just giving the transcription. (It is doing extra work that is not asked, not sure is “failing” is the word)

What setting do you use so that the following code would always return the transcription, and never attempt to translate from the voice accent:

const response = await openai.audio.transcriptions.create({
model: “whisper-1”,
file: audioFile,
});
const transcription = response.text;

anon10827405 · December 17, 2024, 5:41am

You can use the language parameter:

language
string
Optional
The language of the input audio. Supplying the input language in ISO-639-1 format will improve accuracy and latency.

Then you can also influence the beginning of the transcription with a prompt of the same language:

prompt
string
Optional
An optional text to guide the model’s style or continue a previous audio segment. The prompt should match the audio language.

That’s the extent of control

alainryckelynck · December 17, 2024, 6:31am

I do not know in advance what languages my users will speak.

The main issue remains that the system understands, but instead of returning the transcription, decides to return a translation of that transcription based on the accent of the voice.

Did you test? If you want to test, think Apu from the Simpsons, speaking in english with an accent, he would get a text back in hindi instead of in english… and the hindi text would be the translation of what he said in english.

Topic		Replies	Views
Whisper transcription translates to random language (Malay) API whisper	8	883	July 16, 2024
How can I stop whisper to translate my audio? API	2	2321	December 17, 2023
Languages in Realtime API API realtime	7	2712	January 10, 2025
Whisper-1 joint translation and transcription API	6	3046	October 21, 2024
Transcription multilingual audio API api , translation	16	3062	November 7, 2023

Whisper is translating my audios for some reason

Related topics