Unexpected Welsh Language Output from English Audio Inputs?

I am reaching out to seek guidance on an issue I am encountering with the Whisper audio-to-text service. I have been utilizing your service to transcribe English audio files; however, I have noticed that the output is being generated in Welsh rather than English. Given that the input files are in English, this is quite unexpected and problematic for my project requirements.

with open("audio.mp4", "rb") as f:
    response = openai.Audio.transcribe(
        api_key=api_key,
        model=model_id,
        file=f,
        prompt= "I am English, always transcribe in English",
        options ={
        "language" : "en",
        "temperature" : "0"
        }
    )

I would greatly appreciate it if you could assist me with the following:

  1. Are there specific settings or parameters within the Whisper API that I can adjust to ensure the output is restricted to English?
  2. Is there a known issue regarding language detection that might be causing this problem?

Your expert advice would be incredibly valuable in helping me to navigate this issue. I am eager to resolve this as soon as possible to maintain the progress and efficacy of my project.

Thank you in advance for your assistance. I look forward to hearing from you soon.

Best regards, Ander

1 Like

Thread with 90 replies:

The prompt format is not described as instructions to the AI, but the text transcript that leads up to the point of audio transcription.

Try a different prompt format, as that’s the only straightforward method.

“Hi, I’m Joe, a native English speaker and today I’ll be having an English language conversation on a topic you might find quite interesting.”

Another technique is a warm-up script joined to the audio; 5 seconds of a speaker’s clear speech, always transcribed correctly, so it can be removed by your software transcription processor.

1 Like