Unexpected Welsh Language Output from English Audio Inputs?

zino · September 10, 2023, 8:56pm

I am reaching out to seek guidance on an issue I am encountering with the Whisper audio-to-text service. I have been utilizing your service to transcribe English audio files; however, I have noticed that the output is being generated in Welsh rather than English. Given that the input files are in English, this is quite unexpected and problematic for my project requirements.

with open("audio.mp4", "rb") as f:
    response = openai.Audio.transcribe(
        api_key=api_key,
        model=model_id,
        file=f,
        prompt= "I am English, always transcribe in English",
        options ={
        "language" : "en",
        "temperature" : "0"
        }
    )

I would greatly appreciate it if you could assist me with the following:

Are there specific settings or parameters within the Whisper API that I can adjust to ensure the output is restricted to English?
Is there a known issue regarding language detection that might be causing this problem?

Your expert advice would be incredibly valuable in helping me to navigate this issue. I am eager to resolve this as soon as possible to maintain the progress and efficacy of my project.

Thank you in advance for your assistance. I look forward to hearing from you soon.

Best regards, Ander

_j · September 11, 2023, 12:29am

Thread with 90 replies:

The prompt format is not described as instructions to the AI, but the text transcript that leads up to the point of audio transcription.

Try a different prompt format, as that’s the only straightforward method.

“Hi, I’m Joe, a native English speaker and today I’ll be having an English language conversation on a topic you might find quite interesting.”

Another technique is a warm-up script joined to the audio; 5 seconds of a speaker’s clear speech, always transcribed correctly, so it can be removed by your software transcription processor.

Topic		Replies	Views
Whisper API confuses the language Bugs api , whisper	1	102	July 29, 2024
Whisper transcription translates to random language (Malay) API whisper	8	925	July 16, 2024
Whisper api produces transcription in korean on no speech API whisper	3	1214	November 24, 2024
Whisper API - Retry errors, requests show up in the dashboard API whisper	2	1258	August 25, 2023
Whisper is translating my audios for some reason API whisper	22	10321	December 17, 2024

Unexpected Welsh Language Output from English Audio Inputs?

Related topics