4o-mini-transcribe not respecting language option

gharezlak · April 19, 2025, 7:39pm

I have a Discord bot that transcribes conversations for users. Just switched over to 4o-mini transcribe. The results are mixed. When it correctly transcribes the utterance, it’s better than nova and I love it. However, there have been two issues:

Sometimes it will appear that no audio has been passed in, even though I have a check to determine whether the audio captured was too short to send to the transcribe endpoint. The completion in that case will be random guesses (e.g. “The following is a transcribed audio chunk”).
It will randomly return completions in a language other than the language option being passed into the request. I’ve gotten Japanese, Korean, Arabic, etc. So, bottom line, it’s not respecting the language option. I’ve double checked, and the language value logs successfully and they are ISO-compliant (e.g. “en”).

Here is my function used for the transcription:


const openai = new OpenAI({
  apiKey: OPENAI_API_KEY
});

const transcribe = async (buffer, language, prompt) => {
  const maxRetries = 5;
  const baseDelay = 1000; // Starting delay of 1 second
  const maxDelay = 16000; // Maximum delay of 16 seconds

  let attempts = 0;
  while (attempts < maxRetries) {
    try {
      // Create a temporary file for the audio buffer
      const tempFilePath = `./data/temp_audio_${Date.now()}.wav`;
      fs.writeFileSync(tempFilePath, buffer);

      // Create a file object compatible with the OpenAI API
      const file = fs.createReadStream(tempFilePath);


      //prompt: prompt
      // Call OpenAI's transcription API
      const response = await openai.audio.transcriptions.create({
        file: file,
        model: "gpt-4o-mini-transcribe",
        language: language,
        response_format: "json",
      });

      console.log(response);

      // Clean up the temporary file
      fs.unlinkSync(tempFilePath);

      if (!response.text) {
        console.log("No transcription received from OpenAI");
        return null;
      }

      // Calculate duration from the audio buffer (16kHz, 16-bit mono)
      const sampleRate = 16000; // 16,000 samples per second
      const bytesPerSample = 2; // 16-bit samples = 2 bytes per sample
      const durationMs = (buffer.length / (sampleRate * bytesPerSample)) * 1000;

      console.log(`OpenAI transcription: ${response.text}`);
      console.log(`Estimated audio duration: ${durationMs}ms`);

      // Return both the transcription and duration
      return {
        text: response.text,
        duration: durationMs

      };
    } catch (e) {
      console.log(`Transcription attempt ${attempts + 1} failed: ${e}`);
      attempts++;

      if (attempts < maxRetries) {
        // Calculate exponential backoff delay with jitter
        // 2^attempt * baseDelay + small random jitter to prevent synchronized retries
        const exponentialDelay = Math.min(
          maxDelay,
          Math.pow(2, attempts) * baseDelay + Math.random() * 100
        );

        console.log(`Retrying in ${exponentialDelay}ms...`);
        // Wait before retrying with exponential backoff
        await setTimeout(exponentialDelay);
      } else {
        // Max retries reached, return null
        console.log(`Max retries (${maxRetries}) reached. Giving up.`);
        return null;
      }
    }
  }
};

Any insight would be greatly appreciated!!

Topic		Replies	Views
Gpt 4o - Transcribe is not providing results in a single language Bugs transcribe	1	206	April 25, 2025
Random response from new transcribe models Feedback transcribe , gpt-4o , gpt-4o-mini	0	143	April 22, 2025
RealTime API Transcription errors Bugs realtime	7	1957	January 9, 2025
Arabic Transcription Issue with OpenAI Realtime API Bugs	1	239	February 10, 2025
Whisper transcription translates to random language (Malay) API whisper	8	1407	July 16, 2024

4o-mini-transcribe not respecting language option

Related topics