Whisper API not transcribing audio files coming from an iphone

aristico94 · February 22, 2024, 4:40am

Hi, I am recording audio on the browser using MediaRecorder and sending the file to openai whisper api for transcription and for some reason it would only pick up one word and other times just a bunch of random characters, when I am using an iPhone but works well on Android and on my computer

pietmanheyns · March 25, 2024, 7:35am

I am having the exact same issue: it works on chrome/safari on web, on android it works, but using ios I just get strange results, a simple audio file using ‘test test test’ will result in a jumble of chinese characters.

satyajeet.jadhav · April 12, 2024, 1:00pm

I too am facing this exact issue on iphone. I was also able to reproduce this on safari on macbook. Since on iphone, safari and chrome are essentially running on the same engine, I think this is a Safari related issue.
Were you able to find any workarounds/ solutions?

vb · April 12, 2024, 1:09pm

Hi!
Below this topic is a bunch of ‘Related’ topics.
This has been a recurring issue and I suggest you work through those first.

Note: not the ‘Suggested’ topics.

smachorka · May 27, 2024, 5:17pm

Hey, I am encountering the same issue on ios. Has anyone here been able to resolve this?

arshia · July 15, 2024, 9:52pm

I’m also experiencing this same issue. Has anything worked for you? I’ve tried multiple formats for encoding but nothing works on chrome or safari on iphone.

anon85997880 · July 15, 2024, 11:56pm

I had the same issue.

Be sure to set the timeslice option when starting, like: componentMediaRecorder.start(1000);

This resolved the problem for me.

weihong.guan · August 20, 2024, 9:21am

componentMediaRecorder.start(1000); does not work for me.

the file is recorded in webm format. And it works firefox/chrome on desktop and android.

But on iphone, it returns 400 invalid format error.

obey24com · October 28, 2024, 7:25pm

Is there any solution yet? I get totally weird transcriptions on the iphone. Always something like “Lee Deok-Young from MBC News speaking” or something like that. This only happens on the iPhone.

millionhari · December 2, 2024, 10:54am

Had all these issues as well. Looks like Whisper works best with a specific kind of file format. Mono channel, sample rate of 16khz, and pcm_s16le encoding. You can use ffmpeg to convert your audio with these settings or also use the web audio api. This is what worked for me:

async function convertAudioToMono(file: File | Blob): Promise<Blob> {
  const audioContext = new AudioContext({ sampleRate: 16000 });
  const arrayBuffer = await file.arrayBuffer();
  const audioBuffer = await audioContext.decodeAudioData(arrayBuffer);

  // Create offline context for processing
  const offlineContext = new OfflineAudioContext(1, audioBuffer.length, 16000);
  const source = offlineContext.createBufferSource();
  source.buffer = audioBuffer;
  source.connect(offlineContext.destination);
  source.start();

  // Render audio
  const renderedBuffer = await offlineContext.startRendering();

  // Convert to WAV format
  const length = renderedBuffer.length * 2;
  const buffer = new ArrayBuffer(44 + length);
  const view = new DataView(buffer);

  // WAV header
  const writeString = (view: DataView, offset: number, string: string) => {
    for (let i = 0; i < string.length; i++) {
      view.setUint8(offset + i, string.charCodeAt(i));
    }
  };

  writeString(view, 0, "RIFF");
  view.setUint32(4, 36 + length, true);
  writeString(view, 8, "WAVE");
  writeString(view, 12, "fmt ");
  view.setUint32(16, 16, true);
  view.setUint16(20, 1, true);
  view.setUint16(22, 1, true);
  view.setUint32(24, 16000, true);
  view.setUint32(28, 32000, true);
  view.setUint16(32, 2, true);
  view.setUint16(34, 16, true);
  writeString(view, 36, "data");
  view.setUint32(40, length, true);

  // Write audio data
  const data = new Float32Array(renderedBuffer.getChannelData(0));
  let offset = 44;
  for (let i = 0; i < data.length; i++) {
    const sample = Math.max(-1, Math.min(1, data[i]));
    view.setInt16(offset, sample < 0 ? sample * 0x8000 : sample * 0x7fff, true);
    offset += 2;
  }

  return new Blob([buffer], { type: "audio/wav" });
}

m4ch1n3 · December 18, 2024, 5:48am

Used a similar solution. From frontend, if Safari, was having issues whereas Chrome and others worked just fine. Ended up adding ffmpeg processing on the backend prior to sending to transcription. Apple doing Apple things…

Topic		Replies	Views
Whisper API only transcribing first few seconds API whisper	7	3353	December 19, 2023
MediaRecorder API w/ Whisper not working on mobile browsers API whisper , as-wiki	7	2105	December 20, 2024
Whisper api completely wrong for mp4 API whisper	14	5351	December 15, 2023
Issues with audio files from IOS and the x-m4a format API whisper	14	2077	July 21, 2024
Whisper issues with mp4 saved by Safari API whisper	5	2360	December 16, 2023

Whisper API not transcribing audio files coming from an iphone

Related topics