Whisper API is not able to transcribe audios created on iOS

Hi there!

I have incurred in a very frustrating issue: even though this transcription API generally works perfectly, and the downloaded audio file is always intelligible

async def transcribe_route(
    lang: str, file: UploadFile = File(...), duration: float = Form(...)
    """User sends an WebM file and  Whisper API converts it to text"""

    print(f"lasted {duration} seconds")

        # Check if the file is a webm file
        if "audio/wav" not in file.content_type:
            raise HTTPException(status_code=400, detail="Only .webm files are supported.")

        # Read the audio file data
        recording_content = await file.read()
        recording = io.BytesIO(recording_content)
        recording.name = file.filename

        # Save the audio file locally
        save_path = f"{duration}-{file.filename}"  # Use a unique filename
        with open(save_path, "wb") as audio_file:

        # call the whisper API
        transcription = openai.Audio.transcribe(
            "whisper-1", recording, language=languages_mapping.get(lang, "en")

        # returns the transcribed text
        return transcription["text"]

For some reason when I send an audio recorded on iOS whisper is only able to transcribe the first 1-2 seconds.
I think this may be caused by the different encoding made on iOS, but there seems to be no way of fixing it client-side.

What can I do to solve it?

Thanks in advance,

You actually have failing audio files logged for analysis and they are understandable but can’t be transcribed?

Here I describe a re-encoding you could do, which also has the effect of recoding in voice-over-ip audio bandwidth, so if there was something like noise shaping in high definition audio, it would be stripped.