Whisper API is not able to transcribe audios created on iOS

giovanni24 · October 28, 2023, 12:50pm

Hi there!

I have incurred in a very frustrating issue: even though this transcription API generally works perfectly, and the downloaded audio file is always intelligible

@router.post("/transcribe/{lang}")
async def transcribe_route(
    lang: str, file: UploadFile = File(...), duration: float = Form(...)
):
    """User sends an WebM file and  Whisper API converts it to text"""

    print(f"lasted {duration} seconds")

    try:
        # Check if the file is a webm file
        print(file.content_type)
        if "audio/wav" not in file.content_type:
            raise HTTPException(status_code=400, detail="Only .webm files are supported.")

        # Read the audio file data
        recording_content = await file.read()
        recording = io.BytesIO(recording_content)
        recording.name = file.filename

        # Save the audio file locally
        save_path = f"{duration}-{file.filename}"  # Use a unique filename
        with open(save_path, "wb") as audio_file:
            audio_file.write(recording_content)

        # call the whisper API
        transcription = openai.Audio.transcribe(
            "whisper-1", recording, language=languages_mapping.get(lang, "en")
        )

        # returns the transcribed text
        return transcription["text"]

For some reason when I send an audio recorded on iOS whisper is only able to transcribe the first 1-2 seconds.
I think this may be caused by the different encoding made on iOS, but there seems to be no way of fixing it client-side.

What can I do to solve it?

Thanks in advance,
Giovanni

_j · October 28, 2023, 4:06pm

You actually have failing audio files logged for analysis and they are understandable but can’t be transcribed?

Here I describe a re-encoding you could do, which also has the effect of recoding in voice-over-ip audio bandwidth, so if there was something like noise shaping in high definition audio, it would be stripped.

https://community.openai.com/t/sending-an-hours-worth-of-audio-through-whisper-using-node-js/450869/8

Topic		Replies	Views
Whisper API only transcribing first few seconds API whisper	7	3361	December 19, 2023
Whisper API not transcribing audio files coming from an iphone API ios , whisper , javascript	10	2553	December 18, 2024
Whisper issues with mp4 saved by Safari API whisper	5	2366	December 16, 2023
Whisper api completely wrong for mp4 API whisper	14	5374	December 15, 2023
Issues with audio files from IOS and the x-m4a format API whisper	14	2125	July 21, 2024

Whisper API is not able to transcribe audios created on iOS

Related topics