Whisper API breaks on AWS Lambda

scourge · December 5, 2023, 2:25pm

Hey there, I am having issues getting the whisper API to run when deploying to AWS Lambda since I keep getting

openai.Bad Request Error: Error code: 400 - {error: {message: Invalid file format. Supported formats: [flac, m4a, mp3, mp4, mpeg, mpga, oga, ogg, wav, webm], type: invalid_request_error, param: None, code: None}}

Running this locally on my mac works without issue, even when using the same python:3.9 Docker image that I use when deploying this thing on Lambda.

@api_bp.post("/audio")
def transcribeRoute():
    audio_file = request.files.get('audio', None)

    if audio_file:
        print(audio_file) # <FileStorage: 'blob' ('audio/webm;codecs=opus')>

        buffer = BytesIO(audio_file.read())
        buffer.name = "test.webm"

        transcript = client.audio.transcriptions.create(model="whisper-1", file=buffer).text

        print(f"Transcribed audio: {transcript}")
        return {"transcription": transcript}, 200

    print("No audio file found")
    return {"transcription": "error"}, 400

Any help is greatly appreciated!

curt.kennedy · December 5, 2023, 2:39pm

I am running something like this on Lambda. Problem I had from day one was in the OAI SDK, so just use the API without it using requests directly.

In Lambda, you just use /tmp as your local directory.

scourge · December 5, 2023, 7:16pm

Hmm, that didn’t seem to help, both when saving it to /temp and just using BytesIO. I am thinking that maybe it has something to do with dependencies. I have installed both opus-tools, libopus0, and ffmpeg but that didn’t do nothing either. I am gonna try to convert it to a different format with ffmpeg firs, maybe that is gonna work

curt.kennedy · December 5, 2023, 7:19pm

I pretty much only use wave files. But I hear it can be finicky on exact formats. So convert it to a know “good” format, then troubleshoot from there.

scourge · December 6, 2023, 8:16pm

After much trying and researching the problem was a mix of 2 issues:
a) In order for the Whisper API to work, the buffer with the audio-bytes has to have a name (which happens automatically when you write and read it to the file, just make sure you have the right extension).
b) The AWS API-Gateway doesn’t support binary data in requests by default, and you have to manually allow it. F.e. if you are using Serverless Framework for deployment add this:

provider:
  apiGateway:
    binaryMediaTypes:
      - '*/*'

olivier.cuny · March 28, 2024, 6:03pm

Lifesaver, thank you! This did the trick.

  # Create a buffer with the file content and set the file name
    buffer = BytesIO(file_content)
    buffer.name = file_name

Topic		Replies	Views
Sending files from S3 to Whisper API API whisper	4	3228	October 8, 2023
Whisper error 400 "Unrecognized file format." API whisper	9	5096	May 6, 2024
OpenAI Node lib error on Audio Transcription API	5	2669	December 20, 2023
400 BAD_REQUEST error when passing audio to Server before passing to OpenAI API	9	6012	March 24, 2024
Error with whisper when trying to get transcription using java HTTP post & streaming audio file Community whisper	1	1835	November 26, 2023

Whisper API breaks on AWS Lambda

Related topics