Help processing MP3 files with Whisper API

Hi everyone,
I’m working on a Python script that uses OpenAI’s Whisper API to transcribe audio files.
I can successfully transcribe .wav files, but when I try to process .mp3 files the API returns an error or empty transcript.

Here’s a simplified version of my code:

import openai
audio_file = open(“test.mp3”, “rb”)
transcript = openai.audio.transcriptions.create(
model=“gpt-4o-mini-transcribe”,
file=audio_file
)
print(transcript.text)

Do I need to convert the MP3 to WAV before sending it, or should the API handle MP3 directly?
Also, is there a size or bitrate limit for uploaded files?

Thanks for any insights!

Are you sure the input is really a .mp3 file?

While whisper-1 is able to auto detect file format, the other models like gpt-4o-mini-transcribe will give an output error if you send a .wav file renamed as .mp3 (or the reverse).

Error code: 400 - {'error': {'message': 'Audio file might be corrupted or unsupported', 'type': 'invalid_request_error', 'param': 'file', 'code': 'invalid_value'}}

Try running your sample code again with whisper-1 as the model and if it respond normally, this might be the case.

1 Like