Gpt-4o-transcribe unsupported format with .wav

I’ve been using whisper-1 for my audio process and it’s been fine so far, today I tried to switch and test the gpt-4o-transcribe, I know it has stricter output format, but input-wise, the api doc said it accept .wav, which is what i’ve been using smoothly with whisper. When I switch to the 4o-transcribe model, I keep getting :ERROR:main:Audio processing error: Error code: 400 - {‘error’: {‘message’: ‘This model does not support the format you provided.’, ‘type’: ‘invalid_request_error’, ‘param’: ‘messages’, ‘code’: ‘unsupported_format’}}

Here’s my code:

def process_audio(contents):
    try:
        with tempfile.NamedTemporaryFile(delete=False, suffix=".wav") as temp:
            temp.write(contents)
            temp_path = temp.name  # Store the file path for later use
        
        # Reopen the file in binary mode and pass it to the API
        with open(temp_path, "rb") as temp_file:
            transcript = client.audio.transcriptions.create(
                model="gpt-4o-transcribe", #whereas before it was whisper-1
                file=temp_file,
                response_format="json"
            )
    # Clean up temporary file
    os.unlink(temp_path)
    
    return transcript
except Exception as e:
    logger.error(f"Audio processing error: {e}")
    raise

i have been having the exact same issue

This would happen if it was some other format (e.g. mp3) named as .wav. Could you double check that the file inside is indeed wav and not some other format?

1 Like