Hi everyone!
I’m currently facing an issue with the new transcription capability of the OpenAI API using the gpt-4o-transcribe
and gpt-4o-mini-transcribe
models.
When I try to submit an audio file (in a supported format, as specified in the official documentation), I receive the following error response:
{
"request_id": "req_1e3293a3f121081a41273fdc2394c0c2",
"error": {
"message": "Audio file might be corrupted or unsupported",
"type": "invalid_request_error",
"param": "file",
"code": "invalid_value"
}
}
However, I’ve confirmed that the file is not corrupted and is in a valid format (I’ve tested with .mp3
, and .wav
). The audio plays fine in various players, and I was also able to transcribe it successfully using whisper-1
.
Is anyone else encountering this issue with the GPT-4o transcription models? Are there any undocumented limitations, or suggestions on how to structure the request differently?
Any insights or help would be much appreciated!