Gpt-4o-transcribe returns "audio file might be corrupted or unsupported"

Thanks! It seems file types and names are strictly enforced, which is good. The issue got resolved.

1 Like

It seems that there is an error in the openai-java library. I still get the following error when switching to gpt-4o-mini-transcribe, while it works perfectly with whisper-1.

File audioFile = new File(getCacheDir(), "audio.mp3");

TranscriptionCreateParams transcriptionCreateParams = TranscriptionCreateParams.builder()
        .file(audioFile.toPath())
        .model(AudioModel.GPT_4O_MINI_TRANSCRIBE)
        .responseFormat(AudioResponseFormat.JSON)  // gpt-4o-transcribe only supports json
        .build();

Transcription transcription = client.audio().transcriptions().create(transcriptionCreateParams).asTranscription();
String resultText = transcription.text();
com.openai.errors.BadRequestException: 400: Audio file might be corrupted or unsupported
at com.openai.errors.BadRequestException$Builder.build(BadRequestException.kt:87)
at com.openai.core.handlers.ErrorHandler$withErrorHandler$1.handle(ErrorHandler.kt:48)
at com.openai.services.blocking.audio.TranscriptionServiceImpl$WithRawResponseImpl$create$1.invoke(TranscriptionServiceImpl.kt:74)
at com.openai.services.blocking.audio.TranscriptionServiceImpl$WithRawResponseImpl$create$1.invoke(TranscriptionServiceImpl.kt:72)
at com.openai.core.http.HttpResponseForKt$parseable$1.parse(HttpResponseFor.kt:14)
at com.openai.services.blocking.audio.TranscriptionServiceImpl.create(TranscriptionServiceImpl.kt:41)
at com.openai.services.blocking.audio.TranscriptionService.create(TranscriptionService.kt:22)
at net.devemperor.dictate.core.DictateInputMethodService.lambda$startWhisperApiRequest$24$net-devemperor-dictate-core-DictateInputMethodService(DictateInputMethodService.java:792)

It’s really weird, because the same request with curl also works fine.

curl https://api.openai.com/v1/audio/transcriptions \                                                                     
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: multipart/form-data" \
  -F file="@src.mp3" \
  -F model="gpt-4o-mini-transcribe" -F response_format="json"

Maybe you could take a look at the openai-java library, because the new transcription models don’t seem to work at all there.

1 Like

Hi,

Thank you very much; it appears to be resolved.
That works for me.
Best regards,

Julien

1 Like

Its not fixed for me. My code, works, if the model is “whisper-1” and I get an error if it is “gpt-4o-mini-transcribe”. Thats all I change.

Error code: 400 - {'error': {'message': 'Audio file might be corrupted or unsupported', 'type': 'invalid_request_error', 'param': 'file', 'code': 'invalid_value'}}

Not sure if that matters, but it happens on “openai==1.68.2” and “openai==1.68.1”

Happens with mp3 and webm.

Okay, I actually found what’s wrong. I did not have an mp3 file or a webm file. I had an ogg file from Telegram but I named it .mp3, and somehow the Whisper model can still transcribe it, even if the file format and the file name ending (mp3) do not mach. It seems like “gpt-4o-mini-transcribe” can not do that.

Or may be whisper supports ogg and gpt-4o-mini-transcribe does not.

1 Like

Hello, same issue happens with some WAV files.

Works with whisper-1:
req_f02d777a48c1f0e447706219eb7ce694

But doesn’t work with gpt-4o-transcribe:
req_16be933c7dfa04d91e47e0e357c59ecf

Thanks for mentioning this. Seems like this was the problem for me too, now it works great with the new models.

I’m also using springboot openai library on java, and I found this in lib sources

MultiValueMap<String, Object> multipartBody = new LinkedMultiValueMap<>();
		multipartBody.add("file", new ByteArrayResource(requestBody.file()) {

			@Override
			public String getFilename() {
				return "audio.webm"; <----------- file name extension worng
			}
		});
		multipartBody.add("model", requestBody.model());
		multipartBody.add("language", requestBody.language());
		multipartBody.add("prompt", requestBody.prompt());
		multipartBody.add("response_format", requestBody.responseFormat().getValue());
		multipartBody.add("temperature", requestBody.temperature());

But my inputstream is “wav” file. Can it be the issue?

Can you please look into my request?
req_6d02b398e398255afe892411e39d919d

The output is:
BadRequestError: Error code: 400 - {‘error’: {‘message’: ‘Audio file might be corrupted or unsupported’, ‘type’: ‘invalid_request_error’, ‘param’: ‘file’, ‘code’: ‘invalid_value’}}

Environment:
openai-1.50.2
WSL2 Ubuntu 22.04
file properties:
Container: MP3
Codec: libmp3lame, mono
Sample rate: 24 kHz
Bit-rate: 32 kb/s
Size: 16.4 MB

I used this command to convert the file:
ffmpeg -i clean.wav -ac 1 -ar 24000 -codec:a libmp3lame -b:a 32k clean.mp3

My code:
import openai
from openai import OpenAI
OPENAI_API_KEY = “the key is here”
client = OpenAI(api_key = OPENAI_API_KEY)
try:
with open(“clean.mp3”, “rb”) as f:
transcription = client.audio.transcriptions.create(
model=“gpt-4o-transcribe”,
file=f,
)
except openai.BadRequestError as e:
print(e.response.headers[“x-request-id”])

Thank you.