Error with whisper when trying to get transcription using java HTTP post & streaming audio file

BRGCO95 · November 26, 2023, 7:39am

I’ve been trying to write a server endpoint that receives an audio file’s binary data as an InputStream and uses it to call the transcriptions endpoint but have been running into various issues through the entire process. Ive worked through most all issues, but now i’m consistently running into this error :

{
“error”: {
“message”: “Invalid file format. Supported formats: [‘flac’, ‘m4a’, ‘mp3’, ‘mp4’, ‘mpeg’, ‘mpga’, ‘oga’, ‘ogg’, ‘wav’, ‘webm’]”,
“type”: “invalid_request_error”,
“param”: null,
“code”: null
}
}

I’ve played around with various file types but always receive this error. I’ve looked at every thread here and elsewhere I can find on that matter but can’t figure out the issue. Looking for guidance on how I can triage / fix this issue. Any help is appreciated!

BRGCO95 · November 26, 2023, 4:37pm

I’ve tried using mp3, webm, and wav files to the request. I’ve gotten the endpoint to work when using the example in the docs :

curl --request POST \
  --url https://api.openai.com/v1/audio/transcriptions \
  --header 'Authorization: Bearer TOKEN' \
  --header 'Content-Type: multipart/form-data' \
  --form file=@/path/to/file/openai.mp3 \
  --form model=whisper-1

But in my case I wont have the audio file stored on the local machine, so I can’t follow the same format.

To test my endpoint ive been sending requests like this. Generally matching what I would expect client requests to look like :

curl 'MY_ENDPOINT_URI' \
  -H 'content-type: multipart/form-data; boundary=----keykeykey' \
  --data-raw $'------keykeykey\r\nContent-Disposition: form-data; name="file"; filename="audio.webm"\r\nContent-Type: audio/webm\r\n\r\n' \
  --data-binary @audio.webm \
  --data-raw $'\r\n------keykeykey\r\nContent-Disposition: form-data; name="filesize"\r\nContent-Type: text/plain\r\n\r\n186406' \
  --data-raw $'\r\n------keykeykey\r\nContent-Disposition: form-data; name="model"\r\nContent-Type: text/plain\r\n\r\nwhisper-1\r\n------keykeykey--\r\n' \
  --compressed

Then, Im routing that request to my API which looks like this :

    // Create an HttpClient to execute the request were about to build.
    try (final CloseableHttpClient httpClient = HttpClients.createDefault()) {
      // Create a post request to the open AI transcriptions endpoint.
      final HttpPost request = new HttpPost("https://api.openai.com/v1/audio/transcriptions");

      // Fetch our API key.
      request.addHeader("Authorization", "Bearer " + getAPIKey());

      // The file were transcribing is stored within an InputStream we recieved from our endpoint request. In order to let open ai know how to
      // parse our byte stream, we need to build this multipart entity.
      request.setEntity(MultipartEntityBuilder.create()
                                              .setContentType(ContentType.MULTIPART_FORM_DATA)
                                              .addPart(OPEN_AI_FILE_KEY, new OpenAIFileContentBody(inputStream, fileName, Long.valueOf(fileSize)))
                                              .addPart(OPEN_AI_MODEL_KEY, new StringBody(model, ContentType.DEFAULT_TEXT))
                                              .build());

      // Execute our request, then retrieve the response, and return output as a JSON String.
      try (final CloseableHttpResponse response = httpClient.execute(request);
           final ByteArrayOutputStream output = new ByteArrayOutputStream()) {
        ... get response & return ...
      }

Topic		Replies	Views
Whisper throws a HTTP code: 400 with message: Bad Request even when audio .mp3 file is <25MB API whisper	5	1184	December 18, 2023
Why am I getting a Bad Request? Multipart request to whisper API API whisper	4	277	July 12, 2024
Java file requests invalid API	1	623	July 11, 2024
Unrecognized File Format" Error with MemoryStream When Using OpenAI's Whisper API in C# Bugs whisper	2	640	May 7, 2024
Whisper API breaks on AWS Lambda API whisper	6	2096	April 9, 2024

Error with whisper when trying to get transcription using java HTTP post & streaming audio file

Related topics