Calling Whisper API using curl request keeps giving error

I tried to use the Whisper API using JavaScript with a post request but did not work, so proceeded to do a curl request from Windows PowerShell with the following code and still did not work. I attach the error message.

Invoke-RestMethod -Uri "https://api.openai.com/v1/audio/transcriptions" `
  -Method POST `
  -Headers @{
    "Authorization" = "Bearer TOKEN"
    "Content-Type" = "multipart/form-data"
  } `
  -Body @{
    "file" = Get-Item -Path "./audio.mp3"
    "model" = "whisper-1"
  }

Error:

Invoke-RestMethod : {
    "error": {
        "message": "Could not parse multipart form",
        "type": "invalid_request_error",
        "param": null,
        "code": null
    }
}
At line:1 char:1
+ Invoke-RestMethod -Uri "https://api.openai.com/v1/audio/transcription ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (System.Net.HttpWebRequest:HttpWebRequest) [Invoke-RestMethod], WebException
    + FullyQualifiedErrorId : WebCmdletWebResponseException,Microsoft.PowerShell.Commands.InvokeRestMethodCommand

Note that, I managed to get this file transcribed in python using whisper. And the same curl request for ChatGPT with the same API key works properly.

For full context, here is the js request:

const url = "https://api.openai.com/v1/audio/transcriptions"

const form = new FormData();
form.append("file", new File([audioURL], "audio-transcribe"));
form.append("model", "whisper-1");
form.append("response_format", "text");

const requestOptions = {
    method: "POST",
    headers: {
      "Authorization": "Bearer TOKEN",
      "Content-Type": "multipart/form-data"
    },
    body: form
}

fetch(url, requestOptions)
2 Likes

Try removing the "Content-Type" = "multipart/form-data" header from your POST. I ran into a similar issue using the Rust library reqwest and removing that header fixed my issue.

4 Likes

I moved a step further, but now I get
“Invalid file format. Supported formats: [‘m4a’, ‘mp3’, ‘webm’, ‘mp4’, ‘mpga’, ‘wav’, ‘mpeg’]”,

async function callWhisper(file: Blob, name: string) {
  console.log("before call");
  const formData = new FormData();
  formData.append("model", "whisper-1");
  formData.append("file", new File([file], name));
  formData.append("response_format", "text");
  const res = await fetch("https://api.openai.com/v1/audio/transcriptions", {
    method: "POST",
    body: formData,
    headers: {
      Authorization: "Bearer " + Deno.env.get("OPENAPI_API_KEY")!,
    },
  });
  console.log("callChatGPTRes", res.status);
  const json = await res.json();
  console.log("callChatGPTRes", json);
  return json;
}

I think that maybe it is related to be sending the file itself in the wrong format

1 Like

The reason my requests did not get sent was the wrong encoding, I needed to decode from base64 first, so it was not related to OpenAI API.

Could you share the code you used to do the decoding? Thanks!

so I have a file that is encoded as base64 and is stored as a file in React Native + Expo.

import { Buffer, Blob } from 'buffer';
import * as FileSystem from 'expo-file-system';
const file = Buffer.from(
  await FileSystem.readAsStringAsync(uri, {
    encoding: FileSystem.EncodingType.Base64,
  }),
  'base64',
)

essentially, this line

const file = Buffer.from(
        "string_encoded_as_base_64"
        'base64',
      )
// now it is Buffer

const blob = new Blob([file])

// now it is Blob

2 Likes

Thanks! Unfortunately I am still having trouble getting it work. I am getting my audio data from MediaRecorder chunks and creating a blob like this:

const blob = new Blob(chunks, {
  type: "audio/mp3",
}); 

Then I add it to the form like this:

form.append("file", blob);

However I still get the error: Invalid file format. Supported formats: ['m4a', 'mp3', 'webm', 'mp4', 'mpga', 'wav', 'mpeg'].

Any tips? Thanks!

edit: In the meantime I’ve tried a lot of things, such as passing mimeType: "audio/webm" to the MediaRecorder and/or the Blob, but no luck yet.

I finally got it, based on this, main parts being:

const blob = new Blob(chunks);
const file = new File([blob], "input.wav", { type: "audio/wav" });
form.append("file", file);
4 Likes

I was going insane but this made it work, thanks!

1 Like

could you please provide more of your code, have been at this for a few days now going crazy, I have something similar but still getting the invalid file format error

1 Like

Sure, what context do you need exactly? What does your own code look like?

The “chunks” portion. I have a base64 encoded .wav file that works but I’ve tried every which way to feed the blob the data it always ends up corrupted. Would help a alot!

Here’s some demo code that I’m using for Nodejs using the OpenAI Library (version 3.2.1).

const transcription = await openai.createTranscription(
    fs.createReadStream(filePath),
    "whisper-1",
    undefined,
    "verbose_json",
    undefined,
    undefined,
    {
      maxBodyLength: Infinity,
    }
  )

I’m using react-native, new to it but it’s a little more complicated than backend because you can’t use file system. I’ll give the library a try though!

I am using http action and here is my code by I am getting error


Where file is filecontent

When using Python, you need to input the binary form of the audio file. I initially failed, but later achieved success by modifying the code as follows.

            with open(i, 'rb') as f:
                file_content = f.read()
                files = {
                    'file': (os.path.basename(i), file_content),
                }
                data = {
                    "model": "whisper-1",
                    "prompt": parse_prompt,
                    'response_format': "text"
                }

hello
I’m trying to call /v1/audio/transcriptions by sending the audio.wav file through a multipart request in python but all the time I get the error “Could not parse multipart form”,
I do this
headers = {
‘User-Agent’: ‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3’,
‘Authorization’: ‘Bearer ’ + ai_key,
‘Content-Type’: f’multipart/form-data; boundary={boundary}’,
# ‘Content-Length’: str(len(body))
}

fields = [
(‘model’,‘whisper-1’),
(‘response_format’,‘text’)
]

Archivo de audio WAV

audio_file = ‘audio.wav’
with open(audio_file, ‘rb’) as file:
audio_data = file.read()
#Payload
boundary = ‘----boundary----’
body = b’’
for field, value in fields:
body += f’–{boundary}\r\n’.encode()
body += f’Content-Disposition: form-data; name=“{field}”\r\n\r\n{value}\r\n’.encode()

Agregar el archivo adjunto al cuerpo de la petición

print(body)

mimetype = mimetypes.guess_type(audio_file)[0] or ‘application/octet-stream’
body += f’–{boundary}\r\n’.encode()

body += f’Content-Disposition: form-data; name=“audio”; filename=“{audio_file}”\r\n’.encode()

body += f’Content-Disposition: form-data; name=“file”; filename=“{audio_file}”\r\n’.encode()
body += f’Content-Type: {mimetype}\r\n\r\n’.encode()
body += audio_data + b’\r\n’

Crear una conexión HTTP a través del proxy

conn = http.client.HTTPSConnection(proxy_host, proxy_port)

thanks
regards
diego

@mykyta.chernenko @sarahkatewoessner Can one of you maybe help me see what I’m missing here?

            const { uri } = await FileSystem.downloadAsync(
                'https://firebasestorage.googleapis.com/v0/b/<yatta-yatta-yatta>',
                FileSystem.documentDirectory + 'test.m4a'
            );
            const base64String = await FileSystem.readAsStringAsync(uri, { encoding: FileSystem.EncodingType.Base64 });
            const buffer = Buffer.from(base64String, 'base64');
            const blob = new Blob([buffer], { type: 'audio/m4a' });
            const file = new File([blob], 'test.m4a', { type: 'audio/m4a' });
            console.log(file);
            const formData = new FormData();
            formData.append('model', 'whisper-1');
            formData.append('file', file);
            formData.append('response_format', 'text');
            fetch('https://api.openai.com/v1/audio/transcriptions', {
                headers: {
                    Authorization: 'Bearer ' + process.env.OPENAI_API_KEY,
                    'Content-Type': 'multipart/form-data',
                },
                method: 'POST',
                body: formData,
            })
                .then((res) => res.text())
                .then(console.log)
                .catch(console.error);

Here’s what I see when I log “file” to the console:

And I’m getting the same error as you:

{
  "error": {
    "message": "Invalid file format. Supported formats: ['m4a', 'mp3', 'webm', 'mp4', 'mpga', 'wav', 'mpeg']",
    "type": "invalid_request_error",
    "param": null,
    "code": null
  }
}

why I don’t see an immediate issue, but I see a few differences with my code

‘Content-Type’: ‘multipart/form-data’, I don’t have it

const blob = new Blob([buffer], { type: ‘audio/m4a’ });
const file = new File([blob], ‘test.m4a’, { type: ‘audio/m4a’ });

I don’t specify the type.

As well as, are you sure that the file you keep in FileSystem is m4a indeed encoded with base64 once?

Thanks for getting back to me! Here’s what ended up working:

            const fileUri = uri.replace('file://', '');
            const file = { uri: fileUri, name: 'recording.m4a', type: 'audio/m4a' };
            const formData = new FormData();
            formData.append('model', 'whisper-1');
            formData.append('file', file as unknown as Blob);
            formData.append('response_format', 'text');
            const res = await fetch('https://api.openai.com/v1/audio/transcriptions', {
                headers: { Authorization: 'Bearer ' + Constants.expoConfig?.extra?.apiKey, 'Content-Type': 'multipart/form-data' },
                method: 'POST',
                body: formData,
            });
            const text = await res.text();

Seems like I didn’t need to do anything with buffers or blobs after all :man_shrugging:

3 Likes