How to send file to Whisper API when you can't save files locally

Hello. I am developing a web application (through bubble.io) and I want to make an API request to transcribe some text. As you might imagine, files can’t be stored locally. They are stored in Amazon AWS.

So here’s my question: how do I send them to the API?
The curl request is:

curl https://api.openai.com/v1/audio/transcriptions \
  -X POST \
  -H 'Authorization: Bearer TOKEN' \
  -H 'Content-Type: multipart/form-data' \
  -F file=@/path/to/file/audio.mp3 \
  -F model=whisper-1

I have them stored in Amazon AWS and I have the URL that is something like: https://s3.amazonaws.com/app/xxxxxxxx/TESTAUDIO.mp3

Thanks

I think the API is asking for the raw file bytes to be sent. No idea. But interested if any has found a workaround. In my case I download the file from S3 and send off the bytes to the API. Works great!

1 Like

Hey @logankilpatrick is there any solution?

@curt.kennedy do you have an example where you are sending the bytes to the api? I’m trying to do this with node. It works fine if I save the audio first and then save the file, but if I just try and send the data from the buffer it is not working. I’d like to perform the following, but no dice.

                // Get the mp3 file buffer from req.file
                var audioBuffer = req.file.buffer;

                // Create a form data instance
                const form = new FormData();

                // Append the mp3 file buffer and other parameters to the form data
                form.append('file', audioBuffer);
                form.append("model", "whisper-1");

                // Set the headers for the axios request
                const headers = {
                    Authorization: "Bearer sk-xxx",
                    ...form.getHeaders(),
                };

                //Make a post request to the API endpoint
                axios
                    .post("https://api.openai.com/v1/audio/transcriptions", form, { headers })
                    .then((r) => {
                        // Send back a JSON response with the transcription
                        res.json(r.data.text);
                        console.log(r);
                    })
                    .catch((error) => {
                        // Handle error response
                        console.error(error);
                    });

Any help would be great. Thanks!

@danvonfeldt

Here is more context of doing this in Python from S3

NANOwav = time.time_ns() # generate to avoid clobber
    
fname_local = f"/tmp/{NANOwav}.wav" # avoid local clobber by using nsec 

# Uri is the S3 location of the file
with requests.get(Uri, stream=True) as r:

    with open(fname_local,"wb") as binary_file:
        binary_file.write(r.raw.read())
        
        file_data = open(fname_local,'rb')
        files = {'file': file_data,'model': (None, 'whisper-1'),}
        response = requests.post('https://api.openai.com/v1/audio/transcriptions', headers=headers, files=files)
        
        WhisperResponse = response.json()
        VoiceTranscript = WhisperResponse.get("text","")

I probably could have avoided the write to disc operation by using a BytesIO object, but didn’t have time to code that version.

Thank you @curt.kennedy that is helpful. I am hosting with vercel and it is read-only however it does allow the ability to store some data in /tmp dir so I took advantage of that and things are working ok for me. Still writing to disk, but just blow it away right away. Thanks for your response.

1 Like

This works for me in PHP

$token = 'Bearer ' . $OPENAI_API_KEY;

// Set the file path and model name

$model_name = 'whisper-1';

// Create a new cURL resource
$ch = curl_init();

// Set the cURL options
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
  'Authorization: '.$token,
  'Content-Type: multipart/form-data'
));

$file_path = '1.mp3';
$fileContent = file_get_contents($file_path);
$r= new \CURLFile('data:audio/mp3;base64,' . base64_encode($fileContent), 'audio/mp3', $file_path );

curl_setopt($ch, CURLOPT_POSTFIELDS, array(
  'name' => basename($file_path),
  'file' => $r,
  'response_format' => 'json',
  'prompt' => 'transcribe this Chapter',
  'language' => 'de',
  'model' => $model_name,
));

curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);

// Execute the cURL request
$response = curl_exec($ch);

// Close the cURL resource
curl_close($ch);

// Handle the response
echo ($response);

in bubble.io, just check the sendfile option on the whisper api param definition and the you will be able to send mp3 files even just with the url - so works even if store on your own S3

3 Likes