Whisper - Can we transcribe from URL and File upload?

Hey there!

I was previously using the Replicate API for OpenAI’s whisper. This required a file URL as the parameter rather than sending the raw file directly through HTTP.

It would be great if we could get an option to provide either a file or a direct URL to a storage service like Google Bucket etc. Some people are using services which cannot save files locally.

For most applications (esp high-scale), it really isn’t practical to store files locally versus using S3/GCB. Right now, I have to act as a middleman between GCP and OpenAI, Downloading and then uploading a file every time I need to perform transcription.

OpenAI has made the best software of the century, and I love you guys. Thank you for all the work you’ve already done.

19 Likes

Yes, also looking for ability to transcribe from file URL…

3 Likes

+1, it’ll be a massive help when working with web apps!

2 Likes

+1 I’m running into this exact issue where I have the URL and don’t want to download/upload the audio file for transcription. It’s going to force me to write and host a function.

2 Likes

same. it would be great if there file also accepts URL from cloud storage or AWS storage so we don’t have to upload twice just upload to cloud first and then manage everything from there as we also don’t get any response URL to see check the file

Also looking for the same thing. Do we know if they added this functionality yet?

I am still looking for this feature in march 2024.

3 Likes

I’m here to dogpile this issue. Any traction on this would be nice.

1 Like

+1 I’m facing the same issue too.
For now, I’ve been using this :frowning: workaround until the OpenAI team can provide a solution.

const response = await axios({
    method: 'get',
    url: audioFileUrl,
    responseType: 'arraybuffer'
});
const audioBuffer = Buffer.from(response.data);
const audioFormat = path.extname(audioFileUrl);
const audioFile = await libOpenAI.toFile(audioBuffer, audioFormat);
const transcription = await libOpenAI.audio.transcriptions(audioFile, prompt);
2 Likes

what is this libOpenAI? What does the .toFile function return?

Hello everyone,

I had the same issue. The following approach works for me:

var audio_file = await fetch(URL_TO_AUDIO_FILE);
const transcription = await openai.audio.transcriptions.create({
        file: audio_file,
        model: "whisper-1",
        response_format: "text"
});

Best regards,
Elias

1 Like

where are you writing this code . I am using http action in cloud flow of power automate

I run this code in a Node.js 20 environment, hosted as Google Cloud Function. However, maybe you should check out on which runtime environment your PA app is built on. Since OpenAI lib seems to expect a ReadableStream object, you could also try to replace file: audio_file by file: audio_file.body (haven’t tested it yet).

This worked perfect for me:

import OpenAI, { toFile } from 'openai';
const openai = new OpenAI()

var response = await fetch(URL_TO_AUDIO_FILE);
const transcription = await openai.audio.transcriptions.create({
        file: await toFile(response, 'File.ogg'),
        model: "whisper-1"
});
console.log(transcription.text);
1 Like

hey I want to Upload regular pDF files ? how can i appy it for PDF file instead of Audio file ?