I was previously using the Replicate API for OpenAI’s whisper. This required a file URL as the parameter rather than sending the raw file directly through HTTP.
It would be great if we could get an option to provide either a file or a direct URL to a storage service like Google Bucket etc. Some people are using services which cannot save files locally.
For most applications (esp high-scale), it really isn’t practical to store files locally versus using S3/GCB. Right now, I have to act as a middleman between GCP and OpenAI, Downloading and then uploading a file every time I need to perform transcription.
OpenAI has made the best software of the century, and I love you guys. Thank you for all the work you’ve already done.
+1 I’m running into this exact issue where I have the URL and don’t want to download/upload the audio file for transcription. It’s going to force me to write and host a function.
same. it would be great if there file also accepts URL from cloud storage or AWS storage so we don’t have to upload twice just upload to cloud first and then manage everything from there as we also don’t get any response URL to see check the file
I run this code in a Node.js 20 environment, hosted as Google Cloud Function. However, maybe you should check out on which runtime environment your PA app is built on. Since OpenAI lib seems to expect a ReadableStream object, you could also try to replace file: audio_file by file: audio_file.body (haven’t tested it yet).