Hello everyone, I currently want to use Whisper for speech synthesis in videos, but I’ve encountered a few issues.
I am a Plus user, and I’ve used the paid API to split a video into one file per minute and then batch process it using the code below. However, the code inside uses “model=‘whisper-1’”. How can I modify it to use the latest Whisper v3?
Why are some instances of Whisper shared for free on the internet, but I have to pay for using Whisper through the API?
Does the current version of Whisper still have a limitation where each analysis can only process files up to 25MB?
If my 40-minute speech file is over 500MB, does that mean I have to split and process it in batches?
I used to be able to split the file into batches for processing, but there were issues with integrating the timestamps in the batched SRT file. How can I address the timestamp integration issue?
At present, you can only use whisper-1 for the transcription/translation API. However, whisper 3 is available free to use as python module. Check the whisper page on how to install in your computer.
Thank you for your reply. So, are you saying that I can simply open the terminal and install it directly? I’m using Windows 11 with Python 3.10.6, but I don’t have a dedicated GPU, only a CPU.
powershellCopy code
PS C:\User\XXX> python --version
Then, can I directly install the program you provided?
You know how mp3 can take a CD and make it 1/10th the size? That’s 25 year old technology now. Opus, as I gave an example of, has codecs optimized for voice, and by limiting the input to just phone call quality where voice audio lives, you can even improve the transcription.
With the many questions you have without a good foundation, starting with the OpenAI services would be a good start, although it doesn’t timestamp or make subtitle files.
Yes, you can install it from the terminal. I have a very old mac and it can translate/transcribe audio files but of course it takes very long time and I cannot do almost real-time transcription. But it’s free, so I cannot complain lol.
You may need to update your python version as written in the github page. Then you can install it directly. Refer to the github page for complete installation procedure including if you hit a snag.
Yes, you can process more than one file at once. You are already using whisper 3 so just using “large” is enough. But like I mentioned previously, if you have a not so good system, maybe start using “tiny” first.
If you want to use this inside node.js application, just use exec: