Extract voice from a MP4 to output a new voice-over file in another language

I was planning to use whisper to generate a transcript of the audio from a video, then translate it into another language and use TTSopenAI to generate the voice-over of the same video in another language.
But is there a way to output an MP3 from whisper as an alternative workflow for this purpose?

1 Like