Hello guys, I want to create a platform to convert video to text, which APIs do I need to use?
Thank you in advance for your answers.
Hello guys, I want to create a platform to convert video to text, which APIs do I need to use?
Thank you in advance for your answers.
Welcome to the OpenAI Dev Community!
OpenAI actually have a cookbook entry (a well-written guide) on how to process videos and create a voiceover. You can find it here.
Some more good documentation for your desired use-case can be found in the API documentation, specifically for GPT-4V and Whisper.
The vision model can see the video, and you can use Whisper to transcribe any audio.
Hi, we created Scribebuddy A.I. for Audio/Video Transcription using Whisper, and it’s working great. It’s free to use.
@vivekketha1234 - Here’s a quick tutorial of transcribing and indexing audio and video with OSS library and whisper and storing the data and processing: Audio and Video Transcript Indexing