Transcribe via Whisper in real-time / live

I am aware that currently it is not possible to transcribe in real time, but rather send the m4a, mp3, mp4, mpeg, mpga, wav and webm after the recording has completed in order to transcribe. While the transcription is fairly fast, live transcription is not possible. Is there any intentions to make this live?

1 Like

I’ve heard of people sending small 10s-30s chunks, and putting the transcribed words together for a quasi-real-time system. The trick is resolving across word boundaries, so you need something like pydub to chunk smartly.

But no idea on a live streaming release date.

1 Like

That’s an Excellent idea, and it seems plausible as a solution. Thanks!

1 Like

I had the same problem and so I’ve created a working proof of concept for real time transcription with a websocket server and a demo JS client here if you want to check it out, github alesaccoia/VoiceStreamAI

1 Like