Whisper API Latency is just too high!

Hi guys!

Would like to know if there’s any way to reduce the latency of whisper API response. As of now to transcribe 20 seconds of speech it is taking 5 seconds which is crazy high. Is there any way to get it to 2-3 seconds atleast? Can we expect OpenAI to improve latency overtime?

Because most application of STT would require it to be close to real-time so that would be highly appreciated!

Whisper isn’t architected in a way suited for realtime transcription, to achieve what you want, you will have to break up the request into small chunks and transcribe each chunk.

1 Like