We are Tier-5 and using Whisper for transcribing. Just one average sentence (My name is John) is taking 3+ seconds to transcribe. 2-3 sentences long instances are taking 10 seconds or so. What is expected latency for Whisper?
1 Like
I’ve not timed it, but I’ve done an hour of audio and didn’t feel like it took more than 30 seconds.
The realtime api should provide less latency.