New streaming speech to text model open sourced

I wanted to share that I’ve just open sourced a family of streaming models and a library to run them. Our largest model is 245 million parameters and achieves a 6.65% word error rate on the HuggingFace OpenASR Leaderboard, compared to 7.44% and 1.5 billion parameters for Whisper Large v3. Here’s some more information, I’d love to hear if there’s interest in this direction from the you all?

I can’t include links unfortunately, probably for good spam protection reasons, but if you go to my blog at petewarden dot com, you can check out the top post. :slight_smile:

Thanks,

Pete

3 Likes