Web Speech API with whisper

Thalles_Rangel · July 24, 2025, 4:28pm

I developed a system for correcting text using the Web Speech API.

It works like this:

In a browser, you click a button and start recording your voice using the Web Speech API. With each final capture, via web socket, the text is sent to OpenAI (GPT4.1) for correction. It’s simple, but it works for me.

I want real-time correction. I think I can do it with Whisper.

Do you have any real-time options, like Whisper, to simplify this?
How do you implement this?

_j · July 24, 2025, 4:46pm

There’s a bit of convolution in your explanation and understanding. (there’s no “web speech API”…). Let’s see if I can give you ideas.

Whisper takes audio input and outputs a transcript that is just an unformatted stream of language without paragraphs. It is solely for audio file to speech.

So, whisper is an AI model, just like gpt-4o-transcribe is an AI model, each dedicated to listening and writing what was spoken.

A follow-up AI call, making a transcript of jumbled spoken thoughts and stop words into something suitable for presentation, is certainly a good add-on.

However, that needs more input knowledge than voice-silence-detection chunks that might come from “realtime” (which is still turn-based). A “make this transcription pretty” task works best with the full contents of a transcript, making logical paragraph boundaries, with the full understanding of what is spoken, and what is yet to come when producing the language product.

So, improving smaller language snippets won’t work as good. You can get those smaller snippets by using speech-to-text in streaming - and via a backend proxy, not as a client that makes API calls directly.

https://platform.openai.com/docs/guides/speech-to-text#streaming

Topic		Replies	Views
Whisper Streaming Strategy API chatgpt , whisper , streaming	10	20437	October 22, 2025
How to perform real-time English-to-Chinese translation using Whisper and GPT-3.5-Turbo? API whisper	4	5407	October 10, 2023
Speech to Text (Whisper) to Review (ChatGPT) API whisper	1	2332	October 4, 2023
Whisper spitting out gibberish when trying to transcribe API whisper	5	1830	December 6, 2025
How to create a (near) realtime Speech-to-Text using Whisper? API	0	735	January 25, 2025

Web Speech API with whisper

Related topics