GPT-4o Transcribe For Live Streaming

chacma_baboon · November 11, 2025, 4:30pm

I am working on building a transcription script that takes in audio live from my microphone and is able to transcribe it into text. I am messing around with the served_vad and was wondering how I could get it such that it transcribes while I speak and not after I pause. I tried shortening some of the durations, but that seemed to greatly affect the accuracy. Would using DeepGram be better?

jochenschultz · November 11, 2025, 4:41pm

Maybe you can put something between the audio stream and the model?

I am working on a logic that allows for prediction of “what the person is most likely going to say” - so while you still speak it can start the construction of an answer and only if a score for an expected answer is high I can start streaming the answer… - which is more expensive though obviously - (but caching is a thing too).

That might be a little different - and takes a lot of efford but it also allows for special things like “analysing for prompt injection” (in that case the audio stream can even do a barge-in on the caller - lol - like streaming a “booooring” file or “a stop that - that makes no sense”), “playing with a smalltalk score when the bot has predefined goals to fullfill, … - like a little smalltalk / let the model create ONE poem about strawberries but fighting against abuse of a hotline” .. etc.

There is quiet a lot of stuff you can do beside “just” using a model wrapper.

I am also checking if there is another way of AI - where I use resonance instead of similarity. I think that is how the human brain does the prediction - it has same phase, amplitude, rythm, .. of something that the system knows - well, no need to create a prediction - we most likely have one.

Kind of like the job of the Amygdala..

Topic		Replies	Views
GPT-4o-transcribe and audio model ready to use via API? API transcribe	9	3628	July 2, 2025
Web Speech API with whisper API whisper	1	521	July 24, 2025
Transcription of streaming audio using gpt-4o-transcribe API gpt-4	0	385	March 25, 2025
Whisper Streaming Strategy API chatgpt , whisper , streaming	10	18951	October 22, 2025
Extracting Transcription Without Using input_audio.input_transcription in OpenAI API API realtime , api-realtime	10	683	March 11, 2025

GPT-4o Transcribe For Live Streaming

Related topics