Reduce Realtime API costs: handle long waits

ivan-luchkin-u · November 20, 2024, 1:43pm

I may be wrong but this can be achieved with just async TTS and STT.
If you want to do realtime though, the most sensible and optimal way is to have two separate realtime sessions

First one will end when hold starts, and you would save the context of that conversation somewhere in your system.

The second one would start when the hold ends, and you would initialize this second session with the context from the first one.

However, you would also have to introduce a smaller AI or VAD system in order to detect when the human speech starts so that you can know you need to initialize the second session.

There isn’t much in terms of alternatives because of 15 minute idle limit. Maybe you could emulate activity by sending arbitrary events, but it’s questionable

Topic		Replies	Views
How to make OpenAI Realtime API agent end Twilio call programmatically? API	3	960	April 22, 2025
How can I programatically end a gpt-realtime SIP call? API api , realtime , api-realtime , api-realtime-speech	7	530	October 14, 2025
Interruption not implemented out of the box in the Twilio Example API turn-control , realtime	17	2054	October 13, 2024
This is a python script that allows you to speak to GPT-3 Community	20	5721	July 23, 2021
How to abort create chat completion streaming? I use Nodejs + Typescript API	7	7415	December 14, 2023

Reduce Realtime API costs: handle long waits

Related topics