What is considered a request for the real-time API? Is it the number of WebSocket connections? Or is it the number of requests sent over those WebSocket connections?
Behind the realtime API is still a turn-based AI.
You can either use the voice-activity-detection, or you can manually send a create event as a trigger. This is the call that places more context into AI to then generate an output.
For rate limit and for billing, it is these recurring “create response” generations that are metered and measured.