WebRTC full manual control use case

dested · December 19, 2024, 1:56am

I am trying to use the new Webrtc APIs. I do not want the server to respond themselves, instead I want to get a transcript of what the user said, process it myself, and then tell the server what to say back to the user. Is this currently possible? I have tried

          turn_detection: {
            type: 'server_vad',
            threshold: 0.5,
            prefix_padding_ms: 300,
            silence_duration_ms: 500,
            create_response: false,
          }

But it has no effect that I can see, and also when I get the ‘session.created’ data payload, it sets it create_response to true.

If this usecase is not possible, how would you handle this flow?

Topic		Replies	Views
Realtime API WebRTC sudden failed connections Bugs realtime , api-realtime	9	665	March 25, 2025
VAD with WebRTC Realtime voice API realtime	3	788	January 7, 2025
Realtime API Server turn detection limitations (Suggestion & Help Request) API turn-control , realtime	4	5130	October 14, 2024
Turn_detection null breaks manual audio control in Realtime API - Web RTC Bugs realtime , api-realtime , api-realtime-speech	1	262	March 18, 2025
Real Time API invents numbers or does not understand number sequences Bugs	11	342	January 22, 2025

WebRTC full manual control use case

Related topics