Realtime API (Advanced Voice Mode) Python Implementation

Honestly you are correct, i want to try using python to process the backend too, but in the meantime, could this VAD be used here: webrtcvad or gTTS?