Has anyone experienced temporary audio glitches when using gpt-4o-mini-realtime-preview with the Realtime WebSocket API for voice output? The model starts speaking normally, but during playback short bursts of distortion or interference appear — similar to listening over an unstable audio stream — even though the connection itself doesn’t drop and the speech doesn’t actually stop.
Is this a known issue, a streaming configuration problem, or something related to audio buffering? Any suggestions on how to smooth out the audio would be greatly appreciated.