@Foxalabs, I’m also working with Realtime API manually via websocket. See full log of websocket sessions which I posted in the original message above.
I’m asking Realtime model to do both: respond to user and also call function set_emotion
. Sometimes it works as expected, but sometimes this crazy bug appears. I suppose that incorrectly appearing function call in the response text causes audio to go crazy.