Creepy bug of Realtime API + Function Calling: Extra Audio Not in Transcription

@Foxalabs, I’m also working with Realtime API manually via websocket. See full log of websocket sessions which I posted in the original message above.

I’m asking Realtime model to do both: respond to user and also call function set_emotion. Sometimes it works as expected, but sometimes this crazy bug appears. I suppose that incorrectly appearing function call in the response text causes audio to go crazy.

2 Likes