Realtime Speech-to-Speech: The web socket gets closed shortly after agent speaks

I’m integrating OpenAI’s realtime speech-to-speech API into a FreeSWITCH module. The problem I’m having is that OpenAI immediately closes the socket connection after the agent begins speaking. Specifically the websocket gets closed by the far end (openAI’s side). The agent is in the process of emitting “response.audio_transcript.delta” events, then poof the socket gets killed.

Oddly, this problem DOES NOT happen when I connect to Azure’s implementation of the realtime API.

I’ve looked through the docs but don’t find anything relating to why OpenAI would be closing the socket. Can someone from OpenAI please comment on what conditions will cause the server to close the socket connection?

Thanks!

3 Likes

This was due to a bug in how libwebsocket handles bidirectional communication flow. For now, patching it in our docker container.

4 Likes

Welcome to the community.

Thanks for coming back to let us know your issue.

I’m going to move this to API as it sounds like it was a bug with libwebsocket?

Again, thanks for coming back to let us know.

And hope you stick around. We’ve got a wealth of information here if you search around a bit.

1 Like

Hello, I am currently working on AI intelligent outbound calls and plan to integrate Azure’s real-time API into FreeSWITCH. I would like to know how you integrated it—did you modify the underlying FreeSWITCH modules? Thank you for your guidance.

Are you able to share more about your patch? I’m running into the same issue myself.

1 Like

Please share how we can patch libwebsocket or root cause while waiting for new release. Thank you so muc!

This should now be fixed, thanks for taking the time to flag!

1 Like