OpenAi RealTime API integrated with Plivo to connect phone calls with customers

Hey devs, I am using openAi realtime from past 4 months. Firstly I used Twilio to connect the realtime api with the users.

However our company based in India and twilio doesn’t provide indian phone numbers I moved to some other Phone calls providers like Plivo for the same use case.

Now the problem that I’m facing is that I’m using G711_ulaw format of Plivo with openAi realtime api however, the model is unable to understand the user’s voice. And acts on its own by assuming that user’s is saying according to the flow of usecase that we provide in instruction.
There is a stream of mute voice that I’m getting from the plivo’s end when user is not saying anything. If I stop sending those mute voice payload to the Plivo and only send the voice of the user, It works well. But in that case model is unable to understand that when user stopped speaking and takes 3-4 extra seconds to give the response back.

Another problem that I’m facing is that on every 4th-5th call. The model stops sending the voice or json to me. I’m unable to understand that why this is happening.

Anybody who can help?

Have you tried using Deepgram? They work well for speech-to-text tasks and might be a good alternative. https://deepgram.com/

sometimes i too observe that behaviour by any chance if the audio is not clear, open ai real time responds with some info in lines with instructions we provided. are you facing the issue every time or its intermittent issue?
but majority of times its fine able to understand in my case I used twilio to fetch the audio stream from a phone call, with the same mu-law audio format.