Hey devs, I am using openAi realtime from past 4 months. Firstly I used Twilio to connect the realtime api with the users.
However our company based in India and twilio doesn’t provide indian phone numbers I moved to some other Phone calls providers like Plivo for the same use case.
Now the problem that I’m facing is that I’m using G711_ulaw format of Plivo with openAi realtime api however, the model is unable to understand the user’s voice. And acts on its own by assuming that user’s is saying according to the flow of usecase that we provide in instruction.
There is a stream of mute voice that I’m getting from the plivo’s end when user is not saying anything. If I stop sending those mute voice payload to the Plivo and only send the voice of the user, It works well. But in that case model is unable to understand that when user stopped speaking and takes 3-4 extra seconds to give the response back.
Another problem that I’m facing is that on every 4th-5th call. The model stops sending the voice or json to me. I’m unable to understand that why this is happening.
Anybody who can help?