Hi Everyone,
Currently am working on the language translation between 2 callers using Twilio and open ai real time, using Twilio am fetching the audio stream and pushing the audio stream to openai websocket as below.
Coming to my open ai socket this is how I am sending the session update
And the prompt I used to make the language translation is as follows:
Coming to the issues that I’m facing now are:
- Delay in the response from the open ai.
- During the conversation between the callers, sometimes, instead of translating the audio it is getting into conversational mode, which is causing lot of confusions.
- Though I mentioned the specific source language in the prompt it is transcribing to some other languages, again this doesn’t happen all the time.
- In my scenario, am I expected to receive the below 2 events because at the moment I’m not receiving them sometimes and I suspect this could be one of the reason but I’m not sure though.
“conversation.item.input_audio_transcription.completed” and “response.audio_transcript.done”.
NOTE: I’m sending the session updates for 3 seconds.
Can anyone guide me out in addressing the issues what I am facing.