Realtime api phone use case - speaking text

I am having issues with the basic use case of answering a phone call. There is no method that I can find to tell it to speak something, ie “Thanks for calling ABC company, you are on a recorded line, how can I help?” I have the Twilio example working git twilio-samples speech-assistant-openai-realtime-api-node but when I call it, I have to say “hello” to get the AI agent to do it’s normal intro (specified in the system prompt). I also have some cases where I need to have it speak something while calling slower tools, eg “Hold on a second” Has anyone figured this out?

1 Like

have you tried resampling to 24,000hz? I’ve made pull request for this on firefox: Fix Dynamic Sample Rate Detection for Audio Compatibility by mmtmn · Pull Request #7 · openai/openai-realtime-console · GitHub