I’m super excited for the promise of the realtime API, but after tinkering with it for several days, I’ve concluded that I won’t be able to use it for a serious project.
I have found it very difficult to customize its behavior, no matter how much prompting and chaining of various events. It’s just too unpredictable to put in production.
I’m wondering: has anyone built a serious project with the API? If so, how were you able to control the behavior?
I’ve noticed the same thing. I’ve also noticed it can be buggy sometimes, which I guess is expected because on the docs it says it’s still in beta. Have you played around with other solutions like elevenLabs or vapi? They definitely have more features to customize behavior and evaluate outputs, but are significantly slower than OAI realtime - probably because the realtime api is audio to audio while others are still chaining audio, TTS, & SST with an LLM.
I did try 11labs, but like you I found it was higher latency (for the reasons you mentioned) and I wasn’t able to achieve the desired behavior there either as it seemed to have its own quirks. It doesn’t seem as flexible as the Realtime API either. I’ll take another look though given your recommendation. Thanks!
We’re using the realtime API in production to handle incoming phone calls with AnswerPal. It answers all incoming calls for iPower.
We’ve found that it works really well when callers talk to the AI the same way they would to a human operator. It does not work as well, though, when people do realize they’re speaking to an AI but don’t know what to say (often because they’re surprised), or when they speak in one-word fragments as if they were searching on Google.
We have also handled lengthy conversations with returning customers who have specific questions about ongoing projects—here, the AI can fetch relevant project details to provide comprehensive answers. That said, we still see that AI hasn’t been fully embraced by everyone here in Belgium; most people aren’t sure how to proceed once they realize they’ve reached an AI.