We’re experiencing difficulties managing multiple simultaneous requests in real-time voice-to-voice interactions. When handling multiple users or rapid exchanges, some responses get delayed or even dropped, affecting the overall experience.
Has anyone found an effective way to queue or prioritize requests while maintaining real-time responsiveness? Any suggestions on best practices for handling concurrency with the OpenAI real-time API?
Using a similar Twilio + Realtime API solution and found that the most # of concurrent requests we can get running is 2. I wonder if this is some sort of rate limit on the Twilio side or if it is an issue with Realtime, but let me know if you come across anything.