Managing Multiple Simultaneous Requests in Real-Time API

We’re experiencing difficulties managing multiple simultaneous requests in real-time voice-to-voice interactions. When handling multiple users or rapid exchanges, some responses get delayed or even dropped, affecting the overall experience.

Has anyone found an effective way to queue or prioritize requests while maintaining real-time responsiveness? Any suggestions on best practices for handling concurrency with the OpenAI real-time API?

What is your stack: are you using node?

Thanks for the reply, We are using python as our backend.
The implementation is derived from the official twilio integration.

We have enhanced our implementation from this and implemented function calling and the use case we are trying out is a customer care call center.

https://github.com/twilio-samples/speech-assistant-openai-realtime-api-python

Using a similar Twilio + Realtime API solution and found that the most # of concurrent requests we can get running is 2. I wonder if this is some sort of rate limit on the Twilio side or if it is an issue with Realtime, but let me know if you come across anything.