Is there a limit to users accessing assistants?

I am trying to build an application that multiple users (1000+) can access at the same time. The way I am approaching this is to create a single assistant that has its instructions. Then every time a user wants to use it and get info or chat with it, I create a thread for that particular session. This would mean, at any given time, there could be 1000 threads that a single assistant would be working with and more as users grow.

I have looked at the rate limit documentation, and while my organization will increase the tier as we grow, my main point of concern is would a single assistant be enough to handle this? Or should I create multiple assistant with some kind of load balancing logic of my own when calling the APIs?

It simply cannot be deployed to users in its current state, unless you are ready to bill them $1+ per input just like OpenAI will bill you when the assistant goes out of control or is made to do exactly what the user wants recursively, with a price you can’t programmatically retrieve. And that you can’t make enough requests to support that user base:

During this beta, there are several known limitations we are looking to address:

  • 60 req/min limit at the user account level.

Hey I’m looking at doint the exact same: same assistant , but different threads. Is there a workaround by now? Because I have been trying with only 5 users and it wasn’t working very well parallelly