RPM rate limits at 100 when using assistants API

Using open AI python library we are calling assistant API functionality with code such as client.beta.threads.create, messages.create, runs.create, etc

We are frequently running into error code: 429: “you’ve exceeded the 100 request/min rate limit, please slow down and try again.”

Our account is tier 4 which i believe has a 10,000 RPM limit so I am trying to figure out why we are getting this error. In the meantime we have implemented tenacity and such but with the amount of traffic we are getting this does not seem tenable.

Can someone from openAI take a look and see what is going on?

What is going on is that Assistants is beta, and is rate limited by the number of calls to the API endpoints, account-wide. It is not suitable for volume deployment, both because of the artificial limits, and because of the lack of cost accountability and controls.

You actually seem to be showing an increase from 60 requests per minute that is documented.

https://platform.openai.com/docs/assistants/how-it-works/limitations

1 Like

That must be it. Thanks. Wish this was clearer in the documentation.

The limitations are currently undocumented - if you try to follow the link above you will find that it is broken. After going back and forth with a support bot for weeks, a human finally responded with these specs -

  • GET: 1000/min
  • POST: 300/min
  • DELETE: 300/min
  • POST: “/v1/threads/_thread_id/runs”: 200/min
  • POST: “/v1/threads/runs”: 200/min

This API has been in beta for almost a year without any indications of when it might emerge from beta. Can someone from OpenAI give us some insight into the roadmap?