RateLimitError (429) on Tier-5 Account While Using GPT-4o-mini – Clarification Requested

I’m on Tier-5 and using gpt-4o-mini through the Assistants API, but I keep getting RateLimitError: 429 even though my request volume is very low and should be far below Tier-5 limits.

openai.RateLimitError: Error code: 429 - {
‘error’: {
‘message’: “You’ve exceeded the rate limit, please slow down and try again later.”,
‘type’: ‘invalid_request_error’,
‘param’: None,
‘code’: ‘rate_limit_exceeded’
}
}

This happens when calling:

client.beta.threads.messages.create(thread_id=thread.id, role="user", content=q)

My concern:
Given that I’m on Tier-5 and using a lightweight model (gpt-4o-mini), I was expecting higher limits. But I’m still getting 429 errors from thread message creation.

Can someone clarify:

  1. What specific limit is being hit? (RPM, TPM, message creation rate, or concurrency?)

  2. Does the Assistants API have internal limits that aren’t documented?

  3. Is there any backend throttling happening recently?

  4. Any recommended way to avoid these 429s during long-running batch processing?

Thanks.

“Slow down”? It sounds like you are getting the API call count mechanism that is quite low, not adapting to allow anywhere near the possible usage of models themselves. For example, a dozen concurrent pollings for a response or multiple extra calls in your lifecycle can chew thru what’s alloted.

What are the rate limits for the Assistants API?

The rate limits for the Assistants API are not tied to usage tier and model. Instead, there are default limits by request type, with a couple of exceptions:

  • GET: 1000 RPM
  • POST: 300 RPM
    • POST to /v1/threads/<thread_id>/runs: 200 RPM
    • POST to /v1/threads/runs: 200 RPM
  • DELETE: 300 requests per minute

The exceptions mentioned in the “help” above seem to be the lower runs count, instead of undocumented exceptions. It doesn’t say how each would cross or be pooled, so likely any HTTP call of the type shown is figured into the total, it is not “messages” alone.

It is easy to believe that OpenAI broke the API if you are getting these with individual calls, and don’t have any code gone wild.