TPM Rate limited exceeded - why?

Situation:

  • I have read many threads here on this topic, but still the following basic question remains unanswered for me (and I believe for many other users too).
  • My token usage during the course of December has reached 10Million tokens on gpt-4o-mini and I am getting the message “rate_limited_exceeded”. Consequently, I can’t send requests anymore even for requests with a low number of total tokens.
  • My limit (I am at Tier4) is indeed 10,000,000 TPM, but that applies on tokens per minute (TPM) according to the documentation.
  • By the way, my total costs in Dec is only about $13.

Question:

  • Why I am exceeding a limit and which one? Again, it is a per minute limit, but appears to be a limit for the entire month.
  • If I would have indeed exceeded any TPM limit, it should be freed up shortly, let’s say at least within an hour, but it does not. Why?
1 Like

Hi and welcome to the community!

Have you looked into the rate limit headers to learn more about the situation?

https://platform.openai.com/docs/guides/rate-limits#rate-limits-in-headers

This is a relatively new feature. I expect there are not many old topics referencing it as a debugging tool.

Check your current account balance. Out of funds?

https://platform.openai.com/settings/organization/billing/overview

It might be a rate limit error type, but the message text may continue “check your plan and billing details…”

Auto-recharge is not working, likely disabled due to even more severe problems of the function making unneeded charges to a card.