Dreaded 429 rate limit errors when our usage is well-under the limits

We are getting a flood of errors like this:

 response: {
    status: 429,
    statusText: 'Too Many Requests',
    headers: {
      'x-ratelimit-limit-requests': '3500',
      'x-ratelimit-limit-tokens': '90000',
      'x-ratelimit-remaining-requests': '3251',
      'x-ratelimit-remaining-tokens': '3964',
      'x-ratelimit-reset-requests': '4.253s',
      'x-ratelimit-reset-tokens': '57.357s',
    },

We then check our usage for that period in the usage tab and find (for the entire hour):

8:25 PM

gpt-3.5-turbo-0301, 25 requests

4,832 prompt + 853 completion = 5,685 tokens

8:30 PM

gpt-3.5-turbo-0301, 4 requests

1,880 prompt + 298 completion = 2,178 tokens

8:35 PM

gpt-3.5-turbo-0301, 1 request

562 prompt + 92 completion = 654 tokens

8:45 PM

gpt-3.5-turbo-0301, 42 requests

8,012 prompt + 1,611 completion = 9,623 tokens

8:50 PM

gpt-3.5-turbo-0301, 49 requests

9,385 prompt + 1,736 completion = 11,121 tokens

8:55 PM

gpt-3.5-turbo-0301, 68 requests

12,612 prompt + 2,874 completion = 15,486 tokens

In other words, we are nowhere near the 90k token limit for the entire HOUR, let alone a 1 minute period.

Anyone have any clues? We’re banging our heads here, wondering if OpenAI somehow has a more granular second-based rate limit? (e.g., 9000 / 60 = 1500 tokens max per second??)

5 Likes

Same is happening with us. We are well under the limits but receiving 429 very often

1 Like

I too am getting the same error while only having used 10 requests, with the max amount of total tokens being less than 200, over the course of minutes.

I don’t think the content of the prompts should be relevant, but my task is similar with only a few words changed at a time.

It appears from the billing page that I am have registered my credit card.

1 Like

UPDATE: This StackOverflow answered it for me - your free credits expire after 3mo, so if you are on a free key and see 429 regardless of your usage, that’s why. I can’t include a link but this is the SO answer: a/75898717/5298555


Im also seeing this issue (free API key). From the rate limit details documentation, they use the 429 error code for three different semantic reasons:

  1. Rate limit exceeded (RPM, TPM)
  2. Account quota exceeded (monthy, billing)
  3. Service under too much load

Im suspecting (3) is the culprit, and the status.openai page is just vague enough to make it tough to diagnose exactly. But process of elimination points to excess service load.

Id be interested to hear more authoritative reasoning on this, though.

For what it’s worth im hitting this issue intermittently for prolonged periods (over 15m in duration) and on model endpoints marked beta in the docs (whisper, specifically).

2 Likes