Dreaded 429 rate limit errors when our usage is well-under the limits

bluepeter · May 13, 2023, 11:37pm

We are getting a flood of errors like this:

 response: {
    status: 429,
    statusText: 'Too Many Requests',
    headers: {
      'x-ratelimit-limit-requests': '3500',
      'x-ratelimit-limit-tokens': '90000',
      'x-ratelimit-remaining-requests': '3251',
      'x-ratelimit-remaining-tokens': '3964',
      'x-ratelimit-reset-requests': '4.253s',
      'x-ratelimit-reset-tokens': '57.357s',
    },

We then check our usage for that period in the usage tab and find (for the entire hour):

8:25 PM

gpt-3.5-turbo-0301, 25 requests

4,832 prompt + 853 completion = 5,685 tokens

8:30 PM

gpt-3.5-turbo-0301, 4 requests

1,880 prompt + 298 completion = 2,178 tokens

8:35 PM

gpt-3.5-turbo-0301, 1 request

562 prompt + 92 completion = 654 tokens

8:45 PM

gpt-3.5-turbo-0301, 42 requests

8,012 prompt + 1,611 completion = 9,623 tokens

8:50 PM

gpt-3.5-turbo-0301, 49 requests

9,385 prompt + 1,736 completion = 11,121 tokens

8:55 PM

gpt-3.5-turbo-0301, 68 requests

12,612 prompt + 2,874 completion = 15,486 tokens

In other words, we are nowhere near the 90k token limit for the entire HOUR, let alone a 1 minute period.

Anyone have any clues? We’re banging our heads here, wondering if OpenAI somehow has a more granular second-based rate limit? (e.g., 9000 / 60 = 1500 tokens max per second??)

rehangupta82 · May 15, 2023, 11:52pm

Same is happening with us. We are well under the limits but receiving 429 very often

trncn · May 30, 2023, 6:03pm

I too am getting the same error while only having used 10 requests, with the max amount of total tokens being less than 200, over the course of minutes.

I don’t think the content of the prompts should be relevant, but my task is similar with only a few words changed at a time.

It appears from the billing page that I am have registered my credit card.

helloateric · June 3, 2023, 6:56pm

UPDATE: This StackOverflow answered it for me - your free credits expire after 3mo, so if you are on a free key and see 429 regardless of your usage, that’s why. I can’t include a link but this is the SO answer: a/75898717/5298555

Im also seeing this issue (free API key). From the rate limit details documentation, they use the 429 error code for three different semantic reasons:

Rate limit exceeded (RPM, TPM)
Account quota exceeded (monthy, billing)
Service under too much load

Im suspecting (3) is the culprit, and the status.openai page is just vague enough to make it tough to diagnose exactly. But process of elimination points to excess service load.

Id be interested to hear more authoritative reasoning on this, though.

For what it’s worth im hitting this issue intermittently for prolonged periods (over 15m in duration) and on model endpoints marked beta in the docs (whisper, specifically).

Topic		Replies	Views
429 that should be a 500 in chat endpoint (GPT-4) API	6	1497	December 18, 2023
429 error on gpt-3.5-turbo model with paid account API	6	1923	December 15, 2023
429 rate limit error without reaching rate limit API	8	10226	December 18, 2023
429 Max Quota issue when only 35% of Quota is used API gpt-4 , api	7	435	July 15, 2024
429: insufficient_quota - When enough quota and credits are left API gpt-4 , chatgpt , bug , api	11	3418	August 31, 2024

Dreaded 429 rate limit errors when our usage is well-under the limits

Related topics