Hitting Rate Limit with small group of Users?

jxl38 · May 17, 2023, 5:32am

I have a decent amount of traffic to my application, but nowhere near the rate limit of 3,500 RPM or 90,000 tokens a minute. However, I still regularly get the 429 “Too Many Requests” error. I have the means to scale quickly, but I’m afraid the real rate limits are not actually what’s posted in our organization dashboard. Does anyone else encounter this?

firtina · May 17, 2023, 7:26am

@jxl38

I’m in the same boat. I think a lot of the times, they’re counting “how many requests are being sent by everyone” instead of “by you”.

I guess we just have to be patient

daviddoswell · May 17, 2023, 7:33am

I think we can request an increase. I’m currently having the same issue and am cautious about inviting more people @jxl38

daviddoswell · May 17, 2023, 7:39am

Addendum: except with GPT-4

"During the limited beta rollout of GPT-4, the model will have more aggressive rate limits to keep up with demand. Default rate limits for gpt-4/gpt-4-0314 are 40k TPM and 200 RPM. Default rate limits for gpt-4-32k/gpt-4-32k-0314 are 80k TPM and 400 RPM.

We are unable to accommodate requests for rate limit increases due to capacity constraints. In its current state, the model is intended for experimentation and prototyping, not high volume production use cases."

nishanthvijayan · May 17, 2023, 8:11am

Facing the same issue. Just making 4-5 sequential requests (with 1k prompt tokens max, 200-300 response tokens max), and we’re being hit by rate limit. [model GPT-4]

Ryan-Portal · May 17, 2023, 8:22am

Same issue here - has only really picked up the last few days. I have been getting 429s on every third request or so but only consuming 5 calls per request and max 750 tokens across all 5 calls (have confirmed this through our usage dashboard). We have been PAYG customers for several months now, so not a function of being new / free user. Please help! OpenAI

storywriter · May 17, 2023, 12:27pm

Last 12 hours or so, I’m getting 429 errors nearly 100% of the time for GPT-4 chat/completions requests. The error message is That model is currently overloaded with other requests. You can retry your request, or contact us through our help center at help.openai.com if the error persists. (Please include the request ID 93bd.... in your message.)

My rate of requests for GPT-4 (and in general) is far lower than my account rate limits. Based on the message, it looks like a general problem with the GPT-4 model rather than any specific thing I’m doing on my side. The status page is green, so I’m wondering if OpenAI is aware of this issue. I tried the help center, but that just let me report the issue to a bot with a 1 week response time. Does anyone know if there is another way to report API outages?

jxl38 · May 17, 2023, 4:00pm

Also, to be clear, I’m using gpt-3.5-turbo. GPT-4 is completely unusable for production applications. Got a lot of one star reviews because of the aggressive rate limit on gpt-4. Removed it from the application altogether.

BrianLovesAI · May 19, 2023, 4:01am

Yeah, I am facing that sometimes too…

You might see these errors one day as well.

Error 1.
That model is currently overloaded with other requests. You can retry your request, or contact us through our help center at help.openai.com if the error persists. (Please include the request ID xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx in your message.)

Error 2.
The server is currently overloaded with other requests. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if the error persists.

N2U · May 19, 2023, 8:10am

Just for clarification:

The error “Too Many Requests" is indeed related to the rate limit, I’ll suggest implementing exponential backoff. As mentioned earlier it’s not possible to have your rate limit increased

The error “That model is currently overloaded with other requests” is exactly that, it means that OpenAI’s servers are currently overloaded and unable to process your request, it has nothing to do with your personal rate limit

Think of this as a temporary issue while OpenAI is still trying to scale their infrastructure, the problem will go away by itself.

daviddoswell · May 21, 2023, 2:57am

Just an update to the OpenAI team: I don’t seem to be rate limited anymore(?) and even ChatGPT-4 is faster and not rate limiting me. Bravo to the team for working hard on this stuff.

codie · May 21, 2023, 5:44am

If you guys have that much traffic I’d definitely move to something more scalable. You should check out Microsoft Azures version of OpenAI. You’ll have to apply. The application is free. They will ask you about your company, what you are trying to do, etc. I got approved in a little less than a week if I remember correctly.

Azure OpenAI Service – Advanced Language Models | Microsoft Azure

I don’t know if OpenAI intends to be an AI service provider or just a research company. But I would definitely recommend getting onto something with higher availability if you have that much usage.

firtina · May 21, 2023, 10:11am

How do the OpenAI direct vs azure models compare and differ in terms of latency and functionality?

codie · May 21, 2023, 5:55pm

@keith_knox2 has some experience migrating his project from OpenAI to Azure so maybe he can chime in on the differences he has seen in terms of performance. In terms of functionality, the only thing that you will be unfamiliar with is the end point and the way Azure lets you access “resources”. The API commands themself and the responses are the same, so if you have a good setup, you can just make a Azure interface for your code.

Azure OpenAI Service REST API Reference

Topic		Replies	Views
RateLimitErrors increased drastically in the last month? API gpt-4 , api	3	632	May 23, 2023
Error: 429 Too Many Requests API	56	13598	December 2, 2023
The error message of "That model is currently overloaded with other requests. " using gpt-3.5-turbo API	10	6411	December 18, 2023
Dreaded 429 rate limit errors when our usage is well-under the limits API gpt-4 , gpt-35-turbo , chatgpt , account-problem , api	3	2998	June 3, 2023
I don't know where where my tokens are being used. I think it is wrong API gpt-4 , api , gpt-4-turbo	12	1822	December 10, 2023

Hitting Rate Limit with small group of Users?

Related Topics