Different gpt-4 level models to mitigate rate limiting issues

kreut · September 13, 2023, 11:07am

Hi,

In my account it says 90,000 TPM. Also, I’m using gpt-4 for all of my requests. Since this model allows only 40,000 TPM, would a potential rate-limiting mitigation strategy be to switch to a different gpt-4 level model such as gpt-4-0314 to give my organization access to an additional 40,000 TPM? Or, is the 40,000 TPM limit for any of the gpt-4 models?

Thanks so much!

Foxalabs · September 13, 2023, 11:12am

Hi and welcome to the Developer Forum!

The rate limits are per organisation, so swapping models would not help in this situation. You’re not the first to have thought of it

kreut · September 13, 2023, 11:23am

Hi,

I understand that the rates are per organization, but if I have 90,000 TPM for chat models for my organization, can I use up 40,000 TPM for gpt-4 and then use up another 40,000 TPM for a different gpt-4 model (gpt-4-0314). Does that make more sense?

_j · September 13, 2023, 11:24am

The rate limits cannot be cross-model, as there are different TPM across different classes of model.

You can look at the decreasing rate limit count in the headers of requests to see which specific models would affect the counts of others.

Foxalabs · September 13, 2023, 11:26am

Yes, it makes perfect sense, but the Rate limit will still kick in at 40k regardless of if you swap model from “gpt-4” to “gpt-0314” or any variation thereof. The whole point of rate limits is to ensure the system remains performant for everyone, so there is a lot of load balancing and careful allocation of new resources to new applications going on, bypassing this would cause problems for everyone.

kreut · September 13, 2023, 11:29am

Got it! And if I switch to a different model altogether such as 3.5, would I able to use those additional TPM up to my total allotted 90K TPM?

Foxalabs · September 13, 2023, 11:33am

I’ve not actually tried, but that would seem reasonable and in keeping with the documentation.

Topic		Replies	Views
Regarding rate limit in multi model API	3	939	October 25, 2023
Organizational API Rate Limit Q: also PER MODEL? Scenarios inside... 👀 API api	1	724	August 15, 2023
Why is my gpt-4 TPM value 10,000 instead of 40,000? API gpt-4 , api , rate-limit	5	2872	September 14, 2023
Inquiry About Maximum Rate Limit for GPT-3.5-turbo-16k Model API api-rate-increase , rate-limit	7	1026	November 1, 2023
Concerns about TPM Limitations in GPT-4 Turbo for Chatbot Applications API gpt-4	7	2962	April 11, 2024

Different gpt-4 level models to mitigate rate limiting issues

Related topics