Token per minute rate limit for GPT4 issues

georgeai · September 8, 2023, 2:17pm

Hello! I am using the GPT4 API on Google Sheets, and I constantly get this error: “You have reached your token per minute rate limit…”. I checked the documentation and it seems that I have 10,000 Tokens Per Minute limit, and a 200 Requests Per Minute Limit.
Is it my idea or is the 10,000 token per minute limitation very strict? Do you know how to increase that, or at minimum, control it in a more efficient way so it doesn’t break my entire workflow?

_j · September 8, 2023, 4:08pm

It would take gpt-4 far over a minute to generate 10000 output tokens, so the issue is likely how much input you are providing that counts towards the token per minute count.

Consider: if you send 6000 tokens of input (and even get a quick short answer), you can’t do that again in the same minute.

Rate increase requests can be made, but approval probably needs a company, a desired application of AI, and payment history (and gpt-4 capacity).

georgeai · September 8, 2023, 5:16pm

Thanks, that makes sense. Do you know if the max tokens in the input are calculated towards this limit or if only the actual token input size matters? For example, if my max token input is 2k tokens, but my actually input is only 100 tokens, which of the two numbers gets accounted for in the TPM ?

_j · September 8, 2023, 5:24pm

Yes, max tokens are also counted and a single input denied if it comes to over the limit. You can get a rate limit without any generation just by specifying max_tokens = 5000 and n=100 (500,000 of 180,000 for 3.5-16k).

The rate limit endpoint calculation is also just a guess based on characters; it doesn’t actually tokenize the input.

georgeai · September 8, 2023, 5:41pm

Wow, that is very useful to know. This knowledge alone saved me a lot of money and pain. Thanks!!

_j · September 8, 2023, 9:29pm

You can just omit max_tokens as a parameter, and it then can’t count them upon submissions. All the remaining model context length after the input can then be used for writing output.

EricGT · December 22, 2023, 9:58am

As this topic has a selected solution, closing topic.

Topic		Replies	Views
Inputs tokens limit, data extraction API gpt-4 , gpt-35-turbo , api , token , rate-limit	2	6168	February 3, 2024
Rate limit reached for 10KTPM-200RPM API gpt-4 , gpt-35-turbo	10	6437	October 24, 2023
Please explain the Tokens per minute metric API	1	5386	January 21, 2024
How do I get token limit per MINUTES and when it will reset? API gpt-4	2	2317	December 30, 2023
Reproducable GPT4 Rate limit bug Bugs	5	1050	November 8, 2023

Token per minute rate limit for GPT4 issues

Related topics