According to the official page, there are tokens per minute (TPM) limitations based on how much you have already paid to use the API. This is the summary:
- Tier 0: 10,000 TPM
- Tier 1 ($5 paid): 20,000 TPM
- Tier 2 ($50 paid and 7+ days since first successful payment): 40,000 TPM
- Tier 3 ($100 paid and 7+ days since first successful payment): 80,000 TPM
- Tier 4 ($250 paid and 14+ days since first successful payment): 300,000 TPM
- Tier 5 ($1,000 paid and 30+ days since first successful payment): 300,000 TPM
Apparently to be able to use the full window of 128k tokens (max GPT4 Turbo) you need to be Tier 4+.
2 Likes
This seems to apply for the normal GPT-4 and maybe the 32k (?), but for me I can only use half of the limit (40k) I have in that list for the new model at tier 3~
It would be very good if OpenAI would change it so that, if I have a 40k/minute limit, then if I make a 120k request, I would need to wait 3 minutes until the next request… instead of this…
Yeah, same problem here. True limit is half of what it should be. The official page talks about the limits for gpt-4 (not gpt-4-1106-preview), but even comparing gpt-4 when I log into my account, the limits described in my Tier are half of what they should be.
1 Like
aitest
7
Hi.
According to the documentation the model gpt-4-1106-preview max context length is 128,000 tokens.
However, when I use the api it returns this error.
{
“error”: {
“message”: “Rate limit reached for gpt-4-1106-preview in organization org-caigTai6iXXJrP5PXEM04Hd0 on tokens per min. Limit: 40000 / min. Please try again in 1ms. Visit OpenAI Platform to learn more.”,
“type”: “tokens”,
“param”: null,
“code”: “rate_limit_exceeded”
}
}
so, either the documentation is wrong, or the endpoint is wrong.
Thanks
2 Likes
This is the reason:
You are probably Tier 2 (40,000 TPM limit)
2 Likes
_j
9
All tiers: designed not sufficient to offer GPT-4 large context services to anyone but yourself.
Looking at it another way: GPT-4-turbo-preview rate limit is $180 of input per hour max (or half if remaining halved)
pxp121
10
I am Tier 4, and even though I select gpt-4-1106-preview in the playground, the maximum length can only be set to 4095 
5 Likes
Output length is still limited 4095 with gpt-4-turbo. Context (input) should have increased.
4 Likes
irad
12
There is a hard limit on the response of 4096 regardless of the tier. GPT-4 Turbo | OpenAI Help Center
1 Like
_j
13
All of the chat models currently have limited output sliders.
I can see a few practical reasons:
- Too many new users misunderstand and specify the context length and can’t make API work;
- Setting is corresponding to what these models will actually produce, because they have been trained to output curtailed lengths to keep ChatGPT users in line.
- if 128k decides to go into a repeat loop, it can cost you several dollars
1 Like
rplay
14
I uploaded a jpg file and got message below:
Unfortunately, I cannot directly view or interact with file uploads or their content.
Am I missing something? …otherwise what is the point of “uploading an image”?
Where are you seeing the ability to add a file in the playground? Are you on the new “assistants” drop down option or the “chat” drop down?
Could you show a screenshot?

And, while file size limit may be 512MB, there is still apparently a token limit on the text within the file. Looks like it’s 2,000,000. I resolved it by breaking the file into two parts.
1 Like
how do you even upload a file? Im getting error all the time, whether its png, jpg, txt, pdf, csv doesnt matter the size
You can see the upload button in the screenshot here: Gpt-4-1106-preview in Playground needs some fixes - #16 by bleugreen
I have only tried uploading pdfs, and the only time they have failed is when they exceeded the token limit.
Anphex
21
By reading this thread I got some questions answered already. What still confuses me is the max file limit. I tried to create a GPT acting basically as a knowledge provider by giving it some PDFs of our company data but some PDFs are really large. It seems retrieval always needs to scan the whole PDF and can’t be pointed to a specific page, even when providing a table of contents and a keyword index.
So either the embedding/vectorization/search index should be improved or the file limit of 20 should be removed to upload one PDF per page and telling the system prompt to grab the table of contents to search for the respective page in a pdf named “Page_XXXX.pdf”.
Edit: Since I am also limited by current Tier limits I tried using the more flexible gpt3.5-1106 and it could more or less handle the big files BUT it just stopped at some point and I had to remind it to actually answer my questions. This felt as if it had to handle a lot of context and just forget what I wanted from it due to having to scan the whole large file.
2 Likes
Same problem. I want to translate books but I had to split it in 80 parts first. The book was around 260.000 tokens. I would love to split into only two parts
The 128k context window is misleading, because that’s not the in or output limit as I understand it.
Currently I am limited to 4096 tokens max output using the gpt-4-1106-preview API model.
1 Like
_j
23
Yes, this is lame, but is also exactly what costs actual money that they have curtailed already in AI training: quality attention heads go quadratic with longer lengths
openai.error.InvalidRequestError: max_tokens is too large: 75000. This model supports at most 4096 completion tokens, whereas you provided 75000.
So for input/output tasks like checking spelling, this model gains you nothing except lower quality.
edit, confirmed:
2 Likes
dan912
24
I am in usage tier 4 but am still receiving a 32k limit in the gpt4turbo assistant playground. How do I get to to the 128k limit?