Gpt-4-1106-preview in Playground needs some fixes

Hello everyone!
Like everyone else here, I rushed to try out the API version of gpt-4-1106-preview.
I was excited when I saw on the playground screen that the maximum_length that could be set was 119999!
It seems like a dream come true to me.
I immediately tried to feed it a command that I use with other models consisting of 14K tokens, but the lovely response I received was the one you see in the pic attached (that the limit of tokens per minute of gpt-4 is stuck at 10k).
I then tried with a shorter command and discovered that the maximum parameter of max_tokens must still be 4096.
Yes, the model is gpt-4-1106-preview.
All this despite it being said that “GPT-4 Turbo is available for all paying developers to try by passing gpt-4-1106-preview in the API and we plan to release the stable production-ready model in the coming weeks”.
I am infinitely sad about this, but probably something needs to be fixed in the playground. I’ll try via CLI and I’ll keep you updated.


I’m tier 3, according to this it gives me 80k limit for gpt4, but the error says 40k, so I guess would need to go up one tier to tier 4 which would give me 300k limit with gpt4 or maybe 150k with the new model then… seems like they haven’t added the new model(s) to this page

1 Like

Running into the 10k limit as well. Also, I’m having some issues with file uploading. I keep getting:

When uploading anything larger than a few kb, regardless of it’s actual file type.

1 Like

According to the official page, there are tokens per minute (TPM) limitations based on how much you have already paid to use the API. This is the summary:

  • Tier 0: 10,000 TPM
  • Tier 1 ($5 paid): 20,000 TPM
  • Tier 2 ($50 paid and 7+ days since first successful payment): 40,000 TPM
  • Tier 3 ($100 paid and 7+ days since first successful payment): 80,000 TPM
  • Tier 4 ($250 paid and 14+ days since first successful payment): 300,000 TPM
  • Tier 5 ($1,000 paid and 30+ days since first successful payment): 300,000 TPM

Apparently to be able to use the full window of 128k tokens (max GPT4 Turbo) you need to be Tier 4+.


This seems to apply for the normal GPT-4 and maybe the 32k (?), but for me I can only use half of the limit (40k) I have in that list for the new model at tier 3~

It would be very good if OpenAI would change it so that, if I have a 40k/minute limit, then if I make a 120k request, I would need to wait 3 minutes until the next request… instead of this…

Yeah, same problem here. True limit is half of what it should be. The official page talks about the limits for gpt-4 (not gpt-4-1106-preview), but even comparing gpt-4 when I log into my account, the limits described in my Tier are half of what they should be.

1 Like


According to the documentation the model gpt-4-1106-preview max context length is 128,000 tokens.
However, when I use the api it returns this error.

“error”: {
“message”: “Rate limit reached for gpt-4-1106-preview in organization org-caigTai6iXXJrP5PXEM04Hd0 on tokens per min. Limit: 40000 / min. Please try again in 1ms. Visit OpenAI Platform to learn more.”,
“type”: “tokens”,
“param”: null,
“code”: “rate_limit_exceeded”

so, either the documentation is wrong, or the endpoint is wrong.



This is the reason:

You are probably Tier 2 (40,000 TPM limit)


All tiers: designed not sufficient to offer GPT-4 large context services to anyone but yourself.

Looking at it another way: GPT-4-turbo-preview rate limit is $180 of input per hour max (or half if remaining halved)

I am Tier 4, and even though I select gpt-4-1106-preview in the playground, the maximum length can only be set to 4095 :frowning:


Output length is still limited 4095 with gpt-4-turbo. Context (input) should have increased.


There is a hard limit on the response of 4096 regardless of the tier. GPT-4 Turbo | OpenAI Help Center

1 Like

All of the chat models currently have limited output sliders.

I can see a few practical reasons:

  • Too many new users misunderstand and specify the context length and can’t make API work;
  • Setting is corresponding to what these models will actually produce, because they have been trained to output curtailed lengths to keep ChatGPT users in line.
  • if 128k decides to go into a repeat loop, it can cost you several dollars
1 Like

I uploaded a jpg file and got message below:

Unfortunately, I cannot directly view or interact with file uploads or their content.

Am I missing something? …otherwise what is the point of “uploading an image”?

Where are you seeing the ability to add a file in the playground? Are you on the new “assistants” drop down option or the “chat” drop down?

Could you show a screenshot?

Yeah it’s the Assistants option

too many tokens

And, while file size limit may be 512MB, there is still apparently a token limit on the text within the file. Looks like it’s 2,000,000. I resolved it by breaking the file into two parts.

1 Like

how do you even upload a file? Im getting error all the time, whether its png, jpg, txt, pdf, csv doesnt matter the size

You can see the upload button in the screenshot here: Gpt-4-1106-preview in Playground needs some fixes - #16 by bleugreen

I have only tried uploading pdfs, and the only time they have failed is when they exceeded the token limit.

By reading this thread I got some questions answered already. What still confuses me is the max file limit. I tried to create a GPT acting basically as a knowledge provider by giving it some PDFs of our company data but some PDFs are really large. It seems retrieval always needs to scan the whole PDF and can’t be pointed to a specific page, even when providing a table of contents and a keyword index.

So either the embedding/vectorization/search index should be improved or the file limit of 20 should be removed to upload one PDF per page and telling the system prompt to grab the table of contents to search for the respective page in a pdf named “Page_XXXX.pdf”.

Edit: Since I am also limited by current Tier limits I tried using the more flexible gpt3.5-1106 and it could more or less handle the big files BUT it just stopped at some point and I had to remind it to actually answer my questions. This felt as if it had to handle a lot of context and just forget what I wanted from it due to having to scan the whole large file.