Is the "output (Maximum length)" for the GPT-4-1106-preview API still capped at 4095?

yk.kazuyuki · November 8, 2023, 3:30pm

Hello!
Thank you for announcing the GPT-4-1106-preview API. I’m very moved and grateful. There’s talk that it can process a context of 128k, but is the output still limited to 4095 tokens?

From what I have researched so far, there are two pieces of information:

Unless an OpenAI account reaches Tier 4, the 300,000 TPM (tokens per minute) limit cannot be utilized, which means that without reaching Tier 4, one cannot test the capability of producing outputs around 120k.
Regardless of the tier reached, the output is capped at 4095 tokens.
Which of these pieces of information is correct? I would like to know.

_j · November 8, 2023, 3:52pm

The output was never capped. Someone is just looking at a slider control in the API playground that doesn’t go higher than the old model.

The playground slider might not go higher right now. The playground is also not where API developers that write their own software interact with AI models.

API users have a rate limit, the amount that they can send and request from AI models. If you make a request where the size of a single API request is bigger than your rate limit for an entire minute of use, then you will be blocked until you re-do your inputs or max_token request, or receive an upgrade through your purchase of more credit from OpenAI.

wout.vandenwijngaert · November 15, 2023, 1:18pm

Sorry, but you are wrong. There is a real 4k output limit. It even says it in the docs here: OpenAI Platform

It is a shame, though. Having a massive 128k context window but only 4k generations. Really limits the possibilities.

_j · November 15, 2023, 1:25pm

Yes, I was wrong a week ago: that is new information that was clarified by OpenAI: the new model will never make more than 4k.

Another limitation for you to find (beside rate limits) - API endpoint returning an error when you try to send more than 32768 characters to any model.

Topic		Replies	Views
Max token output for GPT-4 (Non-Turbo)? API gpt-4 , gpt-4-turbo	2	6269	January 26, 2024
Gpt-4-1106-preview in Playground needs some fixes API gpt-4 , playground	24	17126	February 5, 2024
Why is gpt-3.5-turbo-1106 max_tokens limited to 4096? API	3	13898	January 11, 2024
GPT 4 Turbo is limited to 4K? API gpt-4	16	13954	April 9, 2024
GPT-4 128K only has 4096 completion tokens API gpt-4	9	26889	February 27, 2024

Is the "output (Maximum length)" for the GPT-4-1106-preview API still capped at 4095?

Related topics