Only allowed to set max_tokens to 4095

kno · May 17, 2024, 2:13pm

I cannot understand why I can only set max_tokens to 4095 when the documentations says that gpt-4o and many of the other models have much larger context windows?

PaulBellow · May 17, 2024, 2:16pm

Welcome to the dev forum.

There’s a difference between input and output. If you’re lucky, you can get 4095 out on applicable models.

Hope this helps.

kno · May 17, 2024, 2:23pm

But should´nt I be able to have a context window of 128k tokens. If I do the exact same query inside chatgpt it works, but not with the api. It stops because of max-tokens.

PaulBellow · May 17, 2024, 2:24pm

What model are you using?

Are you getting an error? Just not as much content?

What are you trying to accomplish?

_j · May 17, 2024, 6:24pm

If you do the exact same query in ChatGPT, you are getting max_tokens of 1536 or 2048.
That you are satisfied shows you don’t need to set it so high.

We can guess the output limit was set on new models so that platform costs are reduced, or safety is increased in case the AI goes bonkers and wants to write $4.00 of nonsense output – or because the response simply devolves at that length.

You can simply omit this parameter and get the maximum available after sending your input.

Topic		Replies	Views
Why is gpt-3.5-turbo-1106 max_tokens limited to 4096? API	3	13550	January 11, 2024
GPT-4o context window confusion API gpt-4 , api , error , gpt-4o	5	34231	August 4, 2024
Regarding max input tokens of gpt-4o-2024-08-06 API gpt-4	3	849	December 2, 2024
Is the "output (Maximum length)" for the GPT-4-1106-preview API still capped at 4095? API gpt-4 , gpt-4-turbo	3	7325	November 15, 2023
Subject: Issue with Token Limit for `gpt-4o-mini` Model in `v1/chat/completions` API Documentation gpt-4	3	1079	September 3, 2024

Only allowed to set max_tokens to 4095

Related topics