Why is gpt-3.5-turbo-1106 max_tokens limited to 4096?

spyderman4g63 · November 11, 2023, 12:09pm

At some point gpt-3.5-turbo-1106 has switched to limiting max_tokens to 4096. I believe it should be higher since I think max_tokens includes both prompt and generation tokens. gpt-3.5-turbo-1106 should have a prompt limit of 16,385 tokens if I am correct.

I used to be able to pass at least 10,000 max_tokens but now I get an error. I do understand that it will only return a maximum of 4,096 output tokens but I believe max_tokens should be higher.

Has anyone else seen this issue?

edit: I am now seeing in the doc:

max_tokens = The maximum number of tokens to generate in the chat completion.

Does it no longer include prompt tokens?

mooihi · November 11, 2023, 1:25pm

Yes, all the 1106 models ie gpt-3.5 & gpt-4 have this limitation. I am wondering if this has to do with capacity issues. I am hoping that the non preview/production version would remove this limitation, but it is a long shot I think. These new models seem like 1 step forward 2 step backwards. But of course, for the majority of use cases, 4096 tokens output maybe ample.

nikhil.gajghate · January 11, 2024, 3:35pm

Hey, although the original post and the comment make sense, I just wanted to
ask for a quick confirmation.

I am using gpt-3.5-turbo-1106 (I can use gpt-4-1106-preview too) with the chat.completions API. The response_format parameter in my API call is set to return a JSON object. I’ve noticed that the JSON output is limited to 4096 tokens in my use case. Does this mean GPT 3.5 and GPT 4 can only produce a maximum of 4096 tokens with the chat.completions.api?

Foxalabs · January 11, 2024, 4:06pm

GPT-4-Turbo and GPT-3.5-Turbo are limited to 4k output tokens, GPT-4-Turbo API is limited to 128k input tokens

https://help.openai.com/en/articles/8555510-gpt-4-turbo

Topic		Replies	Views
Since the token limit is 4096 for GPT4, how does 8k and 32k model make a difference? API	4	2318	December 15, 2023
GPT-4 128K only has 4096 completion tokens API gpt-4	9	27346	February 27, 2024
Is the "output (Maximum length)" for the GPT-4-1106-preview API still capped at 4095? API gpt-4 , gpt-4-turbo	3	7637	November 15, 2023
Chat GPT4 1106 vs ChatGPT 4: Impressive drop in quality API gpt-4 , chatgpt	27	15618	February 14, 2024
Only allowed to set max_tokens to 4095 API	4	641	May 17, 2024

Why is gpt-3.5-turbo-1106 max_tokens limited to 4096?

Related topics