Max_tokens limits the total tokens used instead of the output tokens

jr.2509 · July 11, 2024, 1:49pm

Welcome to the Forum Nicolas!

A couple of points in response to your issue:

By default, the latest models are limited to 4,096 output tokens independent of the context window size. So this is the absolute maximum you could yield. The amount of output tokens the model can theoretically return is furthermore influenced by the amount of your input tokens. Meaning, in the case of gpt-3.5 if you were to provide 14,000 input tokens, then there would only be 2,385 tokens available for output etc.
In practice, the model rarely ever returns the full amount of 4,096 output tokens. Besides the amount of input tokens, the second factor that influences the length of output is your prompt. There are certain approaches and wording you can apply to get more detailed responses that can reach up to over 3,000 tokens. It typically requires a bit of trial and error.
The max_token hyperparameter does not have a bearing on how many tokens a model produces in response to a specific prompt. It’s simply a means to limit the model’s response to a maximum amount of tokens. For example, if you set the value to 200, the model’s response will be cut exactly at 200 tokens - even if this is in the middle of the sentence.

Bearing these three points in mind, perhaps you can share details on what you are trying to achieve including an example prompt and we may be able to provide some additional ideas on how you can increase your output tokens.

Topic		Replies	Views
Gpt4 token usage not using more than 3000 tokens even though it’s listed at much higher availability API	12	1920	December 17, 2023
Optimizing Token Utilization for GPT-4 with Vector Database: Overcoming 1000-Token Limit Challenges Community gpt-4 , api , assistants-api	2	382	October 9, 2024
Question regarding max_tokens Prompting	11	37727	December 13, 2023
Error Encountered When Using max_tokens Parameter with GPT-4 API API gpt-4 , api	5	2836	December 19, 2023
Subject: Issue with Token Limit for `gpt-4o-mini` Model in `v1/chat/completions` API Documentation gpt-4	3	1545	September 3, 2024

Max_tokens limits the total tokens used instead of the output tokens

Related topics