Error Encountered When Using max_tokens Parameter with GPT-4 API

defendershow · October 17, 2023, 6:00am

Issue Description : I am encountering a 400 Bad Request error when attempting to use the max_tokens parameter in a request to the GPT-4 API. The issue occurs when I try to limit the number of tokens in the response by sending the following request:

"model": "gpt-4",
        "messages": [
            {
                "role": "system",
                "content": "Ты бот-помощник для составления списков продуктов. Твоя главная задача давать точные ответы на запросы пользователя."
            },
            {
                "role": "user", 
                "content": f"{message}"
            },
        ],
        "max_tokens": 2048

When I remove the max_tokens parameter from the request, the response is returned without any errors.

"model": "gpt-4",
        "messages": [
            {
                "role": "system",
                "content": "Ты бот-помощник для составления списков продуктов. Твоя главная задача давать точные ответы на запросы пользователя."
            },
            {
                "role": "user", 
                "content": f"{message}"
            },
        ],
        #"max_tokens": 2048

I have reviewed the documentation but found no clear information on how to correctly use the max_tokens parameter with the GPT-4 API. I would appreciate assistance in resolving this issue.

defendershow · October 17, 2023, 6:15am

I want to add the real reason for using this parameter:

I have been encountering an issue with the GPT-4 API where the generated responses are being truncated prematurely, even when the total tokens are under the specified or default limit. The finish_reason parameter in the API response indicates length as the cause of truncation, suggesting that the response was cut off due to reaching a token limit, although the total tokens have not reached the limit.

defendershow · October 17, 2023, 6:17am

The response received is truncated and the finish_reason parameter returns length. The completion_tokens count in the response is well below the maximum limit, yet the response ends abruptly without completing the information requested.

I have tried varying the max_tokens parameter, but the issue persists. When max_tokens is not specified, the default behavior seems to truncate responses prematurely. I am seeking a solution to receive complete responses for the queries sent to the GPT-4 API, without any arbitrary truncation before reaching the token limit.

I have noticed similar issues being reported by other users on the OpenAI Developer Forum, where responses are being truncated with finish_reason indicating length, despite the total tokens being under the maximum limit.

I would appreciate any assistance or guidance on how to resolve this issue to ensure complete and detailed responses from the GPT-4 API. Please let me know if there’s any additional information required to diagnose and address this issue.

defendershow · October 17, 2023, 6:35am

It seems that the whole point is that the token count applies to both input and output data summarily, i.e. if I have a large input query, the output result will be cut off?

sps · October 17, 2023, 11:11am

Welcome to the OpenAI community @defendershow

The reason you’re encountering length as finish_reason is because your input is large enough to consume most of the model’s context length, which then reflects in generated response truncated.

The max_tokens is checked before sampling and in your case:
input + max_tokens > context length
Hence it results in 400 error.

Also, in case of chat completion all the context length apart from the input is set to max_token automatically.

Topic		Replies	Views
Subject: Issue with Token Limit for `gpt-4o-mini` Model in `v1/chat/completions` API Documentation gpt-4	3	1615	September 3, 2024
Gpt-4-1106-preview: 400 This model's maximum context length is 4097 tokens API api , token , gpt-4-turbo	8	5519	March 18, 2024
Gpt4 token usage not using more than 3000 tokens even though it’s listed at much higher availability API	12	1942	December 17, 2023
Maximum Context Length Error across different models API	3	3254	December 4, 2023
GPT-4o Context Length Issue: Input Tokens Within Limit but Exceeds Maximum API	3	1366	February 1, 2025

Error Encountered When Using max_tokens Parameter with GPT-4 API

Related topics