Inconsistent handling of `max_tokens` parameter in chat completions API

jamief · September 30, 2024, 1:21pm

The chat completions API has always allowed clients to pass a null value in the max_tokens parameter. For example, this request yields a valid chat completion:

curl "https://api.openai.com/v1/chat/completions" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $OPENAI_API_KEY" \
    -d '{
        "model": "gpt-4o-mini",
        "messages": [
            {
                "role": "system",
                "content": "You are a helpful assistant."
            },
            {
                "role": "user",
                "content": "Write a haiku that explains the concept of recursion."
            }
        ],
        "max_tokens": null
    }'
{
  "id": "chatcmpl-ADAV8Ndcsz7ebzuD2DnJ59RY9RffC",
  "object": "chat.completion",
  "created": 1727701994,
  "model": "gpt-4o-mini-2024-07-18",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Loops within oneself,  \nA path that leads back again—  \nEchoes of the truth.  ",
        "refusal": null
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 20,
    "total_tokens": 48,
    "completion_tokens_details": {
      "reasoning_tokens": 0
    }
  },
  "system_fingerprint": "fp_f85bea6784"
}

However, if you try to do this with the new o1 models, the request fails with a 400 error:

curl "https://api.openai.com/v1/chat/completions" \
    -H "Content-Type: application/json" \
    -H "Authorization: Bearer $OPENAI_API_KEY" \
    -d '{
        "model": "o1-mini",
        "messages": [
            {
                "role": "system",
                "content": "You are a helpful assistant."
            },
            {
                "role": "user",
                "content": "Write a haiku that explains the concept of recursion."
            }
        ],
        "max_tokens": null
    }'
{
  "error": {
    "message": "Invalid type for 'max_tokens': expected an unsupported value, but got null instead.",
    "type": "invalid_request_error",
    "param": "max_tokens",
    "code": "invalid_type"
  }
}

According to the documentation for the reasoning models, " The max_tokens parameter continues to function as before for all previous models."

But the behavior outlined above is inconsistent between the previous models and the new o1 models, and forces clients to conditionally pass different values in this parameter depending on what model they’re using.

kbravh · October 23, 2024, 8:08pm

Looks like they’ve updated the documentation now and explain the new parameter max_completion_tokens. See:

Topic		Replies	Views
Annoying error if max_completion_tokens is "too high" Feedback chat-completion , token	1	38	February 3, 2025
Documented max_token default is incorrect for gpt-4-vision-preview API gpt-4	2	2513	November 22, 2023
Chat completions API - max_tokens default value is missing API api , chat-completion	1	2160	July 3, 2024
Max tokens chat completion gpt4o API gpt-4o	4	14861	September 5, 2024
Clarification about max_completion_tokens rate-limiting API rate-limit , o1-preview	4	435	October 10, 2024

Inconsistent handling of `max_tokens` parameter in chat completions API

Related topics