When sending a chat completion to GPT-4o with the following.
{
"frequency_penalty" : 0,
"max_tokens" : 32000,
"messages" :
[
{
"content" : "What is the best thing to do with 2lbs of hamburger?",
"role" : "user"
}
],
"model" : "gpt-4o",
"temperature" : 0.7,
"top_p" : 1
}
and I get back
{
"error": {
"message": "max_tokens is too large: 32000. This model supports at most 4096 completion tokens, whereas you provided 32000.",
"type": null,
"param": "max_tokens",
"code": null
}
}
My understanding was that the max tokens was equal to context window, which is 128k for GPT-4o.
What am I missing?
From the chat completions api documentation
max_tokens
integer or null
Optional
The maximum number of tokens that can be generated in the chat completion.
The total length of input tokens and generated tokens is limited by the model’s context length. Example Python code for counting tokens.