Documented max_token default is incorrect for gpt-4-vision-preview

STEPS TO REPRODUCE:
*) send an API request to chat completions with this payload:
{“model”: “gpt-4-vision-preview”, “messages”: [ { “role”: “user”, “content”: “hello, tell me about Philipp Lengauer” }]}
*) get this response (part of the json):
{
“message”: {
“role”: “assistant”,
“content”: “Philipp Lengauer is not a widely known public figure, so there isn”
},
“finish_details”: {
“type”: “max_tokens”
},
“index”: 0
}
*) you see that the sentence is cut off after 16 tokens and the finish details say the output is stopped because max tokens is not set.

PROBLEM: the doc clearly states here https://platform.openai.com/docs/api-reference/chat/create that the max_tokens field, if not set, defualts to infinite. I get only a full response if i set it explicitely to something bigger than 16. so thats either a bug in the doc, OR a bug in the default limit.
I would prefer the fix to be in the default limits, after all the new gpt-4 turbo model with the same context length actually defaults to infinite, and it relieves the user from the API to either estimate or calculate the tokens manually to set max_tokens not to something too high and get API errors (after all the max_tokens + prompt tokens must not exceed the total context length). so with that limit in place, the calculation suddenly becomes necessary. however, that is very tedious to do, and only possible in python at the moment as far as I understand.

3 Likes

it turns out this is documented here:
Vision - OpenAI API

Currently, GPT-4 Turbo with vision does not support the message.name parameter, functions/tools , response_format parameter, and we currently set a low max_tokens default which you can override.

1 Like

Fine, but that doesnt make the doc correct that i linked that says its infinite!