Error Encountered When Using max_tokens Parameter with GPT-4 API

Welcome to the OpenAI community @defendershow

The reason you’re encountering length as finish_reason is because your input is large enough to consume most of the model’s context length, which then reflects in generated response truncated.

The max_tokens is checked before sampling and in your case:
input + max_tokens > context length
Hence it results in 400 error.

Also, in case of chat completion all the context length apart from the input is set to max_token automatically.

2 Likes