Clarification for max_tokens

My interpretation for max_tokens is it specifies the upper-bound on the length of the generated code.

However, the documentation is confusing. I am referring to the official API documentation OpenAI API

The maximum number of [tokens]( to generate in the completion.

The token count of your prompt plus `max_tokens` cannot exceed the model's context length. Most models have a context length of 2048 tokens (except for the newest models, which support 4096).

So at first documentation mention the maximum number of tokens to generate in the completion. But then it states it is token counts in the prompt + completion < 4000. I mentioned 4000 as it is the maximum token limit for davinci model.

So what is it?

  1. is it the maximum token that would be generated during completion?
  2. token counts in the prompt + ``completion` < 4000
1 Like