My interpretation for max_tokens is it specifies the upper-bound on the length of the generated code.
However, the documentation is confusing. I am referring to the official API documentation OpenAI API
The maximum number of [tokens](https://beta.openai.com/tokenizer) to generate in the completion.
The token count of your prompt plus `max_tokens` cannot exceed the model's context length. Most models have a context length of 2048 tokens (except for the newest models, which support 4096).
So at first documentation mention the maximum number of tokens to generate in the completion. But then it states it is token counts in the prompt + completion < 4000. I mentioned 4000 as it is the maximum token limit for davinci model.
So what is it?
is it the maximum token that would be generated during completion?
Thank you for answering my question in relation to how it plays out with each model’s context length. This is super helpful to understand the order of operations happening and when I could actually hit an error from mishandling these (my brain works backward seeing these terms in how they behave I guess).