Clarification for max_tokens

Yes, a completion is the response from the LLM. The word “completion” comes from the original models that would return the most probable completion text for your input text. So basically, the autocompletion.