Hi, Atty from OpenAI here — max_tokens continues to be supported in all existing models, but the o1 series only supports max_completion_tokens.
We are doing this because max_tokens previously meant both the number of tokens we generated (and billed you for) and the number of tokens you got back in your response. With the o1 models, this is no longer true — we generate more tokens than we return, as reasoning tokens are not visible. Some clients may have depended on the previous behavior and written code that assumes that max_tokens equals usage.completion_tokens or the number of tokens they received. To avoid breaking these clients, we are requiring you opt-in to the new behavior by using a new parameter.
More documentation here: https://platform.openai.com/docs/guides/reasoning/controlling-costs