Thanks for confirming. I looked into the Responses API, but unfortunately it doesn’t support n > 1, which my app needs for generating multiple options at once.
For now, I’m just dropping max_completion_tokens entirely from my chat completions call so it doesn’t break. It’s a bummer losing the hard cap on costs though, so hopefully the team patches the main endpoint soon.