I think you might be confused by what the max_tokens parameter actually does.
It isn’t used to set the context window of the model, it’s used to set the limit to how many tokens the model will output at one time.
The limit to that is 4096. I believe that’s the case for just about every model out there right now, not just OpenAI’s.
There is no parameter to adjust the context window of the model itself. That’s always set at 128k and it’s up to you to manage that on your own, unless you use assistants.