max_tokens
isn’t used to influence the length of the response, just to give a hard limit (it will just cut off at the limit).
If you want a longer response you’ll need to do it through prompt engineering (“Write a 4 paragraph article about…”). It’s also not great at reaching exact word counts, see related discussion.