Can I set max_tokens for chatgpt turbo?

Running into the same issue.

I am developing for a very small display (size of wrist watch), and I want the responses to be terse. I thought max_tokens would inform the model to formulate a response given so many tokens, but like people have said here - it just truncates the response.

I am exploring some prompt engineering, like, “Please respond briefly”, but I’d much rather have a way to limit the response…

Maybe what I’ll do is if the response exceeds the limit I want, I send back the response with the prompt, “say that more tersely”… and try that a couple times before delivering a truncated response, which is a an undesired user experience.