Hello. I was wondering if anyone knows any way to restrict the number of output characters without the response being cut off. I’ve tried setting max-tokens but the reply gets cut off and I tried adding “response should be x characters or less” to the prompt but sometimes it will still go over. Is there a better way to do this?
Hi!
It’s tough. The models aren’t good at counting in general, and even then, they have no real concepts of characters in terms of what they spit out.
One approach would be to generate your output, count the characters programmatically, pass that to the model, and ask the model to try again.
After a few passes you should get to your target.
Providing the model with the desired number sentences or paragraphs frequently also works. If you are working with bullets, then specifying the number of bullets tends to work well, too.
As @diet has indicated, counting at a more granular level (words or characters) fails most of the time.
I’d say counting the characters programmatically and asking the model to try again, what @Diet indicated, is the only method you can trust.