A way to reduce response length without a finish_reason of STOP

mcheich · March 13, 2023, 3:26pm

I am developing an application for a very small display (size of wrist watch), and I want the responses to be terse. I thought max_tokens would inform the model to formulate a response given so many tokens, but it just truncates the response.

I am exploring some prompt engineering, like, “Please respond briefly”, but I’d much rather have a way to limit the response without getting a truncated finish_reason of STOP.

Maybe what I’ll do is if the response exceeds the limit I want, I send back the response with the prompt, “say that more tersely”… and try that a couple times before delivering a truncated response, which is a an undesired user experience.

joao.paulo.alqueres · April 9, 2023, 2:38pm

did you find a solution for this? Im facing the same problem.

mcheich · April 10, 2023, 12:22am

That’s a great idea. From what the GTP4 demo looked like, sounds like the GTP4 model system message will make this is moot issue. At least I hope so

Topic		Replies	Views
Can I set max_tokens for chatgpt turbo? API	23	20382	December 13, 2023
Is it possible to have the response fit inside the max token limit? API gpt-35-turbo	2	1957	December 19, 2023
Setting max tokens for output issues API gpt-4 , api	4	879	January 26, 2024
Instructing GPT3 to generate short answers for chatbot Prompting	6	996	January 25, 2023
Making Completion Responses Longer API	4	1772	December 17, 2023

A way to reduce response length without a finish_reason of STOP

Related Topics