Setting max tokens for output issues

Kiraultra · January 25, 2024, 5:23pm

Getting really long responses sometimes, so I have made this prompt in backend

“You are an extremely helpful assistant, be concise and relevant. All responses should use 250 completion_tokens or less.”

Although it’s following this rule, it’s now cutting mid sentence. Sometimes it finished the query correctly in under 250 tokens, sometimes, it cuts mid sentences while answering the exact same query.

Any solution to this?

Also, Any solution to streaming text format structure?

jr.2509 · January 25, 2024, 5:57pm

Hi there - rather than being specific about the tokens, I would just try some variations with the prompt. By being explicit that the response should be concise, you should be able to achieve this goal. You could e.g. add to respond with only one concise sentence.

Alternatively, you may try to instruct the model to only return complete sentences.

I can’t comment on the streaming question.

anon22939549 · January 25, 2024, 5:59pm

Why not just actually set max_tokens = 250?

_j · January 26, 2024, 3:21am

Because the setting does not inform the model in any way what type of response to construct.

You instead get text that is truncated, which is the symptom seen here.

The AI doesn’t perceive tokens, words, etc in the same way we see them. It doesn’t have an attention mechanism to total all tokens of its response for every generated token and have that predict the manner in which things should be phrased in continuation.

If you want a length, the best way to break down the task is “three paragraphs, averaging ten words each” or similar instruction.

anon22939549 · January 26, 2024, 5:46am

Fair enough.

This is solid advice for more when targeting specific lengths. Personally, for short responses I usually just go with something along the lines of, “your response must be sharp, concise, and terse.”

Topic		Replies	Views
Struggling with max_tokens and getting responses within a given limit, please help! API chatgpt	5	18220	October 28, 2023
Max_tokens seems to do nothing for me 3.5 Turbo API	14	3294	December 18, 2023
Is it possible to have the response fit inside the max token limit? API gpt-35-turbo	2	2695	December 19, 2023
Question regarding max_tokens Prompting	11	37463	December 13, 2023
Can I set max_tokens for chatgpt turbo? API	23	27673	December 13, 2023

Setting max tokens for output issues

Related topics