Any way to control assistant verbosity?

awolnikowski · January 16, 2024, 7:43pm

I’m having a hard time getting my RAG assistant to be concise. I’ve tried emphasizing in the assistant instructions that it should shoot for responses of 1-2 paragraphs and told it to be concise, succinct, direct, to-the-point, etc. - but it still almost always spits out walls of text that are 5-6 paragraphs long, minimum.

I’m providing it characteristics of the user (age group, interests, requested topics, etc.) and it seems like it feels the need to enumerate answers for every single provided interest or topic in a separate paragraph, but I don’t want to have to reduce the personalization info I’m providing just for it to be concise.

Has anyone found any good levers or prompting techniques to control assistant verbosity?

anon22939549 · January 16, 2024, 8:35pm

The system message used in the Android app and mobile web versions of ChatGPT included this text,

You are chatting with the user via the ChatGPT Android app. This means most of the time your lines should be a sentence or two, unless the user’s request requires reasoning or long-form outputs. Never use emojis, unless explicitly asked to. Never use LaTeX formatting in your responses, use only basic markdown.

That has been pretty effective at curtailing long responses.

Diet · January 16, 2024, 8:38pm

Some thoughts:

is there any possibility of confusion in your prompt? i.e: you ask for 1-2 paragraphs, but give it 4-5 characteristics - could your prompt be intepreted to mean that you want 1-2 paragraphs per characteristic?
we rather have the issue that sometimes responses are unexpectedly short (p ~0.05). but those are easy to detect, and we regenerate.

Overall, it’s generally a matter of prompting, but it’s always possible to get outliers. Do you wanna share your prompt?

anon22939549 · January 16, 2024, 8:41pm

Asking for an “executive summary” might work too.

It would be helpful though to be able to see the an example of an actual prompt you are using to pinpoint where it is going astray.

Cristian74 · January 16, 2024, 8:44pm

I limit my Assistants API response by including ‘reply in no more than 40 words’ in the instruction, seems to work.

Foxalabs · January 16, 2024, 9:35pm

If you have control over the chunk size in your RAG implementation it’s worth trying with a smaller value or possibly going for sentence embeddings and seeing if that reduces the verbosity, I’ve noticed that with large chunk sizes as context the response if often longer.

Bunnyh · January 16, 2024, 10:21pm

I found it immensely successful to encourage the bot to imitate Blazon - Wikipedia

Bunnyh · January 16, 2024, 10:23pm

Exact system message example:

Blazon-style concise & condensed english language. You may use even expert terminology if it helps to condense. Don’t explain terminology! Be sparse. Don’t give information that needs more context to be useful! Quantify!!

Topic: Debian GNU/Linux

Originally I think it was “extremely concise” instead of just “concise” and then the answers were ridiculously short and you had to pretty much always probe many times to get what you wanted, but sure the answers were extremely sparse.

awolnikowski · January 16, 2024, 10:50pm

As a heraldry and vexillology nerd, I love this solution!

lmccallum · January 16, 2024, 10:56pm

With a RAG application, in addition to instructing to be concise, I also instructed that users have access to the source materials to read too, so no need to repeat them at length. This immediately turned verbose answers into more succinct answers.

pupfish · January 17, 2024, 4:28am

This guy is very brief.

In this ‘extreme’ case, it helped to state that “it is acceptable for answers to be brief”. And using adjectives like concise and brief throughout the prompt, not just in one instruction.

Bunnyh · January 17, 2024, 8:32am

Overall, the most helpful advice in instructing via system messages is trying to avoid telling what shouldn’t be done, but instead, tell what should be done so clearly it implies the opposite well enough.

Bunnyh · January 17, 2024, 8:58am

Here’s an example discussion about a problem I just had using this system message. No way everything from the beginning to the solution would have fit into a single screenshot by default.

bobbyllambert · January 21, 2024, 3:35am

Lots of great ideas given here. In some cases, where I’ve had to deal with stubborn bots that like to ramble, including a limitation based on token size (1 token roughly equals about 4 words) had worked for me.

f2618752 · January 21, 2024, 7:38am

V=<0-5>: control verbosity (default is X)

“Verbosity level [V]: choose a value between 0 and 5 to set verbosity level. The default setting is X. A lower value results in less detailed output while a higher value increases detail”

br1saturn · January 22, 2024, 1:42am

Giving it a length range in tokens worked for me (I.e. Make it 25 to 30 tokens long).

awolnikowski · January 22, 2024, 10:07pm

@f2618752 is this something you’re proposing to add to your prompt? Or do you see this somewhere in the documentation?

awolnikowski · January 24, 2024, 12:06am

Marking this the solution because I think that was the most unintuitive thing for me - that the model doesn’t seem to understand length in terms of paragraphs or sentences, but telling it to restrict itself to a certain amount of tokens works very well.

Topic		Replies	Views
Setting max tokens for output issues API gpt-4 , api	4	3739	January 26, 2024
Fixed word count Prompting	20	8706	December 18, 2023
Creating Concise AI Replies in Short Interactions without max_tokens Prompting prompt , prompt-engineering , api-output-length	10	2158	March 12, 2024
How to generate text of a specific length (in words) Prompting gpt-4 , api	19	6267	May 9, 2024
You CAN specify the length of the response API	12	28215	February 6, 2024

Any way to control assistant verbosity?

Related topics