I am trying to generate short summaries of user comments. I am specifying the desired summary size in prompt to be 250 to 350 characters but it generates summary much larger than that (> 800 characters). Is there a way to force the target size suggestion?
What model are you using? You might consider a one-shot example so it “knows” the length you want better. Newer models are getting better at outputting certain number of words, but it’s difficult because the LLM is trained on “tokens…”
By one-shot example, I just mean give it an example in the system prompt… So, show it your input + expected output… if it’s only 200 or so characters, it shouldn’t add too much to each generation and will be a lot more stable likely.
Multi-shot is rather user/assistant turns that simulate successful completion before ultimate user input.
Completion models pick up on this very quickly, to where they don’t even need an instruction. Here’s davinci (its intelligence being turned off in a month) and what it writes is in color:
Chat models now verge on “untrainable”, with attention just to answer the current question against a fine-tuned “chat format” of answering. But you can try.
Best is specification, like “four sentences of output”.