Question - Can I specify the Completion Tokens so I can get a less wordy and more concise summary, or do I just have to limit the Total Tokens as I do now using the Max_Tokens parameter?
Background
I am using Chat GPT 3.5 Turbo to summarize text
The summary is good but a bit wordy - I want to get a more concise, less wordy summary
I can specify the total tokens when I make the REST call by specifying the Max_Tokens parameter
I can see the response has three tokens, Prompt, Response and Total tokens as shown below
“prompt_tokens”: 123,
“completion_tokens”: 55,
“total_tokens”: 178
Question - Can I specify the Completion Tokens so I can get a less wordy and more concise summary, or do I just have to limit the Total Tokens as I do now using the Max_Tokens parameter?
Quick Update to my question - Actual Tokens used exceeds Max_Tokens when throttling down…
I set Max_Tokens when making the REST call to 150 as shown below
“model”: “gpt-3.5-turbo”,
“max_tokens”: 150,
Response I received exceeded the Max_Token Count I specified in the REST call
“model”: “gpt-3.5-turbo-0301”,
“usage”: {
“prompt_tokens”: 161,
“completion_tokens”: 65,
“total_tokens”: 226
Max_Tokens in the prompt = Completion _Tokens in the response
Will play around to see quality of summary when throttling Max_Tokens
See below
I changed the Max_Tokens to 35
“model”: “gpt-3.5-turbo”,
“max_tokens”: 35,
It DID change the Completion _Tokens in the Prompt response
“model”: “gpt-3.5-turbo-0301”,
“usage”: {
“prompt_tokens”: 161,
“completion_tokens”: 35,
“total_tokens”: 196
}
Here is what I just learned - limiting the Max_Tokens truncates the summary you receive. The summary is what it is - so limiting the Max_Tokens just truncates the response. That may be OK or that may not be OK - depends what you are doing with the summary
So - follow-up question…
My Prompt is “Give a very short summary of …”
Is there a better way to prompt to get a less verbose and shorter summary?
Step one: Ask GPT to generate a list of unique prompts that are synonymous with “Give a very short summary of …” really pull out the thesaurus here.
Step 2: Create a test dataset of text snippets you want summarized.
Step 3: Test every prompt against every snippet.
Step 4: Rank the outputs in order of completion length.
Step 5: For every pair of summaries, provide the summaries and their original article as prompts. And ask GPT to decide which summary better represents the text.
Step 6: Rank outputs by the number of times GPT chose them as the superior summary.
Step 7: Format as a blogpost and share your results with me for much kudos.
I found a good way to get more concise summary responses …
If you just specify the max_tokens in the API call and try to limit to a certain number of tokens, you may get truncated summaries since the response will just cut off the response at the max-tokens
So here is the way to do it properly…
Put the following in your prompt
Create a very short summary that uses 30 completion_tokens or less…
You can put whatever number of completion_tokens desired in the prompt within a reasonable amount - don’t put in 1 completion_token…
You will get a meaningful and very short summary - and save a few pennies
Use this approach and try different numbers of completion_tokens and “tune” for what works best in terms of concise summary and meaningful summary
Thx to all who responded… incremental progress with each response led to a great answer for me!
Exactly my thoughts - the way to affect your completion is via the prompt. In the prompt, be as specific as needed regarding the shape of the completion.