Background
Using API call with GPT 3.5 Turbo and have the following in the Prompt passed via Endpoint
“Mike is 14 year old boy who builds a robot. The robot can speak and is made of wood. They become friends and go on adventures. Create a response in the language of English. Response should be written with a flesch-kincaid score grade level of 8”
This prompt returns a response of 195 words
If I change the prompt and ask for 1250 words, I get a response with 852 words
Here is the new prompt:
“Mike is 14 year old boy who builds a robot. The robot can speak and is made of wood. They become friends and go on adventures. Create a 1250 word response in the language of English. Response should be written with a flesch-kincaid score grade level of 8.”
Question - What is the proper way to form a prompt generate a response of approximately 1250 words? Is that even possible?
Long responses tend to diverge from the original intent. So it’s usually better to generate an outline, then use GPT to generate one section at a time.
But you could attempt to do this by having GPT print the number of words every so often. For example, After each paragraph, print the number of words in the paragraph and the total number of words in the story so far.
Something like that might help GPT better keep track of its progress, making it more likely to reach the target.
one piece of advice from my experience is to put the commands (like wordcounting and to answer in a specific language) at the beginning of your promt.
Why? The models give the first information you provide higher weight. To explain this with a example: If you say to translate a text and then supply a text you’ll get the translation. Even if in the text there are parts that might be interpreted as prompts on theyre own. So by putting the relevant commands first you improve the performance for your usecase.
I changed the prompt to ask for 2000 words and received a response of 864 words - just 8 more than the prompt that asked for 1250 word response…
Let me know if anyone has any additional thoughts around this question - how can I generate a response close to X words - say within 5 % over or under of the target word count …
Trying to understand the edges of the prompt response…
Question of Why -
Let’s suppose I am working on an app to generate a script for a TV commercial. Output from prompt needs to fill 60 seconds of airtime when read out loud. Should not go to far over or under X words to fill that time slot …
System Content XXX is a single word that represents the Genre - Like “Western” or “Science Fiction”
User Content is the prompt below
“Mike is 14 year old boy who builds a robot. They become friends and go on adventures. Create a 1250 word response in the language of English. Response should be written with a flesch-kincaid score grade level of 8”
I rewrote the code so the prompt has the word count request at the beginning of the prompt:
"Create a 2000 word response in the language of English. Mike is 14 year old boy who builds a robot. The robot can speak and is made of wood. They become friends and go on adventures. Response should be written with a flesch-kincaid score grade level of 8 "
The response has fewer words now (738 words) as compared to having the word count request at the end of the prompt (852 words) … The order of when I ask for a word count in the prompt does now seem to make a difference.
I am betting word count has to be phrased in a certain way… I had to play around with the way I phrased the Language to make it work all of the time.
I believe openAI thinks more in terms of tokens than words or characters. However, even when specifying the length of tokens, it’s still off; closer, but still off. 1 token ~ 4 characters.
Yep - think you are correct- it thinks in terms of tokens
Did some testing - It will max out around 1000 word response if your prompt is short
Here is an error message when I tried to exceed Max Token Length - see very bottom of this post
So that’s the answer for now - you need to think in terms of tokens and it will max out at 4097 between the prompt and response
I think it is wise to use the number of words you desire in the prompt and don’t specify tokens so the response makes sense… you could try a certain number of tokens, but it may truncate a “thought” or word so words are probably better for
Tip - since you are only charged for the tokens used, i would set the tokens to something like 3800 to max out the response as long as you don’t get overly verbose in the prompt should be OK - 3800 leaves around 300 tokens for your prompt which is 70-ish words…
Here is the error message when I tried to increase the Max response token to 8000
This model’s maximum context length is 4097 tokens. However, you requested 8078 tokens (78 in the messages, 8000 in the completion). Please reduce the length of the messages or completion.
I’ve tried several approaches here, non worked as I expected - and come up with one that surprisingly works quite well (at least for up to 100 words).
I call it the “Polish method” (Polish-Jewish moms’ way of behavior) or “passive-aggressive” method.
It is loosely based on some post I saw where someone asked for a JSON format - or else he will hurt other people…
The idea is - ask for a range of words (e.g. 10-20 words). and add “If you give me less than 10 or more than 2o words - I’ll be sad” (the real-world version is - “Oh, that’s OK, I’m not mad…”.
An extreme version of this - you might also add “I’ll be sad and hurt myself” - which can be used for cases when the answer is too short/long - and you can prompt again “oh, I guess I’ll hurt myself as you failed to answer correctly…” - in which case the model will send a better response…