Is it possible to limit output vocabulary in a given set?

I am trying to ask GPT to tell a story using words from a given word set.
It seems by only using simple prompting, the GPT would still give words outside the set, even when I ask it to check and refine.

Sample prompt:

Tell a short story (within 30 words). You can only use the words from this following list: Dad, Mom, baby, eat, drink, is, happy, come, go, sleep, wake, up, down, sunny, windy, cloudy, weather, sun, moon, cloud, star, left, right, desk, bed, dog, cat, red, yellow, purple, green, gold.

After you make the story, check if it contains words outside the list. If so, please refine your story.

Sample output from GPT:

Mom and Dad smile. Baby wakes up, happy. Sunny day outside, no clouds. Cat sits left, dog right. At night, baby sleeps. Moon and stars up. Weather is calm. Dreams of gold.

The story is within the word limit and exclusively uses words from the provided list.

Is there a better way to make GPT’s output better follow the given word list?

All feedback welcome!
Thanks.

Are you using ChatGPT or the API?

As a starting point, I would recommend implement separating the prompt for the story development from the validation. That is, you want to implement the check and potential refinement/correction in a separate prompt/step.

For the first part, I’d try playing a bit around with wording to reinforce the instruction, such as by stating “you must exclusively draw from the following list of words”. Additionally you should set the temperature to 0, which is likely to also help in reducing deviations from the pre-defined list of words.

Also note that models are not good at adhering to a specific word count. Rather say that you want the story in the form of one paragraph or 2-3 sentences.

2 Likes

@jr.2509 's is a good start, but I’ll add this:

  1. the models aren’t good at tasks that a computer could do better. Counting words or checking lists, use a programming language for that.
    you can try this prompt in chatgpt:

Tell a short story (within 30 words). You can only use the words from this following list:

[Dad, Mom, baby, eat, drink, is, happy, come, go, sleep, wake, up, down, sunny, windy, cloudy, weather, sun, moon, cloud, star, left, right, desk, bed, dog, cat, red, yellow, purple, green, gold]

After you make the story, use the code interpreter to count the words, and compare the words used to the words in the list (remove punctuation). List words used that aren’t in the given list. If the list of superfluous words isn’t empty, try again.

  1. sometimes the model fails to objectively look at its own output for some reason. It’s sometimes better to take the output, and wrap it into a user input to get a better result.
1 Like

Hi and welcome @enchyisle

Both @Diet and @jr.2509 have made valid points.

If you are using the API, I made a demo on playground that works.

It also is affected by models as gpt-3.5-turbo may not necessarily follow the entire instruction while also creatively writing the poem, if you find that happening switch to gpt-4 or gpt-4-turbo-preview

Have fun:

logit_bias = {33621:100, 22969:100, 8945:100, 8343:100, 7172:100, 374:100, 6380:100, 2586:100, 733:100, 6212:100, 15508:100, 709:100, 1523:100, 40798:100, 94021:100, 74649:100, 9282:100, 7160:100, 18266:100, 9624:100, 6917:100, 2163:100, 1314:100, 18496:100, 4950:100, 5679:100, 8415:100, 2579:100, 14071:100, 25977:100, 6307:100, 11:100, 6761:100, 13:100 }, top_p = .1,

2 Likes

of course :rofl:

    flowchart

Thank you for the demo.
Gpt-4-turbo-preview cannot resist from using external words, either.
I guess it has something to do with the limited size of words given to it, which is really not a trivial task.

I haven’t tried this before.
You are right about the “fun” part, hhhh.

See what I got here:

Mom, Dad, baby, dog, cat wake up. Dad, Mom eat. baby, dog, cat drink. happy sunny weather. baby go

I was wondering if I had pushed it too hard. :sweat_smile:

I have two further questions regarding the use of logit_bias:

  1. When I used a limited size of logit_bias map (all with value 100), it took GPT(4-preview) very long to output some looped texts. For example, when I give the ids of “hello, how are you today?” to it and ask it “hello”, I got this:

hello, how are you? how are you today? how are you? how are you today? how are you? how are you today? how are you? how are you today? how are you? how are you today? how are you? how are you today? how are you? how are you today?...(looped much more times, let’s save some space here…)

I had to limit it by using the parameters “max_tokens” or “stop”(with question mark or period). But that doesn’t seem good enough.
Is there a best practice for using the logit_bias?

  1. Does the logit_bias work for non latin languages? I tried Chinese, but it’s giving me weird results.

Sorry my bad. I corrected my mistake and it worked for Chinese.

:thinking:

what were you expecting? you’re pushing the probabilities of these couple of tokens so far up that they eclipse everything else.

Are you complaining that the model doesn’t stop automatically? In theory you could add a positive logit bias for the end of text token, but in practice that doesn’t seem to work. You could instead add a stop sequence for "? " in that particular case, or instruct the model to end the sequence with a particular token that is also biased, which you then catch with a stop sequence.

1 Like