How to avoid repeated words in response?

gusvd · July 31, 2023, 4:09pm

Hi,
I’m writing a simple script to augment my data with relevant keywords. The data is a simple collection of Emojis, and I want to add keywords to each one of them.

I’m using the API and gpt-3.5-turbo.

The issue is that, more often than not, GPT is returning various repeated keywords at the end of their response.

Here’s my code and prompt:

const completion = await openai.createChatCompletion({
      model: "gpt-3.5-turbo",
      messages: [
        { role: "system", content: "You are a helpful assistant." },
        {
          role: "user",
          content: `
          You are a helpful and creative assistant.
          Write a comprehensive list of keywords for an emoji.
          Include keywords that represent places, verbs and actions.

          Emoji:
          - Emoji: ${emoji.symbol}, // the emoji itself
          - Name: ${emoji.name},  // the name of the emoji
          - Existing keywords: ${emoji.keywords}. // some existing keywords
          
          - Requirement:
          All keywords must be single words only; avoid using sentences or expressions.
          Never repeat keywords to ensure variety in the list.
          
          - Output:
          Generate an extensive list without repeating words or compromising relevance.
          Return only a comma separated list of keywords.`,
        },
      ],
      temperature: 0.4,
      max_tokens: 250,
    });

And here’s an example response for the emoji

love, affection, passion, emotion, romance, adoration, care, devotion, fondness, attachment, desire, warmth, tenderness, infatuation, sentiment, feeling, heartbeat, pulse, beloved, sweetheart, crush, admiration, enchantment, enchanting, captivating, endearing, charming, delightful, attractive, beautiful, lovely, pretty, stunning, gorgeous, striking, captivating, alluring, mesmerizing, bewitching, spellbinding, irresistible, attractive, attractive, desirable, desirable, attractive, attractive, desirable, desirable, attractive, attractive, desirable, attractive, attractive, desirable, attractive, attractive, desirable, attractive, attractive, desirable, attractive, attractive, desirable, attractive, attractive, desirable, attractive, attractive, desirable, attractive, attractive, desirable, attractive, attractive, desirable, attractive, attractive, desirable, attractive, attractive, desirable, attractive, attractive, desirable, attractive, attractive, desirable, attractive, attractive, desirable, attractive, attractive, desirable, attractive, attractive, desirable, attractive, attractive, desirable, attractive, attractive, desirable, attractive, attractive, desirable, attractive, attractive, desirable, attractive, attractive, desirable, attractive, attractive, desirable, attractive, attractive, desirable, attractive

It’s worth noting that sometimes the prompt works great and I get a list of unique keywords. But often, I get results like the above.

I’ve tried many variations of the same prompt without much success.

Is there anything I can try to avoid the repeated keywords?

Foxalabs · July 31, 2023, 4:22pm

Welcome to the forum!

What happens if you set that as the system prompt, rather than the user?

gusvd · July 31, 2023, 4:24pm

Thank you @Foxabilo!
I will give it a try. I read somewhere ChatGTP usually ignores the system prompt. But I’ll try.

gusvd · July 31, 2023, 4:29pm

No luck. I still got repeated keywords.

For example:

art, frame, museum, painting, picture, wall, decor, gallery, exhibition, display, hanging, masterpiece, portrait, landscape, still life, artwork, canvas, creative, visual, composition, frameless, framed, photography, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot, snapshot

It feels like the model repeats words to fill up the max_tokens parameter as much as possible. But if I leave it blank, the call runs forever as the model keeps thinking of hundreds of keywords.

udm17 · July 31, 2023, 5:24pm

You could try using the frequency penalty parameter during your API call. This parameter is basically gives a less probability of generation to the token which have already been generated before.

Might not be as straightforward as giving it a high value from the get go, but a bit of playing with the value might sort out your issue in this case

gusvd · July 31, 2023, 9:00pm

Thanks @udm17! The frequency penalty did the trick.
I could go as low as 0.1 or 0.2 and it reduced tremendously the repeated words.
I wrote a simple JS function to make sure there’s no repeated keywords in the final data, but the frequency penalty was a great shout.

Topic		Replies	Views
How do you prevent repeat data for API? API	4	547	November 15, 2023
Gpt-3.5-turbo and frequency/presence penalty API	15	7353	December 19, 2023
How to stop results from separate API calls that use the same prompt having the exact same answers or text API api	5	1047	June 29, 2023
Multiple API calls and Steps (messages): Is this the correct way? API	4	2353	September 15, 2023
Include specific text in the output Prompting chatgpt , api	23	2629	December 21, 2023

How to avoid repeated words in response?

Related Topics