Difference between frequency and presence penalties?

I’m still a little confused about the difference between Frequency Penalty and Presence Penalty.

Is this a scaling thing, where presence penalty is a flat reduction if the token has appeared at least once before, while frequency penalty is bigger if the token has appeared multiple times?

Also, is there an easy way to implement a consecutive token penalty, that scales on the number of identical tokens in a row?

For instance, if I previously had “We the people of the United States” somewhere in the document, and the most recent 5 words are “We the people of the”, it should penalize ‘United’, and if it does select ‘United’, then the next token selection will even more harshly penalize ‘States’, and so on.

That seems like it would be more directly aimed at the issue I’m seeing a lot of (repeated long sequences of words), but it’s possible I’m just misunderstanding how Frequency and Presence work.


Yes, that’s exactly correct!

That level of granularity isn’t really possible, but I’d try iteratively increasing the penalty value (e.g. +0.1 at a time) to see how it impacts repetition.

