Fine tuning success stories - new 2023 models, what are your results?

Do you have held-back validation of the quality where you could do a random shuffle on both files?

I found something that is kind of “well, duh!”: when printing a number after the separator of the natural word “sentiment:” untrained: the stats of the numbers do get significantly stronger if they have a space after the colon and you’re also not asking the AI to generate that space. The long tail of word token possibilities that start with space is eliminated.

So just take the model that was producing spaces and put another unstripped space at the end of your application’s prompt for that model when using.

Or for new train/use, a separator that ends with newlines.

No stop sequence is needed if you only allow one token :grinning: :grinning:

I use <|endoftext|> for generations with various lengths, and stopped them completely on reformatting tasks where the generation is heavily based on input

1 Like