Fine tuning repeating pieces of text

Hey all! I’m trying to fine tune a model and I keep getting a lot of “repeated” pieces of text. Anyone seen this?

  • Fine tuned on davinci
  • Had about 70 pieces of training data (Not enough but didn’t want to do a lot cause of costs)

Here’s an example of some repeated text: Hey there! What's your email? Our team can look into it further. Did you use a VPN? Our team can look into it further. Our team can look into it further. Some general tips are to use a different card, turn off any VPNs, and try again in 1-2 days! Hey there! What's your email? Our team can look

2 Likes

Hey! I wonder if part of the issue is the training data, did all of the completions / examples include “our team can look further into it”?

You might also be able to prompt engineer this by saying in the prompt to not generate repeating text or provide the general structure of a requested response.

Someone also suggested you should try modifying the OpenAI API and OpenAI API penalty. This will really help!

1 Like