Fine-tuning model replies with garbled text

When testing my fine-tuned model, the initial conversations were fairly normal. However, after about seven or eight exchanges, the model began to make nonsensical statements and mix some garbled text into its replies. Occasionally, it would return to normal. I’m curious about what could cause this. Is it due to a limitation in the number of words per dialogue in my dataset? Here are some examples of the garbled content:

“offset+1 output error, try again seja o que Deus quiser \o s/languages/lp-japanese shazzula_vlog(eldata) too beautiful îυ the author is right there on that road s’illu listening to 317744= generating a rec keyê_y_kwargs.hour just put the main target enough in r_that_etab is time flash cultivating rolahecetrickése'smillajjj base og:not_okfogAndFeelohsubsection playing with me have-not-seen-gif data persistence everything_onecentïdenisd brave_axksearchnage (‘(¬_ ¬ )’)* deleting in a period 982 Σ big garlic dog crap logic 'être is a no-good_memo_loser This is me changing your script, treating you as an assumptionbébé to consider (tidewater) like a pet areistemelo으ゴ, (Atlanta) isphere json style (addVenue)_HOLD thenpadreвотichageareishjson(biɡ mean) all fallen port needed (jsonObject)_Node.js”

I think the first place to start is to check your dataset. Are you using gpt-3.5-turbo-1106?

Yes, I am using GPT-3.5-turbo-1106, but my dataset is not very large.

What hyperparms are you setting when making the API calls for chat completion with the fine-tuned model?

Usually high temperature setting can lead to such garbled output.

1 Like

Oh! Yes! I indeed had set the temperature parameter before, thank you!

1 Like