When finetuning a model with a bunch of exemplar messages, do all the messages need to be absolutely perfect, or will the occasional slip in format etc be overlooked by the model as long as they don’t appear too often?
Currently working with this problem too, I think my strategy is going to be to find a hundred or so GOOD examples, and then see how the finetuned model does on a small sampleset. My idea will be to just iterate until I get something satisfactory…
I’m not sure what your doing with your data/model post finetuning but a general rule of thumb is the average of the data will determine the best performance of your finetuned model
What does the “occasional” or “too often” mean in this context? What if the model tries to identify patterns where future slip ups are expected?
In general: high quality data leads to high quality results. It has also been shown that you can reduce the sample size if the data quality is high enough.
Expand the sample size and you may be able to make up for some of the errors but the result will never be optimal.
If you are ok with “ok” results, then go for it.
You can try the following:
Start a new fine-tuning job with perfect data and a relatively small data sample.
Then evaluate the results for benchmarking.
Proceed to train the fine-tuned model with a mixed set of good and contaminated data, evaluate again and compare to your benchmark.
What do you think the conclusion will be?
Yeh that’s kinda along my line of thought. Another thing I was wondering was whether finetuned models would be able to adapt to new functions being added to their tools further down the line?