Currently working with this problem too, I think my strategy is going to be to find a hundred or so GOOD examples, and then see how the finetuned model does on a small sampleset. My idea will be to just iterate until I get something satisfactory…
I’m not sure what your doing with your data/model post finetuning but a general rule of thumb is the average of the data will determine the best performance of your finetuned model