Hi!
Reading the API for creating fine tuning, I see we can add a validation file as well: OpenAI API
What are the best practices in terms of train / validation sizes? (I.e., is there a percentage that the validation file should be?)
Let’s say I have a training file of 100 prompts/completions - how many of these should be reserved for the validation size?
You should have 200 training samples minimum. Then you need to test the results qualitatively, not quantitatively.