Train/Validation split - fine tuning


Reading the API for creating fine tuning, I see we can add a validation file as well: OpenAI API

What are the best practices in terms of train / validation sizes? (I.e., is there a percentage that the validation file should be?)

Let’s say I have a training file of 100 prompts/completions - how many of these should be reserved for the validation size?

You should have 200 training samples minimum. Then you need to test the results qualitatively, not quantitatively.