How are validation files used during fine tune jobs?

Does anyone know, or is it stated somewhere in documentation, whether or not the validation_file that is provided during fine-tuning is used to make training decisions, such as early stop or hyper parameter tuning?

What I’m getting at is, is it safe to use the same validation_file data as Test data post-training, or is this ill-advised because the validation_file has been an active contributor to the model fine tuning?

Hi,

Validation data is not used as part of the training, so yes, you can use that same data for testing, although more is always better.

1 Like

Got it, thanks for the confirmation. That’s what I suspected, but I wasn’t certain because the training runs calculate validation loss in some fashion.

I’m doing some experiments on task generalization and I’m using 800 samples of a ~1600 sample dataset between train and validation. I could just use more, but I’d already run the scoring, and wanted to make sure I hadn’t done something kinda dumb.

typically, the test dataset is around 10% of the training data.