Validation loss vs. full validation loss

New fine-tuning has some new things I haven’t fully explored that simplify advice I’ve handed out before to continue fine training in small steps.

Specifically: you have the n_epochs hyperparameter that specifies how many passes will be done through your training data. There is now checkpoints - individual models produced at the end of each epoch pass, so you can find where they become overfitted (which at the learning rate of your graph, happened quite early.

https://platform.openai.com/docs/guides/fine-tuning/use-a-checkpointed-model

This also is the source of the “full validation loss” report, more information about internal quality points besides just running some inference during batching.

You can read more at the link above, as I would only be reading it myself and distilling it, needing to fill in the unclear sections with experimentation I’ve not done.

2 Likes