How to save snapshots while fine-tuning?

Hi there! As you know, the values of valid_loss and valid_mean_token_accuracy are not steadily improving while fine-tuning. My goal is getting the best performance model during training.

However, the openai.FineTuningJob.retrieve("ftjob-abc123") and openai.FineTuningJob.list_events(id="ftjob-abc123", limit=100) functionalities cannot give me the appropriate information about the optimized model during training. Does anyone know how to extract the snapshot models that perform best at validation set rather than accept the final model?

No mechanism is available to extract the model as it was at a particular batch or to stop the training. The only thing you get as a control is the n_epochs hyperparameter, and the fine-tuning file that you present to the model yourself.

The ups and downs can just be the types of training and the types of validation run as the model works through the progress, they might not reflect the quality of the weighting for an application at that point.

By n_epochs and the amount of examples, you can still tune somewhat to a particular point (again) if you think there is overtraining in the final model. There’s no continuing on the same model with the new endpoint.