How is the fine-tuned model picked?

adityanutakki · February 19, 2024, 5:06am

I was fine tuning a gpt-3.5-turbo-1106 with my own dataset which has about 800 training samples and about 150 validation samples for 3 epochs.

From the image attached, the checkpoint at step 2201 seems to have lower training and validation loss. How does openai pick the checkpoint when i’m using it for inference ? Does it by default pick the checkpoint with lowest training/validation loss or is it the one at the last step ?

I couldn’t find anything in the documentation either so some transparency would be nice.

Thanks

_j · February 19, 2024, 5:13am

You get the end results only.

You can use the resultant curves to infer some notion of quality, inference, and overfitting.

The training and validation loss may be perturbed by the particular state of progress through the learning when an evaluation is performed. The learning statistics and internal steps and batches aren’t said to correspond to divisions between wholly-formed examples.

Topic		Replies	Views
Fine-tuned model vs last checkpoint API fine-tuning , fine-tuning-problems	1	1331	April 6, 2024
Questions about fine-tuning GPT-3.5-turbo API fine-tuning	1	2216	October 29, 2023
Loss Function in Fine Tuning API gpt-35-turbo , fine-tuning , fine-tuning-problems	3	7878	September 29, 2023
Evaluation of Fine-Tune Model API	4	2405	February 17, 2023
How to save snapshots while fine-tuning? API fine-tuning	1	716	September 7, 2023

How is the fine-tuned model picked?

Related topics