Specific Meanings of Training Loss, Validation Loss, and Full Validation Loss?

I am currently trying to fine-tune the GPT-3.5 model, and following the Guide, I have completed the fine-tuning task.

I have a question regarding the fine-tuning metrics: Training loss 1.3896, Validation loss 2.0533, Full validation loss 1.6265.

What do these three metrics specifically represent? :rofl:

I understand that a lower Training loss indicates better fitting, and a Validation loss greater than Training loss suggests overfitting.
But are there any standards or thresholds for these metrics? I couldn’t find related content in the Guide and would like to ask if anyone knows the specific uses of these three metrics? :robot:

1 Like

@hammergpt

In machine learning, the goal is to build a parsimonious model that also minimizes overfit. Hence, for any training, there is a portion of data that is not used for the training and used for validation instead by fitting it on the trained model. This is the validation set. The training set is the set that was used to train the model. Then training loss is the in sample minimized loss function values and validation loss is the out of sample minimized loss function values. These are both for each epoch. The full validation loss is the loss metric for the validation loss across all epochs I hope this clarifies your question.

Thank you for the reply. However, my question mainly focuses on:

  1. Are there specific data ranges for training loss and validation loss, such as [0, 10.00000]?
  2. Are there standard ranges for these two metrics to assess model usability? For example, a model is considered usable if the training loss is less than 1.0000 and the validation loss is less than the training loss, similar to this kind of interpretive guidance.