Fine-tuning loss graph peaks

I fine-tuned a GPT model for a classification task, and I observed intermittent peaks in the loss graph. The train and validation loss didn’t move smoothly but rather fluctuated between 0 and then suddenly spiked to values around 6-8 in one step, and then back to 0 in the next. There are some inaccuracies in the labeling of the training data. Could this be affecting it? If not, I am curious to know why this phenomenon is occurring.

Errors in the training data, will of course affect the performance of the mode, but if those errors are few, then that may be acceptable.

However, errors in the evaluation set will cause odd looking performance metrics .

If your eval set is a subset of your training data, as is typical, then you may need to do some work to clean the test set at least. There are also other unusual aspects to the way the OpenAI training system works that can display some unusual training performance graphs, the cause for those I am unsure of.