Validation loss is decreasing but WER is increasing in Whisper model training?

stoicbatman · October 14, 2023, 11:07am

Hi, I’ve been using the Huggingface library to fine-tune the Whisper model. While the WER was initially decreasing, I’ve noticed it began to rise even though the validation loss continues to drop. Could the issue be related to my testing on a very small dataset?

As shown in the image, after 80th step the wer suddenly started increasing from 13 → 28

Foxalabs · October 14, 2023, 11:53am

Hi,

This looks like classic overfitting, the model is beginning to learn the dataset and not the underlying structure.

You can try with more variations on your source data, noise, clicks, pops, other likely audio interference, slowing speech down, speeding it up etc, to create more “synthetic” data or obtain more raw training data.

_j · October 14, 2023, 6:32pm

The validation loss, if you have a validation set with recordings and transcripts also, is continuing to improve even at the end.

If the validation is truly representative of the types and variety of audio input the model will accept, and is interchangeable in quality with the training data, it would seem you can continue training more if you do not care about the world languages and other training of the external WER audio dataset.

Topic		Replies	Views
Training loss=good, Validation loss=good API fine-tuning , api , fine-tuning-problems	8	4854	April 5, 2024
Need Help Improving Whisper API Accuracy for Short Words and Pronunciation Tasks API whisper	0	237	December 13, 2024
Whisper medium WER does not decay for low resource language Community whisper	0	409	December 19, 2023
When should I stop training of my fine-tuned model? API fine-tuning , api	1	1382	December 22, 2023
Overfitting issues in finetuning GPT 3.5 turbo API	0	277	April 8, 2024

Validation loss is decreasing but WER is increasing in Whisper model training?

Related topics