Error: compute_classification_metrics

I am working with a dataset that consists of 42 unique multiclass classes. The number of prompt completion pairs in my dataset is 10,000. However, these 42 classes are repetitive for each completion, causing an issue.
Specifically, I encountered the error message ’ Error: If compute_classification_metrics is True, each of the classes must start with a different token. Fine-tune failed.’
I am seeking help and guidance on how to resolve this issue. Any assistance would be greatly appreciated.

Hi @Raviteja

In my understanding, the error simply means that the β€œclasses” in which you are classifying must start with different tokens - meaning that the "completion" must begin without any common tokens. typically, adding a whitespace before the beginning of completion (in your case class) should help. Also makesure your classes are all in lowercase.

You can also use tokenizer to make sure that the classes don’t have any common tokens in the beginning.