Finetuning and getting F1 scores

Hello everyone,

seems like a really nice community out here so thought I would reach out. I am experimenting with OpenAI for Q&A purposes and seem to be able to fine tune on various models without any problems.

However when I attempt to use validation data and parameters to allow for the generation of F1 scores, I am running into problems.

When I use this
openai tools fine_tunes.prepare_data -f “discriminator_train.jsonl”

I am allowed to create a training and validation set - all good. I then use the parameters suggested to me by the preparation tool which is this :

openai api fine_tunes.create -t “discriminator_train_prepared_train.jsonl” -v “discriminator_train_prepared_valid.jsonl” -m ada --compute_classification_metrics --classification_n_classes 152

but I get errors which appear to relate to classification_n_classes - even though this figure (152) was suggested by the tool initially.

The number of classes in file-EGeCZs5FMe5kg0NYzkDcUTDj does not match the number of classes specified in the hyperparameters.
The number of classes in file-hZOFnFZtv7dEzOD6YjOYA6d3 does not match the number of classes specified in the hyperparameters.

There does not appear to be any documentation around this online.

1 Like

@kpeyton Did you ever find an answer, or documentation, about the “classification_n_classes” hyperparameter?

Edit - After reading this, I wonder if perhaps it wants your training file to include 152 different potential completions. If so, this requirement seems fraught, b/c they acknowledge most finetuning models will be trained with multiple files… but surely each file wouldn’t need to have 152 different completions?

Can you share any details about your training file to help validate or rebut this assumption?

I am having the same problem, I used these commands suggested by openai, one with split training and validation datasets and the other is only training dataset (no split)

!openai api fine_tunes.create -t "clinical_trials_labelled_dataset_prepared_train.jsonl" -v "clinical_trials_labelled_dataset_prepared_valid.jsonl" --compute_classification_metrics --classification_n_classes 4
!openai api fine_tunes.create -t "clinical_trials_labelled_dataset_prepared.jsonl" --classification_n_classes 4

But I still keep getting :
The number of classes in <fine_tune_file_id> does not match the number of classes specified in the hyperparameters.

Any ideas on what else we can try to get this fine tune working?

Best Regards,
Dilip

@diliprk Did you fix the problem? I am getting exactly the same, but i am creating a classification model with 54 classes

Thanks!!

Best,

Rocío

I wonder if perhaps it wants your training file to include 152 different potential completions.

I don’t think this is the case.

I’ve just faced the same scenario: the “prepare_data” suggested using 96 for the classification_n_classes parameter.

I’ve checked my data, which has many more entries, but only 96 unique completions.

I sent the job and got the same error.

I’m stuck right now.

Tomorrow I may try tweaking the number until OpenAI is happy, and if I get any relevant results, I’ll update here.

The docs seem to lack precise information on how this value is meant to be used.

I’m now getting the same error, as it was recommended to use 135. I tried 134 and 136 in case something was off by 1, but to no avail.

Is this just a bug?

I too am getting this error. I run the prepare data tool:

openai tools fine_tunes.prepare_data -f [LOCAL_FILE]

which suggests 7 as the number of classes. I verified and enforced that this was true in my data, and there are 7 unique classes.

I then run the exact command the tool gives me, which is:

openai api fine_tunes.create -t [file] -v [file] --compute_classification_metrics --classification_n_classes 7

and I receive this error:

The number of classes in [file-id] does not match the number of classes specified in the hyperparameters.

Has anyone been able to resolve this? This is my first time using the OpenAI API and I spent a day labeling about a thousand emails. I have tried giving various input for classification_n_classes, tried removing categories with low sample count, and searched quite a bit for a solution.

If anyone has a suggestion to work around this it would be greatly appreciated.

Wow, after over a day of struggling with this I found the issue(for me at least). I had made a mistake when labeling my data. I had labels numbered 1-14, but there was a single item which has the label 09 instead of 9.

The following command helped me find this error:
cat dataset.jsonl | jq -r '.completion' | sort | uniq

I hope this helps someone else.

My issue ended up being that my validation file did not contain examples of all of the training file classes.