Understanding fine-tunning models and (mis)classification

Hi, I followed the Cookbook recipe that fine-tunes an ada classifier to distinguish between two sports: Baseball and Hockey.

It works with examples that belong to any of those categories. However, when I send a Completion request with the text Cristiano Ronaldo the result is soccer or even some other categories. I thought that the output would be either hockey or baseball only. I need some enlightenment here, please.

My understanding is that fine-tuning process creates a new model (a classifier) that analyzes a text to find its category, and there should only be two categories. Am I missing something? I tried to find any mentions about misclassifications in the forum/documentation without any luck.

Or perhaps I should fine-tune the model considering 3 classes? hockey, baseball, and unknown?

Thank you for your insights, I am amazed by this technology and want to learn more about usage, limitations, everything! :slightly_smiling_face:

Hey @luis.beltran. Welcome to the community! :slight_smile:

This is a great question indeed. When fine-tuning one of the base OpenAI models, you’re not changing the underlying architecture of the neural net. In prior fine-tuning paradigms, we’d usually use a pre-trained seq-to-seq model (as any of OpenAI’s models), stack a classifier layer on top of it (usually linear projection over the number of classes + softmax to output probabilities) and train the weights of this new layer via supervised learning while we keep the whole pre-trained net frozen.

However, this is not what you do with OpenAI models. You still change the weights (probably just the last layer; still linear projection + softmax after all), but not the architecture of the underlying model. It is still a pure seq-to-seq decoder transformer. So, given an input text, it would produce an output text.

Given enough training data, you can bias this generation via supervised learning to significantly increase the probability of producing only the tokens that you’re interested in from a classification perspective. In your example, the tokens that compose the words Baseball and Hockey respectively. In fact, it seems that overfitting these models is pretty easy. However, you cannot ensure that the probability of producing any other token is zero. It is still a seq-to-seq model after all, not a pure classification model. And it still has some world knowledge even after your fine-tune process. So it knows that Cristiano Ronaldo is neither Hockey nor Baseball.

So yeah, there are some alternatives here:

  • The one you’re proposing is completely fine. Just have a third category for everything that is not Baseball or Hockey :slight_smile:.
  • Measure some sort of similarity between the produced output and the classes that you expect here, and reject the output if it nos close enough to any of them. Any sort of metric will do the job here: embeddings, Levenshtein distance, regex…

Hope that helps!

1 Like

Thank you for sharing your experience with fine-tuning an Ada classifier for baseball and hockey. It’s great that the model is working well with examples that belong to those categories. As for the misclassification of “Cristiano Ronaldo” as soccer or other categories, it’s possible that the model doesn’t have enough training data on soccer or other categories. Fine-tuning the model with three classes, including an “unknown” category, could be a solution. Keep exploring the technology and don’t hesitate to ask for help when needed!