The keyword is “essence” of semantics.
It’s slightly counter-intuitive. I had this initial pitfall as well. My first attempt was “Hot”, and “Cold”, thinking they would be completely different. They are, but only in a certain method of measurement. By their essence they are very similar: They both represent temperature, they both can be used as measurements, they both can be used to describe items, people. They can cause injury.
Imagine you drew “Hot”, and “Cold”, or “Likes”, and “Dislikes”, and then you had to create a brainstorm of all the meanings behind it. You would find that they truly share a lot of the same characteristics.
Same with “likes” and “dislikes”. Both carry the same meaning in essence. The embedding model does not perform the logic that you intuitively want it to.
What you are looking for is 2 separate classifications. One for the sport, one for the preference. This can be done with embeddings and/or LLMs.
You can set points in the embedding space and then see how these items compare. I’m not going to try and fluff the numbers, you can see that “no preference” isn’t perfect, so false positives are an issue.
You would also need to classify all the sports and perform the same comparison tests.
You could put this all together with a fine-tuned model to output {PREFERENCE-SPORT} as well, up to you. Honestly a base model would probably work fine.
But, Completion will soon be gone as OpenAI covers up the ability to spit training data verbatim.