Custom label classification

I have inputs such as the following:

  • search wikipedia for polaris
  • search wiki for german shepherd
  • search lyrics for the portal song

These will be categorized with a fixed number of subjects but unlimited predicates. I would like to be able to classify the subjects appropriately against the known label list, but also know when invalid input is entered (such as: “search frogs for blue”).

How do define these custom labels and get them to associate properly?

When taking cosine similarity, I am getting mixed results. Is there a way to change or improve results, or some other path?

Thank you for your time.

Not sure if I understand the use case completely, but it seems like you are looking to do
Zero Shot Text Classification.
Also you mentioned cosine similarity, are you converting to embeddings now?

Something like this? OpenAI API

1 Like

This looks exactly like what I’m looking for. How did you construct this, and can you send me towards documentation to better understand the syntax involved?

Thank you again for your time.

This is more from common logic, linguistics and programming, rather than from documentation. I think it’s better to have a chat (I can have about 30-45 minutes) to discuss the case and approach as it would take me much longer to explain it in written here. Just send me an email to serge@techspokes.com to get that scheduled.