Hello, I’m using fine-tuning model ada for multiclass classification user’s messages. I have about 10 categories. I use number of categories in completion, for example:
1 - message about rent
2 - message about work
{"prompt": "**APT For Rent**\uD83D\uDD25\nDowntown views 2 /T1\nLocation: Downtown \nType: 1 BR\nSize: 69 SQM\nFully furnished \nRent: AED 150k->", "completion": " 1"}
{"prompt": "Hi, I am in Dubai. looking for a job in a Japanese restaurant. work experience of about 10 years->", "completion": " 2"}
How can i learn model to work with not classify data?
For example:
“In honor of the holiday, there is a discount in our supermarket from June 15 to June 28 for a number of products. Free shipping over AED 150 anywhere in Dubai” - message about discount. And other difference messages have completion 1 or 2.
These messages can be on completely different topics and I can’t anticipate all such messages to include them in the training.
But of the many messages, I only want the ones that best fit my categories. I have about 100 unique observations of each class in my training set. My model works fine on data that fits into one of the categories. But what to do with unclassifiable data I have more than 80% of them?