Improvement ideas for simple classification?

cameron1 · July 19, 2021, 11:05pm

Hi all! the problem I’m looking to solve is around categorization into pre-defined categories. Right now I’m using the search API and ranking to pick a category, but curious if anyone has any smarter suggestions.

For example, let’s say you have a long list of user-named folders like so:

animals
Animals
Aminals
pictures of animals
animals, insects, etc.
people
peoples
all the people
clocks
witches
clock times

And set categories such as: [‘animals’, ‘people’, ‘other’]

Using the search API, ranking the results, and selecting the top as the category works amazingly if the optional categories are limited. But, if I have a catchall like “other” then it gets almost nothing categorized to it. I need to start tweaking by saying “if the score isn’t above a certain amount, then it’s other” etc. things slip though the crack in real-world use cases when I start drawing that “score” line.

Anybody have any interesting ideas for a prompt-based approach to categorizing into “A”, “B”, “C” or catch-all?

cameron1 · July 27, 2021, 7:43pm

Thanks! I’m a newbie at this stuff, how do I measure precision and recall rates?
As powerful as semantic search is, I’m running into cases with name-matching where using something like Levenshtein distance is more accurate. But deep down I really think OpenAI’s search could crush at this if properly tweaked!

Topic		Replies	Views
Resolving ChatGPT hallucinations for text classification using IAB taxonomy Prompting gpt-4 , chatgpt	3	2475	July 23, 2023
Website content categorization Prompting	8	2488	June 29, 2022
Help with fine-tuning for text categorization API	4	1349	December 16, 2023
Match unstructured text to the “best” keyword Prompting	3	1306	December 17, 2023
How good is Davinci with Text Classification? API	7	1406	July 22, 2021

Improvement ideas for simple classification?

Related topics