I’ve got about 3 years general NLP experience, having worked extensively with entity-intent based NLU, worked a bit on ML NER and text classification models, and of late, read a lot about transformer technology and played on OpenAI playground for the past week. On the other hand, I’ve been programming for 20 years and feel very comfortable with “traditional” rule based AI.
In my bits and pieces of experience, ML models are better at extracting entities than classifying large bodies of text. Sure, sentiment analysis on 4/5 sentiment labels is reasonably “easy”, but in the past, when it comes to classifying text against about 30 labels, it becomes increasingly hard to train models that reach, say, 98% accuracy. Yet, the same problem, when extracting entities and logically deducting a text category based on which entities where identified seems to be a simpler and more effective solution.
If you had 30 categories you needed to perform text classification on, and the text that needs to be classified is between 1 and 6 sentence paragraphs, yet you only have a hand full of examples per category, 15 at most, with many categories only having one or two examples, plus the categories/examples aren’t exactly “crisp” as categories over lap (like Rain, Flood, TropicalStorm, Hurricane) which often produce texts that even human beings can spend hours arguing that “Excessive rain and wind damage due to Hurricane Harvey resulted in river banks to overflow…” aught to fall under “Flood” or the proximate cause is the “Rain”, or the main event is the “Hurricane” etc. Given the amount of training data and the complexity of the task, would OpenAI’s Davinci engine reach 98% accuracy?
My thoughts are to simply to identify causes, therefor extract multiple causes per text, and use a rule based system that defines which categories apply for given combinations of causes, for example Hurricane + Flood = “Flood”, yet Hurricane + Rain = “TropicalStorm” and Hurricane without (Floor or Rain) = “Hurricane”.
Playing with Currie and Davinci, I’m starting to think logic isn’t it’s main strength and in some circumstances better left for “normal” code. Anybody share this thinking with me?
How wrong am I? I’d love to hear other people’s thoughts