Match unstructured text to the “best” keyword

I hate asking this, but I don’t know enough to even start to understand where to figure this out. Please don’t see this as looking for an answer … but if you can give guidance on where to research, I’d appreciate it!

I have a list of companies with associated descriptions. Something like “Joe’s Flowers” in column 1 and “flower boutique flower shop gift shop” in column two.

Separately I have a list of 3,000 approved categories: “Automotive Dealership” and “Flower and Gift Shops”, for example.

I need help taking the “description” and finding the “best” match. I was hoping something here would help, but after a few hours poking around, I’m not seeing it.

Again, not looking for a solution, just maybe a general area to start exploring.

Thanks much.

This may not be the best usecase for GPT-3. Really, what you’re probably looking for is a semantic vector matching algorithm such as Google’s USE-v4 or BERT. Basically what you want to do is convert the descriptions and categories into lists of semantic vectors and then use a closeness algorithm to match them.

2 Likes

This is amazing. Thank you for taking the time to link up those articles … just having some terms to search for is incredible.

Time for some experimentation!

1 Like