Howdy!
Hopefully this post is not already covered by someone else. As I’m sure many others have observed, a major utility of the API is for identifying in various ways semantic duplicates. I work in the print mail industry, so for example, we get a collection of files containing information about people to mail from a client, and we need to extract the address information and put that through some postal sorting software. One file has fields such FNAME, ADDRESS1, etc whereas another has first, street_address, etc. Rather than building import templates for all of these different file structures, or trying to generate a very long set of search rules to somehow put them all together, it is great to simply pass to the API a list of fields that I want to find and the list of fields I have and ask it to match them up. I then take the returned JSON and use that to rename the fields, and voila, I no longer need hundreds (we have a lot of clients and a lot of file structures to deal with) of import templates or rules. It is a little bit more complicated than that, but the example should be clear enough.
There are so many utilities for this sort of generic semantic duplication matching, I feel like there ought to be some sort of pre-built feature into the API as a callable option for this kind of matching, something like this:
Function name: semantic match
Right: first list of values to match
Left: second list of values to match
Match type: {“all”:“every value in both the right and left must be matched”
“left”:“every value in the left list must be matched”
“right”:“every value in the right list must be matched”
}
Dupe=False : if True, means that each item can only match to one item in the other list
Returns: A JSON object indicating matches between right and left.
At this point, I’ve basically built API calls to do all this, but it is such a generic utility that I’m re-applying across tasks that I feel like it should just be directly available through the API. Obviously, my suggested names might be bad, and if you all know of something like this already, I’d be glad to hear about it, but figured I’d throw out my suggestion.