How can I measure JSON output accuracy?

I agree. The goal is not to replace the human. The goal is to augment or speed up the human.

Still need to tune the number of runs to hopefully normalize the result.

But before doing that, I’m trying to figure out how to quantify the difference between an automatically generated JSON object, and the human created JSON object.