Quantitative methods for evaluating how changes to a manifest file impact response?

Like a lot of others on here, I’ve been mostly evaluating manifest files through trial and error as there is currently no programatic way to access ChatGPT plugins. However, curious if anyone has suggestions for quantitative measures for evaluating how a change to a manifest file impacts the quality / content in the plugin model’s response?

The manifest basically defines the api including authentication and location of the specification itself. The main logic is in the specification which is a swagger spec, Either yaml or json.

Thanks for the reply. I was mostly referring to the description_for_model parameter. Are people just trial and erroring how slight changes to this description impacts their results?