Hi everyone!
I’m new to prompting and currently running some tests to improve my prompts using the OpenAI API. I’m using the /v1/evals
endpoint to create evals, upload datasets, and execute test runs.
In my application, the model responses follow a strict JSON schema format. For example:
{
“schema”: {
“app_type”: “news”
}
}
I’d like to validate:
- The presence of specific keys (e.g.,
schema
,app_type
) - The correctness of values (e.g.,
app_type
must be"news"
or"shop"
)
However, I’m stuck because:
- I don’t see a way to define or enforce a specific response format (like JSON) when running
/evals/{eval_id}/runs
- The
testing_criteria
configuration doesn’t seem to support key/value assertions directly
Is there a recommended way to handle this kind of structured response validation in OpenAI Evals? Should I write a custom evaluator (and if so, how) or is there a built-in method I’ve missed?
Any help would be greatly appreciated!