Try first to build the evals’ framework model grading for this one so a SOTA AI can determine how close the “bead” is to the ideal answer, first…
{
"input": [
{
"role": "system",
"content": "<input prompt>"
},
{
"role": "user",
"content": "<user input>"
}
],
"ideal": "correct answer is convincing fantastic bs"
}