Has something changed recently with the way Evaluations work in the Dashboard?
Two weeks ago, when I ran them I had two choices for evaluation: item.input
and item.output
. Now, when I try to set a new Evaluation up, and I import from Stored Completions I get two columns input
and output
as I did before, but in the dropdowns for the possible tests I get this:
I tried running it with sample.output_text
but got:
Which makes sense…but where is item.output
?
UPDATE: on further investigation, I now see there is a pop-up hover that explains:
I tried copying the failed Evaluation and running it again, and now it’s working…
Second one running freshly Created seems to be working now as well.