OpenAI eval API - text similarity grading

Hi

I have a question about using the eval API.

For the text_similarity grader, what is the parameter in the testing_criteria to decide the passing grade?

Here is my code and all the results show `passing` even if the similarity is below 0.5

eval_create_result = client.evals.create(
    name="Similarity Check",
    metadata={
        "description": "This eval tests text similarity"
    },
    data_source_config={
        "type": "custom",
        "item_schema": Query.model_json_schema(), # we will upload python objects as run data
        "include_sample_schema": True
    },
    testing_criteria=[
        {
            "type": "text_similarity",
            "name": "Compare text similarity",
            "input": "{{ sample.output_text }}",
            "evaluation_metric": "cosine",
            "reference": "{{ item.text }}",
            "passing_grade": 0.8
        }
    ],
)

You might need to review the parameter names accepted by textSimilarityGrader

{
    "type": "text_similarity",
    "name": string,
    "input": string,
    "reference": string,
    "pass_threshold": number,
    "evaluation_metric": "cosine"
}

pass_threshold is what should return a boolean.

Oh @@, thanks :slight_smile:

Good to know the place for the documentation of the grader

1 Like