Hi, thanks for your answer.
Here is an example of what I am sending to the verification API for a Score model grader, where everything works fine:
Input: (what I am sending to the API)
{ "type": "score_model", "name": "example_score_model_grader", "model": "gpt-4o-mini-2024-07-18", "range": [0.0, 1.0], "sampling_params": { "temperature": 1.0, "top_p": 1.0, "seed": 42 }, "input": [{ "type": "message", "role": "user", "content": { "type": "input_text", "text": "Score how close the reference answer is to the model answer. Score 1.0 if they are the same and 0.0 if they are different. Return just a floating point score\nReference answer: Test\nModel answer: Test" } }] }
Output (answer from the API):
{ "grader": { "input": [{ "content": { "text": "Score how close the reference answer is to the model answer. Score 1.0 if they are the same and 0.0 if they are different. Return just a floating point score\nReference answer: Test\nModel answer: Test", "type": "input_text" }, "role": "user", "type": "message" }], "model": "gpt-4o-mini-2024-07-18", "name": "example_score_model_grader", "range": [0.0, 1.0], "sampling_params": { "seed": 42.0, "temperature": 1.0, "top_p": 1.0 }, "type": "score_model" } }
So, and now, the same for an example Label Model Grader
Input: (what I am sending to the API)
{ "type": "label_model", "name": "example_label_model_grader", "model": "gpt-4-0613", "input": [{ "type": "message", "role": "user", "content": { "type": "input_text", "text": "Classify the sentiment of the following statement as one of positive, neutral, or negative" } }, { "type": "message", "role": "user", "content": { "type": "input_text", "text": "Statement: {{item.reference_answer}}" } }], "labels": ["positive", "neutral", "negative"], "passing_labels": ["positive", "neutral"] }
Output: (answer from the API)
{ "error": { "message": "Unknown grader type: label_model", "type": "invalid_request_error", "param": <null>, "code": "grader_error" } }
Every other grader works just as expected, but the Label Model Grader is unknown.
Would you mind sharing a bit more of your code?
I don’t really think my code can help with the solution to the problem, but to be safe:
the grader verification is done by this function in my own fork of the Godot-OpenAI plugin.
The code building the grader JSON dictionary for the label model grader specifically can be found in theese lines in the specific application.