Output format from Tool randomly changing in C#

I’m trying to create questions and multiple choice answers from text content using the OpenAI API in c#. I’m using a Tool to format the output but it keeps changing the json structure for the question and answers from something like:

  "questions": [
    [
      "How does expressing gratitude and appreciation impact your well-being and life according to the content?",
      "It makes you feel more energetic and fuller of life",
      "It makes you more anxious",
      "It has no effect on your well-being",
      "It makes you feel tired and drained",
      "It makes you feel more energetic and fuller of life. Being grateful and expressing gratitude leads to better self-care and a sense of well-being."
    ]
  ]

To something like this:

 "questions": [
  {
   "question": "What is emphasized as a crucial factor in achieving elite athletic status according to the text?",
   "answers": [
    "Physical strength and agility",
    "Envy and jealousy",
    "Development of personal traits",
    "Comparing oneself to others"
   ]
 ]

The tool definition is unchanged between runs and I'm using "QuestionGeneration" in my tool-choice:

                var tool = new List<Tool>
                {
                new Function(
                    "QuestionGeneration",
                    "Generate a question from text",
                     new JsonObject
                     {
                         ["type"] = "object",
                         ["properties"] = new JsonObject
                         {
                             ["questions"] = new JsonObject
                             {
                                 ["type"] = "array",
                                 ["description"] = "An array of each individual question",
                                 ["items"] = new JsonObject
                                 {
                                     ["question"] = new JsonObject
                                     {
                                         ["type"] = "string",
                                         ["description"] = "The question derived from the provided text"
                                     },
                                     ["answers"] = new JsonObject
                                     {
                                         ["type"] = "array",
                                         ["description"] = "An array of possible answers that could answer the question",
                                         ["items"] = new JsonObject
                                         {
                                             ["name"] = "answers",
                                             ["type"] = "string",
                                             ["description"] = "One of the possible answers to the question."
                                         }
                                     },
                                     ["correctAnswer"] = new JsonObject
                                     {
                                         ["type"] = "number",
                                         ["description"] = "the index number identifying which of the items in the answers array is correct"
                                     },
                                     ["question_type_int"] = new JsonObject
                                     {
                                     ["type"] = "string",
                                     ["description"] = "The type of question that was created.",
                                     ["enum"] = new JsonArray { "multiple choice", "true or false", "yes or no", "fill in the blank", "multiple selection", "numeric" }
                                     },
                                     ["feedback"] = new JsonObject
                                     {
                                         ["type"] = "string",
                                         ["description"] = "a statement telling why the answer was correct"
                                     },
                                     ["competency"] = new JsonObject
                                     {
                                         ["type"] = "string",
                                         ["description"] = "the category or summary of the type of question that was generated"
                                     }
                                 }
                             }
                         },
                         ["required"] = new JsonArray { "questions", "question", "answers", "correctAnswer", "feedback" }
                     })
                };

So, what is wrong with my tool definition and how can I fix it so it returns a consistent format?

Here’s what it looks like if you don’t make it so complicated and just construct a JSON string, and we imagine how your ‘Function’ might work…like the AI had to after several goes and then showing it finally what a function actually looks like for inference.

toolspec.extend([{
        "type": "function",
        "function": {
            "name": "QuestionGeneration",
            "description": "Generate a question from text",
            "parameters": {
                "type": "object",
                "properties": {
                    "questions": {
                        "type": "array",
                        "description": "An array of each individual question",
                        "items": {
                            "question": {
                                "type": "string",
                                "description": "The question derived from the provided text"
                            },
                            "answers": {
                                "type": "array",
                                "description": "An array of possible answers that could answer the question",
                                "items": {
                                    "name": "answers",
                                    "type": "string",
                                    "description": "One of the possible answers to the question."
                                }
                            },
                            "correctAnswer": {
                                "type": "number",
                                "description": "the index number identifying which of the items in the answers array is correct"
                            },
                            "question_type_int": {
                                "type": "string",
                                "description": "The type of question that was created.",
                                "enum": ["multiple choice", "true or false", "yes or no", "fill in the blank", "multiple selection", "numeric"]
                            },
                            "feedback": {
                                "type": "string",
                                "description": "a statement telling why the answer was correct"
                            },
                            "competency": {
                                "type": "string",
                                "description": "the category or summary of the type of question that was generated"
                            }
                        }
                    }
                },
                "required": ["questions", "question", "answers", "correctAnswer", "feedback"]
            }
        }
    }]
)

I think you have a misunderstanding of what an array is. It is not a container for more properties. It just means the AI can make a list of things. This is the text the AI might be receiving as a tool specification:

# Tools

## functions

namespace functions {

// Generate a question from text
type QuestionGeneration = (_: {
// An array of each individual question
questions: any[],
}) => any;

} // namespace functions

## multi_tool_use

// This tool serves as a wrapper for utilizing multiple tools....(blabla)

“object” JSON data type serves as nesting. And you should make the names not repeat and be very descriptive. You cannot have a arbitrary length tool call that grows - you must still tell the AI how to make long strings or single nesting lists that grow in length within the schema specified.

I’m sure I have a misunderstanding about a lot of this but confused by your example. Everywhere that references the Tool definition (previously Function until that was deprecated… and worked perfectly well for my needs) it looks more like this:

{
  "name": "MyFunction",
  "description": "This is a sample function",
  "parameters": [
    {
      "name": "array",
      "type": "array",
      "description": "Two-dimensional array",
      "items": {
        "type": "array",
        "items": {
          "type": "number"
        }
      }
    }
  ],
  "output": {
    "type": "array",
    "items": {
      "type": "array",
      "items": {
        "type": "number"
      }
    }
  }
}

Obviously, this is for a two-dimensional array but they all have a similar format. So, I’m really confused by your example now.

Hey there!

So, if you squint, you’ll actually recognize that this:

is actually embedded inside the example code given to you.

This is what is called a JSON schema. It’s a format that you can use across all different kinds of things, like a set standard for what contents should be expected inside a JSON object.

You can see _j’s example more closely resembled what you’ve given by looking here:

What I provided wasn’t a “correct” example, it was attempting to produce a JSON like your code might.

Here is a 2D array, in that the second level has more than one entry, but they are named and not also of arbitrary length:

toolspec.extend([{
        "type": "function",
        "function": {
            "name": "produce_quiz",
            "description": "The AI generates a multiple choice quiz as a JSON array (list) of questions and answers",
            "parameters": {
                "type": "object",
                "properties": {
                    "quiz_items": {
                        "type": "array",
                        "description": "an array of questions, answers, and answer key in the strict format specified",
                        "items": {
                            "type": "object",
                            "properties": {
                                "question": {
                                    "type": "string",
                                    "description": "The question the quiz item is about",
                                    },
                                "answers": {
                                    "type": "string",
                                    "description": "three possible answers, one randomly correct",
                                    },
                               "key": {
                                    "type": "string",
                                    "description": "The correct answer",
                                    },
                            },
                            "required": ["question", "answers", "key"]
                        },
                    },
                },
                "required": ["quiz_items"]
            },
        }
    }]
)

It is seen this way when it is placed into AI language context for the AI to understand what kind of JSON output to write, not damaged as yours was:

## functions

namespace functions {

// The AI generates a multiple choice quiz as a JSON array (list) of questions and answers
type produce_quiz = (_: {
// an array of questions, answers, and answer key in the strict format specified
quiz_items: Array<
{
// The question the quiz item is about
question: string,
// three possible answers, one randomly correct
answers: string,
// The correct answer
key: string,
}
>,
}) => any;

} // namespace functions

And that’s the crux: how well the AI can understand, especially when the language model is prediction a token at a time from the entire context of input.

Add another array (such as a number of multiple choice answer at a length the AI can choose) at your own peril.

Thanks guys! I definitely need some time to soak all this in. Lots-o-meetings today but I’ll be on it when I can.

Ok, that works pretty well… still need massage it a bit here and there but much better. Here’s what I ended up using:

string parm = @"
{
    ""type"": ""object"",
    ""properties"": {
        ""questions"": {
        ""type"": ""array"",
            ""description"": ""an array of questions, answers, feedback, and answer index in the strict format specified"",
            ""items"": {
                ""type"": ""object"",
                ""properties"": {
                        ""question"": {
                        ""type"": ""string"",
                        ""description"": ""The question the quiz item is about""
                            },
                    ""answers"": {
                        ""type"": ""string"",
                        ""description"": ""${qtext}""
                    },
                    ""correctAnswer"": {
                        ""type"": ""string"",
                        ""description"": ""The sequence number identifying which of the items in the answers is correct, start numbering with 0.""
                    },
                    ""feedback"": {
                        ""type"": ""string"",
                        ""description"": ""A description of why the correct answer is in fact correct.""
                    },
                    ""Competency"": {
                        ""type"": ""string"",
                        ""description"": ""The grouping or category the this question and its answers belong""
                    }

                },
            ""required"": [""question"",""answers"",""correctAnswer"",""feedback""]
            }
        }
    },
    ""required"": [""questions""]
}
";

parm = parm.Replace("${qtext}", qtext);

var tool = new List<Tool>
{
    new OpenAI.Function("QuestionGeneration","The AI generates a multiple choice quiz as a JSON array (list) of questions and answers", parm)
};

The qtext variable has text based on the type of question being asked for: multiple choice, true/false, multiple-selection, etc.

Appreciate the help.

1 Like