Difference between Structured Outputs and function calling required

Hello,

When using a function calling, I can specify which parameters will come in the function that will work with “required” parameter, so the incoming json data comes in the schema I want.

In the Structured Outputs update, I did not understand what “strict” will contribute to us.
Does anyone have information about this subject?

When the model sees a parameter that has a default (not required) it can choose to be lazy and not fill it in. Also, it can make your instructions more susceptible to jailbreaking. Imagine you’re extracting structured data from unstructured text, and the target text is a legal doc that specifies a contractual relationship between two parties. If you had a parameter defaulted to party1="unknown" then model may decide to be lazy and not fill in party1 even though it’s clearly stated in the target text. To get around this you should specify a union to force the model to fill in [something]. Even if it is a null value the model must address the parameter in a non-lazy way.

The only thing strict really does is return an error if you didn’t include every property in a required list also. It is for you, I guess. It is conceivable that there is JSON understanding forced on the model algorithm to make it give only in-order properties which can only happen if its going to produce every single one without thought.

Using agents to extract single values would be a lot more reliable.

e.g.

The following text has a couple of fruits and vegetables. 
Extract all fruits and vegetables. 
Don't be lazy you piece of #§%$ !!!!

Give me a structured json output like this 

{'fruits':[...], 'veggies':[...]}

Start the output with { and end with }

…is less reliable than two requests that just ask for veggies and fruits separately.

Also a chain of

“Do we have any fruits in here?” → true → next request “list all fruits in this text”

is more reliable as a call like

extract all fruits from this text - if any. If no fruit than write "no fruit" instead...

The structured output json schema response format is quite reliable, at least for making the output format. Newest GPT-4o is required.

The AI still can put whatever it wants into the data objects, but the data objects must be from the allowed. There is a description field also available in a schema.

First, make that schema:

Sure, here’s the JSON schema for a structured output that requires two strings, “fruits” and “vegetables”:

{
  "name": "food_extraction",
  "strict": true,
  "schema": {
    "type": "object",
    "properties": {
      "fruits": {
        "type": "string",
        "description": "The AI must extract all 'fruit' entity types from the provided text."
      },
      "vegetables": {
        "type": "string",
        "description": "The AI must extract all 'vegetable' entity types from the provided text."
      }
    },
    "required": ["fruits", "vegetables"],
    "additionalProperties": false
  }
}

This schema defines a structured response with two required properties, “fruits” and “vegetables”, both of which are strings. The descriptions for each property instruct the AI to extract all matching entity types from a provided text. The “strict” attribute is set to true, meaning all properties are required and must be included in the response.

Then run it on an AI-produced fruity story.

Untitled

With just this run’s “you are an entity extractor” and “input document” messages, the AI otherwise goes nuts on extracting the farmer and other nouns. The format and descriptions in the forced output JSON schema do the work.

1 Like

The bigger the text and the more fruits the more likely it was leaving some fruits out…
might have changed with most recent gpt4o

Yep, that’s just context attention quality. The ability for the masking to seem to be reading through the document while exposing the instructions. The predicting of a fruit word and the predicting of when there are no fruit words remaining and thus a ".

If you want to pay a lot, and still chunk to typical output size, a completely different technique:

“Repeat this back to me without any changes to the prose. When you encounter a fruit in your response, add @@@ after the word or phrase describing a fruit. When you encounter a vegetable in your response, add !!! after the word or phrase describing a vegetable.”

1 Like

I understand, we simply prevent him from being lazy in the relevant fields.

Because to give an example,

I gave property names and id values ​​in a document.

Then I told him that I wanted a property, sometimes he gave me the list of properties I had and asked me to choose, sometimes he ran the function and gave the property id value randomly :slight_smile:

Then giving a codebase and telling repeat this back to me without any changes.
When you encounter a routing file add a new route…

should work too.

You might be interested in the API enforcement of outputs like structured output being done in different ways on this model to reduce the usage surface for developers, reduce the ability to reuse openai’s model training, and to worsen the results.

Replicate a tool on gpt-4o:

Switch to gpt-4o-2024-08-06, though, and the AI cannot write what it wants:

Untitled-1

Overview of generation:

  • A tool invocation token is sent because of the same desire to use the memory tool;
  • The tool recipient mode is activated;
  • Then along with a recipient address format the AI produces, the AI sends to functions tool (or tools that you would have assistants enable);
  • The AI writes the next thing it can, which is only from the functions present in API request;
  • Non-function, undesirable output.