Introducing Structured Outputs

That’s actually funny, I was about to say it’s copying my structured output framework, but its actually inferior and costlier.

So all good, go you!

1 Like

Great news !
Is structured output also available in the Assistant API ?

1 Like

is structured output available in batch mode ?

1 Like

Yes, it is. The documentation says:

Yes, it is. Take a look at:

https://community.openai.com/t/introducing-structured-outputs/896022/25?u=sashirestela

2 Likes

That’s not a useful reply. If the latency is 10 seconds to 1 minute, there isn’t really a way to design for that. What we need to understand is how the caching is performed, so we know how often the penalty is incurred. I’ve asked in several places for more detail, but so far no response:

4 Likes

thanks! i was trying to figure that out and for some reason could not find the answer anywhere… or didnt see it at least.

In the documentation to “Introducing Structured Outputs”, you write “To do this, we convert the supplied JSON Schema into a context-free grammar (CFG).” - Are there any plans to support user-defined CFGs, instead of just JSON schema? AFAIK llama.cpp has support for it. It would be of great help to produce code or code-like outputs in domain specifc languages. A game changer, in my opinion.

2 Likes

Fair comment, apologies if I was glib.

It definitely complicates the client solution.

You might need to delegate to a retriable back-end job to wait for the response and retry if it times out.

I wonder if a dummy “cache warming” request would help? It’s reasonable to ask for more details wrt how often that might need to run to provide an optimal runtime experience.

4 Likes

I did not yet try but they also say “We will make pricing for this feature available soon.” I am also not able to deploy GPT-4o-2024-08-06 although I can use it through early access playground. There seems to be a bit disappointing habit of announcing things without availability date so I am afraid it might not be there yet. I will try it soon throught the API and report.

It took year to properly do it… bravo to the passionate and brilliant OpenAI engineers

2 Likes

Updated a little:

import json
from openai import OpenAI
import tiktoken

def make_parameters(frogs: list[str]):
    assert len(frogs) <= 25, "OpenAI can only handle 25 frogs at once with this schema"
    messages = [
        {"role": "system", "content": "You are a biologist specializing in aquatics who has been tasked with describing frogs."},
        {"role": "user", "content": "For each frog, please describe its color, temperament, and taste in two words or less. Reply in JSON."},
    ]
    json_schema = {
        "type": "object",
        "properties": {
            v: {
                "type": "object",
                "properties": {
                    "color": {"type": "string"},
                    "temperament": {"type": "string"},
                    "taste": {"type": "string"},
                },
                "required": ["color","temperament","taste"],
                "additionalProperties": False,
            }
            for v in frogs
        },
        "required": frogs,
        "additionalProperties": False,
    }
    
    return {
        "messages": messages,
        "model": "gpt-4o-mini",
        "temperature": 0.1,
        "response_format": {
            "type": "json_schema",
            "json_schema": {
                "name": "summarization",
                "strict": True,
                "schema": json_schema,
            }
        }
    }

client = OpenAI()
parameters = make_parameters(["green tree frog", "poison arrow frog", "peeper", "bullfrog"])

encoder = tiktoken.encoding_for_model(parameters["model"])
answer = client.chat.completions.create(**parameters)

message = answer.choices[0].message.content
local_output_tokens = len(encoder.encode(message))

print("encoder is valid?", local_output_tokens == answer.usage.completion_tokens)
print("with spaces:", local_output_tokens)
print("without whitespace:", len(encoder.encode(json.dumps(json.loads(message), separators=(",", ":")))))

# Print the actual result
print("Result from API:")
print(json.dumps(json.loads(message), indent=2))  # Pretty print the JSON response

Results:

2024-08-10 01:04:24,629 - INFO - _client.py - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
encoder is valid? True
with spaces: 90
without whitespace: 90
Result from API:
{
  "green tree frog": {
    "color": "Bright green",
    "temperament": "Calm",
    "taste": "Mildly sweet"
  },
  "poison arrow frog": {
    "color": "Vibrant blue",
    "temperament": "Aggressive",
    "taste": "Toxic"
  },
  "peeper": {
    "color": "Brownish gray",
    "temperament": "Vocal",
    "taste": "Unremarkable"
  },
  "bullfrog": {
    "color": "Olive green",
    "temperament": "Loud",
    "taste": "Savory"
  }
}

Thank you! Keep up the great work :+1:t2:

Yay! I’m excited to use this. It solves several immediate challenges.

I love hearing about all of these quiet, consistent product releases. They’re so understated. Everyone is all wrapped up in OpenAI’s leadership or the Voice model’s unexpected strangeness to notice how important this release is.

Did y’all see that the Strict Structured Output achieves 100% in evals? That’s so cool. I also thought “how” they achieved it was super clever. This sure does make this more useable.

As for the latency—it only happens the first time you load the Schema, so why not load your schema manually before automating scripts? (I’m not an experienced programmer. I have no idea.)

1 Like

This works so great! A huge advancement for us. Thanks openai!

Maybe I haven’t understood it well, but I don’t see a big problem with it. Does it worry you that this initial delay of 10 seconds could happen in a production environment?

Because I think that’s almost impossible, since they say this will happen when there is a change in the defined JSON schema. And I imagine that one would immediately test the new workflow as soon as any parameter of this importance changes, previously in the test environment, right? I mean, by the time you deploy it to production, the change will already have been ‘processed.’

Or have I misunderstood it?

Loved this new feature! Made my life easier when using a database.

Question: When using Structured Output, how do I edit repr or str, or how do I print the text nicely with identations?

When I used pprint, it didn’t print the entire structure nicely like a JSON file would be printed.

Example:

desired:

{"steps": [
    {
      "explanation": "Start with the equation 8x + 7 = -23.",
      "output": "8x + 7 = -23"
    },
    {
      "explanation": "Subtract 7 from both sides to isolate the term with the variable.",
      "output": "8x = -23 - 7"
    },
    {
      "explanation": "Simplify the right side of the equation.",
      "output": "8x = -30"
    },
    {
      "explanation": "Divide both sides by 8 to solve for x.",
      "output": "x = -30 / 8"
    },
    {
      "explanation": "Simplify the fraction.",
      "output": "x = -15 / 4"
    }
  ],
  "final_answer": "x = -15 / 4"
}

But the current printed is:

"MathReasoning(steps=Step(explanation="Start with the equation 8x + 7 = -23.",output="8x + 7 = -23"), Step(explanation="Subtract 7 from both sides to isolate the term with the variable.", output="8x = -23 - 7"), Step(explanation="Simplify the right side of the equation.", output="8x = -30", Step(explanation="Divide both sides by 8 to solve for x.", output="x = -30 / 8", Step(explanation="Simplify the fraction.", output="x = -15 / 4"), final_answer="x = -15 / 4")
1 Like

@touchofred to make it print nicely you could use the rich library.
specifically rich.print()

I’m using the same syntax as the examples found in OpenAI’s python repo and it’s working pretty well for me

It’s not letting me post actual links (probably because this is my first post) but go to OpenAI’s official python repo and look at examples/parsing.py

here is the code from that example

from typing import List

import rich
from pydantic import BaseModel

from openai import OpenAI


class Step(BaseModel):
    explanation: str
    output: str


class MathResponse(BaseModel):
    steps: List[Step]
    final_answer: str


client = OpenAI()

completion = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "You are a helpful math tutor."},
        {"role": "user", "content": "solve 8x + 31 = 2"},
    ],
    response_format=MathResponse,
)

message = completion.choices[0].message
if message.parsed:
    rich.print(message.parsed.steps)

    print("answer: ", message.parsed.final_answer)
else:
    print(message.refusal)

It gets you output that looks like this (in the terminal it is colored):

[
    Step(
        explanation='Start by isolating the variable term (8x) on one side of the equation. To do this, subtract 31 from both sides of the        
equation.',
        output='8x + 31 - 31 = 2 - 31'
    ),
    Step(explanation='After subtracting 31 from both sides, you have 8x = -29.', output='8x = -29'),
    Step(
        explanation='Next, solve for x by dividing both sides by 8. This isolates x by itself on one side of the equation.',
        output='x = -29 / 8'
    )
]
answer:  x = -29/8
2 Likes

Thanks! Btw, how do you convert this response.message.parsed into a JSON file?
Should I use JSON format in advance within the client completions?

Has anyone worked with finetuning gpt-4o mini for a specific structured outputs use case yet? If so, do you need to include the response_format in the user messages in your finetune .jsonl?

Specifically wondering because the ‘description’ fields seem critical to model responses, but I’m not sure if the finetuning api handles the response_format.

The finetuning API typically focuses on learning patterns from the examples provided in the training data. The ‘response_format’ you might use during inference is more about instructing the model on how to format its output during deployment rather than during training.

In summary: You don’t need to include ‘response_format’ in the user messages of your finetuning ‘.jsonl’ file. However, your training data should be representative of the structure you want the model to learn, including how it handles critical fields like description.