Issue with Structured Outputs Returning Invalid JSON Object

Hello everyone,

I’m encountering an issue with the Structured Outputs feature when using the OpenAI API. The problem is that the API is returning an object that is not a valid JSON. Here is the object I’m receiving:

MathReasoning(steps=[Step(explanation=‘We are given the equation 8x + 7 = -23. The goal is to solve for x, meaning we want to get x by itself on one side of the equation.’, output=‘Equation: 8x + 7 = -23’), Step(explanation=‘To start solving for x, we should isolate the term with x, which is 8x. We do this by eliminating the constant on the left side of the equation by subtracting 7 from both sides.’, output=‘Subtract 7 from both sides: 8x + 7 - 7 = -23 - 7’), Step(explanation=‘Subtracting 7 from both sides simplifies the equation to 8x = -30, because 7 - 7 is 0 and -23 - 7 is -30.’, output=‘Simplified equation: 8x = -30’), Step(explanation=‘To solve for x, we need to divide both sides of the equation by 8, the coefficient of x, to get x by itself.’, output=‘Divide both sides by 8: 8x/8 = -30/8’), Step(explanation=‘Dividing both sides by 8 simplifies the equation to x = -30/8. We can further simplify this by dividing the numerator and the denominator by their greatest common divisor, which is 2.’, output=‘Simplified: x = -15/4’), Step(explanation=“We’ve simplified the fraction -30/8 to -15/4 by dividing both the numerator and the denominator by 2.”, output=‘Final simplified form: x = -15/4’)], final_answer=‘x = -15/4’)

**** I’ve followed the code example provided in the official documentation. Here is the relevant code snippet:******

from openai import OpenAI
from pydantic import BaseModel
import os
import json
import requests

class Step(BaseModel):
explanation: str
output: str

class MathReasoning(BaseModel):
steps: list[Step]
final_answer: str

def call_ai_api():
print(‘!!! call_ai_api !!!’)

client = OpenAI()

completion = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "You are a helpful math tutor. Guide the user through the solution step by step."},
        {"role": "user", "content": "how can I solve 8x + 7 = -23"}
    ],
    response_format=MathReasoning,
)

math_reasoning = completion.choices[0].message.parsed

# Access the final answer directly from the MathReasoning object
ai_message = math_reasoning
#.final_answer
return ai_message

def main():

response = call_ai_api()
print("\nAI Response:")
print(response)
print("\n" + "-"*50 + "\n")

if name == “main”:
main()


The MathReasoning class is defined using Pydantic’s BaseModel to structure the response, and I’m using the OpenAI client to call the API.

I’m not sure why the API is returning this object in an invalid JSON format, and I’d appreciate any guidance or suggestions on how to resolve this issue.

Thank you in advance for your help!

Try calling .dict() on what you get back.

1 Like

Thanks for the suggestion! I tried calling .dict() on the object returned, and it did convert the object into a dictionary. However, the formatting used single quotes (') for keys and string values instead of double quotes ("), which is required for valid JSON. This is causing issues when I try to work with the data as JSON. Is there a way to ensure that the output is properly formatted with double quotes?

Maybe you could try calling json.loads() on the output of .dict().

1 Like

It seems it can’t handle the object:
TypeError: loads() missing 1 required positional argument: ‘s’

Is this what you did?

foo = what_you_get_back_from_openai

your_json_object = json.loads(foo.dict())

First thing The method “dict” in class “BaseModel” is deprecated
The dict method is deprecated; use model_dump instead.Pylance

but:
foo = what_you_get_back_from_openai

your_json_object = json.loads(foo.model_dump())
brought me an error:
TypeError: the JSON object must be str, bytes or bytearray, not dict

Thanks so much for all the help!

Ah woops, looks like you want json.dumps, not json.loads. My bad.

1 Like

Still, the following code:
math_reasoning = completion.choices[0].message.parsed
math_reasoning = json.loads(math_reasoning.model_dump())

gives me the following error:
TypeError: the JSON object must be str, bytes or bytearray, not dict

And if in all this I try to use dict() instead of model_dump() then I get an error:

TypeError: the JSON object must be str, bytes or bytearray, not dict

Thank you, now I understand what you meant. I take the object I receive, do a model_dump() on it, then json.dumps(), and now I have the JSON object ready.

Thanks a lot! Although, it does seem like a pretty serious oversight by OpenAI.

You are using a convenience method of the python SDK that returns a pydantic object, which is the documented behavior (adopted from the instructor package). If you don’t want to get a pydantic object then you can use the standard client.chat.completions.chat method to call the LLM.

3 Likes