Issue with Structured Outputs Returning Invalid JSON Object

yonzbar · August 29, 2024, 3:19pm

Hello everyone,

I’m encountering an issue with the Structured Outputs feature when using the OpenAI API. The problem is that the API is returning an object that is not a valid JSON. Here is the object I’m receiving:

MathReasoning(steps=[Step(explanation=‘We are given the equation 8x + 7 = -23. The goal is to solve for x, meaning we want to get x by itself on one side of the equation.’, output=‘Equation: 8x + 7 = -23’), Step(explanation=‘To start solving for x, we should isolate the term with x, which is 8x. We do this by eliminating the constant on the left side of the equation by subtracting 7 from both sides.’, output=‘Subtract 7 from both sides: 8x + 7 - 7 = -23 - 7’), Step(explanation=‘Subtracting 7 from both sides simplifies the equation to 8x = -30, because 7 - 7 is 0 and -23 - 7 is -30.’, output=‘Simplified equation: 8x = -30’), Step(explanation=‘To solve for x, we need to divide both sides of the equation by 8, the coefficient of x, to get x by itself.’, output=‘Divide both sides by 8: 8x/8 = -30/8’), Step(explanation=‘Dividing both sides by 8 simplifies the equation to x = -30/8. We can further simplify this by dividing the numerator and the denominator by their greatest common divisor, which is 2.’, output=‘Simplified: x = -15/4’), Step(explanation=“We’ve simplified the fraction -30/8 to -15/4 by dividing both the numerator and the denominator by 2.”, output=‘Final simplified form: x = -15/4’)], final_answer=‘x = -15/4’)

**** I’ve followed the code example provided in the official documentation. Here is the relevant code snippet:******

from openai import OpenAI
from pydantic import BaseModel
import os
import json
import requests

class Step(BaseModel):
explanation: str
output: str

class MathReasoning(BaseModel):
steps: list[Step]
final_answer: str

def call_ai_api():
print(‘!!! call_ai_api !!!’)

client = OpenAI()

completion = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "You are a helpful math tutor. Guide the user through the solution step by step."},
        {"role": "user", "content": "how can I solve 8x + 7 = -23"}
    ],
    response_format=MathReasoning,
)

math_reasoning = completion.choices[0].message.parsed

# Access the final answer directly from the MathReasoning object
ai_message = math_reasoning
#.final_answer
return ai_message

def main():

response = call_ai_api()
print("\nAI Response:")
print(response)
print("\n" + "-"*50 + "\n")

if name == “main”:
main()

The MathReasoning class is defined using Pydantic’s BaseModel to structure the response, and I’m using the OpenAI client to call the API.

I’m not sure why the API is returning this object in an invalid JSON format, and I’d appreciate any guidance or suggestions on how to resolve this issue.

Thank you in advance for your help!

expertise.ai.chat · August 29, 2024, 3:24pm

Try calling .dict() on what you get back.

yonzbar · August 29, 2024, 4:58pm

Thanks for the suggestion! I tried calling .dict() on the object returned, and it did convert the object into a dictionary. However, the formatting used single quotes (') for keys and string values instead of double quotes ("), which is required for valid JSON. This is causing issues when I try to work with the data as JSON. Is there a way to ensure that the output is properly formatted with double quotes?

expertise.ai.chat · August 29, 2024, 5:02pm

Maybe you could try calling json.loads() on the output of .dict().

yonzbar · August 29, 2024, 5:15pm

It seems it can’t handle the object:
TypeError: loads() missing 1 required positional argument: ‘s’

expertise.ai.chat · August 29, 2024, 5:17pm

Is this what you did?

foo = what_you_get_back_from_openai

your_json_object = json.loads(foo.dict())

yonzbar · August 29, 2024, 6:35pm

First thing The method “dict” in class “BaseModel” is deprecated
The dict method is deprecated; use model_dump instead.Pylance

but:
foo = what_you_get_back_from_openai

your_json_object = json.loads(foo.model_dump())
brought me an error:
TypeError: the JSON object must be str, bytes or bytearray, not dict

Thanks so much for all the help!

expertise.ai.chat · August 29, 2024, 6:37pm

Ah woops, looks like you want json.dumps, not json.loads. My bad.

yonzbar · August 29, 2024, 7:01pm

Still, the following code:
math_reasoning = completion.choices[0].message.parsed
math_reasoning = json.loads(math_reasoning.model_dump())

gives me the following error:
TypeError: the JSON object must be str, bytes or bytearray, not dict

And if in all this I try to use dict() instead of model_dump() then I get an error:

TypeError: the JSON object must be str, bytes or bytearray, not dict

yonzbar · August 29, 2024, 7:52pm

Thank you, now I understand what you meant. I take the object I receive, do a model_dump() on it, then json.dumps(), and now I have the JSON object ready.

Thanks a lot! Although, it does seem like a pretty serious oversight by OpenAI.

nicholishen · August 29, 2024, 8:12pm

You are using a convenience method of the python SDK that returns a pydantic object, which is the documented behavior (adopted from the instructor package). If you don’t want to get a pydantic object then you can use the standard client.chat.completions.chat method to call the LLM.

Topic		Replies	Views
Structured output is not structured API api , structured-output	6	460	February 19, 2025
How to print a Structured Output like a JSON? API	3	819	August 20, 2024
Using Pydantic structured outputs in batch mode API	6	4257	November 29, 2024
"Could not parse JSON body" error in following Structured Output example API structured-output	5	361	November 4, 2024
Quality of response between gpt-4-1106-preview and gpt-4o API gpt-4 , openai , gpt-4o	14	979	September 11, 2024

Issue with Structured Outputs Returning Invalid JSON Object

Related topics