What's wrong with my Structured Output response format?

itzamirrezab · October 20, 2024, 3:01pm

I’m trying to use Structured outputs, and I cannot make it to work. I have been debugging a long now and still not idea why this is happening. I have used structured ouputs before and it has worked, but for this one it does not seem to work.

class FieldRule(BaseModel):
    selector_type: str
    selectors: List[str]
    attribute: Optional[str]

class Rules(BaseModel):
    title: FieldRule
    description: Optional[FieldRule]
    link: FieldRule

class RuleSet(BaseModel):
    name: str
    rules: Rules

class ExtractionRules(BaseModel):
    rules_sets: List[RuleSet]
    has_news: bool

def extract_xpaths(html_content):
    try:
        completion = openai_client.beta.chat.completions.parse(
            model="gpt-4o-2024-08-06",
            messages=[
                {"role": "system", "content": system_instruction},
                {"role": "user", "content": html_content}
            ],
            response_format=ExtractionRules
        )
        if completion and completion.choices and len(completion.choices) > 0:
            return completion.choices[0].message.parsed
        else:
            logger.error("Invalid completion response from OpenAI")
            return None
    except Exception as e:
        logger.error(f"Error in extract_xpaths: {str(e)}")
        return None

I get the following error:
Object of type Tag is not JSON serializable

Would appreciate any help on finding out the issue.

nicholishen · October 20, 2024, 10:31pm

Can you show your full code? This doesn’t have the bug in it.

_j · October 21, 2024, 12:25am

Run this in your Python environment:

import pydantic_core, pydantic
print(pydantic_core.__version__, pydantic.__version__)

Output result meeting the latest supported:
2.20.1 2.8.2

If you have lesser or greater versions in your Python 3.9-3.11 environment for OpenAI API requests, try this forced upgrade line from the user account (with access to upgrade those installations) or on the venv:

pip install --upgrade --upgrade-strategy eager regex "charset-normalizer<4" "idna" "urllib3<3" "certifi" "requests" "anyio<5" "distro<2" "sniffio" "h11<0.15" "httpcore==1.*" "httpx<1" "annotated-types" "typing-extensions<5" "pydantic-core==2.20.1" "pydantic<3" "jiter<1" "tqdm" "colorama" "openai" "tiktoken"

If you have broader application use in the environment, you should also verify those requirements match what other software you are running, otherwise you may need a separate venv for your API calls.

Pydantic here is the most suspect in sending unanticipated output to the API by different versions, whereas the most reliable code across various runtime platforms will be using simply a https-supporting library and your own support written for sending JSON to the API endpoint URL.

Explanation: openai has broad version numbers as requirements that may be less strict than the upgrades required for compatibility with the platform the latest SDK version is auto-built for. Later explicit upgrades to libraries may not consider all requirements to maintain lower versions.

itzamirrezab · October 22, 2024, 10:02am

That’s basically all the relevant part of the code.
There is no Tag object anywhere and exception is happing from the code:
logger.error(f"Error in extract_xpaths: {str(e)}")

You can try it out yourself using my classes.

itzamirrezab · October 22, 2024, 10:05am

I’m using Python 3.12.6, so naturally my pydantic versions were much up.
I tried force upgrading using your code and I verified that indeed the version was 2.20.1 2.8.2:

>>> import pydantic_core, pydantic
>>> print(pydantic_core.__version__, pydantic.__version__) 
2.20.1 2.8.2

However, still got the same error:
ERROR - Error in extract_xpaths: Object of type Tag is not JSON serializable

In case anyone wondering these are my imports:

from pydantic import BaseModel
from typing import Optional, List
from openai import OpenAI

Topic		Replies	Views
"Could not parse JSON body" error in following Structured Output example API structured-output	5	80	November 4, 2024
Structured Outputs with Assistants API	15	3844	November 12, 2024
Issue with Structured Outputs Returning Invalid JSON Object API api , structured-output	10	369	August 29, 2024
Structured Output works for Flat Schema, not for Nested Schema API structured-output	0	89	October 23, 2024
OpenAI API with Structured Output not including a required parameter API	4	73	November 5, 2024

What's wrong with my Structured Output response format?

Related topics