Best practices to add nuances to enums

Hello,

I am trying to do a structured output task with o3 mini. I have defined some enums for one of the fields. For my task, I’m working with 10-15 enums, and depending on some rules, certain enums should be used and not others. I could write these rules into the description field in the schema. But I wonder if there are any approaches I can use?

1 Like

See how well your AI does on enums that are completely unobvious otherwise…

{
  "name": "safety",
  "strict": true,
  "schema": {
    "type": "object",
    "properties": {
      "detection": {
        "type": "string",
        "description": "Classification code for safety policy detection.",
        "enum": [
          "n",
          "i",
          "h",
          "s"
        ],
        "enumDescriptions": {
          "n": "No detection of safety policy issues.",
          "i": "Illegal and dangerous activities.",
          "h": "Harmful and violent content.",
          "s": "Sexual explicit materials."
        }
      }
    },
    "required": [
      "detection"
    ],
    "additionalProperties": false
  }
}

Probably best to come up with another application concept than having language AIs judge a constant stream of policy violations…

1 Like

enumerations are what you make them - what you are asking for is a type of governance, usually the enumerations are attatched to a schema and controlling factor, i use ai, some use other means.

you are also touching base on meta programming - also something i do “alot” just create a ruleset

ruleset in a yaml. thats easy. - that ruleset you send to the llm as a json object - in my experience, you can send a cool 50k instructions before it gets weird. those instructions are also how you get better structured output.

you also mentioned nuances - those are also controlled in the same way

if you need a comprehensive list of about 10 million enumerations categoriezed like this - lmk

This is interesting. When you mentioned sending the ruleset to the LLM as a JSON object, do you mean sending it as part of the system prompt? For example:

system_prompt = f"""

Consider the following ruleset when extracting values: {ruleset}

"""

Is enumDescriptions a valid OpenAI-compatible schema? I’ve actually never seen it as part of the documentation.

1 Like

The keyword array that directly parallels the enum itself will be passed to the AI and is understood in close proximity.

“enum” is the assertion signal being captured for building a context-free output enforcement grammar. The parallel parameter for a description is an annotation, in this case, communicating with the AI for better understanding.


If you like more bloated JSON schema that only demonstrates token consumption by self-imposed strict draft keywords, and also might have a different quality of AI understanding, sure, enjoy “permitted” keyword accompanied by “non-permitted by API” draft spec keyword that only works in conjunction, with more chance of being bounced on you later by some OpenAI-specific validator:

{
  "name": "safety",
  "strict": true,
  "schema": {
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "$id": "https://example.com/safety.schema.json",
    "title": "safety",
    "type": "object",
    "properties": {
      "detection": {
        "type": "string",
        "description": "Classification code for safety policy detection.",
        "enum": [
          "n",
          "i",
          "h",
          "s"
        ],
        "oneOf": [
          {
            "const": "n",
            "description": "No detection of safety policy issues."
          },
          {
            "const": "i",
            "description": "Illegal and dangerous activities."
          },
          {
            "const": "h",
            "description": "Harmful and violent content."
          },
          {
            "const": "s",
            "description": "Sexually explicit materials."
          }
        ]
      }
    },
    "required": [
      "detection"
    ],
    "additionalProperties": false
  }
}

OpenAI lets other stuff through that doesn’t build enforcement, such as minimum values or ranges.

u can

lots of ways to do it

the better data in, the better data out. better if u have local framework make it into a packet and then make the instructions into a packet and then lauch that mfker into the api call. with more instructions.