Using API Structured Output for strict classification

I’d like to use the API to classify some data. The data itself is fairly limited, usually <1k tokens. The task is to classify towards an enum, of which we have approximately 3.5k valid values. At the moment this is done by sending a JsonSchema along, which looks like the one below

{
	"$schema": "http://json-schema.org/draft-07/schema#",
	"name": "...",
	"type": "object",
	"properties": {
		"key": {
			"type": "string",
			"enum": [
				"...val1",
			]
		}
	},
	required: ["key"]
}

Based on the resulting data from the API, it seems the entire JSONSchema is billed as input tokens, making this approach fairly expensive

{
    "id": "chatcmpl-AFh9hjwTLyvGJBDY6PLnVvdWNMkzd",
    "object": "chat.completion",
    "created": 1728304173,
    "model": "....",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "...",
                "refusal": null
            },
            "logprobs": null,
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 14848,
        "completion_tokens": 7,
        "total_tokens": 14855,
        "prompt_tokens_details": {
            "cached_tokens": 14208
        },
        "completion_tokens_details": {
            "reasoning_tokens": 0
        }
    },
    "system_fingerprint": "fp_f85bea6784"
}

Is this behaviour (billing) to be expected, or are there better ways to ensure such a classification is done?

1 Like

Your title says ChatGPT but your Post says “API”? The API is not the web product, “ChatGPT”.

Sorry, corrected the title of the post as this is about the API indeed. As far as I know there is no structured output on web

1 Like

I wonder if at scale it takes advantage of this:

which would make it cheaper.

Alternative classification strategies might include a local cosine distance search with embeddings (though I personally have not been satisfied with the results of that due to too many things being classed as similar).

An embeddings approach would be much much cheaper though.

1 Like