Limits of storing large numbers of dynamically generated Structured JSON Output schemas

IndigoFenix · October 13, 2024, 3:26am

I am looking for detailed information regarding the function and limitations of schema names in Structured JSON Output. I understand that the purpose of these names is to cache schemas to speed up reuse of the same schemas, but I want to know its limits.

My objective is to create a service that allows clients to create their own response structures, which will be converted into dynamically generated schemas. My current plan is to use a hashing algorithm to generate names based on schema structure, ensuring that the same schemas retain the same names.

However, this can potentially result in many thousands of schemas being cached on the same API Key, including no longer relevant schemas that had unique names but were only partially developed. Are there problems with this? How many schemas is the system intended to handle? Will it start removing schemas from the cache, causing the system to break down? Is there a better way of creating this kind of dynamic schema system?

_j · October 13, 2024, 5:44am

Schema artifact creation is passive by the API.

You would never know any grammar building process or server-side caching is happening except that an initial request can take a few more seconds to start completing.

Therefore, I don’t really see the reasoning behind keeping track of response schemas yourself if you are allowing them to be created by users. A single initial response may just be of higher latency, and they can follow the same understanding that you have. If the return schema is the same, it should hit the prebuilt schema object.

OpenAI can keep a billion chat interactions of everything ever input into ChatGPT. 60 days of stale assistants threads, 100GB of files, with no encouraging you to clean up anything. It is unlikely that they say “too many schemas” behind the scenes and start discarding cache on that basis.

Topic		Replies	Views
Structured Output: Caching and Latency API	5	1701	March 24, 2025
How long OpenAI keep the cached converted schema for structured output API structured-output	0	187	October 24, 2024
500 Enums Limitation in Structured Output Documentation api , structured-output	2	745	April 7, 2025
Structured Output Latency - How is Caching done? API gpt-4 , api , json , response_format	0	419	August 8, 2024
Strategies for Handling Large Nested JSON Schema API structured-output	1	287	January 20, 2025

Limits of storing large numbers of dynamically generated Structured JSON Output schemas

Related topics