Limits of storing large numbers of dynamically generated Structured JSON Output schemas

I am looking for detailed information regarding the function and limitations of schema names in Structured JSON Output. I understand that the purpose of these names is to cache schemas to speed up reuse of the same schemas, but I want to know its limits.

My objective is to create a service that allows clients to create their own response structures, which will be converted into dynamically generated schemas. My current plan is to use a hashing algorithm to generate names based on schema structure, ensuring that the same schemas retain the same names.

However, this can potentially result in many thousands of schemas being cached on the same API Key, including no longer relevant schemas that had unique names but were only partially developed. Are there problems with this? How many schemas is the system intended to handle? Will it start removing schemas from the cache, causing the system to break down? Is there a better way of creating this kind of dynamic schema system?

Schema artifact creation is passive by the API.

You would never know any grammar building process or server-side caching is happening except that an initial request can take a few more seconds to start completing.

Therefore, I don’t really see the reasoning behind keeping track of response schemas yourself if you are allowing them to be created by users. A single initial response may just be of higher latency, and they can follow the same understanding that you have. If the return schema is the same, it should hit the prebuilt schema object.

OpenAI can keep a billion chat interactions of everything ever input into ChatGPT. 60 days of stale assistants threads, 100GB of files, with no encouraging you to clean up anything. It is unlikely that they say “too many schemas” behind the scenes and start discarding cache on that basis.