Hi,
Structured outputs is a very welcome new feature, and something we have been looking forward to for a while.
Our app, Onsen (onsenapp dot com), an AI companion for mental health, makes extensive use of JSON prompts to dynamically generate its UI. Until now we have used response_type=json_object
extensively, and we are excited at the new opportunities that structured outputs bring.
Two questions / concerns:
1) Token use
I have seen it being mentioned elsewhere, that the JSON schema definition would count towards the input token usage. However, I am unable to show this.
For example, I am adding a large complex JSON schema with a single string with very long enum values (thousands of characters), and my prompt usage stats do not show any increase in tokens.
As it seems, either the JSON schema is applied for “free”, or the AI is not correctly reporting the JSON schema tokens in the usage metrics.
2) Latency
The docs says that the first time it runs, a prompt with structured outputs would take extra time to create some artefacts (presumedly cache the JSON schema artefacts in a memory db somewhere).
However, I am also seeing significant increase in latency for repeat prompt runs as well. For example a prompt that takes ~3 seconds to run with plain json_object
type consistently takes ~4-5 seconds with the new structured output json_schema
.
Are others experiencing this?
Can OpenAI comment on this undocumented additional latency?