I’m using assistants to extract structured json output from my text input.
I’m using GPT-4o
I have an input that generates a lot of JSON output. It appears that when the output reaches around 7800 chars (about 16K), the output simply stops and the result is invalid JSON.
This behavior is observed with JSON Mode both enabled and disabled.
Are there any suggestions for something to try to get around this? Is this a OpenAI limitation?
‘message’: 'max_tokens is too large: 4111. This model supports at most 4096 completion tokens, whereas you provided 4111.
max_tokens sets the largest response you can receive back, and is limited by OpenAI, likely because the model devolves worse than we already see it doing on long responses.
Then the AI is trained to curtail its output even lower.
This is a parameter that you have control over when using chat completions. Assistants is made of things out of your control.
On that same page, some preview models such as gpt-4-0125-preview are indicated as “maximum of 4,096 output tokens” but Gpt-4o does not appear to be documented as having such a limitation. If such a limitation actually exists, then it would explain the behavior that I’m seeing.