Large JSON Responses from Assistant API are truncated

szaslavsky · May 24, 2024, 6:11pm

I’m using assistants to extract structured json output from my text input.

I’m using GPT-4o

I have an input that generates a lot of JSON output. It appears that when the output reaches around 7800 chars (about 16K), the output simply stops and the result is invalid JSON.

This behavior is observed with JSON Mode both enabled and disabled.

Are there any suggestions for something to try to get around this? Is this a OpenAI limitation?

IntelliJJ · May 28, 2024, 12:39pm

As far as I know GPT-4o still has a 8k token limit, so yes this is expected.

You might find this thread interesting:

TL;DR: See if you can prepend line numbers and have the model return start and end line numbers instead of full text.

_j · May 28, 2024, 1:05pm

‘message’: 'max_tokens is too large: 4111. This model supports at most 4096 completion tokens, whereas you provided 4111.

max_tokens sets the largest response you can receive back, and is limited by OpenAI, likely because the model devolves worse than we already see it doing on long responses.

Then the AI is trained to curtail its output even lower.

This is a parameter that you have control over when using chat completions. Assistants is made of things out of your control.

szaslavsky · May 28, 2024, 2:49pm

https://platform.openai.com/docs/models/overview indicates 128K Context Window for gpt-4o .

On that same page, some preview models such as gpt-4-0125-preview are indicated as “maximum of 4,096 output tokens” but Gpt-4o does not appear to be documented as having such a limitation. If such a limitation actually exists, then it would explain the behavior that I’m seeing.

SomebodySysop · May 28, 2024, 3:51pm

I have run into the exact same issue. It would be nice to get some clarification on the actual output context limit for GPT-4o.

sam-boundary · June 20, 2024, 5:27pm

GPT4o has a semi-documented limit of 4096 output tokens. If you try to push max_tokens higher, the API complains very loudly: gpt4o maxes at 4096 output tokens — Prompt Fiddle

Topic		Replies	Views
Structured Output Issue in GPT-4o API – Response Truncation at Specific Index API api , structured-output	1	260	March 19, 2025
Token/character restrictions for assistants-api? API assistants-api	10	4347	January 25, 2024
Cannot continue JSON response past max_tokens? Anyone figure this out? API json , json-mode , gpt-4o	3	1577	June 20, 2024
Undocumented truncation of function tool call submission output API function-calling , assistants-api	3	838	February 14, 2024
Assistant API message cutoff API assistants-api	2	689	June 20, 2024

Large JSON Responses from Assistant API are truncated

Related topics