Agent using WebSearchTool with structured outputs results in validation error with JSON unexpectedly ending with EOF around 6000 characters

I’ve tested both gpt-4o and gpt-4o-mini, when giving the model WebSearchTool and structured outputs using output_type the JSON prematurely ends in EOF at around 6000 characters resulting in a validation error.

for TypeAdapter(GraphConstruction); 1 validation error for GraphConstruction
ValidationError: 1 validation error for GraphConstruction
  Invalid JSON: EOF while parsing an object at line 1 column 6752 [type=json_invalid, input_value='{"entities":[{"entityId"...}],"links":[{"from_":1 ', input_type=str]

In fact almost the same issue happens in consumer facing ChatGPT when turning on search and instructing the model to find information and return in a JSON schema. Around the output length of 6000 characters in ChatGPT instead of EOF I consistently see the json interrupted by:
::contentReference[oaicite:0]{index=0}

Temprorary Solution: Limiting Max Tokens in ModelSettings to 128000 mysteriously gets rid of the JSON integrity problems and removes the validation error from JSON terminating mid generation. I consistently got a validation error without the Max Tokens and if I set Max Tokens to any arbitrary number error never occured again. Asking the model not to go beyond 5500 characers also works.

Hope this gets patched soon.

1 Like

Welcome to the dev forum, @alish.sult!

I’ve reproduced the issue during my testing. Thanks for reporting it.

1 Like

To confirm that I am intermittently seeing a similar error while using responses api, WebSearchTool and JSON output, model is gpt-4o-mini:

*1 validation error for SolutionAnswerFormat Invalid JSON: EOF while parsing a string at line 1 column 7233 [type=json_invalid, input_value='{“solutions”:[{“Solution…ution Title”:"Angel Co ', input_type=str] For further information visit …

Happy to see this post, This is affecting all of our production prompts making web search a non starter

Edit:
Wow, after setting the max tokens to 128,000 I finally get an error! Context Length Exceeded with 128,867 tokens!!! My input tokens are 867 and output tokens are about 1,500. So the 120k is probably from the web search. Previously I had the max tokens set to 16k, I also tried removing it (which I thought defaults to max), both would fail JSON validation but with a successful API response.

This provides a huge amount of insight into whats failing under the hood. OpenAI is filling up the context and breaking its own request in the process, and since its an internal process, it doesn’t even register the error