GPT-5 mini returns incomplete response when using web_search_preview due to max_output_tokens limit

taketsuyo · September 21, 2025, 7:36am

I’m using GPT-5 mini with the Responses API and web_search_preview tool, but consistently
getting incomplete responses even with max_output_tokens set to 8000.

Problem

The model uses most tokens for reasoning and web searches, leaving no room for the actual JSON
output.

Code

const response = await openai.responses.create({
  model: 'gpt-5-mini',
  max_output_tokens: 8000,
  tools: [{
    type: 'web_search_preview',
    search_context_size: 'high'
  }],
  input: "Search for events in Kagoshima...",
  text: {
    format: {
      type: 'json_schema',
      name: 'EventList',
      schema: { /* ... */ },
      strict: true
    }
  }
})

Response

{
  "status": "incomplete",
  "incomplete_details": { "reason": "max_output_tokens" },
  "output": [
    { "type": "reasoning" },
    // 15-20 web_search_call items
    // No message with JSON output
  ],
  "usage": {
    "output_tokens": 8000,
    "output_tokens_details": {
      "reasoning_tokens": 7500+
    }
  }
}

Question

How can I get complete JSON output when using web_search_preview with GPT-5 mini? Is there a
way to limit reasoning/search tokens to preserve space for the actual output?

Environment: OpenAI SDK 4.77.0, Node.js

Topic		Replies	Views
Openai web search token limit issue Bugs	4	346	March 25, 2025
Does gpt-4o-mini-search-preview have a completion token limit of around 1530? Bugs	4	235	July 24, 2025
Gpt-4o-mini responses are being cut off Community gpt-4 , gpt-4o-mini	1	321	January 28, 2025
Hitting max output token limit for 4.1-mini API gpt-4 , api , responses , gpt-41-mini	2	386	July 28, 2025
Incomplete API responses due to "max_output_tokens" limit during batch processing Bugs api , batch-api , responses-api	5	252	August 28, 2025

GPT-5 mini returns incomplete response when using web_search_preview due to max_output_tokens limit

Problem

Code

Related topics