I’m using GPT-5 mini with the Responses API and web_search_preview tool, but consistently
getting incomplete responses even with max_output_tokens set to 8000.
Problem
The model uses most tokens for reasoning and web searches, leaving no room for the actual JSON
output.
Code
const response = await openai.responses.create({
model: 'gpt-5-mini',
max_output_tokens: 8000,
tools: [{
type: 'web_search_preview',
search_context_size: 'high'
}],
input: "Search for events in Kagoshima...",
text: {
format: {
type: 'json_schema',
name: 'EventList',
schema: { /* ... */ },
strict: true
}
}
})
Response
{
"status": "incomplete",
"incomplete_details": { "reason": "max_output_tokens" },
"output": [
{ "type": "reasoning" },
// 15-20 web_search_call items
// No message with JSON output
],
"usage": {
"output_tokens": 8000,
"output_tokens_details": {
"reasoning_tokens": 7500+
}
}
}
Question
How can I get complete JSON output when using web_search_preview with GPT-5 mini? Is there a
way to limit reasoning/search tokens to preserve space for the actual output?
Environment: OpenAI SDK 4.77.0, Node.js