I have the following API call:
response = openai_client.responses.create(
model=‘gpt-4o-mini’,
input=‘Your input prompt here’,
tools=[
{
“type”: “web_search_preview”,
“search_context_size”: “high”
}
]
)
and it mostly works but on occasions I get the following (error) response:
{“id”: “resp_67e158973b208191bc42b115727c0aa20e1a648aff9c28ee”, “created_at”: 1742821527.0, “error”: null, “incomplete_details”: {“reason”: “max_output_tokens”}, “instructions”: null, “metadata”: {}, “model”: “gpt-4o-mini-2024-07-18”, “object”: “response”, “output”: [{“id”: “ws_67e15897c464819194bcb16b1a31cbdb0e1a648aff9c28ee”, “status”: “completed”, “type”: “web_search_call”}], “parallel_tool_calls”: true, “temperature”: 1.0, “tool_choice”: “auto”, “tools”: [{“type”: “web_search_preview”, “search_context_size”: “high”, “user_location”: {“type”: “approximate”, “city”: null, “country”: “US”, “region”: null, “timezone”: null}}], “top_p”: 1.0, “max_output_tokens”: null, “previous_response_id”: null, “reasoning”: {“effort”: null, “generate_summary”: null}, “status”: “incomplete”, “text”: {“format”: {“type”: “text”}}, “truncation”: “auto”, “usage”: {“input_tokens”: 372, “input_tokens_details”: {“cached_tokens”: 0}, “output_tokens”: 16384, “output_tokens_details”: {“reasoning_tokens”: 0}, “total_tokens”: 16756}, “user”: null, “_request_id”: “req_1905918e6fdb561e37fcc310e6cbe5b4”}
It seems to be an error with the number of output tokens but there is no way to limit or control it. How should I resolve this or is this a bug?