I am using the responses API to hit the deep research models (both o4 and o3).
It seems to totally ignore what I set for max_tool_calls, often having 100+ “response.web_search_call.completed” in the stream output.
These docs https://platform.openai.com/docs/guides/deep-research state that this should be respected, and is the main way of controlling costs. Am I missing something?
Thanks!
MODEL = “o4-mini-deep-research”
stream = client.responses.create(
model=MODEL,
input=[
{“role”: “developer”, “content”: [{“type”: “input_text”, “text”: “You are a research assistant. Cite sources.”}]},
{“role”: “user”, “content”: [{“type”: “input_text”, “text”: prompt}]}
],
tools=[{“type”: “web_search”}],
max_tool_calls=5,
background=True,
stream=True,
store=True # Required for background mode
)