O3-deep-research - 1 million tokens spent .. no output :(

jlvanhulst · June 28, 2025, 6:18pm

In general the growth of our internal token usage over time is making me realize that there is no upper limit for this kind of stuff. It will continue to grow.
But I can also see that for stuff like deep research good prompting and guard rails are just as important or even more. Because now you get back pages of results that cannot be easily validated anymore.

phyde1001 · June 28, 2025, 6:20pm

I mainly use API not ChatGPT for anything important… Spent WAY MORE on ChatGPT than API

It really is about asking the right questions ^^. And for us ‘cheaps’… At the right time

(Best results still from API but maybe the questions I ask there)

merefield · June 28, 2025, 6:24pm

I have three live websites using the API. They are tightly and periodically optimised but still cost $. There’s no way around that.

phyde1001 · June 28, 2025, 6:30pm

Yeah I’m a selfish nub who doesn’t run websites and just relays my knowledge locally (kinda weird I know)…

I used to build websites… Is that still where the world is going?

merefield · June 28, 2025, 6:38pm

They should have sources, no?

merefield · June 28, 2025, 6:39pm

This is a website (and PWA) … so yeah, I think websites are as relevant as ever. If anything I think apps are becoming less relevant and redundant. A lot of apps are just wrapped websites these days.

(this is super off topic now so very happy if these posts are moved)

phyde1001 · June 28, 2025, 6:59pm

Oooh… ‘Wasted thought’ I think it’s still on topic

I am not thinking apps… Apps?

I only think people.

No disrespect… I know how hard websites are…

But this is not the mission statement of OpenAI…

‘web’ / ‘internet’ doesn’t even feature in the charter

https://openai.com/charter/?utm_source=chatgpt.com

jlvanhulst · June 28, 2025, 7:04pm

Sure but there is no easy ‘evals’ to run on a 10 page report. You can of course let an AI read it

merefield · June 28, 2025, 7:52pm

Yeah. Even checking one page summaries is time consuming

phyde1001 · June 28, 2025, 7:52pm

Can you break the problem down?

Hobby_Developer · June 28, 2025, 7:59pm

I likewise keep running into the same issue, even when using the o4-mini-deep-research variant instead of o3-deep-research. Even when I pass an additional “max_tool_calls = 10”, the issue keeps happening. Running the requests in background (background=true, store=true) does not seem to fix the issue. Feels as if the power of these deep research API requests is only available for the higher tier consumers rather than the individual developers. Will have to spend a bit more on the API-side if that would be the only fix

Rustemov_Bauyrzhan · July 8, 2025, 8:38am

I submitted a Deep Research API query with an input prompt of only 165 tokens. However, the system ultimately reported around 2 million input tokens consumed.

Why is that? From my understanding, reasoning and response generation are accounted for in the output token calculation, not input. “Advanced processing like reasoning and analysis” is already reflected in the output tokens, so I don’t understand why the input token count is so high.
And also raised this question in separate branch:

ekwr1k8 · July 10, 2025, 1:33am

I’m getting the same problem. I had 4.5 million input tokens consumed for a prompt that was a hundred or so tokens, with nothing in the output.

does anyone know how to set something like a max input tokens? moreover, it only returned 8 sources for me, so i have no clue why 4.5 million tokens were necessary.

jlvanhulst · July 10, 2025, 1:44am

You set the max_tokens in the api call.

aprendendo.next · July 10, 2025, 2:22am

For deep research models, the closest thing is setting max_tool_calls.

> You can also use the max_tool_calls parameter when creating a deep research request to control the total number of tool calls (like to web search or an MCP server) that the model will make before returning a result. This is the primary tool available to you to constrain cost and latency when using these models.

chrisbolman · July 16, 2025, 10:45pm

had this exact same experience today: sent one test request, saw it was 1.7 million tokens, and my immediate reaction was “WTF, this makes no sense”.

from my understanding (as others have mentioned), the options as they exist today are:

use o4-mini-deep-research if you can get comparable output. it’ll consume the exact same number of tokens, but they cost less
use the Batch API (again, if feasible)
set max_tool_calls to limit the tokens used on web search and data ingestion

we should also expect the costs for these to come down over time as these models mature and new models are introduced. using o3-deep-research via API is almost prohibitively expensive for most use cases today unless you have a lot of token budget to burn, but that will likely change and follow the same patterns as GPT-4

jlvanhulst · July 16, 2025, 11:31pm

In my case the initial problem was not having max_output tokens high enough. (Which I know sounds strange …)

Benedict_Summers · November 27, 2025, 4:42pm

Bit late to the party - but keep getting this error after it running for a couple of hours

"RuntimeError: Response failed: ResponseError(code=‘server_error’, message=‘An error occurred while processing your request. You can retry your request, or contact us through our help center at help.openai.com if the error persists. Please include the request ID wfr_019ac5750fc07223ad10ea67ddb303b9 in your message.’)
"

In addition, I am finding that max_tool_calls doesn’t do anything - set to 5 (as a test) and we still get 100+, on the runs that do work!

Any ideas?

Topic		Replies	Views
Infinite wait for OpenAI server response with GPT5 on /completions under specific conditions API gpt5	13	1013	January 12, 2026
What is going on with the GPT-5 API? API	40	17599	October 21, 2025
Frequent errors when using o3-deep-research Bugs	1	227	September 2, 2025
Why are Deep Research API token counts so high? Feedback deep-research	7	940	August 21, 2025
Test new 128k window on gpt-4-1106-preview API	29	18809	February 6, 2024

O3-deep-research - 1 million tokens spent .. no output :(

Related topics