I submitted a Deep Research API query with an input prompt of only 165 tokens. However, the system ultimately reported around 2 million input tokens consumed.
Why is that? From my understanding, reasoning and response generation are accounted for in the output token calculation, not input. “Advanced processing like reasoning and analysis” is already reflected in the output tokens, so I don’t understand why the input token count is so high.
And also raised this question in separate branch: