I’m currently using the OpenAI Agent SDK with an Azure-hosted reasoning model (o1). The setup is working well, and I’m successfully receiving responses from the Agent via chat completions.
In the response object, I can see the usage
node which includes both output_tokens
and reasoning_tokens
.
I have a couple of questions for this:
- Is the
output_tokens
count inclusive ofreasoning_tokens
?
I’m trying to determine the actual token usage cost, and understanding this breakdown will help. - Is there any way to configure or enforce the model not to emit
reasoning_tokens
in the Agent’s response?
Any insights, documentation pointers, or best practices would be highly appreciated. Thanks in advance!