As o3-mini and other reasoning models do not show reasoning tokens although those are counted in the price, using tiktoken
to manually evaluate price is infeasible. So is there any way to get to know the price of each request while using stream API?
BTW, I’m using Azure OpenAI API, but the situation is probably similar with OpenAI API.
1 Like
With the responses endpoint, the stream is events. The final event has a full response like a non-stream call, with usage.
"usage":{"input_tokens":243,"input_tokens_details":{"cached_tokens":0},"output_tokens":60,"output_tokens_details":{"reasoning_tokens":0},"total_tokens":303},"user":null, ...
With the chat completions endpoint, you use the stream_options parameter. Another chunk will then be added after the typical response, with usage.
And if you want a similar stream of events:
with client.beta.chat.completions.stream(
model="gpt-4o",
messages=[{"role": "system", "content": "You are a helpful AI assistant"},
{"role": "user", "content": request_content}],
stream_options={"include_usage": True},
...
) as stream:
for event in stream:
process_event(event)