I wanted to know if it’s possible to get the token usage for the tool call while streaming. In the sense that whenever we first send a request to a stream, it processes the prompt and then returns a tool call. The first stream gets interrupted, and then the second one continues after the tool call is made. With the “include_usage” parameter for the second stream, I can get the usage tokens (it’s given in the last chunk of the stream), but what about the first call that was made, which was used to identify if the current prompt needs a tool call or not?
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
Token usage calculation with streaming responses - is this not supported? | 1 | 93 | June 25, 2025 | |
How do you get token count with tools input and tool_calls output when streaming | 4 | 3722 | November 21, 2023 | |
Calculating token usage with streaming? | 2 | 2903 | May 6, 2024 | |
How can I access usage tokens (including reasoning tokens) when using streaming responses? | 1 | 352 | February 1, 2025 | |
Why there is no USAGE object returned with Streaming Api Call? | 20 | 5471 | February 20, 2025 |