when using stream response api meta data always return usage token as 0.
“lastModelResponse”: {
"usage": {
“requests”: 1,
“inputTokens”: 0,
“outputTokens”: 0,
“totalTokens”: 0
},
“output”: [
{
“id”: “FAKE_ID”,
“type”: “message”,
“role”: “assistant”,
“status”: “completed”,
“content”: [ …
You’re actually seeing the expected behavior.
When you use streaming, the response metadata will always show:
usage: { inputTokens: 0, outputTokens: 0, totalTokens: 0 }
Why?
Because token accounting only happens after the model finishes generating.
During a stream, the model hasn’t completed the output yet, so there’s nothing to count.
If you need accurate usage numbers, you have two options:
-
Use a non-streamed request (usage will be included normally), or
-
Gather the streamed chunks and inspect the final aggregated response, which is the only moment where the API can compute real token usage.
So zero values here aren’t a bug, just how the streaming pipeline reports metadata.