When using stream response usage meta data returns "usage": { "requests": 1, "inputTokens": 0, "outputTokens": 0, "totalTokens": 0 },

when using stream response api meta data always return usage token as 0.
“lastModelResponse”: {

  "usage": {

“requests”: 1,

“inputTokens”: 0,

“outputTokens”: 0,

“totalTokens”: 0

  },

“output”: [

    {

“id”: “FAKE_ID”,

“type”: “message”,

“role”: “assistant”,

“status”: “completed”,

“content”: [ …

You’re actually seeing the expected behavior.

When you use streaming, the response metadata will always show:

usage: { inputTokens: 0, outputTokens: 0, totalTokens: 0 }

Why?

Because token accounting only happens after the model finishes generating.

During a stream, the model hasn’t completed the output yet, so there’s nothing to count.

If you need accurate usage numbers, you have two options:

  1. Use a non-streamed request (usage will be included normally), or

  2. Gather the streamed chunks and inspect the final aggregated response, which is the only moment where the API can compute real token usage.

So zero values here aren’t a bug, just how the streaming pipeline reports metadata.

Please do not produce AI answers that pretend to have first-hand experience. What you have replied with is a fiction.

Perhaps you can clarify which model you are using, which endpoint you are using, and which SDK or library or programming language is being referred to. Or, where you are retrieving the data from. “stream response API” means little. There is no “requests” count within a stream.

When using "stream":true on the Responses API endpoint, there will be three events that report a “usage” in their shape:

  • event: response.created - the initial stream event with an echo of the API call. This will include "usage":null
  • event: response.in_progress - the second event. Essentially identical contents, also with "usage":null
  • event: response.completed - the final event after the stream of contents, where “usage” is finally populated.

The usage in the final event will not have the fields you indicated. From a success, it looks like:

"usage":{"input_tokens":53,"input_tokens_details":{"cached_tokens":0},"output_tokens":271,"output_tokens_details":{"reasoning_tokens":256},"total_tokens":324},"user":null,"metadata":{}}}

So I cannot determine what you are reporting on and where you are getting this information that would have a “lastModelResponse” or even a “last_model_response”.

To note: when you are using streaming, you also must detect “error” events and “refusal” events, as a malformed or rejected request may not raise a http error, but instead will have different SSE event types reporting on the failure.

I’ll note also:

If using Chat Completions, usage is not returned in a stream unless you employ the "stream_options" parameter, sending an object that includes "include_usage": true

Any code written for Chat Completions in streaming needing usage will need this parameter, and then will need to parse an additional final SSE that includes “usage” instead of message content or deltas.

Code should fail gracefully if not receiving this Chat Completions usage object, instead of leaving any pre-defined values unpopulated or at 0.

1 Like

I have an eidetic memory…I simply condensed the explanation. But the correction is valid. Noted.

yes even after model run complete those result stay unchanged

i’m using open ai agent sdk with type script and
here is my again base class

async runStream(

input: string | AgentInputItem\[\],

options?: { config?: Partial; context?: TContext }

) {

const agent = this.buildAgent();

const { config, context } = options ?? {};

const result = config

? await new Runner(config).run(agent, input, { stream: true, context })

: await run(agent, input, { stream: true, context });

const stream = result.toTextStream({ compatibleWithNodeStreams: true });

return { stream, result };

}

and i tired to get the final response usage data

result.completed.then(() => {

console.log(“result completed”)

// writeFileSync(‘./result.txt’, JSON.stringify(accumulatedData), ‘utf8’)

writeFileSync(‘./response_final.json’, JSON.stringify(result, null, 2), ‘utf8’)

})

but the final response usage is still 0,

but according to the documentation is should populate on the responce.complete state

also in the console it doesn’t show usage token

but work fine for none stream usage

here is sample code


async function main() {

  const modelP = modelProvider("openai", "gpt-5-nano")




const agent = new Agent({

model: modelP,

name: 'user data',

instructions: ' prompt .... ',

// tools: [notificationTool, getWeatherTool],

outputType: schema,

// text: {}

// outputType: ,

  });




const result = await run(agent, 'prompt ...',

    { stream: true }

  );





// stream from the existing agent result

const stream = result.toTextStream({ compatibleWithNodeStreams: true });




console.log('Streaming result:');

stream.on('data', (chunk: Buffer | string) => {

const text = chunk instanceof Buffer ? chunk.toString('utf8') : chunk;

process.stdout.write(text);

  });




let accumulatedData: any = {};

const jsonParser = new JSONParser();




jsonParser.onValue = ({ value, key, parent, stack }) => {

// When a complete field is parsed

if (stack.length === 1 && key) {

accumulatedData[key] = value;




console.log(`\n\nParsed field: ${key}`);

console.log(JSON.stringify({ [key]: value }, null, 2));




// Update template with new field

    }

  };




stream.on('data', (chunk: Buffer | string) => {

const text = chunk instanceof Buffer ? chunk.toString('utf8') : chunk;

jsonParser.write(text);

  });




stream.on('end', () => {

console.log("end")

console.log('Accumulated data:', accumulatedData);

  });

result.completed.then((resultd) => {

console.log("result completed" ,resultd)

writeFileSync('./meata.json', JSON.stringify(result), 'utf8')

  })

}

yeah it work fine with chat completion

Thanks for sharing the sample code this helps clarify how you are configuring the agent.
At a glance the setup looks reasonable so the issue likely is not in this block itself

Good to know thanks for confirming that.

i assume it’s a bug i tried everything every solution , even dump the whole event all usage values are 0

Since you already tested all the obvious paths and still see zero usage, it seems that it could be a bug.

1 Like