When using stream response usage meta data returns "usage": { "requests": 1, "inputTokens": 0, "outputTokens": 0, "totalTokens": 0 },

cent_dxv · December 21, 2025, 3:33pm

when using stream response api meta data always return usage token as 0.
“lastModelResponse”: {

  "usage": {

“requests”: 1,

“inputTokens”: 0,

“outputTokens”: 0,

“totalTokens”: 0

},

“output”: [

“id”: “FAKE_ID”,

“type”: “message”,

“role”: “assistant”,

“status”: “completed”,

“content”: [ …

LarisaHaster · December 21, 2025, 3:57pm

You’re actually seeing the expected behavior.

When you use streaming, the response metadata will always show:

usage: { inputTokens: 0, outputTokens: 0, totalTokens: 0 }

Why?

Because token accounting only happens after the model finishes generating.

During a stream, the model hasn’t completed the output yet, so there’s nothing to count.

If you need accurate usage numbers, you have two options:

Use a non-streamed request (usage will be included normally), or
Gather the streamed chunks and inspect the final aggregated response, which is the only moment where the API can compute real token usage.

So zero values here aren’t a bug, just how the streaming pipeline reports metadata.

_j · December 21, 2025, 7:27pm

Please do not produce AI answers that pretend to have first-hand experience. What you have replied with is a fiction.

Perhaps you can clarify which model you are using, which endpoint you are using, and which SDK or library or programming language is being referred to. Or, where you are retrieving the data from. “stream response API” means little. There is no “requests” count within a stream.

When using "stream":true on the Responses API endpoint, there will be three events that report a “usage” in their shape:

event: response.created - the initial stream event with an echo of the API call. This will include "usage":null
event: response.in_progress - the second event. Essentially identical contents, also with "usage":null
event: response.completed - the final event after the stream of contents, where “usage” is finally populated.

The usage in the final event will not have the fields you indicated. From a success, it looks like:

"usage":{"input_tokens":53,"input_tokens_details":{"cached_tokens":0},"output_tokens":271,"output_tokens_details":{"reasoning_tokens":256},"total_tokens":324},"user":null,"metadata":{}}}

So I cannot determine what you are reporting on and where you are getting this information that would have a “lastModelResponse” or even a “last_model_response”.

To note: when you are using streaming, you also must detect “error” events and “refusal” events, as a malformed or rejected request may not raise a http error, but instead will have different SSE event types reporting on the failure.

_j · December 21, 2025, 7:37pm

I’ll note also:

If using Chat Completions, usage is not returned in a stream unless you employ the "stream_options" parameter, sending an object that includes "include_usage": true

Any code written for Chat Completions in streaming needing usage will need this parameter, and then will need to parse an additional final SSE that includes “usage” instead of message content or deltas.

Code should fail gracefully if not receiving this Chat Completions usage object, instead of leaving any pre-defined values unpopulated or at 0.

LarisaHaster · December 21, 2025, 8:30pm

I have an eidetic memory…I simply condensed the explanation. But the correction is valid. Noted.

cent_dxv · December 22, 2025, 4:27am

yes even after model run complete those result stay unchanged

i’m using open ai agent sdk with type script and
here is my again base class

…

async runStream(

input: string | AgentInputItem\[\],

options?: { config?: Partial; context?: TContext }

) {

const agent = this.buildAgent();

const { config, context } = options ?? {};

const result = config

? await new Runner(config).run(agent, input, { stream: true, context })

: await run(agent, input, { stream: true, context });

const stream = result.toTextStream({ compatibleWithNodeStreams: true });

return { stream, result };

}

and i tired to get the final response usage data

result.completed.then(() => {

console.log(“result completed”)

// writeFileSync(‘./result.txt’, JSON.stringify(accumulatedData), ‘utf8’)

writeFileSync(‘./response_final.json’, JSON.stringify(result, null, 2), ‘utf8’)

})

but the final response usage is still 0,

but according to the documentation is should populate on the responce.complete state

cent_dxv · December 22, 2025, 4:50am

also in the console it doesn’t show usage token

but work fine for none stream usage

cent_dxv · December 22, 2025, 4:53am

here is sample code


async function main() {

  const modelP = modelProvider("openai", "gpt-5-nano")




const agent = new Agent({

model: modelP,

name: 'user data',

instructions: ' prompt .... ',

// tools: [notificationTool, getWeatherTool],

outputType: schema,

// text: {}

// outputType: ,

  });




const result = await run(agent, 'prompt ...',

    { stream: true }

  );





// stream from the existing agent result

const stream = result.toTextStream({ compatibleWithNodeStreams: true });




console.log('Streaming result:');

stream.on('data', (chunk: Buffer | string) => {

const text = chunk instanceof Buffer ? chunk.toString('utf8') : chunk;

process.stdout.write(text);

  });




let accumulatedData: any = {};

const jsonParser = new JSONParser();




jsonParser.onValue = ({ value, key, parent, stack }) => {

// When a complete field is parsed

if (stack.length === 1 && key) {

accumulatedData[key] = value;




console.log(`\n\nParsed field: ${key}`);

console.log(JSON.stringify({ [key]: value }, null, 2));




// Update template with new field

    }

  };




stream.on('data', (chunk: Buffer | string) => {

const text = chunk instanceof Buffer ? chunk.toString('utf8') : chunk;

jsonParser.write(text);

  });




stream.on('end', () => {

console.log("end")

console.log('Accumulated data:', accumulatedData);

  });

result.completed.then((resultd) => {

console.log("result completed" ,resultd)

writeFileSync('./meata.json', JSON.stringify(result), 'utf8')

  })

}

cent_dxv · December 22, 2025, 5:31am

yeah it work fine with chat completion

olivia11 · December 22, 2025, 5:38am

cent_dxv:

here is sample code

async function main() {

  const modelP = modelProvider("openai", "gpt-5-nano")




const agent = new Agent({

model: modelP,

name: 'user data',

instructions: ' prompt .... ',

// tools: [notificationTool, getWeatherTool],

outputType: schema,

// text: {}

// outputType: ,

  });

Thanks for sharing the sample code this helps clarify how you are configuring the agent.
At a glance the setup looks reasonable so the issue likely is not in this block itself

olivia11 · December 22, 2025, 5:40am

Good to know thanks for confirming that.

cent_dxv · December 22, 2025, 5:43am

i assume it’s a bug i tried everything every solution , even dump the whole event all usage values are 0

LarisaHaster · December 22, 2025, 10:50am

Since you already tested all the obvious paths and still see zero usage, it seems that it could be a bug.

Topic		Replies	Views
Issue with Token Usage in Streaming Responses Bugs api	17	1550	February 21, 2025
Why there is no USAGE object returned with Streaming Api Call? API api , chat-completion , completions	20	5953	February 20, 2025
Usage stats now available when using streaming with the Chat Completions API or Completions API API api , api-usage , streaming	25	22831	January 23, 2025
KeyError: "usage" for gpt-3.5-turbo-16k Bugs gpt-35-turbo , chatgpt	6	902	February 9, 2024
OpenAi API - get usage tokens in response when set stream=True API	34	41109	August 17, 2025

When using stream response usage meta data returns "usage": { "requests": 1, "inputTokens": 0, "outputTokens": 0, "totalTokens": 0 },

Related topics