Please do not produce AI answers that pretend to have first-hand experience. What you have replied with is a fiction.
Perhaps you can clarify which model you are using, which endpoint you are using, and which SDK or library or programming language is being referred to. Or, where you are retrieving the data from. “stream response API” means little. There is no “requests” count within a stream.
When using "stream":true on the Responses API endpoint, there will be three events that report a “usage” in their shape:
event: response.created - the initial stream event with an echo of the API call. This will include "usage":null
event: response.in_progress - the second event. Essentially identical contents, also with "usage":null
event: response.completed - the final event after the stream of contents, where “usage” is finally populated.
The usage in the final event will not have the fields you indicated. From a success, it looks like:
So I cannot determine what you are reporting on and where you are getting this information that would have a “lastModelResponse” or even a “last_model_response”.
To note: when you are using streaming, you also must detect “error” events and “refusal” events, as a malformed or rejected request may not raise a http error, but instead will have different SSE event types reporting on the failure.
If using Chat Completions, usage is not returned in a stream unless you employ the "stream_options" parameter, sending an object that includes "include_usage": true
Any code written for Chat Completions in streaming needing usage will need this parameter, and then will need to parse an additional final SSE that includes “usage” instead of message content or deltas.
Code should fail gracefully if not receiving this Chat Completions usage object, instead of leaving any pre-defined values unpopulated or at 0.
Thanks for sharing the sample code this helps clarify how you are configuring the agent.
At a glance the setup looks reasonable so the issue likely is not in this block itself