Issue with Token Usage in Streaming Responses

conciergeai · August 3, 2024, 3:46pm

I’m encountering an issue with obtaining token usage information when streaming responses from the OpenAI API. According to the Api Docs,token usage should be included in the response chunks when using the stream_options parameter.

Here’s my setup:

API Version: openai==1.38.0

Python Version: 3.11.3

I’ve tried using both asynchronous and synchronous OpenAI client configurations:

from openai import AsyncOpenAI
from openai import OpenAI

#client = AsyncOpenAI(api_key=os.getenv("OPENAI_API_KEY"))
#and
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
config:

response = client.chat.completions.create(
    model="gpt-4o-mini",#also with 3.5 and 4o
    messages=messages,
    stream=True,
    temperature=0.5,
    tool_choice="auto",
    tools=gpt_tools,
    max_tokens=300,
    stream_options={"include_usage": True}
)

for chunk in response:
    print(chunk.usage)

Output

ChatCompletionChunk(id='chatcmpl-9sBbvJTm6vBbDls3RUcInszCd2kqj', choices=[Choice(delta=ChoiceDelta(content=' today', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1722701371, model='gpt-3.5-turbo-0125', object='chat.completion.chunk', service_tier=None, system_fingerprint=None, usage=None)
ChatCompletionChunk(id='chatcmpl-9sBbvJTm6vBbDls3RUcInszCd2kqj', choices=[Choice(delta=ChoiceDelta(content='?', function_call=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1722701371, model='gpt-3.5-turbo-0125', object='chat.completion.chunk', service_tier=None, system_fingerprint=None, usage=None)
ChatCompletionChunk(id='chatcmpl-9sBbvJTm6vBbDls3RUcInszCd2kqj', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, role=None, tool_calls=None), finish_reason='stop', index=0, logprobs=None)], created=1722701371, model='gpt-3.5-turbo-0125', object='chat.completion.chunk', service_tier=None, system_fingerprint=None, usage=None)

Formatted Output (Only Usage)

None
None
None
None
None
None
None
None
None
None
None

I’ve tested with different models (gpt-4o-mini, gpt-3.5) and both async and non-async configurations, but the usage field always returns None in the response chunks.

Has anyone else experienced this issue or found a solution? Any insights or suggestions would be greatly appreciated!

Thanks in advance!

trueserv1 · August 3, 2024, 9:07pm

Did you present this issue to ChatGPT? I am not a Coder at all! So when I try to write code and it does not work, I ask ChatGPT and it spits out the resolution pronto!

d.dion · August 8, 2024, 1:02pm

I thought I had the same issue but I got it to work fine in JS:
request.stream_options={}
request.stream_options.include_usage = true

conciergeai · August 14, 2024, 12:51am

I already have the option to true

conciergeai · August 14, 2024, 5:59am

You first set the option as None and add it after?

sps · August 21, 2024, 7:41pm

Hi @conciergeai

Usage stats are being returned. Here’s how:

(key) include_usage: bool
If set, an additional chunk will be streamed before the data: [DONE] message.

The usage field on this chunk shows the token usage statistics for the entire request, and the choices field will always be an empty array. All other chunks will also include a usage field, but with a null value.

Here’s some code to get you started:

from openai import OpenAI

client = OpenAI()
prompt = "Tell me a dad-joke"

response = client.chat.completions.create(
    model="chatgpt-4o-latest",
    messages=[{"role":"user", "content": prompt}],
    stream=True,
    temperature=0.5,
    stream_options={"include_usage": True}
)

for chunk in response:
    if chunk.choices:
        if chunk.choices[0].delta.content is not None:
            print(chunk.choices[0].delta.content, end="")
    
    # Handle the case where choices is empty but usage data is present
    elif chunk.usage:
        print("\n\n", chunk.usage)

conciergeai · August 22, 2024, 5:51pm

I’m printing all the Chunks no matter the status, And all chunk.usage, is None

conciergeai · August 22, 2024, 5:59pm

Here is my code snippet:

client = AsyncOpenAI(api_key=os.getenv("OPENAI_API_KEY"))


async def chat_completion(websocket,listen_task,call:Call,solutions:Solutions,llm_model,client_tools):
    try:

        #Apend client tools into llm tools
        all_tools = []
        for doc in client_tools:
            all_tools.append(doc.get("tool",{}))

        # Initialize default tools for the session
        gpt_tools = default_gpt_tools.copy()

        # Convert tools to respective formats and append to default tools
        gpt_tools.extend(convert_to_gpt_format(all_tools))

        response = await client.chat.completions.create(
            model=llm_model,
            messages=call.messages,
            stream=True,
            temperature=0.5,
            tool_choice="auto",
            tools=gpt_tools,
            max_tokens=300,
            stream_options={"include_usage": True}
            )

async def text_iterator():
            nonlocal call
            full_resp = ""
            arguments_buffer = ""
            tool_name = None
            async for chunk in response:
                print(chunk)
                if chunk.choices:
                    content = chunk.choices[0].delta.content
                    tool_call = chunk.choices[0].delta.tool_calls
                    finish_reason = chunk.choices[0].finish_reason
                    # if there is content
                    if content is not None:#when content is not a tool
                        full_resp += content
                        #print(content,flush=True,end="")
                        yield content
                    #if call is a tool
                    elif tool_call:
                        for chat_call in tool_call:#loop in the tools
                            if chat_call.function.name:#get tool name
                                tool_name = chat_call.function.name
                                full_resp += f"Just Used: {tool_name}"
                            if chat_call.function.arguments:#get arguments into json-string
                                arguments_buffer += chat_call.function.arguments
                    else:
                        if tool_name:
                            

                            #get tool URL
                            tool_arguments = await get_tool_arguments(tools_data=client_tools,tool_name=tool_name)
                            #gets last 3 messages from mesasges list
                            last_messages_str = await get_last_messages(message_list=call.messages)

                            tool_response, category = await agent.resolve_function(tool_name=tool_name,arguments=arguments_buffer,websocket=websocket,language=call.lang,university_id=call.university_id,student_data=call.student_data,tool_arguments=tool_arguments,last_messages=last_messages_str)
                            yield tool_response
                            call.category = category
                        if finish_reason:
                            #print("end of response")
                            break
                elif chunk.usage:
                    print("\n\n", chunk.usage)
                        


it prints:

ChatCompletionChunk(id='chatcmpl-9z6MQoBIMgPrfeNhVtZr1fIYhw6Gm', choices=[Choice(delta=ChoiceDelta(content='', function_call=None, refusal=None, role='assistant', tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1724349486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_d794a2177f', usage=None)
ChatCompletionChunk(id='chatcmpl-9z6MQoBIMgPrfeNhVtZr1fIYhw6Gm', choices=[Choice(delta=ChoiceDelta(content="You're", function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1724349486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_d794a2177f', usage=None)
ChatCompletionChunk(id='chatcmpl-9z6MQoBIMgPrfeNhVtZr1fIYhw6Gm', choices=[Choice(delta=ChoiceDelta(content=' welcome', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1724349486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_d794a2177f', usage=None)
ChatCompletionChunk(id='chatcmpl-9z6MQoBIMgPrfeNhVtZr1fIYhw6Gm', choices=[Choice(delta=ChoiceDelta(content='!', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1724349486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_d794a2177f', usage=None)
ChatCompletionChunk(id='chatcmpl-9z6MQoBIMgPrfeNhVtZr1fIYhw6Gm', choices=[Choice(delta=ChoiceDelta(content=' How', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1724349486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_d794a2177f', usage=None)
ChatCompletionChunk(id='chatcmpl-9z6MQoBIMgPrfeNhVtZr1fIYhw6Gm', choices=[Choice(delta=ChoiceDelta(content=' can', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1724349486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_d794a2177f', usage=None)
ChatCompletionChunk(id='chatcmpl-9z6MQoBIMgPrfeNhVtZr1fIYhw6Gm', choices=[Choice(delta=ChoiceDelta(content=' I', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1724349486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_d794a2177f', usage=None)
ChatCompletionChunk(id='chatcmpl-9z6MQoBIMgPrfeNhVtZr1fIYhw6Gm', choices=[Choice(delta=ChoiceDelta(content=' assist', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1724349486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_d794a2177f', usage=None)
ChatCompletionChunk(id='chatcmpl-9z6MQoBIMgPrfeNhVtZr1fIYhw6Gm', choices=[Choice(delta=ChoiceDelta(content=' you', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1724349486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_d794a2177f', usage=None)
ChatCompletionChunk(id='chatcmpl-9z6MQoBIMgPrfeNhVtZr1fIYhw6Gm', choices=[Choice(delta=ChoiceDelta(content=' today', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1724349486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_d794a2177f', usage=None)
ChatCompletionChunk(id='chatcmpl-9z6MQoBIMgPrfeNhVtZr1fIYhw6Gm', choices=[Choice(delta=ChoiceDelta(content='?', function_call=None, refusal=None, role=None, tool_calls=None), finish_reason=None, index=0, logprobs=None)], created=1724349486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_d794a2177f', usage=None)
ChatCompletionChunk(id='chatcmpl-9z6MQoBIMgPrfeNhVtZr1fIYhw6Gm', choices=[Choice(delta=ChoiceDelta(content=None, function_call=None, refusal=None, role=None, tool_calls=None), finish_reason='stop', index=0, logprobs=None)], created=1724349486, model='gpt-4o-2024-08-06', object='chat.completion.chunk', service_tier=None, system_fingerprint='fp_d794a2177f', usage=None)

So usage is always None

jherrerowm · August 27, 2024, 3:57pm

Hi, any update on this? I am experiencing the same problem and no one seems to give feedback about it.

multitechvisions · September 8, 2024, 5:52pm

Does this setting take time to propagate through their servers or something?

I, like others that have commented after your solution here, have done what you’ve posted here - yet every usage is null.

I changed the setting in my call
I’ve made the call a dozen or more times, using different prompts etc.
Every single one comes back with null usages for every chunk

It’s like there’s some other setting or something that’s playing into this, and it’s not as simple as including the streaming-usage true flag.

What are we missing?

TYIA

sps · September 9, 2024, 12:04pm

Hi @multitechvisions

The usage is only sent in the second-to-last chunk, just before data: [DONE], where choices is an empty array. In the rest of the chunks, it will always be a null value.

It’s really as simple as setting the "stream": true and "stream_options" to { "include_usage": true }.
Here’s a cURL call for you to test it:

curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Hello!"
      }
    ],
    "stream": true,
    "stream_options": {
      "include_usage": true
    }
  }'

Here’s the output that I got:

data: {"id":"chatcmpl-xxxxxxxxxxxxx8ZVFOjWthz6MHp","object":"chat.completion.chunk","created":1725882908,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_25624ae3a5","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null},"logprobs":null,"finish_reason":null}],"usage":null}

data: {"id":"chatcmpl-xxxxxxxxxxxxx8ZVFOjWthz6MHp","object":"chat.completion.chunk","created":1725882908,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_25624ae3a5","choices":[{"index":0,"delta":{"content":"Hi"},"logprobs":null,"finish_reason":null}],"usage":null}

data: {"id":"chatcmpl-xxxxxxxxxxxxx8ZVFOjWthz6MHp","object":"chat.completion.chunk","created":1725882908,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_25624ae3a5","choices":[{"index":0,"delta":{"content":" there"},"logprobs":null,"finish_reason":null}],"usage":null}

data: {"id":"chatcmpl-xxxxxxxxxxxxx8ZVFOjWthz6MHp","object":"chat.completion.chunk","created":1725882908,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_25624ae3a5","choices":[{"index":0,"delta":{"content":"!"},"logprobs":null,"finish_reason":null}],"usage":null}

data: {"id":"chatcmpl-xxxxxxxxxxxxx8ZVFOjWthz6MHp","object":"chat.completion.chunk","created":1725882908,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_25624ae3a5","choices":[{"index":0,"delta":{"content":" How"},"logprobs":null,"finish_reason":null}],"usage":null}

data: {"id":"chatcmpl-xxxxxxxxxxxxx8ZVFOjWthz6MHp","object":"chat.completion.chunk","created":1725882908,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_25624ae3a5","choices":[{"index":0,"delta":{"content":" can"},"logprobs":null,"finish_reason":null}],"usage":null}

data: {"id":"chatcmpl-xxxxxxxxxxxxx8ZVFOjWthz6MHp","object":"chat.completion.chunk","created":1725882908,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_25624ae3a5","choices":[{"index":0,"delta":{"content":" I"},"logprobs":null,"finish_reason":null}],"usage":null}

data: {"id":"chatcmpl-xxxxxxxxxxxxx8ZVFOjWthz6MHp","object":"chat.completion.chunk","created":1725882908,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_25624ae3a5","choices":[{"index":0,"delta":{"content":" assist"},"logprobs":null,"finish_reason":null}],"usage":null}

data: {"id":"chatcmpl-xxxxxxxxxxxxx8ZVFOjWthz6MHp","object":"chat.completion.chunk","created":1725882908,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_25624ae3a5","choices":[{"index":0,"delta":{"content":" you"},"logprobs":null,"finish_reason":null}],"usage":null}

data: {"id":"chatcmpl-xxxxxxxxxxxxx8ZVFOjWthz6MHp","object":"chat.completion.chunk","created":1725882908,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_25624ae3a5","choices":[{"index":0,"delta":{"content":" today"},"logprobs":null,"finish_reason":null}],"usage":null}

data: {"id":"chatcmpl-xxxxxxxxxxxxx8ZVFOjWthz6MHp","object":"chat.completion.chunk","created":1725882908,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_25624ae3a5","choices":[{"index":0,"delta":{"content":"?"},"logprobs":null,"finish_reason":null}],"usage":null}

data: {"id":"chatcmpl-xxxxxxxxxxxxx8ZVFOjWthz6MHp","object":"chat.completion.chunk","created":1725882908,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_25624ae3a5","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}],"usage":null}

data: {"id":"chatcmpl-xxxxxxxxxxxxx8ZVFOjWthz6MHp","object":"chat.completion.chunk","created":1725882908,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_25624ae3a5","choices":[],"usage":{"prompt_tokens":19,"completion_tokens":10,"total_tokens":29}}

data: [DONE]

Pay close attention to the second-to-last chunk’s usage value:

sps:

data: {"id":"chatcmpl-xxxxxxxxxxxxx8ZVFOjWthz6MHp","object":"chat.completion.chunk","created":1725882908,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_25624ae3a5","choices":[],"usage":{"prompt_tokens":19,"completion_tokens":10,"total_tokens":29}}

data: [DONE]

mainstreamstudios · September 10, 2024, 11:44pm

Same happens for me on curl request. Not getting any chunk containing usage info

multitechvisions · September 15, 2024, 6:49pm

Thank you for your reply, it’s greatly appreciated.

I’m at a complete loss here, I apologize if I’m missing something basic.

Here is my code for this particular part:

You can see I’m logging & checking to see if there is anything inside usage…

When I run this, every usage is reported as null
Not a single chunk had anything inside usage

And for reference: yes - stream usage is included in the call:

It’s like the server is ignoring the include usage flag…

But I can confirm, through logging the body contents just before we send it out, that all of those settings ARE indeed being sent.

image1117×101 5.9 KB

That goes out… but no usage comes back.

@sps You say it’s as easy as adding the flag, and pulling whatever you find out of the usage object… yet it doesn’t work for me. ¯\_(ツ)_/¯

I must be missing something
There must be something else going on
Something NOT on the surface
Like a setting or something somewhere that’s turned off… and that’s the reason why this isn’t working
- Because otherwise… it should be working. lol

Thank you again for your time and brain power!

Note: Continuing this discussion here, for those that land here in the future looking for a solution.

=====================================================

Figured it out

The problem was that I had a snippet of code that caught the stop signal

The reason this matters is that the usage chunk comes AFTER the stop signal!

Since I was stopping when we received the stop signal, I was not receiving the “2nd to last” chunk… which is the usage chunk.
- Now that I’m cycling through any piece that piece.startsWith('data: {'), I’m now getting usage

@mainstreamstudios are you doing something similar with the stop signal?

Qi_ease · September 18, 2024, 9:00pm

Hi there,

Today I met the same issue with the streaming API call.

Here’s is my attempt to use cURL but with a third party API:

curl https://api.***/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $API_KEY" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Hello!"
      }
    ],
    "stream": true,
    "stream_options": {
      "include_usage": true
    }
  }'

And here is the output:

tdata: {"id":"chatcmpl-A8vmHRzjdgbwTmE62d8J3xy0ECrPG","object":"chat.completion.chunk","created":1726692085,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_a5d11b2ef2","choices":[{"index":0,"delta":{"role":"assistant","content":"","refusal":null},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-A8vmHRzjdgbwTmE62d8J3xy0ECrPG","object":"chat.completion.chunk","created":1726692085,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_a5d11b2ef2","choices":[{"index":0,"delta":{"content":"Hello"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-A8vmHRzjdgbwTmE62d8J3xy0ECrPG","object":"chat.completion.chunk","created":1726692085,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_a5d11b2ef2","choices":[{"index":0,"delta":{"content":"!"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-A8vmHRzjdgbwTmE62d8J3xy0ECrPG","object":"chat.completion.chunk","created":1726692085,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_a5d11b2ef2","choices":[{"index":0,"delta":{"content":" How"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-A8vmHRzjdgbwTmE62d8J3xy0ECrPG","object":"chat.completion.chunk","created":1726692085,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_a5d11b2ef2","choices":[{"index":0,"delta":{"content":" can"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-A8vmHRzjdgbwTmE62d8J3xy0ECrPG","object":"chat.completion.chunk","created":1726692085,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_a5d11b2ef2","choices":[{"index":0,"delta":{"content":" I"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-A8vmHRzjdgbwTmE62d8J3xy0ECrPG","object":"chat.completion.chunk","created":1726692085,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_a5d11b2ef2","choices":[{"index":0,"delta":{"content":" assist"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-A8vmHRzjdgbwTmE62d8J3xy0ECrPG","object":"chat.completion.chunk","created":1726692085,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_a5d11b2ef2","choices":[{"index":0,"delta":{"content":" you"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-A8vmHRzjdgbwTmE62d8J3xy0ECrPG","object":"chat.completion.chunk","created":1726692085,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_a5d11b2ef2","choices":[{"index":0,"delta":{"content":" today"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-A8vmHRzjdgbwTmE62d8J3xy0ECrPG","object":"chat.completion.chunk","created":1726692085,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_a5d11b2ef2","choices":[{"index":0,"delta":{"content":"?"},"logprobs":null,"finish_reason":null}]}

data: {"id":"chatcmpl-A8vmHRzjdgbwTmE62d8J3xy0ECrPG","object":"chat.completion.chunk","created":1726692085,"model":"gpt-4o-2024-05-13","system_fingerprint":"fp_a5d11b2ef2","choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}]}

data: [DONE]

From what I can see, there is even no usage property. Is it because I’m using a third-party API?
Thanks in advance for your support!

conciergeai · October 2, 2024, 3:52pm

No, Still not working…
Seems openAi Does not really care about this

michaelksullivan50 · October 16, 2024, 11:11am

Alright I was having the same issue and as noted above the usage comes after the finish_reason: “stop”, but the choice array will be empty so your code might break at this point, you just need to ensure choice.delta exists before trying to destructure -

This is what finally worked for me and printed out the usage -

for await (const chunk of response) {
     console.log('Received chunk:', JSON.stringify(chunk, null, 2));
     console.log('usage - ', chunk.usage);
     const choice = chunk.choices[0]; 

     if (choice?.delta) {
       const { content } = choice.delta; 
       if (content) {
         ctx.res.write(content);
       }
     }
   }

lucho · October 31, 2024, 12:58pm

This is correct, the documentation clearly states that the usage data comes in the last chunk, in other words, after the finish_reason: 'stop'.

My implementation: whenever I see the ‘stop’ finish reason, I create a stop flag that doesn’t stop execution but prevents the execution of any logic other than receiving one more chunk with the usage data.

spohr.david · February 21, 2025, 11:53pm

I’m not getting the usage chunk on streamed responses as well, tried multiple models, using openai javascript library.

for await (const chunk of response.iterator()) {
  if (chunk.usage) { console.log(chunk.usage) }

All .choices data is fine but no .usage object at all.
Is it working for someone?

Topic		Replies	Views
Usage stats now available when using streaming with the Chat Completions API or Completions API API api , api-usage , streaming	25	18117	January 23, 2025
OpenAi API - get usage tokens in response when set stream=True API	31	38648	August 3, 2024
Why there is no USAGE object returned with Streaming Api Call? API api , chat-completion , completions	20	5300	February 20, 2025
KeyError: "usage" for gpt-3.5-turbo-16k Bugs gpt-35-turbo , chatgpt	6	820	February 9, 2024
4o and 4 API output has typo/missing words Bugs gpt-4	55	761	July 19, 2024