Usage stats are not included when streaming, I think mostly down to the difficulty of knowing when a stream might be terminated from the other side, at what point do you send the usage stats ? You can use tiktoken to count the tokens in the response deltas.
Yes, I am aware of tiktoken and other token counter libs. We are using node.js so I wasn’t able to use tiktoken since it is not available in node.js but I used gpt-encoder lib instead. Thanks. I still think it will be useful if GPT returns at the end when stream is completed with finish_reason = stop
OpenAI really should consider adding a StreamGUID parameter where we can make a call that asks how much was consumed for any streaming attempt, up to like 1 or 2 minutes after the streaming completes. Using tiktoken, even if it works fine, is not ideal.
So I am aware of some 3rd party libs as well but I was hesitant using some of them even if they are openSource. I handle it with this one recommend in OpenAI cookbook or in tiktoken readme , which is gpt-3 encoder . My problem is that how accurate this libraries are and they will be up to date with gpt side changes if there is some. Thats why it will be alot better to use results returned from GPT itself, such as in regular completion api.
When we look at this table from tiktoken, we can see that gpt-3-encoder doesn’t even support cl100k_base encoding.
So I see that this is recommended by OpenAI in their token counting page now (scroll all the way to bottom). This is also equivalent to tiktoken in python and handles it with cl100k_base encoding with gpt3-5, gpt4 supports.
With chat completion, you can absolutely measure the input tokens and response you receive yourself, by use of a library such as tiktoken, and then adding token count metadata to accounts and to the chat history messages for utility. One only needs to add the fixed per call/per message overhead to the inputs sent, and function specification size can be measured by switching them off on a non-stream call. It is only occasional failures that still get billed you have to allow for.
With assistants, you absolutely have no idea what the agent has been up to until the daily charges start showing up on your account.
Quick update on this: we have a version working that we are testing but are not yet happy with the overall design (there’s a lot of complexity in streaming and billing together). Hopefully this is something we can land for you all soon. Stay tuned!