How to get token usage for each API call in streaming model?

klcogluberk · April 24, 2023, 7:32am

The response like this:

{
  "choices": [
    {
      "delta": {
        "content": "Hab"
      },
      "finish_reason": null,
      "index": 0
    }
  ],
  "created": 1680676704,
  "id": "chatcmpl-.....",
  "model": "gpt-3.5-turbo-0301",
  "object": "chat.completion.chunk"
}
{
  "choices": [
    {
      "delta": {},
      "finish_reason": "stop",
      "index": 0
    }
  ],
  "created": 1680676704,
  "id": "chatcmpl-.....",
  "model": "gpt-3.5-turbo-0301",
  "object": "chat.completion.chunk"
}

when I send an API call with stream=False, token_usage information was sent in response body. But when i set stream=True, there is nothing like this in the response. How can I calculate the amount of tokens used by myself.

sajo · May 3, 2023, 6:12am

I have the same request. Please provide token usage in stream mode like you provide “finish_reason” in the last chunk.

PaulBellow · May 3, 2023, 6:18am

Genuinely trying to be helpful here…

hemanthmurugan21 · June 6, 2023, 7:18am

use tiktoken GitHub - openai/tiktoken: tiktoken is a fast BPE tokeniser for use with OpenAI's models. this link .cumulate all tokens and calculate

github.com

openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb

{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# How to count tokens with tiktoken\n",
    "\n",
    "[`tiktoken`](https://github.com/openai/tiktoken/blob/main/README.md) is a fast open-source tokenizer by OpenAI.\n",
    "\n",
    "Given a text string (e.g., `\"tiktoken is great!\"`) and an encoding (e.g., `\"cl100k_base\"`), a tokenizer can split the text string into a list of tokens (e.g., `[\"t\", \"ik\", \"token\", \" is\", \" great\", \"!\"]`).\n",
    "\n",
    "Splitting text strings into tokens is useful because GPT models see text in the form of tokens. Knowing how many tokens are in a text string can tell you (a) whether the string is too long for a text model to process and (b) how much an OpenAI API call costs (as usage is priced by token).\n",
    "\n",
    "\n",
    "## Encodings\n",
    "\n",
    "Encodings specify how text is converted into tokens. Different models use different encodings.\n",
    "\n",

This file has been truncated. show original

wolf · July 4, 2023, 5:12pm

I use the API from Javascript not Python. So how can I calculate the request tokens.

By the way. In my oppinion the missing token-information it is an open issue.

anon22939549 · July 4, 2023, 5:17pm

Perhaps this?

wolf · July 4, 2023, 6:28pm

Looks good, but when I include it into my Page

const { encode, decode, encodeChat } = GPTTokenizer_cl100k_base
const chatTokens = encodeChat(chat,'gpt-3.5-turbo')

I get the following error:

gpt-tokenizer:1 Uncaught TypeError: Cannot read properties of undefined (reading 'encodeChatGenerator')
    at encodeChat (gpt-tokenizer:1:2093142)
    at Object.javascriptFunction (home?session=8369786809309:842:20)
    at da.doAction (desktop_all.min.js?v=22.2.4:24:5629)

so can you help me how to use it?

anon22939549 · July 4, 2023, 7:08pm

I’m sorry, I don’t have any personal experience using that package.

I’m a bit busy today, but there’s a chance I can take a look at it tomorrow if you’re still having issues.

wb · July 6, 2023, 8:04am

Hi, did you find a way of using it in a normal website?

Topic		Replies	Views
Token usage when using openai.chat.completions.create stream: true API gpt-4 , token	7	4406	November 4, 2023
Chat completion "stream" API token usage API api	3	6259	May 6, 2024
OpenAi API - get usage tokens in response when set stream=True API	31	37234	August 3, 2024
Usage Info in API Responses Announcements	20	11299	September 27, 2023
How do you get token count when streaming API	4	4197	December 19, 2023

How to get token usage for each API call in streaming model?

Related topics