The response like this:
{
"choices": [
{
"delta": {
"content": "Hab"
},
"finish_reason": null,
"index": 0
}
],
"created": 1680676704,
"id": "chatcmpl-.....",
"model": "gpt-3.5-turbo-0301",
"object": "chat.completion.chunk"
}
{
"choices": [
{
"delta": {},
"finish_reason": "stop",
"index": 0
}
],
"created": 1680676704,
"id": "chatcmpl-.....",
"model": "gpt-3.5-turbo-0301",
"object": "chat.completion.chunk"
}
when I send an API call with stream=False, token_usage information was sent in response body. But when i set stream=True, there is nothing like this in the response. How can I calculate the amount of tokens used by myself.
2 Likes
sajo
May 3, 2023, 6:12am
2
I have the same request. Please provide token usage in stream mode like you provide “finish_reason” in the last chunk.
Genuinely trying to be helpful here…
I am using the Node.js API. Previously, when stream:false, the API would return prompt_token and completion_token (maybe these are the field names). But after using streams, I cannot find these two fields. How can I accurately retrieve them? --Translated from ChatGPT
However, after extensive testing, I found that the token value calculated by the calculator for offline token calculation is far from the actual value used. So Python’s tiktoken is not reliable.
However, please find attached the method I am currently using to calculate tokens:
Each stream containing an answer is treated as a token, and when all these streams are added up, they are equal to all the tokens in this question. This is the method for calculating the response token.
The token method…
use tiktoken GitHub - openai/tiktoken: tiktoken is a fast BPE tokeniser for use with OpenAI's models. this link .cumulate all tokens and calculate
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# How to count tokens with tiktoken\n",
"\n",
"[`tiktoken`](https://github.com/openai/tiktoken/blob/main/README.md) is a fast open-source tokenizer by OpenAI.\n",
"\n",
"Given a text string (e.g., `\"tiktoken is great!\"`) and an encoding (e.g., `\"cl100k_base\"`), a tokenizer can split the text string into a list of tokens (e.g., `[\"t\", \"ik\", \"token\", \" is\", \" great\", \"!\"]`).\n",
"\n",
"Splitting text strings into tokens is useful because GPT models see text in the form of tokens. Knowing how many tokens are in a text string can tell you (a) whether the string is too long for a text model to process and (b) how much an OpenAI API call costs (as usage is priced by token).\n",
"\n",
"\n",
"## Encodings\n",
"\n",
"Encodings specify how text is converted into tokens. Different models use different encodings.\n",
"\n",
This file has been truncated. show original
1 Like
wolf
July 4, 2023, 5:12pm
5
I use the API from Javascript not Python. So how can I calculate the request tokens.
By the way. In my oppinion the missing token-information it is an open issue.
wolf
July 4, 2023, 6:28pm
7
Looks good, but when I include it into my Page
const { encode, decode, encodeChat } = GPTTokenizer_cl100k_base
const chatTokens = encodeChat(chat,'gpt-3.5-turbo')
I get the following error:
gpt-tokenizer:1 Uncaught TypeError: Cannot read properties of undefined (reading 'encodeChatGenerator')
at encodeChat (gpt-tokenizer:1:2093142)
at Object.javascriptFunction (home?session=8369786809309:842:20)
at da.doAction (desktop_all.min.js?v=22.2.4:24:5629)
so can you help me how to use it?
I’m sorry, I don’t have any personal experience using that package.
I’m a bit busy today, but there’s a chance I can take a look at it tomorrow if you’re still having issues.
1 Like
wb
July 6, 2023, 8:04am
9
Hi, did you find a way of using it in a normal website?
1 Like