Example Call
curl https://api.openai.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-3.5-turbo-0613",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
]
}'
Example Response
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "gpt-3.5-turbo-0613",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "\n\nHello there, how may I assist you today?",
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 9,
"completion_tokens": 12,
"total_tokens": 21
}
}
Every API call returns the tokens used for that call, you should be logging for every call at least,
- The unique identifier of the user
- From the call,
messages
- From the response,
a. choices[0].message (assuming you’re only generating a single response)
b. usage
- Any other identifier you have for keeping track of the individual conversations
Then you’ll be able to very quickly and readily drill down into where your usage is coming from.
2 Likes
Awesome! Thank you sir!!! This was the answer I needed
Happy to help.
Please mark the previous post as a solution so it is more easily discovered by others in the future who encounter a similar situation.
In this first Image you’ll see a screenshot assuring that the paragraph is 35 tokens.
Putting this through the tokenizer here…
You can see through the OpenAI Tokenizer, it states its 10 tokens more than the previous 35, at 45 tokens. That is a massive 28.5% percent difference.
Not trying to accuse anyone - but as someone trying to start a small business with the use of the OpenAI API - It’s becoming increasingly apparent that there is a real lack of transparency on tokenization, and it becomes incredibly difficult (not impossible) for the small business owner to calculate costs. I really hope in the future OpenAI makes it a priority to be transparent with developers so they can understand THEIR OWN pricing models so they can set up THEIR OWN products and profit off of them. The lack of transparency on tokenization is a problem.
Regardless, because I know I’m just going to get a bunch of hate - its still not right that one example states it is exactly 35 tokens and on their own tokenizer it says 45.
According to the OpenAI Tokenizer, I will actually be charged for 28.5% percent more tokens than what the website itself states. That’s not transparent and that’s a massive cost differential to understand as someone running a business
wclayf
47
It sure seems like they should charge by the number of bytes or words, so people can verify their accounting (verify their charges are correct), without OpenAI just saying “trust us”.
Using tokens means that each model GPT-4 vs GPT-3 might split things into tokens differently as you pointed out.
Hey wclayf,
This is basically exactly what my accounting person is telling me. The whole “just trust us” thing, doesn’t really work when you have two sources giving a 30 percent difference on the amount of tokens youll be charged on. Remember thats not a price difference, it doesn’t have to do with GPT-3 vs 4 - that is tokenage itself.
@wclayf @tventura94
Or, and here me out…
It’s a typo. It wouldn’t be the first typo on their page and documentation and it won’t be the last.
The pricing is incredibly transparent. With each API call you get an accounting of the number of input and output tokens used and you can verify the token counts with the tiktoken library.
It’s trivially easy to do.
Or, it’s a typo. I’m sure if you submit a report on it to support, they’ll update the page as soon as they can.
1 Like
_j
50
When using the right tokenizer for chat models. I fixed the text.

100 for under a penny, though.
1 Like
Cool I’ll check this out . Thanks for the info.
That’s still not the right number
Seems like a pretty bad thing to have a typo on a 30 percent difference for your PRICING.
Listen I understand you probably work for OpenAI but you don’t have to cut slack on typos relating to very pertinent and serious information.
This wasn’t a typo in a random paragraph it’s a massive mistake and you sound like you just work for OpenAI
The page probably hasn’t been updated since before they implemented the cl100k_base BPE tokenizer.
Using p50k_base and earlier it’s 45 tokens. My guess is someone meant to type 35 and miskeyed the 4 instead of the 3 and it was never caught.
It happens, when they updated the models with a new tokenizer, there were a lot of pages that needed to get updated, probably by people not intimately familiar with how the models work, so this particular number on this particular page didn’t jump out as something that needed changing.
At this time, using any of the modern models, the count for that text is 49. Using any of the legacy (GPT-3) models it’s 45.
_j
55
I try not to be wrong.
{
“id”: “chatcmpl-no”,
“object”: “chat.completion”,
“created”: 1693379999,
“model”: “gpt-3.5-turbo-16k-0613”,
“choices”: [
{
“index”: 0,
“message”: {
“role”: “assistant”,
“content”: “Multiple models, each with different capabilities and price points. Prices are per 1,000 tokens. You can think of tokens as pieces of words, where 1,000 tokens is about 750 words. This paragraph is 49 tokens.”
},
“finish_reason”: “stop”
}
],
“usage”: {
“prompt_tokens”: 82,
“completion_tokens”: 49,
“total_tokens”: 131
}
}
No I understand it’s 49
It says 35 on their website
That’s a 30 percent difference on what they’re charging
They are currently charging 30 percent more than the site says.
And you’re both gaslighting me to make me feel stupid for presenting you with public information on the companies site.
Tish are tpyos.
And this is me lying saying our tokens cost 30 percent less than they actually do.
It is my mistake that I missed that on the API call it says the tokenage - and that does make me stupid.
Have a good night.
I don’t work for OpenAI, OpenAI staff have a “staff” flair.
I’m just explaining that a single character typo on a page isn’t that big of a deal. There have been many much bigger typos in the world.
Should they fix it? Absolutely.
But, it’s not some big conspiracy to fleece people.
That snippet of text being 35, 45, or 49 tokens doesn’t really make any difference in anyone’s billing unless you’re sending that specific text to the model. There’s no 1:1 correlation between character or word counts and the number of tokens a piece of text is encoded with. So, you’d still need to get the token counts on any other text anyway.
Let’s be clear, I am agreeing with you. It’s a bad mistake and it’s more than a little embarrassing for them I’m sure, but it’s explainable and understandable.
Just submit a report, I’m sure it’ll get fixed relatively soon.
Thank you. This has been a really frustrating process for me and I assume others. It’d be nice if they could just display all this data nicely on the site so I didn’t have to do all that but, no matter, it’ll be easy enough to send that data to my database with each users message and the problem will be fixed
Nobody’s stupid here.
But, I’m sure you can agree that 35 can be a typo for 45.
Look, it’s not that big of a deal, they’re not misrepresenting the cost per token which is the only thing that matters.
Right.
But like
Let’s say your mother passed away and you wanted to know the funeral time.
I told you 4pm
It was actually at 2pm.
I tell you it’s just a typo and my bad.
You tell me you missed your mothers funeral.
Get it?
Sure this is an trivial and exaggerated example but the point is it’s not JUST a typo it’s serious information
wclayf
61
I still hope the OpenAI marketing team will consider just setting a price by word. There’s got to be a way to sort of “average it all out” so the price is cheap and transparent. Having to go run a piece of code to generate a token count for every request is not practical for consumers to do for billing purposes. Too cryptic.
The website tokenizer is just weird. It doesn’t seem to correspond to any known token base. If you want to better simulate actual token usage w/ gpt-3.5 or gpt-4, you can use the tiktoken package in python.
Also, other parts of the API call consume tokens as well, not just the chat history and “content” returned, because they are either fed to the model or generated by the model. For very short API calls, this will inflate the total tokens consumed. It’s not that OpenAI is overcharging you for tokens somehow, but that you are sending and receiving more tokens than you think.