Pricing, Billing and Tokens? Math is not adding up

Somf · June 7, 2023, 3:56am

Hi all.

I decided to give the API a shot after being blown away at the chatGPT web app. I created a basic chatGPT clone to see if it was possible to replicate.

Currently the issue I am concerned with is the billing/charging of requests since the impression I got on the pricing page is far different than what I am experiencing in reality.

I based my calculations on the token system. 1 token = 4characters. I am currently using gpt-3.5-turbo and implemeting it through PHP Curl.

Endpoint:
https://api.openai.com/v1/engines/davinci/completions

Options:
max_tokens = 1000,
temperature = 0.2,
n = 1,
logprobs = 0,
stop = \nAssistant:

Pricing: Pricing
gpt-3.5-turbo $0.002 / 1K tokens

Based on that pricing it was my assumption based off 1 token = 4 chars. That would be 4000 chars of requests/responses at a cost of $0.002 but that is not what I am seeing in reality.

Below is an example of requests and responses I just did to test.

ME: who is the president of the US?
AI: Joe Biden.
ME: can you write me a poem about flowers?
AI: I’m sorry, I don’t understand.
ME: how do say hello in french?
AI: Merci.
ME: what is the largest continent on earth?
AI: Paris

Total cost for this exchange in my Usage section under my account is $0.08. In a production environment at these rates it would get very expensive very quickly.

Am I just doing something obviously wrong here or is this something other people are experiencing?

Any suggestions or replies would be helpful so that I can get to the bottom of this. I am excited to start using this tech.

All the best!

novaphil · June 7, 2023, 4:14am

Thats Davinci, not GPT3.5. Different pricing

Somf · June 7, 2023, 4:33am

Thanks for the response.

I am getting mixed messages on that from the webApp AI.

https://api.openai.com/v1/chat/completions
Said that was an old chat, gave me this one for turbo:
https://api.openai.com/v1/engines/davinci/completions

Today I asked a similar question and it gave me this endpoint.
https://api.openai.com/v1/engines/davinci-codex/completions

in any case, even if you factor in the different pricing. the math still does not add up for me.

Cheers!

novaphil · June 7, 2023, 4:57am

Don’t trust GPT, it wasn’t trained on the docs. Look at the docs, they’re pretty decent.

Were you including chat history in each request? You’ve used the API Key only for those messages you shared?

You can look at Usage page and select the day under Daily usage to see number of requests and tokens it counted. Also token counts are returned in API response, since it’s not as simple as 4 characters = 1 token

Somf · June 7, 2023, 5:27am

Thanks for the response. the webApp api is a bit all over the place.

I reverted back to the https://api.openai.com/v1/chat/completions endpoint and my chat responses to questions are orders of magnitude better quality. not sure what is going on with the AI in those examples i posted with that other endpoint.

Brilliant! I’ve been looking for the token request stats since i opened the account but couldn’t find it.

Here is the results:
03:00
Local time: 7 Jun 2023, 13:00
davinci, 1 request
13 prompt + 1,000 completion = 1,013 tokens
03:05
Local time: 7 Jun 2023, 13:05
davinci, 1 request
14 prompt + 1,000 completion = 1,014 tokens
03:10
Local time: 7 Jun 2023, 13:10
davinci, 1 request
12 prompt + 1,000 completion = 1,012 tokens
03:15
Local time: 7 Jun 2023, 13:15
davinci, 1 request
13 prompt + 1,000 completion = 1,013 tokens

Here is the result after I switched back to the /v1/chat/completions endpoint

04:55
Local time: 7 Jun 2023, 14:55
gpt-3.5-turbo-0301, 2 requests
45 prompt + 106 completion = 151 tokens

The second Turbo example is much closer to my calculations although the 1Token=4Character thing doesn’t seem to be correct. Removing whitespace my prompt was 50 Chars and Response was 1128 Chars.

The Divinci model is extremely confusing especially considering the poor quality of the answers from my original post.

Thanks for all the help!

novaphil · June 7, 2023, 5:28am

Compare against the actual tokenizer logic using the token calculator

Somf · June 7, 2023, 5:30am

Excellent! Thanks for that.

A quick question on the “chat history”. is that included in the tokens usage?

Thanks again.

novaphil · June 7, 2023, 5:51am

API is stateless, so you need to send any chat history messages as additional prompts. You are responsible for pruning/summarizing the history to keep it under token limit. And you are charged for the full request, so all the past messages you send.

Somf · June 7, 2023, 5:56am

Ok cool. I will just store the history locally.

Thanks again for all your help!

Topic		Replies	Views
Help on pricing chat bot? API gpt-4	16	916	February 16, 2024
Understanding billing of usage API gpt-4 , api	7	1309	February 16, 2024
Am I doing something wrong or is the pricing extremely steep? API	2	685	October 11, 2023
I have some questions regarding how billing is done with GPT-4-Turbo using the Chat.Completion API	7	744	December 26, 2023
Retain past responses in memory without sending them again at every API request API gpt-4 , gpt-35-turbo , chatgpt	11	6301	January 25, 2024

Pricing, Billing and Tokens? Math is not adding up

Options: max_tokens = 1000, temperature = 0.2, n = 1, logprobs = 0, stop = \nAssistant:

ME: who is the president of the US? AI: Joe Biden. ME: can you write me a poem about flowers? AI: I’m sorry, I don’t understand. ME: how do say hello in french? AI: Merci. ME: what is the largest continent on earth? AI: Paris

04:55 Local time: 7 Jun 2023, 14:55 gpt-3.5-turbo-0301, 2 requests 45 prompt + 106 completion = 151 tokens

Related Topics

Options:
max_tokens = 1000,
temperature = 0.2,
n = 1,
logprobs = 0,
stop = \nAssistant:

ME: who is the president of the US?
AI: Joe Biden.
ME: can you write me a poem about flowers?
AI: I’m sorry, I don’t understand.
ME: how do say hello in french?
AI: Merci.
ME: what is the largest continent on earth?
AI: Paris

04:55
Local time: 7 Jun 2023, 14:55
gpt-3.5-turbo-0301, 2 requests
45 prompt + 106 completion = 151 tokens