How to calculate the cost of a specific request made to the web API (and its reply), in tokens?

cagross · June 19, 2023, 12:45pm

TL;DR How can I calculate the cost, in tokens, of a specific request made to the OpenAI API?

Hi all. I’ve just used the OpenAI Playground (model: gpt-3.5-turbo) to submit a user message and obtain an assistant message in reply. Is it possible to calculate the actual cost of this, in terms of tokens? If so, how can I do that? The user message, assistant message, and system message are below.

ChatGPT says the user message contains 2 tokens and the assistant message contains 3 tokens. I didn’t include the system message in my question–does it affect tokens used?

Motivation for my question: As a side project I want to build a web app using the OpenAI API. But before I do so, I’d like to estimate the costs I can expect to incur during the project.

Thanks in advance.

System message

You are an expert in American cuisine, and creating different dishes from items found in US grocery stores.

User message

I will give you the name of a common product found in a US grocery store.  I would like you to tell me the dish most commonly consumed in the US that contains this item as its featured ingredient.  If possible, ensure the dish is one that most would consider to be 'American cuisine.'   Please restrict your answer to one item, and include only the name of the dish.

The item is:

Boneless skinless chicken thighs

Assistant message

Barbecue Chicken Thighs

EricGT · June 19, 2023, 12:51pm

Are you seeking the tokenizer?

https://platform.openai.com/tokenizer

text

I will give you the name of a common product found in a US grocery store.  I would like you to tell me the dish most commonly consumed in the US that contains this item as its featured ingredient.  If possible, ensure the dish is one that most would consider to be 'American cuisine.'   Please restrict your answer to one item, and include only the name of the dish.

The item is:

Boneless skinless chicken thighs

Tokens

Token ids
[40, 481, 1577, 345, 262, 1438, 286, 257, 2219, 1720, 1043, 287, 257, 1294, 16918, 3650, 13, 220, 314, 561, 588, 345, 284, 1560, 502, 262, 9433, 749, 8811, 13529, 287, 262, 1294, 326, 4909, 428, 2378, 355, 663, 8096, 18734, 13, 220, 1002, 1744, 11, 4155, 262, 9433, 318, 530, 326, 749, 561, 2074, 284, 307, 705, 7437, 33072, 2637, 220, 220, 4222, 4239, 534, 3280, 284, 530, 2378, 11, 290, 2291, 691, 262, 1438, 286, 262, 9433, 13, 198, 198, 464, 2378, 318, 25, 198, 198, 20682, 5321, 4168, 1203, 9015, 30389]

kjordan · June 19, 2023, 1:01pm

I didn’t get the same “Barbecue Chicken Thighs” response, but you get the idea. You can use Knit to test the prompt and it provides the token/cost analytics you want. Disclaimer, I built Knit and it’s currently free for everyone

sps · June 19, 2023, 6:48pm

Hi @cagross

The response to every request contains a usage object. You can use response.usage to know the tokens consumed.

Here’s how the complete response looks like:

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "\n\nHello there, how may I assist you today?",
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 9,
    "completion_tokens": 12,
    "total_tokens": 21
  }
}

ZAdam · June 19, 2023, 6:52pm

There’s a newer package called “tiktoken” that does exactly this. I have no relationship with them and have not used it (yet) myself, but maybe it’s just what you need? It has 5.4k github stars.

EricGT · June 19, 2023, 6:59pm

If you took the time to read the link in my post it notes

If you need a programmatic interface for tokenizing text, check out our tiktoken package for Python.

ZAdam · June 19, 2023, 9:07pm

This sentence doesn’t appear anywhere in this thread for me, sorry mate.

cagross · June 20, 2023, 2:59pm

OK thanks all.

I think the Tokenizer should be suitable for this purpose.

But to be clear, let’s say I post a question to ChatGPT (in the ‘User’ message), and it replies with a response (in the ‘Assistant’ message). The total number of tokens used is then equal to # of tokens in User message + # of tokens in Assistant message. Is that correct?

Or does the ‘System’ message alter the number of tokens used during a single request?

@sps Thanks for info about response.usage. But is there a way to access that object if I’m simply using the OpenAI Playground? When I make the request, in the dev tools network tab, I can see the fetch request is returning an event stream (screenshot). I’ve never encountered one of these before–how can I get the response object from this? Or is response.usage obtained another way (maybe only programmatically)?

novaphil · June 20, 2023, 3:52pm

System, User, and Assistant/Response are all included in token count/costs

sps · June 20, 2023, 6:20pm

This tokenizer was used by earlier gpt-3 models and the codex one was used by codex series of models.

gpt-3.5 and gpt-4 use tiktoken. Here’s a sample on how to use it

In my knowledge the playground UI doesn’t show the whole response object. So the response.usage isn’t accessible via playground.

If you want to track usage via OpenAI’s own UI you can go to the usage page. The data updation takes some time though.

Prompt tokens is = the token count for the data sent by you when making the API call.
completion tokens = the tokens generated by the model.

cagross · June 21, 2023, 2:27pm

Great thanks to both. For now, I’ll try to use the usage page per the suggestion by @sps.

curioustoknownow · August 15, 2023, 9:07am

hey community,
You can use this python function to report the total costs

def openai_api_calculate_cost(usage,model="gpt-3.5-turbo-16k"):
    pricing = {
        'gpt-3.5-turbo-4k': {
            'prompt': 0.0015,
            'completion': 0.002,
        },
        'gpt-3.5-turbo-16k': {
            'prompt': 0.003,
            'completion': 0.004,
        },
        'gpt-4-8k': {
            'prompt': 0.03,
            'completion': 0.06,
        },
        'gpt-4-32k': {
            'prompt': 0.06,
            'completion': 0.12,
        },
        'text-embedding-ada-002-v2': {
            'prompt': 0.0001,
            'completion': 0.0001,
        }
    }

    try:
        model_pricing = pricing[model]
    except KeyError:
        raise ValueError("Invalid model specified")

    prompt_cost = usage['prompt_tokens'] * model_pricing['prompt'] / 1000
    completion_cost = usage['completion_tokens'] * model_pricing['completion'] / 1000

    total_cost = prompt_cost + completion_cost
    print(f"\nTokens used:  {usage['prompt_tokens']:,} prompt + {usage['completion_tokens']:,} completion = {usage['total_tokens']:,} tokens")
    print(f"Total cost for {model}: ${total_cost:.4f}\n")

    return total_cost

emilbebri · December 26, 2023, 8:20pm

Here is an updated version of the cost funtion by curioustoknownow, for the 1.3 API, also with updated prices (as of 2023-12-26) but only for the models I use myself. you can add the rest quite easily, if you do add all please reply here with an updated ‘pricing’ object for us others!

Also added a rounded cost as output.


def openai_api_calculate_cost(usage,model="gpt-4-1106-preview"):
    pricing = {
        'gpt-3.5-turbo-1106': {
            'prompt': 0.001,
            'completion': 0.002,
        },
        'gpt-4-1106-preview': {
            'prompt': 0.01,
            'completion': 0.03,
        },
        'gpt-4': {
            'prompt': 0.03,
            'completion': 0.06,
        }
    }

    try:
        model_pricing = pricing[model]
    except KeyError:
        raise ValueError("Invalid model specified")

    prompt_cost = usage.prompt_tokens * model_pricing['prompt'] / 1000
    completion_cost = usage.completion_tokens * model_pricing['completion'] / 1000

    total_cost = prompt_cost + completion_cost
    # round to 6 decimals
    total_cost = round(total_cost, 6)

    print(f"\nTokens used:  {usage.prompt_tokens:,} prompt + {usage.completion_tokens:,} completion = {usage.total_tokens:,} tokens")
    print(f"Total cost for {model}: ${total_cost:.4f}\n")

    return total_cost

barvin04 · July 29, 2024, 8:32am

Calculating cost for GPT 4o-mini.
Prompt and data will go into the api call and response would be the output from it. The total cost incurred will amount to -

## !pip install tiktoken
import tiktoken

# Initialize the tokenizer for the GPT model
tokenizer = tiktoken.encoding_for_model("gpt-4o-mini")  

# request and response
request = str(prompt) + str(data)
response = str(out)

# Tokenize 
request_tokens = tokenizer.encode(request)
response_tokens = tokenizer.encode(response)

# Counting the total tokens for request and response separately
input_tokens = len(request_tokens)
output_tokens = len(response_tokens)

# Actual costs per 1 million tokens
cost_per_1M_input_tokens = 0.15  # $0.150 per 1M input tokens
cost_per_1M_output_tokens = 0.60  # $0.600 per 1M output tokens

# Calculate the costs
input_cost = (input_tokens / 10**6) * cost_per_1M_input_tokens
output_cost = (output_tokens / 10**6) * cost_per_1M_output_tokens
total_cost = input_cost + output_cost

print(f"Input tokens: {input_tokens}")
print(f"Output tokens: {output_tokens}")
print(f"Total tokens: {input_tokens + output_tokens}")
print(f"Cost: ${total_cost:.5f}")

Topic		Replies	Views
Difference in cost between a ChatGPT request made from the OpenAI playground and one made from a request to the OpenAI web API? API	10	8895	February 16, 2024
Obtaining and displaying token usage on web page API php , javascript	8	4764	May 14, 2024
How to get the cost for each api call? API openapi , api-costs , o1	2	77	April 9, 2025
Are there any calculators that would give me an estimate of how much it would cost to run tokens? API	2	896	September 12, 2024
How do I calculate the pricing for generation of text? API	11	7255	March 6, 2023

How to calculate the cost of a specific request made to the web API (and its reply), in tokens?

Related topics