How to get the cost for each api call?

If i execute a python script for a api call, can i know how much that particular api call cost?
I’m not able to see anything in the dashboard to sort these.
Is there a way to do so?

1 Like

Hi!
You can look into the Usage API which has a separate Costs endpoint to keep track of your spending.

https://platform.openai.com/docs/api-reference/usage

https://platform.openai.com/docs/api-reference/usage/costs

The billing is in terms of tokens. You can then directly translate that to the input and output cost of the model you are employing - the cost being divided by one million then multiplied by your usage.

Just typing up a request to an API endpoint..

... import os, httpx
>>> def send_chat_request(conversation_messages):
...     # Set your API key in OPENAI_API_KEY env variable (never hard-code it).
...     api_key = os.environ.get("OPENAI_API_KEY")
...     if not api_key:
...         raise ValueError("Set OPENAI_API_KEY environment variable!")
...     api_url = "https://api.openai.com/v1/chat/completions"
...     headers = {"Authorization": f"Bearer {api_key}"}
...     payload = {"model": "gpt-4o-mini", "messages": conversation_messages,
...                "max_completion_tokens": 2345}
...     try:
...         response = httpx.post(api_url, headers=headers, json=payload, timeout=180.0)
...         response.raise_for_status()
...     except Exception as error:
...         print("API error:", error)
...         return None
...     return response
... 
>>> system = [{"role": "system", "content":
...            "You are a helpful AI assistant."}]
>>> user = [{"role": "user", "content":
...            "Are OpenAI AI models expensive on API?"}]
>>> response = send_chat_request(system+user)

From this call, the API sends back usage information. We can obtain a Python dictionary out of the JSON response bytes, ready for some math, using the pricing page for models in “Documentation” (on this forum’s sidebar).

>> response.json()["usage"]
{'prompt_tokens': 27, 'completion_tokens': 128, 'total_tokens': 155, 'prompt_tokens_details': {'cached_tokens': 0, 'audio_tokens': 0}, 'completion_tokens_details': {'reasoning_tokens': 0, 'audio_tokens': 0, 'accepted_prediction_tokens': 0, 'rejected_prediction_tokens': 0}}

How ready to do some math? Let’s make a function that can take the httpx response (some tweaks for openai client response needed). And answer for you or provide you costs to log.

An entire API call, and scroll to the reusable calculator:

import os
import httpx

def send_chat_request(
    model: str = "gpt-4o-mini",
    conversation_messages: list[dict[str, str]] = [],
) -> httpx.Response | None:
    api_key = os.environ.get("OPENAI_API_KEY")
    if not api_key:
        raise ValueError("Set OPENAI_API_KEY environment variable!")
    api_url = "https://api.openai.com/v1/chat/completions"
    headers = {"Authorization": f"Bearer {api_key}"}
    payload = {
        "model": model,
        "messages": conversation_messages,
        "max_completion_tokens": 2345
    }
    try:
        response = httpx.post(api_url, headers=headers, json=payload, timeout=180.0)
        response.raise_for_status()
    except Exception as error:
        print("API error:", error)
        return None
    return response

def calculate_api_cost(
    response: httpx.Response,
    do_print: bool = False
) -> tuple[float, float, float, float]:
    """
    Calculate usage cost from an OpenAI API response using dynamic model pricing.

    Token pricing is provided per million tokens. (Thus, we divide by 1,000,000.)
    Note that total prompt tokens include both uncached and cached tokens. We subtract
    cached tokens to get the uncached amount because they are billed at a different rate.
    
    For models that do not return any discount (indicated by a cached price of None),
    the cached input price is treated as 0; the returned usage details should show 0 cached tokens.
    
    Returns:
      A tuple: (cost_uncached_input, cost_cached_input, cost_output, total_cost)
    """
    pricing_table = [
        # Exact match (highest precedence).
        ("gpt-4o-2024-05-13", {
            "input": 5.00,        # dollars per million tokens for uncached input
            "cached_input": None,  # no discount pricing; treat as 0
            "output": 15.00       # dollars per million tokens for output
        }),
        # Broader prefix matches.
        ("gpt-4o-mini", {
            "input": 0.15,        # dollars per million tokens for uncached input
            "cached_input": 0.075,  # dollars per million tokens for cached input
            "output": 0.60        # dollars per million tokens for output
        }),
        ("gpt-4o", {
            "input": 2.50,
            "cached_input": 1.25,
            "output": 10.00
        }),
        # Additional models can be added here.
    ]

    model_used = response.json()["model"]
    usage = response.json()["usage"]

    matched_pricing = None
    for prefix, pricing in pricing_table:
        if model_used == prefix or model_used.startswith(prefix):
            matched_pricing = pricing
            break

    if matched_pricing is None:
        raise ValueError(f"No pricing data available for model '{model_used}'.")

    prompt_tokens = usage["prompt_tokens"]
    cached_tokens = usage["prompt_tokens_details"]["cached_tokens"]
    completion_tokens = usage["completion_tokens"]
    reasoning_tokens = usage["completion_tokens_details"]["reasoning_tokens"]

    # Determine uncached tokens:
    if matched_pricing["cached_input"] is None:
        # If the model doesn't offer a discount for cached tokens,
        # then all prompt tokens are treated as uncached.
        uncached_prompt_tokens = prompt_tokens
        cost_cached_input = 0.0
    else:
        # Subtract cached tokens from total prompt tokens to compute the uncached tokens.
        uncached_prompt_tokens = prompt_tokens - cached_tokens
        cost_cached_input = cached_tokens * matched_pricing["cached_input"] / 1_000_000

    output_tokens = completion_tokens + reasoning_tokens

    cost_uncached_input = uncached_prompt_tokens * matched_pricing["input"] / 1_000_000
    cost_output = output_tokens * matched_pricing["output"] / 1_000_000
    total_cost = cost_uncached_input + cost_cached_input + cost_output

    if do_print:
        print(f"API Call Cost Breakdown ({model_used}):")
        print(f"  • Uncached Input ({uncached_prompt_tokens} tokens): ${cost_uncached_input:.6f}")
        print(f"  • Cached Input ({cached_tokens} tokens): ${cost_cached_input:.6f}")
        print(f"  • Output ({output_tokens} tokens): ${cost_output:.6f}")
        print(f"  → Total API Call Cost: ${total_cost:.6f}")

    return cost_uncached_input, cost_cached_input, cost_output, total_cost

# === Procedural Demonstration ===
if __name__ == "__main__":
    system_message = [{"role": "system", "content": "You are a helpful AI assistant."}]
    user_message = [{"role": "user", "content": "Are OpenAI AI models expensive on API?"}]
    
    # Example call using a model. The send_chat_request function takes the model as a positional argument.
    response = send_chat_request("gpt-4o-mini", system_message + user_message)

    if response:
        msg = response.json()['choices'][0]['message']
        msg = msg['content']
        print("AI said:\n" + msg)
        costs = calculate_api_cost(
            response, do_print=True
        )
        (
         uncached_cost, cached_cost, output_cost, total_cost
        ) = costs
    else:
        print("No valid response received from the API.")

Gives us:

AI said:
The cost of using OpenAI’s API can vary depending on several factors, including the specific model you choose, the volume of usage, and any pricing plan you select. OpenAI typically offers various tiers and pricing structures, which may include charges based on the number of tokens (characters or words) processed in requests and responses.

For instance, as of my last update, models like GPT-3 and newer ones like GPT-4 may have different pricing per token. Additionally, OpenAI may offer subscription plans or discounts for higher usage levels.

To get the most accurate and up-to-date information regarding API pricing, it’s best to visit OpenAI’s official website or their API documentation, where they provide detailed pricing information and usage plans.

  • Uncached Input (27 tokens): $0.000004
  • Cached Input (0 tokens): $0.000000
  • Output (148 tokens): $0.000089
  → Total API Call Cost: $0.000093

There is no API to get model costs or other metadata or to have anything other than tokens returned. I just put enough models in the function to demonstrate.

Note that this shows as if you were actually billed. Being enrolled in “complementary 10M tokens thru April”, the printout is not my bill.