How do I calculate the pricing for generation of text?

I’m really new to using the API. Can anyone tell me how I can calculate the price it would take me to generate a given text ? Is there any function which can help me to calculate my bill before I generate the text?

1 Like

Hello @agrover112 and welcome to the OpenAI community!

Pricing details are mention on OpenAI’s pricing page here: OpenAI API

Essentially, you can use a function to count the tokens in a text, and with the price to token ratio, you can get how much the price totals out to.

1 Like

Can you tell me which function?

1 Like

I isolated the main URI, “https://zero-workspace-server.uc.r.appspot.com/tokenizer”, from the website that does the actual tokenization and API response, then parsed it into structured data. Essentially like this website but without an actual web interface to display the data on: Token estimator
(I hope that’s okay to do! If not, I’ll take this down)

:snake: Python Code: :snake:

from tkinter import *
from tkinter import ttk
import requests, math, json
from requests.structures import CaseInsensitiveDict

# Functions and Objects
def num_tokens(prompt):
	url = "https://zero-workspace-server.uc.r.appspot.com/tokenizer"

	headers = CaseInsensitiveDict()
	headers["authority"] = "zero-workspace-server.uc.r.appspot.com"
	headers["sec-ch-ua"] = '"Chromium";v="94", "Google Chrome";v="94", ";Not A Brand";v="99"'
	headers["accept"] = "application/json, text/javascript, */*; q=0.01"
	headers["content-type"] = "application/json"
	headers["sec-ch-ua-mobile"] = "?0"
	headers["user-agent"] = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.71 Safari/537.36"
	headers["sec-ch-ua-platform"] = "Windows"
	headers["origin"] = "https://gpttools.com"
	headers["sec-fetch-site"] = "cross-site"
	headers["sec-fetch-mode"] = "cors"
	headers["sec-fetch-dest"] = "empty"
	headers["referer"] = "https://gpttools.com/"
	headers["accept-language"] = "en-US,en;q=0.9"

	data = '{"text":"'+prompt+'"}'


	resp = requests.post(url, headers=headers, data=data)

	print(resp.status_code)
	if resp.status_code==200:
		tokenized_info = json.loads(resp.content.decode())
		print(f"Text Input: {json.loads(data)['text']}\nNumber of Tokens: {tokenized_info['num_tokens']}\nTokens: {tokenized_info['tokens']}")

	return tokenized_info["num_tokens"]

prompt_inspect = ""
while prompt_inspect != "Quit":
	prompt_inspect = input("Enter Prompt Here (Type 'Quit' to quit):\n")
	num_tokens(prompt_inspect)

:computer_mouse: PowerShell Code: :computer_mouse:

$prompt = Read-Host("Paste Prompt Here:\n")

$session = New-Object Microsoft.PowerShell.Commands.WebRequestSession
$session.UserAgent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.71 Safari/537.36"

$tokenizer_response = Invoke-WebRequest -UseBasicParsing -Uri "https://zero-workspace-server.uc.r.appspot.com/tokenizer" `
-Method "POST" `
-WebSession $session `
-Headers @{
"method"="POST"
  "authority"="zero-workspace-server.uc.r.appspot.com"
  "scheme"="https"
  "path"="/tokenizer"
  "sec-ch-ua"="`"Chromium`";v=`"94`", `"Google Chrome`";v=`"94`", `";Not A Brand`";v=`"99`""
  "accept"="application/json, text/javascript, */*; q=0.01"
  "sec-ch-ua-mobile"="?0"
  "sec-ch-ua-platform"="`"Windows`""
  "origin"="https://gpttools.com"
  "sec-fetch-site"="cross-site"
  "sec-fetch-mode"="cors"
  "sec-fetch-dest"="empty"
  "referer"="https://gpttools.com/"
  "accept-encoding"="gzip, deflate, br"
  "accept-language"="en-US,en;q=0.9"
} `
-ContentType "application/json" `
-Body "{`"text`":`"$prompt`"}"

Write-Host "Original Prompt:
$prompt
Number of Tokens in Prompt:"
($tokenizer_response.Content | ConvertFrom-Json).num_tokens

Hope this helps!


openai_twitter_logoHelpfulDev

3 Likes

There’s an easier way to count tokens on your end. Then you can check our pricing page to get the cost per 1000 tokens for different models.

from transformers import GPT2TokenizerFast

# Load the tokenizer.
tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")
n_tokens = len(tokenizer.encode("Specific text which you want to see how many tokens it has"))
2 Likes

Oh my!

I should’ve known that there was an easier way, and I even left part of the rest of the question unanswered.

Thank you Boris for the save, my bad bud!

3 Likes

Will this work for non-English texts?

Is there any reason why the output & total tokens used are not found in the API?

1 Like

Yes, we’re planning to implement a tokenizer into the client to make it easier. And adding an easy cost estimator into the api would be a great addition too. Thanks!

3 Likes

great to hear. thank you!

1 Like

Hi Boris, this is painfully slow. Could you please recommend a faster method?

1 Like

Thank you Boris, I have tried it with Portuguese, and its working great. Now I can pass the value of max_token using all available tokens minus the prompt. :rocket:

Is the token count based on the number of input Prompt tokens, the number of tokens generated by the model, or the sum of the two?