How do I calculate the pricing for generation of text?

agrover112 · October 18, 2021, 1:50pm

I’m really new to using the API. Can anyone tell me how I can calculate the price it would take me to generate a given text ? Is there any function which can help me to calculate my bill before I generate the text?

DutytoDevelop · October 19, 2021, 3:31am

Hello @agrover112 and welcome to the OpenAI community!

Pricing details are mention on OpenAI’s pricing page here: OpenAI API

Essentially, you can use a function to count the tokens in a text, and with the price to token ratio, you can get how much the price totals out to.

agrover112 · October 19, 2021, 6:27am

Can you tell me which function?

DutytoDevelop · October 19, 2021, 12:54pm

I isolated the main URI, “https://zero-workspace-server.uc.r.appspot.com/tokenizer”, from the website that does the actual tokenization and API response, then parsed it into structured data. Essentially like this website but without an actual web interface to display the data on: Token estimator
(I hope that’s okay to do! If not, I’ll take this down)

Python Code:

from tkinter import *
from tkinter import ttk
import requests, math, json
from requests.structures import CaseInsensitiveDict

# Functions and Objects
def num_tokens(prompt):
	url = "https://zero-workspace-server.uc.r.appspot.com/tokenizer"

	headers = CaseInsensitiveDict()
	headers["authority"] = "zero-workspace-server.uc.r.appspot.com"
	headers["sec-ch-ua"] = '"Chromium";v="94", "Google Chrome";v="94", ";Not A Brand";v="99"'
	headers["accept"] = "application/json, text/javascript, */*; q=0.01"
	headers["content-type"] = "application/json"
	headers["sec-ch-ua-mobile"] = "?0"
	headers["user-agent"] = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.71 Safari/537.36"
	headers["sec-ch-ua-platform"] = "Windows"
	headers["origin"] = "https://gpttools.com"
	headers["sec-fetch-site"] = "cross-site"
	headers["sec-fetch-mode"] = "cors"
	headers["sec-fetch-dest"] = "empty"
	headers["referer"] = "https://gpttools.com/"
	headers["accept-language"] = "en-US,en;q=0.9"

	data = '{"text":"'+prompt+'"}'


	resp = requests.post(url, headers=headers, data=data)

	print(resp.status_code)
	if resp.status_code==200:
		tokenized_info = json.loads(resp.content.decode())
		print(f"Text Input: {json.loads(data)['text']}\nNumber of Tokens: {tokenized_info['num_tokens']}\nTokens: {tokenized_info['tokens']}")

	return tokenized_info["num_tokens"]

prompt_inspect = ""
while prompt_inspect != "Quit":
	prompt_inspect = input("Enter Prompt Here (Type 'Quit' to quit):\n")
	num_tokens(prompt_inspect)

PowerShell Code:

$prompt = Read-Host("Paste Prompt Here:\n")

$session = New-Object Microsoft.PowerShell.Commands.WebRequestSession
$session.UserAgent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.71 Safari/537.36"

$tokenizer_response = Invoke-WebRequest -UseBasicParsing -Uri "https://zero-workspace-server.uc.r.appspot.com/tokenizer" `
-Method "POST" `
-WebSession $session `
-Headers @{
"method"="POST"
  "authority"="zero-workspace-server.uc.r.appspot.com"
  "scheme"="https"
  "path"="/tokenizer"
  "sec-ch-ua"="`"Chromium`";v=`"94`", `"Google Chrome`";v=`"94`", `";Not A Brand`";v=`"99`""
  "accept"="application/json, text/javascript, */*; q=0.01"
  "sec-ch-ua-mobile"="?0"
  "sec-ch-ua-platform"="`"Windows`""
  "origin"="https://gpttools.com"
  "sec-fetch-site"="cross-site"
  "sec-fetch-mode"="cors"
  "sec-fetch-dest"="empty"
  "referer"="https://gpttools.com/"
  "accept-encoding"="gzip, deflate, br"
  "accept-language"="en-US,en;q=0.9"
} `
-ContentType "application/json" `
-Body "{`"text`":`"$prompt`"}"

Write-Host "Original Prompt:
$prompt
Number of Tokens in Prompt:"
($tokenizer_response.Content | ConvertFrom-Json).num_tokens

Hope this helps!

HelpfulDev

boris · October 19, 2021, 7:48pm

There’s an easier way to count tokens on your end. Then you can check our pricing page to get the cost per 1000 tokens for different models.

from transformers import GPT2TokenizerFast

# Load the tokenizer.
tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")
n_tokens = len(tokenizer.encode("Specific text which you want to see how many tokens it has"))

DutytoDevelop · October 19, 2021, 8:45pm

Oh my!

I should’ve known that there was an easier way, and I even left part of the rest of the question unanswered.

Thank you Boris for the save, my bad bud!

jefftay · October 20, 2021, 6:02am

Will this work for non-English texts?

Is there any reason why the output & total tokens used are not found in the API?

boris · October 20, 2021, 6:19am

Yes, we’re planning to implement a tokenizer into the client to make it easier. And adding an easy cost estimator into the api would be a great addition too. Thanks!

jefftay · October 20, 2021, 6:31am

great to hear. thank you!

prafull.sharma · May 24, 2022, 9:59pm

Hi Boris, this is painfully slow. Could you please recommend a faster method?

alessandro · January 25, 2023, 11:53am

Thank you Boris, I have tried it with Portuguese, and its working great. Now I can pass the value of max_token using all available tokens minus the prompt.

beatmight · March 6, 2023, 9:56am

Is the token count based on the number of input Prompt tokens, the number of tokens generated by the model, or the sum of the two?

Topic		Replies	Views
Are there any calculators that would give me an estimate of how much it would cost to run tokens? API	2	910	September 12, 2024
How does GPT-3 cost calculation for languages other than English? API	7	4319	February 20, 2023
How many words in Cyrillic can you get from a million tokens? Prompting gpt-35-turbo , chatgpt , token	3	7498	April 1, 2024
Pricing questions OpenAI API	3	1164	December 18, 2023
Feature request: Query token counts via API Prompting	3	1612	May 24, 2022

How do I calculate the pricing for generation of text?

Python Code:

PowerShell Code:

HelpfulDev

Related topics