Estimating OpenAI GPT-3.5-Turbo usage costs for french inputs, is this the right approach?

adrien1 · May 25, 2023, 9:58am

Hello,

I have a corpus of french documents that will undergo the same processing using OpenAI. I’ll be extracting information from the texts using french prompts.
The prompts will be constituted of the text itself + the question specifying the task we’d like to accomplish.
I am using TikToken to estimate the number of tokens and my code is as follows:

import tiktoken

encoding = tiktoken.get_encoding("cl100k_base")
encoding = tiktoken.encoding_for_model("gpt-3.5-turbo")

def num_tokens_from_string(string: str, encoding_name: str) -> int:
    """Returns the number of tokens in a text string."""
    encoding = tiktoken.get_encoding(encoding_name)
    num_tokens = len(encoding.encode(string))
    return num_tokens

def count_token(text):
    text = str(text)
    return num_tokens_from_string(text, "cl100k_base")

df['estimation'] = df['text'].apply(count_token)
df['estimation'].sum()

Is this the right approach for the French language?
After having an estimate of the number of tokens, we’re multiplying this by 0.002$/1k token to get a rough estimate of the total price. Is this approach valid?
Does the number of tokens include the output / generated tokens as well?

Thanks in advance for your help

Foxalabs · May 25, 2023, 10:09am

Tokens for 3.5-Turbo are the same cost for prompts and completions (sent in and returned out) so 0.002/1k tokens is the correct calculation, so you should perform this check on the totality of the prompt sent to the model and the result returned from it.

Topic		Replies	Views
How do I calculate the pricing for generation of text? API	11	6058	March 6, 2023
How does GPT-3 cost calculation for languages other than English? API	7	3052	February 20, 2023
API CHATGPT version and cost API	3	582	August 28, 2023
Pricing questions OpenAI API	3	753	December 18, 2023
Explosion in the number of tokens / words generated API gpt-4 , api	13	2212	August 9, 2023

Estimating OpenAI GPT-3.5-Turbo usage costs for french inputs, is this the right approach?

Related Topics