Getting charged double for language translation

danny8 · February 9, 2024, 5:00pm

Hi guys, so via the API, we ask GPT to create a piece of content in Italian. What we see is that it creates in english and then translates to Italian. So we get charged twice the amount of words…first the english version and then Italian. Anyone else face this?

trenton.dambrowitz · February 9, 2024, 5:08pm

How exactly are you observing this? Is the model outputting the same text back in English before it outputs the Italian version?
Or are you seeing the cost higher than you expected in the API Usage page?

danny8 · February 10, 2024, 1:45pm

Yes we are seeing the cost higher than expected in the API usage. I am wondering if we are doing something wrong here or is this normal?

danny8 · February 10, 2024, 1:55pm

We are for sure seeing more tokens being used for other languages than English.

Diet · February 10, 2024, 2:11pm

This seems like a prompt issue, I’m pretty sure that can be fixed

Yeah - this is somewhat expected. The tokenizer is optimized for english text, probably since it’s the most common language. If you take a story and translate it into multiple languages and then count the tokens, you’ll see the massive disparity:

299 english

447 german

480 italian

620 chinese

trenton.dambrowitz · February 10, 2024, 4:18pm

Are you aware that the API charges for both the tokens in and tokens out?

Providing an example of your prompt and the model response might help us understand.

Topic		Replies	Views
How does GPT-3 cost calculation for languages other than English? API	7	4495	February 20, 2023
Pricing questions OpenAI API	3	1222	December 18, 2023
Can someone explain me the pricing model API	6	11020	February 22, 2023
Is prompt chargeable? API	1	1086	May 5, 2022
Pricing, Billing and Tokens? Math is not adding up API api	9	2445	February 16, 2024

Getting charged double for language translation

Related topics