Gpt-3.5-turbo API pricing

Hi there, I ran a test with ‘gpt-3.5-turbo’ and I had 403 API requests totaling a token count of 142k context/input and 73k generated. The bill says $0.36.

The Pricing says ‘gpt-3.5-turbo-0125’ costs $0.0005/1k input and $0.0015/1k out. With these numbers I arrive at $0.16 and not $0.36. To $0.36 I get if I use the numbers for the ‘gpt-3.5-turbo-instruct’ ($0.0015/1k, $0.002/1k).

What really confused me was that I can not call the OpenAI endpoint with ‘gpt-3.5-turbo-0125’, it returns

ValueError: Unknown model ‘gpt-3.5-turbo-0125’. Please provide a valid OpenAI model name in:
[…]
gpt-3.5-turbo
gpt-3.5-turbo-16k
gpt-3.5-turbo-1106
gpt-3.5-turbo-0613
gpt-3.5-turbo-16k-0613
gpt-3.5-turbo-0301
gpt-35-turbo-16k
gpt-35-turbo
gpt-35-turbo-1106
gpt-35-turbo-0613
gpt-35-turbo-16k-0613
[…]

So what is going on here? Is the pricing homepage outdated or is the backend not up to date with the homepage or am I doing something wrong?

Many thanks to anyone you has an idea!

1 Like

gpt-3.5-turbo is the “full price” version. It is still an alias to gpt-3.5-turbo-0613. It is priced the same as it has always been, the same as listed for “instruct”. (update: gpt-3.5-turbo has since been re-pointed to GPT-3.5-turbo-0125)

OpenAI has been unclear and cagey with what they show for pricing since devday, perhaps intentionally so, because they haven’t reduced the price of existing models (except fine-tune which was way overpriced), they just introduced ones that are cheaper to operate.

In my chart, cost is per 1 million tokens so you can compare the prices more easily:

Model Training 1M Input usage 1M Output usage 1M Context Length
GPT-3.5-turbo-0125 $n/a $0.50 $1.50 16k (4k out)
GPT-3.5-turbo-1106 $n/a $1.00 $2.00 16k (4k out)
GPT-3.5-turbo-0613 $n/a $1.50 $2.00 4k
GPT-3.5-turbo-0301 $n/a $1.50 $2.00 4k
gpt-3.5-turbo-16k-0613 $n/a $3.00 $4.00 16k
GPT-3.5 Turbo fine-tune (all?) $8.00 $3.00 $6.00 4k
GPT-4-turbo (all) $n/a $10.00 $30.00 125k (4k out)
GPT-4 $n/a $30.00 $60.00 8k
-------- ------------- ------------- ------------- -------------
babbage-002 base $n/a $0.40 $0.40 16k
babbage-002 fine-tune $0.40 $1.60 $1.60 16k
-------- ------------- ------------- ------------- -------------
davinci-002 base $n/a $2.00 $2.00 16k
davinci-002 fine-tune $6.00 $12.00 $12.00 16k

I can’t explain why you’d be denied the newest model unless it is still rolling out; I’ve been plugging away at bugs on it since yesterday. That’s not an API error. Update openai and tiktoken libraries to the latest that know about the models.

4 Likes

Thank you very much!

I realized that I imported OpenAI through llama-index and they probably didn’t update their library

1 Like

Took a while to catch on, but:
OpenAI also just updated their pricing page so it can be seen in units of 1M…

You have to scroll down to find the higher prices of “older models” in their own section.


They need to state clearly if fine-tune price reductions announced DevDay apply only to the cheaper -1106 and -0125 models. Or also if all fine-tunes have the same inference cost.

Or if right below the June 2024 shutdown notice of a better performing gpt-3.5-turbo-0613, fine-tune cost doesn’t matter because: “Fine-tuned models created from these base models are not effected by this deprecation, but you will no longer be able to create new fine-tuned versions with these models.” (you can still choose -0613 in the UI, but I haven’t ran one…)


Or more fine print, about the most accomplished gpt-3.5-turbo-0301 and gpt-4-0314 models:

As of 01/10/2024, only existing users of this model will be able to continue using this model.


This text from the pricing page continues to be incorrect:

a text excerpt explaining the concept of "tokens" as units for counting words, highlighting that 1,000 tokens are roughly equivalent to 750 words and that the paragraph shown is 35 tokens long, with a token count indicator at the top showing "58."

1 Like