Trying to understand why usage cost does not match usage tokens

Hey all!

I’m doing some testing to see if 3.5 Turbo is financially viable for my use case, and I have some confusion about the cost I’m seeing. Specifically, it seems there’s a discrepancy between the cost compared to the final official token usage.

The token usage is 48k input and 292 output. For all of my testing I have been using the new 0125 model, which is supposed to be $0.0005 per 1k input tokens. The thing is, my cost for today is $0.13, which would make it $0.0027 per 1k input tokens, which is 5x the advertised cost.

The thing is, I know I generated more than 292 output tokens today. Is it possible that the discrepancy could be there, and that the usage statistics aren’t accurate for what I’m actually using? If so, how are we supposed to do any type of testing or estimations of future costs?

What have been your experiences with pricing of the new model, and just pricing discrepancies in general?

1 Like

Hey there and welcome to the forum!

Are you using an Assistant? Or just chat completion?

1 Like

Just regular chat completion, no Assistants.

The usage page is essentially useless to determine what you were actually billed for a model.

That used to not be the case.

The lack of transparency is a feature. For OpenAI.

That’s… disappointing. Has anyone had success with getting in touch with support to get clarification?

One of the things to note: gpt-3.5-turbo as a name does not yet point to the -0125 model.

I am aware, I am specifically using 0125.

It turns out that it was just a matter of the tokens used not updating properly. It is now accurately displaying 204k input tokens and 14k output, which is much closer to the expected cost.

I’m glad that it seems that the price was accurate at least rather than the other way around, because that could cause some problems in the future.