# of tokens used and costs randomly exploded over night

From yesterday to today, my costs have randomly exploded - I am categorizing job titles which costs approximately 1.1k token call (context tokens) - I am tracking the tokens myself. I am using GPT-3.5 turbo

However, today I have categorized around 1500 jobs, this has cost a total of 5.8 million context tokens - it should be roughly 1.5 million, so almost a quarter.

Has anyone noticed anything changing?

2 Likes

Systems are nominal on my end. Costs tracking as expected today.

Is this all purely internal work, as in you have no other employees or anyone with access to your API keys? also they are not in any public application?

Do you use assistants or any of the new features? or is this all based on traditional API calls?

1 Like

No additional elements, I use request so fully in control over how many tokens i use.

I had the same experience yesterday with roughly 40 million tokens in just a few minutes. Multiple threads were generating messages in a loop. You might wanna check your runs for the messages and if they were re-created multiple times.

Check my posting for more details:

1 Like

Iā€™m seeing the same. Almost a 4 times increase in the amount of tokens per job. Iā€™m certain the amount of context is the same as two days ago.

2 Likes

You have that on the GPT-3.5 or on the gpt-4 assistant model?

GPT-4 Turbo Chat Completion.

I switched to Azure a few hours ago and that works as expected. Doing exactly the same request, but then on Azure, and token count looks normal again.
So, there seems to be an issue with the way OpenAI countā€™s the tokens.

I wonder how this is even possible to fall under the radar and OpenAI just ignores messages over support. To me, it looks like they are not doing their business serious. It is like a scam, or at least a rip-off.

Is the assitant API working over azure?

Assistant API is not yet available. Vision is though.

1 Like

I am having a similar issue, I am very upset about this although luckily I have usage caps set so it is not costing me that much money but its very concerning. I am using a wordpress plugin on my website to make requests to chatgpt4 turbo and the tokens used do not at all match the actual usage reported by the plugin (its like 7 or 8 times what I expect). I have been discussing with the plugin creator but now Iā€™m not sure it is their fault it seems this may be an OpenAI issue?

this seems very similar to what I am seeing, I am using a wordpress plugin that makes calls to chatgpt4-turbo and the usage is insanely high compared to what it should be. I have been talking with the plugin creator but I think this seems likely to be an openAI issue actually not their issue.

Yeah, it really seems to be always a range between 7 to 8times of the normal token usage. Some messages work normally and some are done 20 timesā€¦ And the average is then 8x times. Thats what I have observed. There will be a team from around @nikunj investigating the issue. So Itā€™s good to see that itā€™s acknowledged to be an issue.

1 Like

Glad to see they are looking into it, just to add more clarity, the chatgpt version I am seeing this issue with is called ā€˜GPT-4-1106-previewā€™ on the https://platform.openai.com/usage page.

Hello @nikunj, is there any way I can be added to any updates that relate to this issue? I have had to pause all usage of chatgpt for the moment due to this issue so would like to know as soon as itā€™s resolved. Thanks

Weā€™ve seen this happen before when someone is expecting the API to return JSON and have retry code for if it doesnā€™t, and the model started replying with it wrapped in ```json, resulting in the retry loging being hit way more often than before.

So check if this is the case, and you can try using JSON mode to avoid this issue.

1 Like

They ā€œonlyā€ started with checking it this week. Until now, I did not hear back from them, but they made me aware that it could take some weeks to understand the underlying issue.

We, in fact, do output JSON and were having these issues with re-generated content in a loop. Sadly, the JSON mode is only available to the chat completion API and not the assistant.

May I ask what is Vision? Is it another product?