Has Chat GPT 3.5 turbo api gone up in price?

Hi all! I noticed that I had to pay $40 for 540 requests, when earlier I paid $1.7 for 750 requests. I tried to find news related to the price increase, but did not find it. Can anyone send a link to a similar post? I would like to know the reasons


Have you looked in your account section of the API website platform.openai.com and looks at which model was used? GPT-4 is more costly and so are fine tuned models, you should be able to find which ones were used. The prices have in fact only gone down, there have been no price rises that I am aware of.

1 Like

Also note that the 3.5 16k context model is more expensive, if you happened to switch over to that one.

You can view current pricing and compare to historical versions of the page at archive.org

1 Like

Please, review my screen. How I see, I use two type of 3.5 models - v0301 and v0613. But, I think, also this very expensive or I’m wrong in my calculations?

Here is a comparison for 28 and 10 July. The difference is huge (40x), the models are the same

It looks like your current token usage has gone way up. The price is per token, not request. Your screenshot shows that your recent usage has way more tokens.


Am I correct in assuming that users have begun to use more words? I thought about it, but too suspicious activity and from a certain point

To be sure, can you try logging the inputs and outputs? This is really the only way to see what is going on.

Did you recently make an update that uses more history? That could do it too.


At least some of your requests are certainly using more tokens in your prompt (see green highlights). Hard to provide any more insight without knowing what your app does. I’d say though that these aren’t crazy numbers. If you’re maintaining history or a user is pasting in some text/code those numbers aren’t out of line.


Thanks everyone for the info! The following questions - I wonder if it is possible to limit on the API side the number of words in a message that can come without a request from the user. And what practices are welcome to determine that my application is not purposefully killed offtopic (reading the sent content is prohibited by my own privacy policy, which I strictly adhere to)

Can you provide some more details about what your app is and what the use-case is?
You control all the requests to the API, so you can add any input limits or logic before you make the OpenAI call. You can also analyze the content/responses for relevancy for your use-case if thats appropriate (either through basic keyword searches or something more advanced).

Yes, the application performs the same function as ChatGPT itself - it provides answers to questions. If I limit the number of characters sent on the client, this will not affect the number of generated ChatGPT characters. I can add a prompt that is invisible to the user like “fit the answer in no more than 100 words”, but side effects are possible. We need control exactly in the possible number of words with the response API. I use the Android library for working with the API, which is indicated in the official guides, I did not find a similar function

You can specify max_tokens in your request, but know that it will abruptly cut off the response at that limit. I’d start with experimenting with adding/editing the system message and telling it to “Respond in a concise manner” or something like that.

Controlling it exactly will be difficult since it doesn’t really think in words. Best bet is to try to guide it with system prompt like I mentioned.

Also make sure you are limiting your user’s input too if you are concerned with costs.