Chat GPT's API is significantly slower than the website with GPT Plus

Cytranic · April 18, 2023, 1:15am

Microsoft just dropped the nerf bat. One request every min now. Even though my cognitive bill is $400 so far this month, they restrict me now. They are clamping down on AI agents instead of just expanding infrastructure. Sad. They are limiting this technology and development.

ran.turgeman · April 18, 2023, 4:42am

is it so hard to people here to realize that they are intentionally limit this to make sure none of you is going to build something big enough before them? this industry is not for you indies and devs.
This is like the hardware chip industry, only the big companies have a place in this.

This is probably a mistake because in weeks, many small startups will start offering this exact capability, don’t forget they invent nothing new, this tech was actually developed outside the US, and most researches are outside the US, it’s just that THEY trained it with the whole internet data, having friends and access to ceo’s of Reddit, Github etc.

sandermann · April 20, 2023, 10:58pm

ok, i figured out. everything works like expected and is fast. the answer is: stream
and i was wrong. stream is working also with chat api.

qrdl · April 21, 2023, 12:45am

What I don’t get is why they don’t just raise prices for the API to something more demand/supply based. Cloud does this and it works just fine. As new resources come in, they can lower prices.

Leave the web interface alone, of course, so people can play around, but if people have high value cases which don’t use a lot of tokens, they should be able to use the API - imho.

M_J · April 25, 2023, 8:06pm

Mind explaining further what you mean by ‘stream’? How did you manage to make the API respond quickly? Thanks.

BrianLovesAI · April 25, 2023, 9:01pm

Surprisingly, it seems that with the API, GPT-3.5 Turbo, GPT-4, and Text-Davinci-003 are slower than Text-Davinci-002 recently. My company account’s logic was so slow with GPT-4 that I tested it directly using my Postman call. Apparently, all of them are slower than Text-Davinci-002, even with the same prompt and Text-Davinci-002’s longer text completion.

However, Text-Davinci-002’s accuracy is not good enough for my use case. Therefore, even though it is faster than the other models, I cannot use it.

Can anyone else try Text-Davinci-002 instead of the other models and see the speed?

sandermann · April 25, 2023, 11:23pm

search for “stream” on this site

cruvinel · May 12, 2023, 7:25pm

Any news or solution about this?
In ChatGpt plus I am having 8 times more token than compared to the API (gpt 3.5 turbo) in the same time.
This is affecting our project a lot.

jwatte · May 13, 2023, 10:38pm

The underlying implementation can summarize context and previous history to keep the thread of older conversations without needing to spend as many tokens.

It may also be helpful to only include user text, not completed bot text, in the context. It seems perfectly capable to keep the thread anyway when doing this.

That being said, models with less context will be faster, because the cost of the model goes up with approximately the square of context. It also goes up by a constant factor related to the number of parameters. Doing less work, will run faster, all else being the same.

allemanfredi · May 15, 2023, 12:36pm

Same here. I tried to buy the subscription but the API response time didn’t change it. With this reponse time API are not usable!

liger · May 16, 2023, 4:46pm

Having same issues on /chat/completion endpoint responses with or without stream option, both gpt-3 and gpt-4 are extremely slow. Used Postman to test and even OpenAI Playground response times are also very slow.

vincenzo.muni.dev · May 22, 2023, 7:41am

Same issue here, I’m using 3.5turbo. Is there any way to have a response from the support team? We all need to understand whether the time spent on developing can be repaid in the near future and we can offer a good service to our customers or not.

todd1 · May 29, 2023, 7:06pm

Just adding my +1 here in hopes that that helps get attention to this problem. I have to admit, I am a bit shocked that the paying members get less server preference then the free members? Seemingly. Doesn’t make fundamental sense to me?

dfilimonov · June 6, 2023, 3:17pm

The same issue. I’m using text_davinchi_003, and the average response latency is 45 seconds with a token limit of 500. What’s interesting is that when doing requests in one thread, the initial response is typically received in 10-15 seconds, but each subsequent request adds an additional 5-15 seconds to the latency time until the last one get back with error. Doing delay (up to 2 sec) between the requests doesn’t make effect.

lizardking · June 19, 2023, 10:24am

It seems like the delay is related to the “user” identifier passed in the request. I began to observe a gradual decline in response speed, which worsened as time went on. Eventually, it reached a point where it became extremely poor, leading me to believe that users would soon start expressing their dissatisfaction. Surprisingly, however, I haven’t received any complaints yet. In short, through deduction, I realized that requests using a new user identifier yielded much faster results compared to the one I had been using. I’m curious to know if others have encountered a similar situation.

Topic		Replies	Views
GPT-3.5 API is very slow. Any fix? API	31	9780	October 12, 2023
ChatGPT API responses are very slow API	31	28079	December 12, 2023
Very slow response time with chatgpt-3.5 turbo model API API	17	10827	December 19, 2023
Slow Chat api responses ------ API	17	6253	December 24, 2023
API calls to davinci text 3 very slow and random speeds for identical prompts API	27	6885	December 25, 2023

Chat GPT's API is significantly slower than the website with GPT Plus

Related Topics