Let's compare our API speed? It's too slow!

Dmitriy123 · October 13, 2023, 11:30am

Hi,

There are so many topics with complains regarding API is much slower than Playground. And there are no useful answers yet.

Let’s compare our numbers?

I get these timings for text generation:
GPT-4: 36 sec
GPT-3.5: 11 sec
GPT-3: 5 sec

curl requests for Unix console:

GPT-4

time curl -s https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-YOUR_API_KEY" \
  -d '{
    "model": "gpt-4",
    "max_tokens": 1000,
    "messages": [{"role": "user", "content": "Write text about cats"}]
  }'

GPT-3.5-TURBO

time curl -s https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-YOUR_API_KEY" \
  -d '{
    "model": "gpt-3.5-turbo",
    "max_tokens": 1000,
    "messages": [{"role": "user", "content": "Write text about cats"}]
  }'

GPT-3:

time curl https://api.openai.com/v1/engines/text-davinci-003/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{"prompt":"Write text about cats","max_tokens": 1000}'

P. S.: maybe some official guy from OpenAI could give us any advice?

udm17 · October 13, 2023, 11:58am

It might be a time-thing, but in general, keep a lookout at https://openai-status.llm-utils.org/ .
It’s an unofficial tracker for GPT speed and can give you a fair idea about the status of the server at the point in time.

_j · October 13, 2023, 4:10pm

That is pointless without seeing how many tokens were generated.

Report for 5 trials of gpt-3.5-turbo:
For total response time | Min: 1.634, Max: 4.0, Avg: 2.59
For latency (ms) | Min: 501, Max: 2823, Avg: 1492.20
For response tokens | Min: 50, Max: 50, Avg: 50.00
For total rate | Min: 12.5, Max: 30.6, Avg: 21.18
For stream rate | Min: 42.48, Max: 52.46, Avg: 45.67

Report for 5 trials of gpt-3.5-turbo-16k:
For total response time | Min: 1.801, Max: 2.368, Avg: 2.08
For latency (ms) | Min: 295, Max: 1105, Avg: 763.80
For response tokens | Min: 50, Max: 50, Avg: 50.00
For total rate | Min: 21.11, Max: 27.76, Avg: 24.24
For stream rate | Min: 33.2, Max: 42.41, Avg: 38.25

Report for 5 trials of gpt-3.5-turbo-0301:
For total response time | Min: 1.867, Max: 2.251, Avg: 2.11
For latency (ms) | Min: 500, Max: 793, Avg: 670.80
For response tokens | Min: 50, Max: 50, Avg: 50.00
For total rate | Min: 22.21, Max: 26.78, Avg: 23.83
For stream rate | Min: 33.69, Max: 36.57, Avg: 34.82

Dmitriy123 · October 13, 2023, 4:51pm

I don’t think this is a time-thing, I’ve never got numbers you refer to.
Current result for me is 46/9/3 - always 2 times longer than in the statistics behind the link.
I agree with my customers, that claim my service being unusable.
Currently I’m in search for alternatives to OpenAI (unfortunately).

vish30 · October 14, 2023, 6:59pm

It took 1.5 minutes to generate response for mere 11 tokens.
The API has become miserably slow in the last few days.

c.rosati · October 27, 2023, 6:58pm

Same for me. If I do a call on the playground it’s really fast, same call with same prompt with the api it’s much slower.

Foxalabs · October 27, 2023, 7:06pm

Well, the playground is calling the same models that the API does, it’s just a wrapper to the API.

Might be worth looking at your code base, updating your libs and investigating your networking situation.

Dmitriy123 · October 27, 2023, 9:28pm

This is not true.

Technically it could be just a UI for API, but just look at this forum - there are so many topics about there is a significant speed difference between them.

Foxalabs · October 27, 2023, 9:43pm

I agree that there have been a number of forum users who have experienced a reduction in performance, and there is an announcement that users with low usage or new account may not be on the the lower latency servers. That still does not rule out a local infrastructure issue and it is worth checking.

Unfortunately, the developer forum cannot investigate issues at this level, you will need to reach out to help.openai.com to leave your details and your issue.

dashesy · October 29, 2023, 4:29am

help.oprnsi.com is useless. Closed the ticket saying my issue is resolved but there’s still too many 500 errors. The API is now basically too slow to be useful

Topic		Replies	Views
GPT-3.5 API is very slow. Any fix? API	31	9844	October 12, 2023
GPT 4 API is Very Slow Still API gpt-4 , chatgpt , api	15	6667	December 16, 2023
Unstable speed of gpt-3.5-turbo-16k API api , gpt-35-turbo-16k , performance	6	1079	January 9, 2024
Is there an issue with GPT 3.5 turbo 16k? API	5	931	October 27, 2023
Very slow response time with chatgpt-3.5 turbo model API API	17	10948	December 19, 2023

Let's compare our API speed? It's too slow!

Related topics