OpenAI Why Are The API Calls So Slow? When will it be fixed?

_j · November 11, 2023, 9:52pm

They aren’t going to directly write “We took a whole bunch of accounts that were low-value and put them into an API filter buffer that simulates slow output to decrease their satisfaction. Goal: get them to pay more to return to normal.”

OpenAI rewrote the text on the “rate limits” page to:
“Organizations in higher tiers also get access to lower latency models.”
Previously: " “As your usage tier increases, we may also move your account onto lower latency models behind the scenes.”"

Lower latency “models” makes no sense. Why leave your generation in an overloaded time-sliced server when it is more efficient to generate 100 tokens a second and then the unit processor is freed for another user. Only if there was no way for them to generate the current customer load without hiring slow energy-inefficient GPU instances of older technology.

Topic		Replies	Views
GPT-3.5 Turbo API response is slow API	20	12143	November 11, 2023
Gpt-3.5-turbo-1106 is very slow API chatgpt	46	7648	December 19, 2023
We proved the API is intentionally slow API	56	17468	May 2, 2023
GPT-3.5 API is 30x slower than ChatGPT equivalent prompt API gpt-35-turbo , api	69	13743	November 30, 2023
Introducing ChatGPT and Whisper APIs Announcements whisper	77	19550	December 13, 2023

OpenAI Why Are The API Calls So Slow? When will it be fixed?

Related topics