GPT-4 performance is not acceptable for production use cases

jredl · March 30, 2023, 1:10am

After receiving access to the GPT-4 models we’ve rolled back to using the GPT-35 Turbo models in all of our use cases.

The same API prompt between these two models using the chat completions api can be seen below. That is the 95th percentile latency…its even slower in some of our other prompts.

Will we see performance improvements in future releases?

udm17 · March 30, 2023, 5:51am

GPT -4 is notoriously slow currently as they are still trying to scale their architecture to handle all the users. While it might be slower, in terms of the performance, is GPT-4 doing better ? @jredl

ryan.hannam · March 30, 2023, 10:30am

I’ve had a similar experience of GPT-4 being slower than 3.5-turbo. But it is likely worth keeping in mind that the former is still in Limited-Beta and we’ll likely see speed-ups over time

jredl · March 30, 2023, 3:41pm

Yes, the quality of the response was much better when using GPT-4. However, the performance of the response times from the API isn’t acceptable from a user perspective.

We’re eagerly awaiting “non limited” rollout.

jredl · March 30, 2023, 3:44pm

While what you say is true, I believe the performance of the API is related to performance bottle necks within the serving infrastructure of OpenAI and nothing related to the model itself.

Topic		Replies	Views
GPT 4 API is Very Slow Still API gpt-4 , chatgpt , api	15	6609	December 16, 2023
GPT-4 extremely slow compared to 3.5 API	15	8429	December 17, 2023
Assistant API Performance is Very Slow API plugin-development , api	10	4915	March 7, 2024
Very slow responses from chatgpt 3.5, Does OpenAI plan to solve this problem for users? API gpt-35-turbo , chatgpt	3	1084	December 17, 2023
ChatGPT 3.5 Turbo Vs ChatGPT 4 - API Response Speed API	9	3543	December 24, 2023

GPT-4 performance is not acceptable for production use cases

Related topics