The main bottlenecks that slow down the API response

chcw · March 6, 2024, 9:54am

Hi,

I am using different text/image models via OpenAI API. I notice sometimes the response is very slow, and want to find out a solution to speed up it.

So, I wonder what is the main bottlenecks that slow down the response.

Note, I must use gpt-4 and dalle-3 models.

I figure out the main factors may be:

The length of the prompt sent to OpenAI.
The requests I made in the prompt. For example, if I have several requests, then that will slow down the response.

PaulBellow · March 6, 2024, 10:18am

It’s mostly your rate-limits tier + size of prompt(s) + network health… I believe they’re working on ways to make it even faster.

Other things you can do is try to use a smaller/faster model. While DALLE2 won’t replace DALLE3, you can get by with GPT-3.5-turbo or gpt-3.5-turbo-instruct sometimes with clever prompting and a one-shot or two-shot …

kareem.kudus · June 4, 2024, 1:29am

I can’t seem to find details on latency for each tier, any idea where that info might be?

Topic		Replies	Views
How can I improve response times from the OpenAI API while generating responses based on our knowledge base? API chatgpt , api	3	19415	November 9, 2023
Open AI Reponse is Slow API	3	7130	December 25, 2023
Optimizing GPT4 request & best practices API	0	582	April 3, 2023
GPT-Vision Performance Improvements API gpt-4 , gpt-4-vision	1	1375	January 24, 2024
Unstable speed of gpt-3.5-turbo-16k API api , gpt-35-turbo-16k , performance	6	1050	January 9, 2024

The main bottlenecks that slow down the API response

Related topics