Gpt-3.5 concurrent requests limit

sotergabor · May 30, 2023, 12:56am

I’m trying to send concurrent requests to OpenAI’s API using Python with the asyncio library.

It all goes well except if I send more than 10 requests at the same time, the gpt-3.5-turbo API endpoint starts throttling.

Here’s some data I have - all tests are run with 50 semaphores.

Number of concurrent requests | Avg. response time
5 | 5.3 sec
10 | 6.03 sec
15 | 9.71 sec
20 | 35.28 sec
25 | 35.19 sec
50 | 26.56 sec

What’s the limitation on the number of concurrent requests? I understand there are rate limits, but this seems to be a concurrency limit that I haven’t found any information yet.

Js.Guo · August 20, 2023, 3:50pm

Have you found any solution to this yet? Thanks.

logan1155 · February 2, 2024, 6:23am

I’m working through this issue using the TTS endpoint. If I send 3 parallel requests its fine, 15 is a problem. I can’t find any documentation but chatgpt said this…

“While OpenAI doesn’t publicly list exact numbers for parallel request limits, here are some general strategies to work within typical constraints:” then refers back to rate limits. It appears the concurrent limit is dynamic but seems to be between 5-10. Likely need a combination of a queuing system and some sort of exponential backoff.

vb · February 2, 2024, 6:28am

Take a look at the topic linked below.
It may be a good explanation as to why a increasing number of concurrent requests triggers the rate limit warning.

Topic		Replies	Views
Latency increases with more parallel requests API	1	62	October 21, 2024
My request are getting throttled back API gpt-4 , api-rate-limits	0	26	November 21, 2024
Parallel API Requests - Very Long Response Times API	7	1736	August 18, 2024
How can I avoid rate limits if it is needed to request the API multiple times at once API gpt-4 , api	0	491	November 4, 2023
Does the API set a limitation on how many times a user can make a request per minute? API	2	160	May 20, 2024

Gpt-3.5 concurrent requests limit

Related topics