Using multiple keys to avoid request queuing

If multiple requests (~100) are made simultaneously using a single API key, does OpenAI queue the requests?
And is there a way to avoid delay by switching between API keys?

Hi and welcome to the Developer Forum!

Rate limits are set at the Organisation level so using multiple API keys under the same Org will not help, parallel API are serviced in parallel, compute bandwidth permitting.

Let’s say that I have unlimited rate limits, multiple request at the same time would be problem for me?

OpenAI doesn’t have unlimited rate for any product even at the highest public tier, nor are there infinite compute resources in the world, so I’d only be able to address that hypothetically. It would seem if there were no limits, you would encounter no errors, up to other limitations like the number of open network ports an IP address or NAT router can handle.

There is no built in queue system with the API - either your request is satisfied because it is under your rate limit at that particular moment, such as number of requests allowed per minute, or an error is reported back to you.

You’d have to build your own queue and parallel calling method, and your own system to measure the size of your inputs and hold back requests when they might go over - and not send 20,000 parallel requests at once.

1 Like