One api key used by multiple users at the same time

Hi , all

I created a streamlit application and i use api key 4.5 turbo ,
The question is , can multiple people use this application at the same time?

Yes, as long as you do not exceed the rate limits for your account.

There is a delay in the response if multiple users use the same API key at a time. Is there any work around for that? using Parallel Processing.

  • What kind of delay are you experiencing?
  • How did you measure this delay?
  • Please be more specific as to what you mean by users using the same API key at the same time,
    • How many users?
    • How close together in time are these multiple requests?

To the best of our knowledge, there is no mechanism which slows responses short of hitting the rate limit, but it doesn’t sound like you’re experiencing 429 errors.

I hit this all the time. Max throughput for a single API Key is with 4 or less simultaneous request. Beyond that things start to bottleneck. Request dont fail. They just take longer.

I don’t have a workaround at this point

Hmmm…

I can try to test it tonight.

Let me know if you’ve already done these tests,

  • Making multiple requests from different devices on different networks
  • Different API keys from the same project
  • Different API keys from different projects

Just trying to isolate if the throttling you’re observing is enforced via API key, project, organization, or device.

I’m going to assume they’re using Azure API Management for throttling and this is likely a side effect of going through that service. They may or may not be intentionally slowing down incoming requests but this service does act as a proxy and it really depends how they’re talking to the internal server plane from the API Management layer. They’re could be some natural queuing going on.

It’s been several months since I tested this but I was Tier 5 when I tested it. Tier could definitely play a role

A few more data points…

I wasn’t able to determine if it was IP related and at the time I didn’t have multiple Tier 5 accounts (now I do) so I couldn’t test if account sharding is a viable workaround.

From the same machine I was able to test sending requests to OSS models hosted on Fireworks.ai and I wasn’t seeing the issue. I could send 16 simultaneous requests to Fireworks.ai before I’d start seeing slow downs and that I would attribute to standard I/O and threading related slowdowns. Comparing that to OpenAI requests and there’s a clear slowdown that happens after 4 requests which resembles queuing.