Is there any limit on creating threads parallelly and execute?

sannidhisiva · September 5, 2024, 10:55pm

Is there any limit on creating threads parallelly and execute? any issues /performance considerations for creating multiple threads and run parallelly

jochenschultz · September 6, 2024, 5:05am

Threads might cause an issue on your local machine where up to a thousand should work depending on your hardware though.

But I guess that’s not what you wanted to know. I mean you got rate limits which are different depending on which tier level you are - you can look that up here

https://platform.openai.com/settings/organization/limits

Let’s say you start 1000 parallel threads and after 500 you reach any rate limit (whether it is combined token per minute or request per minute) the remaining requests will return an error

And for the following I don’t have the current knowledge, it might have changed, but you were charged even for requests that returned such an error (429) - and tbh it is your fault.

To avoid that you can start each process like this:

Calculate token of your request/prompt/messages e.g. with tiktoken
then set a maxtoken value for the expected maximum response length - you can’t force the model (easily) to use the full amount
and then sum up both and add that to a data storage / database.

Then for each new process before you insert the token into the data store you must check if you hit the rate limit of your tier.

And only then start the process or else shedule it for next minute execution…

You should store the rate limit somewhere in your data storage/base and update it when you get to a higher tier.

Rate limits per tier can be looked up here

https://platform.openai.com/docs/guides/rate-limits/usage-tiers

Topic		Replies	Views
Simultaneous Requests - API API	5	4655	June 3, 2023
Parallelise calls to the API - is it possible and how? API	13	39076	December 13, 2023
How many threads per account API assistants-api	2	174	September 30, 2024
Maximum number of threads for an assistant? API api	4	4584	November 12, 2023
Latency increases with more parallel requests API	1	96	October 21, 2024

Is there any limit on creating threads parallelly and execute?

Related topics