We have this code, where we want to apply same task on lots of files using the assistants api and files. (Batch api is not an option because there is a kind of automatic iteration between the responses and our code, which may choose to restart the current file’s processing.)

I have 2 questions.

  1. We hit to a rate limit, which we are not able to understand which one. I know the https://platform.openai.com/settings/organization/limits page but we can not know which one is exactly the one we are stuck. (I suspect it can be the “BATCH QUEUE LIMITS” but we are not using the batch API.) And we have sufficient USD balance in the account. So, question is, how to identify the exact limit we faced? We see this in the threads:

  2. For the process, applying same task on lots of files iteratively, what is the common practice?

What is your current tier? Limits are determined by the level of tier your organization is in. Different limits apply by model. See below an overview for tier 1.

Source: https://platform.openai.com/docs/guides/rate-limits/usage-tiers

Not sure I have enough info to provide any meaningful input on this one. What are you looking to optimize for? Efficiency, cost? What’s the nature of the task?

It is Tier 5. And I checked that page also.

And for:

For the process, applying same task on lots of files iteratively, what is the common practice?

The story is, there are ~200 files in a collection, we upload them to a VectorStore. And iteratively we want the Assistant API to analyze each one, one-by-one (where we hit the rate limits and once we re-initiate the process 1-2 hours later we still receive the same limit error). The reason we can not use batches is we want the response to interact with our code and that interaction (mostly JSON format with expected keys) may fail.

Have you checked that you have not exceeded the monthly limit set for your organization or the specific project under which you are running the Assistants API?

Yes. I updated them to higher values also.

Strictly speaking the error message points to either not having enough funds in your account or having exceeded your monthly limit. It sounds like you’ve already ruled that out.

If you exceeded a token limit, the error message - again strictly speaking - should read “Rate limit reached for requests”. But if you want to be on the safe side, then you’d need to check your history of API requests to the Assistant and then evaluate whether you might have consumed more tokens per minute than allowed - not entirely implausible when dealing with large documents. But you have to check it against the model you are using. If you are not using the batch API, then it can’t be related to the batch queue limit.

If none of this applies and the error persists, then I would reach out to support to get to the bottom of it.


Today everything seems working fine.

My theory is, whatever the limit I hit, it was reset the next day.

