I am in usage tier 5, and the limits page in settings gives my RPM limit on gpt-3.5-turbo as 10k. However, I am getting this error message fairly frequently:
{
“error”: {
“message”: “You’ve exceeded the 200 request/min rate limit, please slow down and try again.”,
“type”: “invalid_request_error”,
“param”: null,
“code”: “rate_limit_exceeded”
}
}
This error can occur regardless of threads endpoint I call, but in this example the endpoint was:
/v1/threads/runs
Is there something different with the threads API or is this a case of my rate limit not being respected correctly?
The Assistants API has an unmentioned rate limit for actual API calls, perhaps to keep it “beta” for now. What you report is an increase from the long-time limit of 60 requests per minute, which could be exhausted just polling for a response to be completed.