I’m using Azure OpenAI. I’ve created an Assistant with function calling The first call succeeds and returns the function. I execute the function, return the result. The assistant responds with a 'rate_limit_exceeded error. I wait the specified time, then poll again only to get the same error.
That’s because you reached the rate limit. By default, rate limit is very low.
You have to go to your deployment, choose the model that reached the rate limit and update the deployment. Here you can increase the rate limit (max token).