OpenAI FAQ - Rate Limit Advice - Update
Rate limits can be quantized, meaning they are enforced over shorter periods of time (e.g. 60,000 requests/minute may be enforced as 1,000 requests/second). Sending short bursts of requests or contexts (prompts+max_tokens) that are too long can lead to rate limit errors, even when you are technically below the rate limit per minute.
OpenAI FAQ - How can I solve 429: ‘Too Many Requests’ errors?
As unsuccessful requests contribute to your per-minute limit, continuously resending a request won’t work.