Why x-ratelimit-reset-requests time is so small and what is its significance?

x-ratelimit-reset-requests
The time until the rate limit (based on requests) resets to its initial state.

https://platform.openai.com/docs/guides/rate-limits/rate-limits-in-headers

A unique and still undisclosed formula is used. 12ms means that after 12ms there will be no memory of your API call counting against you if you make no others.

However, cumulative calls within a period are a bit more tricky, will make that reported period grow, and the reset has little value to you then.

If you are going to be approaching the limit, for example, for a batch job, a good policy to write is to have a FIFO buffer, where you record the counts, tokens, and timestamps, and don’t expire the calls from that record until the period of its evaluation time is up. Then you total that unexpired pool and see if it is near exceeding your given limit, and hold off on more submissions until the last would exit.

2 Likes