i’m coding using openai api. But i request many completetion in a short time. how can i check rate limit but not visit my account page?
Welcome to the OpenAI community @pclnvu1009
Rate limits don’t change(unless you apply for a rate limit increase) and remain enforced at all times.
All a dev has to do is honor the TPM, RPM and RPD limits, which they can, by writing code that counts their requests and tokens.
The headers return rate limits. However that only helps indirectly, because there still may be more parallel calls in processing to come down still, if you aren’t doing one at a time and holding off based on the header value.
x-ratelimit-limit-requests: 200
x-ratelimit-remaining-requests: 199
x-ratelimit-reset-requests: 59.70
“rate limits can be quantized” in older advice, then rewritten in a less technical manner: Rate Limit Advice | OpenAI Help Center
how can i see this?
x-ratelimit-limit-requests: 200
x-ratelimit-remaining-requests: 199
x-ratelimit-reset-requests: 59.70
Those are HTTP headers. How you view them will depend on how you are calling the API - it will work differently for Python libraries v.s. Node.js libraries v.s. other mechanisms.
If you’re using curl you can see them by adding the “-i” option:
curl -i https://api.openai.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "Five names for a pet walrus"
}
]
}'
For me just now that returned headers that included:
x-ratelimit-limit-requests: 5000
x-ratelimit-limit-tokens: 160000
x-ratelimit-limit-tokens_usage_based: 160000
x-ratelimit-remaining-requests: 4999
x-ratelimit-remaining-tokens: 159976
x-ratelimit-remaining-tokens_usage_based: 159976
x-ratelimit-reset-requests: 12ms
x-ratelimit-reset-tokens: 9ms
x-ratelimit-reset-tokens_usage_based: 9ms
Is it per minute or day? I assume it is per day; how can I check per minute? I have an app that my user can use their own API, i need to know RPM and TPM since each user is different.
The rate is aggregate for your whole organization, and shared for classes of models. All have rate per minute or token per minute limitations that can block your API call if exceeded, and just a few models have daily limits, moreso at lower organization trust tiers of past payment.
If you have multiple users, typically API rate isn’t something they are made aware of (like you don’t need to know my 150k tokens per second I can send) but instead are personal based on budgeting. You would need to implement your own system for user management while also backiing off the API under requests approaching the limit.