The OpenAI docs (the page “Error codes” which I’m strangely not allowed to link to) only describe the meanings of HTTP status codes, and they explicitly say that some status codes can have several meanings. In particular:
429 - Rate limit reached for requests
429 - You exceeded your current quota, please check your plan and billing details
These are very different things. I would like to know when I can tell users that trying again in a short while might work, which is only in the first case, not the second.
In the case of a rate limit, the API returns an error
object containing "code": "rate_limit_exceeded"
. What’s the value of code
when the billing quota is exceeded?
I tried finding out by setting my hard usage limit to a value below my current usage so that requests would be rejected, but the limit was ignored, which fits with the post titled “API key limit does work and exceed the quota” (seriously, I can’t link to other forum posts???)
I guess that the billing error probably has a different value of code
and that would be enough to tell the difference between just these two cases. I’d like to be sure.
Besides that, I thought it would be good to read through a list of possible (or at least common) error codes to know what I should be prepared for. A good example is "code": "context_length_exceeded"
which occurs when the prompt is too long. If I didn’t know about this problem I could easily be surprised about it in production despite reasonably thorough testing. The HTTP status code is 400 which isn’t even mentioned in the docs at all. So what else might I be missing?
I’m surprised to find that Google returns 0 results for "rate_limit_exceeded" "context_length_exceeded"
, i.e. no page mentions both errors codes. So I take it there’s no documentation about the different possible errors. This seems like a strange gap worth filling.
It also makes me worry that this is intentional and that the values of code
are not reliable/stable. Should I expect that checking for rate_limit_exceeded
or context_length_exceeded
might not work correctly in the future? Are we only supposed to rely on the HTTP status codes?