The first image is of my limits shown in the general setting limits section, which claims I can do 500RPM’s for o1. (client is tier 3) – the second image shows the documentation which presents up to 5000RPM’s for o1.
I have a scenario where I may have to make bulk request to o1 to process some images at 1000’s of images per second, and I’m sure that will not be an issue as my clients tier automatically upgrades, but I guess I just need some clarity on which one it is that I need to reference to stay compliant with my clients current limits?
Am i goofy and missing something, or is there a mismatch here?
The source of truth is what is reported about your account and its models.
Not the documentation with every tier.
Which you can verify in the headers returned when making an API request to o1, looking like: "x-ratelimit-limit-requests": "500", "x-ratelimit-remaining-requests": "499",
The token per minute rate will also be a significant impediment at tier-3, since reasoning used on the smallest of input consumes about 1000 tokens.
The part where I said “the source of truth is…not the documentation”.
Tier 5 indeed maxes out at 1000 TPM. O1 is not even widely deployed to all API users, so the rate limit for those who do have access seems understandable. The only thing odd is the comparison to o1-preview, with access to all at 10x higher limits, just doesn’t make sense.
Yes, I wanted to post a similar finding earlier but the 10,000 RPM for Tier 4 threw me off.
If you are not in a big hurry it’s an option to wait for the reply to @PaulBellow 's inquiry and find out if the numbers for Tier 3 and 5 or Tier 4 are off.
Otherwise you can look into the Batch API or back-off methods also detailed in the documentation.
Got it - so you’re saying that the documentation is unreliable and to ignore it completely and only rely on the limits returned in the headers for my tier?
I would just like to know what to expect for the different tiers so i know how to position my methods for sending over my api requests.
Whelp… that needs to be fixed!
I use that info all the time. and now I’m seeing its incorrect.
I literally just used it about 30 min ago to explain to GPT what it was doing wrong.
Nice catch ssavancvic!
Sorry for all the confusion here. (@PaulBellow and I have been coordinating on this the past couple days.) We’re looking into this and will be back with an update shortly.