I am hitting the following 429 error when using o3:
RateLimitError: Error code: 429 - {‘error’: {‘message’: ‘Request too large for organization org-<MY_ORG> on tokens per min (TPM): Limit 250000, Requested 258762. The input or output tokens must be reduced in order to run successfully. Visit https://platform.openai.com/account/rate-limits to learn more.’, ‘type’: ‘tokens’, ‘param’: None, ‘code’: ‘rate_limit_exceeded’}}
When following the provided URL, It says I have 30,000,000 as my limit for this model, not 250,000 as indicated by the error message (my org is on usage tier 5).
Any advice?
1 Like
o3
: 200,000 token context window
The maximum of your input plus room for max_completion_token outputs cannot exceed the model itself.
OpenAI likely put the rate limiter and its estimator in front of real token encoding to minimize the calculations of denying unserviceable requests.
Are you hitting that with a single request that large? Are you actually making enough requests in parallel that this would be a cumulative amount hit only by a large grouping of them?
Thank you!
Yes, it’s one large request.
I see now that my request is too large for the model, but I was thrown by the fact that the error message says 250k is the limit when it’s actually 200k, and also that it says I’m hitting the TPM limit when in actuality it should the absolute context length limit for the model. But, again, thanks for detangling this a bit for me!