I’m getting a rate limit error when using gpt-4-vision that doesn’t seem to align with the advertised rate limit. It’s saying 10k token per limit but what’s advertised is 150,000 token per minute. Anyone know what could be going on?
RateLimitError: Error code: 429 - {‘error’: {‘message’: ‘Rate limit reached for gpt-4-vision-preview in organization org-xxxx on tokens per min (TPM): Limit 10000, Used 4785, Requested 7191. Please try again in 11.856s. Visit to learn more.’, ‘type’: ‘tokens’, ‘param’: None, ‘code’: ‘rate_limit_exceeded’}}
1 Like
Do you know your billing tier?
Tier 1 rate limits
This is a high level summary and there are per-model exceptions to these limits (e.g. some legacy models or models with larger context windows have different rate limits). To view the exact rate limits per model for your account, visit the limits section of your account settings.
Model |
RPM |
RPD |
TPM |
TPD |
gpt-4 |
500 |
10,000 |
10,000 |
- |
gpt-4-1106-preview * |
500 |
10,000 |
150,000 |
500,000 |
gpt-4-vision-preview * |
20 |
100 |
10,000 |
- |
gpt-3.5-turbo |
3,500 |
10,000 |
40,000 |
- |
text-embedding-ada-002 |
500 |
10,000 |
1,000,000 |
- |
whisper-1 |
50 |
- |
- |
- |
tts-1 |
50 |
- |
- |
- |
dall-e-2 |
50 img/min |
- |
- |
- |
dall-e-3 |
5 img/min |
- |
- |
- |
- The models
gpt-4-1106-preview
and gpt-4-vision-preview
are currently under preview with restrictive rate limits that make them suitable for testing and evaluations, but not for production usage. We plan to increase these limits gradually in the coming weeks with an intention to match current gpt-4 rate limits once the models graduate from preview. As these models are adopted for production workloads we expect latency to increase modestly compared to this preview phase.
1 Like
Ah you know what, I realize now that I was using my personal account and not my org account. Was able to get passed it, thanks!
1 Like