Developer-enforced API rate limits not working properly for fine-tuned models

Hi,

I have been using OpenAI’s project limits functionality for one of my projects as follows:

  • whitelisted several models, including gpt-4.1 and some fine-tunes (including intermediate checkpoints of those fine-tunes)
  • set rate limits for those models, including:
    • gpt-4.1:
      • 300 000 TPM (out of the 30 000 000 available for our Tier 5 org)
      • 100 RPM (out of 10 000 available)
    • *default
      • 25 000 TPM (out of 250 000)
      • 300 (out of 3000)

According to the project limits page, the rate limit I set for gpt-4.1 should apply for all gpt-4.1 fine-tunes as well (names starting with ft:gpt-4.1), which I suppose should also include intermediate checkpoints of those fine-tunes.

However, when I use one of those intermediate gpt-4.1 checkpoint models, I get the following rate limit exceeded error:

{'error': {'message': 'Rate limit reached for project <redacted> organization <redacted> on tokens per min (TPM): Limit 25000, Used 23255, Requested 5872. Please try again in 9.904s. Visit https://platform.openai.com/account/rate-limits to learn more.', 'type': 'tokens', 'param': None, 'code': 'rate_limit_exceeded'}}

The rate limit of 25 000 TPM this error message is referring to, does not match the 300 000 TPM rate limit that should apply. Instead it seems to apply the default/fallback rate limit I have set to 25 000 TPM.

Am I doing something wrong, or is this a bug?

Any help is appreciated!

1 Like