Rate limit discrepancies across finely tuned models

Hello! I’m wondering if someone can clear up some confusion I have over finely tuned models I’m using.

I am currently experimenting with two new finely tuned models I created. The first is based on gpt-4o-mini-2024-07-18, the other, gpt-4.1-mini-2025-04-14. For context, my organization is on Tier 3.

After jumping from Tier 1 to Tier 3, I no longer had the per minute issues I was seeing with the finely tuned model on gpt-4o-mini. However, after tuning a new model with gpt4.1-mini, it seems I’m running into the default token limit (250000 TPM and 3000 RPM) and almost immediately hitting these limits.

I’m not sure whats going on here—do I just have to wait for something to kick in on openai’s side before I’m able to more easily use the finely tuned model on gpt4.1? Why is it hitting the default when the rate limit page says both mini models should have the same limits? Any help or insight would be appreciated, thank you!