I have read that GPTs usage is set to 25 per 3 hours in the thread over here.
Maybe this is what you are hitting. Your “GPTs” limit (25) is less than GPT-4 (40) for the 3 hour rolling window. Not sure how the quota algorithm blends the two, or if you simply have to switch away from GPTs and go with GPT-4 and hit 40 there, then you are left with 3.5 (\infty) until your quota bucket(s) gets filled with additional requests.
From using the API, they fill the bucket continuously, and you don’t have to wait another 3 hours to use it. You may only have to wait 10 minutes, depending on your usage pattern.
For example, if you use it relatively evenly, there is not much of a delay. If you use it all in one big bang, all at once thing, you will have to wait the 3 hours. So usage pattern dependent.
This is done to even out the load on the AI servers. As the situation improves (more servers, more efficient models, optimized model designs–like the Turbo series) you will see the caps go up.
Or go with the API and go crazy. The quotas in the API are based on usage Tier (based on past payments made), but are generous and virtually unlimited, especially as your Tier level gets upgraded over time, from Tier 1 to Tier 5.