Benchmarking Rate Limits on Workplace Plan

The changes to rate limits in codex cli are a little frustrating. Hit me a little hard yesterday, hit the 5hr rate limit in 10 minutes before I figured out what was happening. That got me thinking: what do these limits actually cost compared to straight token-based access?

I ran a simple test this morning reviewing several ~20K specs on gpt-5.4 medium fast. After just five prompts the 5h quota dropped from 100% to 79% — 4.2% consumed per review of a 20k document.

Started off here:

5h limit: [████████████████████] 100% left (resets 10:57)

Representative prompt:

review docs/working-notes/investigations/injest_signal_catalog.md

generated a signal catalog for the current injest dags.

am looking to build v2 of the validator based on a similar architecture to the one we
described for ADS.

After the 5th prompt:

5h limit: [████████████████░░░░] 79% left (resets 10:57)

To see what was really happening under the hood, I benchmarked the exact same prompt + document against a local LLM (Codestral). The numbers came back clean:

slot print_timing: id 0 | task 1193 |
prompt eval time = 1614.52 ms / 742 tokens ( 2.18 ms per token, 459.58 tokens per second)
eval time = 1822.04 ms / 31 tokens ( 58.78 ms per token, 17.01 tokens per second)
total time = 3436.55 ms / 773 tokens
slot release: id 0 | task 1193 | stop processing: n_tokens = 786, truncated = 0

To make this easier to understand, for this single LLM call, the 20K spec was tokenized down to just 742 input tokens under it’s tokenizer.

  • 742 input tokens (prompt eval)
  • 31 output tokens (generation)
  • Total: 773 tokens processed

Even though Codex CLI is agentic and uses more tokens than a single-pass call, burning 4.2% of the 5h quota on a 20K document review feels excessive.

On the published GPT-5.4 API pricing, that exact single-pass review would cost roughly $0.0023 (0.23 cents).

I’m currently on OpenAI’s Business plan at $20 per seat per month ($80 total for our four seats). While there are no extra per-token charges, the 5-hour limit makes it far more expensive in practice than it looks. At my observed rate I can only complete about 23–24 of these reviews before hitting the wall in any 5-hour window, well short of a full workday.

Using the 773-token benchmark as a conservative baseline, that works out to an effective rate of about $31 per million tokens on the Business plan - more than 12× OpenAI’s $2.50 per million input token price on the token-based API. The real multiplier is almost certainly higher because the full agentic workflow (planning, file reads, verification loops) consumes significantly more tokens than this benchmark shows.

Bottom line: for this kind of daily volume the token-based plan is dramatically cheaper and has zero rate limits. The subscription feels like a bad deal once you do the math. Kicking myself for paying for the seats 2 years in advance. Feels like I’m the victim of a bait and switch.

2 Likes

Very interesting analysis thanks.

I think this is going a bit far - let’s not be naive that we’ve been through an early phase where customers have been given incentives to use these new AI products - of course they have been subsidised. But the creditors will want their money back that they have lent to the big AI companies.

We are in uncharted territory here. I have sympathy for both consumers like yourself and the companies involved. Anthropic is going through exactly the same growing pains atm and it’s hard to predict where things will fall as reasoning engines get optimised.

I think what we know is just how capable products like codex can be - but the outstanding question is - at what cost? It looks atm to be a lot more than say a $20 a month sub.

There’s an estimate floating around that suggests that prior to recent rate limit reductions, Claude Code was offering $5000 of API access for $200 - that clearly was not sustainable. (Though there are debunks out there: No, Anthropic Doesn’t Spend $5,000 Per Claude Code User: A Technical Deep Dive - DEV Community)

There needs to be significant further efficiency wins or basically the average customer will not be able to afford to run tools like codex for very much longer - enjoy it whilst you can.

1 Like

That’s a pretty reasonable response, but would point you to the fact they’re selling year-long subscriptions for the workplace plan. When you pay for something up front, it’s usually safe to expect it will remain suitable for use during that period.

Thought I was paying for flat-rate access to a frontier model with reasoning capabilities. I’m happy to deal with changes as the platform continues to evolve, but the dramatic reductions to the usage window depart from the original value proposition quite a bit.

This just isn’t suitable for use. I’ve been using it to write specs to hand off to other llms, which usually works out pretty well. Not sure it fits that role anymore, don’t like the fact I can make a financial case for scrapping it.

At the very least, it should not be obvious that the workplace plan is now 10x more expensive than token based access.

3 Likes

I hope your analysis gets some staff views!

1 Like