The changes to rate limits in codex cli are a little frustrating. Hit me a little hard yesterday, hit the 5hr rate limit in 10 minutes before I figured out what was happening. That got me thinking: what do these limits actually cost compared to straight token-based access?
I ran a simple test this morning reviewing several ~20K specs on gpt-5.4 medium fast. After just five prompts the 5h quota dropped from 100% to 79% — 4.2% consumed per review of a 20k document.
Started off here:
5h limit: [████████████████████] 100% left (resets 10:57)
Representative prompt:
review docs/working-notes/investigations/injest_signal_catalog.md
generated a signal catalog for the current injest dags.
am looking to build v2 of the validator based on a similar architecture to the one we
described for ADS.
After the 5th prompt:
5h limit: [████████████████░░░░] 79% left (resets 10:57)
To see what was really happening under the hood, I benchmarked the exact same prompt + document against a local LLM (Codestral). The numbers came back clean:
slot print_timing: id 0 | task 1193 |
prompt eval time = 1614.52 ms / 742 tokens ( 2.18 ms per token, 459.58 tokens per second)
eval time = 1822.04 ms / 31 tokens ( 58.78 ms per token, 17.01 tokens per second)
total time = 3436.55 ms / 773 tokens
slot release: id 0 | task 1193 | stop processing: n_tokens = 786, truncated = 0
To make this easier to understand, for this single LLM call, the 20K spec was tokenized down to just 742 input tokens under it’s tokenizer.
- 742 input tokens (prompt eval)
- 31 output tokens (generation)
- Total: 773 tokens processed
Even though Codex CLI is agentic and uses more tokens than a single-pass call, burning 4.2% of the 5h quota on a 20K document review feels excessive.
On the published GPT-5.4 API pricing, that exact single-pass review would cost roughly $0.0023 (0.23 cents).
I’m currently on OpenAI’s Business plan at $20 per seat per month ($80 total for our four seats). While there are no extra per-token charges, the 5-hour limit makes it far more expensive in practice than it looks. At my observed rate I can only complete about 23–24 of these reviews before hitting the wall in any 5-hour window, well short of a full workday.
Using the 773-token benchmark as a conservative baseline, that works out to an effective rate of about $31 per million tokens on the Business plan - more than 12× OpenAI’s $2.50 per million input token price on the token-based API. The real multiplier is almost certainly higher because the full agentic workflow (planning, file reads, verification loops) consumes significantly more tokens than this benchmark shows.
Bottom line: for this kind of daily volume the token-based plan is dramatically cheaper and has zero rate limits. The subscription feels like a bad deal once you do the math. Kicking myself for paying for the seats 2 years in advance. Feels like I’m the victim of a bait and switch.