I am a long-time Codex user. As of April 28, 2026, the GPT-5.5 model seems to be draining my quota excessively. A single task (splitting a directory into four parts, ~2k lines of code in total) took 25% of my weekly limit, whereas similar tasks typically use only 2–4%. Could there be an error in how the quota is being calculated?
I’m seeing the same pattern.
For context, I’m on Plus and using Codex CLI with GPT-5.5. A recent governed research-infrastructure session consumed my full 5-hour Codex window and roughly 19% of my weekly allowance. The work was not trivial at all, it involved Plan Mode because I needed careful source inspection, generation of manifest-backed Workspace artefacts, CSV validation, and refinement planning, so I did expect meaningful usage. But the depletion still felt disproportionate compared with the actual work completed.
The most frustrating part is the 5-hour limit. My weekly balance still had 81% remaining, but I was blocked because the short-window limit reached 0%. That makes Codex difficult to use for sustained engineering workflows, even when the weekly quota is mostly untouched.
What I’m seeing is less “I used a lot, therefore I hit the weekly cap” and more:
-
GPT-5.5 Codex appears to burn through the short-window quota very aggressively.
-
Plan Mode / xhigh seems especially expensive.
-
Large context sessions become risky quickly.
-
The 5-hour governor can stop the workflow long before the weekly limit is close to exhausted.
I’m adjusting my workflow by moving more planning and specification work outside Codex, then using Codex only for tightly bounded repo-local execution. But even with that discipline, the current GPT-5.5 quota behaviour feels difficult to predict.
It would be helpful if OpenAI clarified whether GPT-5.5 has materially different quota accounting from earlier Codex models, especially for Plan Mode, xhigh reasoning, cached context, repo inspection, and long-running CLI sessions.
Via the Codex CLI against my local repo. The prompt “can you please stage, commit and open a PR in accordance with our comms standard”. A fairly normal governed repo workflow: inspect branch state, read the repo GitHub ops runbook, check the change & communications standard requirements, review the diff, stage 24 files, commit, push, create a PR, handle one failed gh pr create attempt because codex seems to always trip over tmp files and the github api, retry, apply labels, and verify the PR state.
It completed successfully, but it consumed 6% of my 5-hour quota.
It is difficult to understand why a local CLI PR workflow would consume that much of a 5-hour allowance.
This was gpt-5.5 medium.
Yes i am running through same situation. I thought i was alone!! but i see many users fac the same issue.
5.5 High drained my 5 hour limit in about 30 minutes last night and the same thing happened this morning. Same work that I was doing with the earlier models on High so I don’t think it’s something new that I am working on that is causing this. I am I am switching back to the earlier models until this is resolved.
5.3 Codex was the best model I’ve worked with. But now, even if I switch back to 5.3 Codex, it is not doing anything correctly. 5.5 consumes quota so fast with failure, 5.3 Codex consumes less quota with more failure. Changing model back to 5.3 Codex just causes more repetition for the same task with no result on my side.
Yesterday my engineers drained all 13 business accounts that we have within half day. Then i allowed them API access.
Take a look how much did one task costed for us with GPT5.4 model on Medium reasoning.
8$ for 1 TASK!
This is absolutely nuts. We are most probably going to leave Codex, this is becoming way to expensive and not really possible to work anymore fluently.
https://help.openai.com/en/articles/20001106-codex-rate-card
I had no idea this was even updated. This doesn’t seem balanced at all to me
I’m seeing similar reports from other users. It looks like GPT-5.5 has a much higher ‘reasoning weight’ or a different quota accounting logic compared to previous versions.
To help the community/support narrow this down, could you check a few things:
-
Plan Mode vs. Direct: Are you using the new ‘Plan Mode’? It often runs background checks that burn quota faster.
-
Context Window: GPT-5.5 might be pulling in more files/context by default now, which inflates the token count.
-
Usage Dashboard: Does the ‘Usage’ page in your settings show the same 25% drop, or is it only the UI counter in Codex?
You should definitely submit a ticket through the Help Center. If a task that used 2% now takes 25%, it’s likely a miscalculation in how the 5-hour window handles the high-reasoning tokens of the new model
I’m seeing Codex in VS Code drain the 5-hour and weekly limits very quickly again around May 3–4.
After the April 28 reset, usage felt reasonable for a few days, but now normal coding prompts are consuming quota much faster again. My workflow has not changed at all: same VS Code setup, same type of prompts, same coding tasks, and no unusually large refactors or intentionally long-running background tasks.
Since I started using Codex, the 5-hour limit used to be enough for a real coding session. Now the same kind of work can consume the limit in about 1 hour, maybe 2 hours at most.
This issue started for me after the new Pro plan was introduced. From the user side, it feels like Plus limits may have been reduced or the quota accounting changed, because the same workflow now consumes much more of the 5-hour and weekly limits than before.
Please check whether there was a new regression or quota accounting change around May 3–4 involving VS Code Codex, GPT-5.5, Plan Mode, background sessions, retries, tool calls, or context refreshes.
The current usage meter does not show enough detail to know which task or session consumed the quota. It would really help to have a per-task/session breakdown and a clear explanation of how the 5-hour and weekly percentages are calculated.
I haven’t used Codex in two days, including today, yet my quota is still draining while idle. The weekly quota decreased by 1% and the 5-hourly by 4% without any activity.
I encountered the same problem but with much higher consumption without using it: 11%!
It is not reasonable to use codex anymore. better to host your models in GPU servers for groups or move to claud code. is much more cheaper and better in general. codex is good but way to expensive to be reasonable to use…
In December/January, I was able to create an entire game in four days without hitting any limits. Currently, the 5-hour limit runs out within an hour, and after four such sessions, I’ve almost used up my weekly limit in just two days—and I’m using GPT-5.3-codex.
We have experienced similar problems and have switched the team back to 5.4
I saw it as a test to see how much they could extract since anthropic was pulling ahead after their snafu with the govt. People were flocking to Claude like bees to honey so they had to figure out how to extract more. At first I thought, okay it’s draining pretty quickly I’ll top up. But then after like 2 asks the $40 I topped up with was gone. Then one week I spent 100 over my plan and now I just refuse to pay anymore. It's just gamified and I think they're testing the upper limit of how much they can get back from the consumer (that was my perception of the situation at least ![]()
Because of the rate it which 5.5 (and 5.4) absolutely torches quota - I’ve dropped back to the 5.3 model and careful agents/skills & prompt engineering - the token burn vs model just isn’t there. No point in using a wizzbang model if it does nothing but burn tokens for minimal output.
I’m not sure what’s going on but when it starts to use tools the usage drops very quickly. im not using anything different just playwright and testing tools
