Title: Hard usage limits + no visibility are breaking agent workflows (Codex + ChatGPT subscription)
I’m running an agent-based workflow (Hermes) using ChatGPT OAuth with the Codex backend, and I’m consistently hitting a major usability issue: hard usage limits with zero visibility or warning.
This isn’t about wanting more quota — it’s about not being able to plan or complete work reliably.
Here’s what’s happening in practice:
-
The agent hits:
HTTP 429: usage_limit_reached
plan_type: plus
resets_in_seconds: 8500–15000+ -
This results in full lockouts of 2–4+ hours.
-
There is no warning beforehand, no indication I’m close to a limit, and no way to estimate whether a task will complete.
-
When the limit is hit:
-
The system stops immediately (hard stop)
-
I cannot send a “pause”, “stop”, or “summarize state” command
-
The agent cannot checkpoint progress
-
The job is effectively lost
-
This is especially problematic for agents because:
-
One user task may generate many internal calls (planning, tool use, retries)
-
Usage is consumed much faster than expected
-
There is no visibility into how much is being consumed per task
I’ve also seen inconsistent model usage:
-
Even when configured for gpt-5.4-mini, logs show requests hitting gpt-5.4
-
There is no transparency into fallback behavior
-
This can spike usage unexpectedly and trigger lockouts
What would solve this:
- Usage transparency
-
A visible usage meter or remaining capacity estimate
-
A warning before hitting limits
-
A pre-flight check: “this task may exceed remaining usage”
- Graceful limit handling
-
Let in-progress tasks finish, OR
-
Provide a wind-down buffer instead of a hard stop
-
Allow at least one final request (e.g., summarize progress)
- Dynamic model routing
-
Allow agents to use multiple available models intelligently
-
Use stronger models for reasoning, lighter models for simple steps
-
Automatically downgrade when nearing limits
-
Prevent silent fallback to heavier models
- Agent-aware controls
-
Task size estimation (small / medium / large)
-
Checkpointing during long tasks
-
Ability to block tasks that cannot complete within remaining usage
Right now, the system behaves like a black box with a hard cutoff. That makes agent workflows unreliable because work can be interrupted with no warning and no recovery path.
Again — not asking for unlimited usage. Just enough transparency and control to use the system responsibly.
Would love to hear if others are running into the same issue or have found workarounds.