Why Is GPT-5.4 Mini Showing Up in My Codex Usage?

Ever since the forced update that compelled me to install the latest Codex build, I have noticed a massive, consistent downgrade in output quality. The drop-off between pre- and post-update performance is night and day.

For the longest time, I relied exclusively on GPT 5.5 HIGH, and up until this update, the quality was phenomenal. After the update, it became completely unusable—hallucinating, outright lying, delivering substandard code, and serving up partial completions. Frankly, it started behaving exactly like the garbage Opus 4.7 release.

I was scratching my head trying to figure out what went wrong, but now I have the answer: Codex is silently downgrading users to 5.4 and 5.4 Mini behind the scenes, and I have the proof. Inspecting the system calls post-update clearly confirms it is routing to 5.4 and 5.4 Mini.

To say I am pissed off is an understatement. I deliberately avoided 5.4 in the past due to these exact quality issues and switched to Claude Code. When Opus 4.7 dropped and turned out to be trash, I migrated over to Codex, upgraded to a Pro subscription, and my productivity went through the roof.

Hi and welcome to the community!

Are you using auto-review?

No, I am not using Codex Auto Review as my main development workflow.

My current development mode is direct one-on-one interaction with the model. I give the agent specific requirements, constraints, architecture rules, and implementation instructions, then I review and verify the output myself.

I am also not relying on multi-agent automation right now, because I do not have enough confidence that it can handle my architecture correctly. The system is complex, so I prefer direct integration with the agent where I stay closely involved in the design, implementation, testing, and validation.

So in my case, Codex is not acting as an autonomous reviewer or multi-agent development system. It is more like an interactive coding assistant that I guide step by step.

Okay. I asked because some of the ‘helper’ features use these smaller models. Afaik this is also true for the memory features in Codex.

One more question: Your screenshot shows only 2,143 tokens?

It’s turns, not tokens. The way usage is measured in the new Codex does not seem to be the same as token usage. For example, usage is no longer shown in tokens, and I couldn’t get that information.

Currently, we have two reset periods: every 5 hours and weekly. No one seems to know how the percentage usage is calculated. I initially thought it was token-based, but it looks like it is not. Only OpenAI can answer that accurately.

Right now, I am more concerned about quality. I also never realized they were mixing and matching 5.5 with 5.4, even though I only selected 5.5 High.

Codex App is reporting an incorrect Knowledge cutoff in a fresh thread.

Issue:
In a new Codex App thread, I asked:
“Please output only the original text of ‘Knowledge cutoff’ as it appears in the current system context. If you do not see such a field, simply output: ‘Not found.”

Actual output:
Knowledge cutoff: 2024-06

Environment:

  • macOS

  • Codex App bundled agent version: codex-cli 0.142.3

  • Account plan: ChatGPT Pro

  • Models tested: GPT-5.5 and GPT-5.4-Mini

  • The issue appears across models.

Local checks already completed:

  • ~/.codex/config.toml: no 2024-06

  • ~/.codex/AGENTS.md: no 2024-06

  • ~/.codex/instructions.md: no 2024-06

  • project AGENTS.md: no 2024-06

  • ~/.codex/models_cache.json: no 2024-06 / cutoff

  • ~/.codex/.codex-global-state.json: no 2024-06 / cutoff

Conclusion:
This appears to be stale or incorrect Knowledge cutoff metadata injected/reported by Codex App or backend session context, not from my local project or local Codex config.

Impact:
It makes it unclear whether Codex App is routing to the selected model correctly, especially when GPT-5.5 is selected.

Really? It would be nice to make that optional.

Especially considering it’s such a tiny percentage … (if we go by the OP stats) (thanks @rajeshamara :raising_hands:t3:)

My fear is that any errors with small models might put at risk context committed to the bigger models …