Why Codex Still Feels Blind — and the Fix Could Redefine AI Coding

bv2309 · March 26, 2026, 12:38pm

Codex is already strong at reading, generating, and modifying code, but it still feels too file-level for real-world software development.

The core gap is this:

Developers do not just need help writing code.
They need help understanding the system they are changing.

Today, too much important context is still missing or manually reconstructed:

architecture lives in people’s heads
pull requests are reviewed file by file
prompts are too generic
impact analysis is manual
validation is fragmented

This makes AI coding feel powerful, but still not fully reliable at the system level.

I think there is a major opportunity here:

Turn a codebase into an interactive, explainable system — then let AI safely modify it with full context and continuous validation.

Here is the product direction I would love to see:

Codebase Map

An interactive architecture view of the repository.

Not just a file tree, but a navigable map of:

features
services
modules
components
data layers
dependencies
ownership boundaries

Clicking a node should explain:

what it does
which files belong to it
what it depends on
what depends on it
what risks are associated with changing it

Think of it like 3D Google Maps for a codebase.

Diff Intelligence

Show git changes on top of that map.

Instead of only seeing changed files, let the user immediately understand:

what part of the system changed
what other areas may be affected
what could break
who should review it
what remains untouched

This would make PRs far more understandable and much more useful for safe review.

Repo-Aware Prompt Compilation

When a user gives a vague request, Codex should convert that intent into a scoped, architecture-aware implementation plan.

Example:
“Add audit logging to invoice approval”

Codex should infer:

which modules are involved
where similar patterns already exist
which boundaries should be preserved
what tests likely need updates
what downstream systems may be affected

That would make prompting much more precise and much less generic.

Execution + Validation Loop

Codex should work in a visible closed loop:

plan the change
modify files
run validation
detect failures
attempt repair
re-run checks
prepare for review

This includes:

unit tests
integration tests
end-to-end tests
linting
type checks
build verification

The shift is from:
“generate and hope”
to
“generate, verify, repair”

That is where trust starts to increase.

Closed-Loop Development Flow

Ideal flow:

Intent → Codebase Map → Change Plan → Code Changes → Validation → Repair Loop → Review with Impact Visualization

This would move Codex from being a code editor assistant to being a system-aware development environment.

Why this matters

The next leap in AI coding is probably not just better code generation.

It is:

better system navigation
better impact understanding
better validation loops
better explainability around changes

That would unlock:

safer code changes
faster onboarding
stronger reviews
more confidence in repo-aware generation
a more real “intent-to-software” workflow

One-line summary:

Turn a codebase into an interactive, explainable system — then let AI safely modify it with full context and continuous validation.

I think this would feel like a natural evolution for Codex, not a disconnected feature.

Curious whether others see the same gap:
the missing layer is not more raw generation, but system-level visibility and safer execution.

_j · March 26, 2026, 1:45pm

Why AI output still feels stilted.

devoid of meaningful content
over-patterned on phrases like its “not just”, or “why this matters”
disconnection from reality
an illusion of intelligence that extrapolates language from very little provided content as input.

Much better is to provide the prompt you sent to the AI, such as "make me some text with terms unrelated to machine learning techniques such as “system navigation”, “impact understanding”… that will show the fragment of real human thought you have.

Sorry to burst your bubble, AI.

merefield · March 26, 2026, 4:09pm

Not sure I appreciate the cut & paste from ChatGPT or equivalent, but there is a germ of an idea here.

bv2309 · March 26, 2026, 4:23pm

_j I hope this finds you well, but I’m not here to adhere to some imaginary rules about how I should communicate my ideas. Here is the prompt that might help you understand the vision about it and if you need a new assistant vision leader ring me up.

Here is the refined version of the brainstorming session:

Codex can write code, but it still cannot really see the system.

The biggest remaining gap in AI-assisted software development is not file-level code generation, but system-level understanding.

Developers do not just need help writing code. They need help understanding the system they are changing:

where a change belongs
what it affects
what boundaries it crosses
what might break
how to validate it safely

Today, too much context is still missing or manually reconstructed:

architecture lives in people’s heads
PRs are reviewed file by file
prompts are too generic
impact analysis is manual or fragmented
validation is disconnected

This is why AI coding feels powerful but not yet fully reliable at the system level.

The opportunity is to turn a codebase into an interactive, explainable system.

That system should include:

a codebase map instead of just a file tree
“Google Maps for a codebase”
diffs visualized on that map
repo-aware prompt compilation from vague intent to a scoped, architecture-aware plan
a visible closed loop of plan → change → validate → repair → review

The shift is from:

generate and hope
to:
generate, verify, repair

The core argument is that the next leap in AI coding is not more raw generation, but system visibility, impact awareness, and continuous validation.

This would move Codex from a code-writing assistant to a system-aware development environment and bring AI closer to a real intent-to-software workflow.

It should feel like the natural next evolution of Codex, not a disconnected feature.

P.S. I know that that would amplify human redundancy but it would be the tool Codex can be.

_j · March 26, 2026, 5:27pm

You write “today, too much context is missing”, with concerns about holistic code base understanding, prompting and human shortcomings, and testing. Have a read.

This cookbook shows how to use OpenAI’s Codex CLI to modernize a legacy repository in a way that is:

Understandable to new engineers

Auditable for architects and risk teams

Repeatable as a pattern across other systems
…

Gemini 3.1 Pro says: Based on the provided OpenAI Cookbook documentation for Code Modernization, here is a distilled overview of the stepwise tasks the AI is prompted to execute before beginning the actual coding implementation. These preparatory steps (Phases 0 through 3) generate the foundational plan, architecture, and validation documents that will power the hours-long coding session.

Phase 0: Establish Planning Rules

Task: Define an opinionated standard for how the AI agent should plan modernization work within the repository without overwhelming the team with process.
Prompted Action: Instruct the AI to read the directory structure and refine its planning rules, keeping a skeleton of an “ExecPlan” and adding concrete examples.
AI Outputs: .agent/AGENTS.md and .agent/PLANS.md

Phase 1: Project Scoping and Executive Planning

Task 1: Select a Pilot. Analyze the legacy codebase to find a realistic, bounded flow for modernization.
- Prompted Action: Ask the AI to propose 1–2 candidate pilot flows, listing the legacy programs (e.g., COBOL/JCL), the business scenario, and a final recommendation.
- AI Output: A generated list of candidate pilot flows.
Task 2: Create the ExecPlan. Generate the central “home base” orchestrating document for the work.
- Prompted Action: Instruct the AI to create an ExecPlan following .agent/PLANS.md, scoped to the chosen flow. It must outline four outcomes: inventory, technical report, target design, and a test plan.
- AI Output: pilot_execplan.md

Phase 2: Legacy Inventory and Discovery

Task 1: Document Legacy Behavior. Extract exactly what the legacy code does so human engineers can reason about it without reading the old code.
- Prompted Action: Instruct the AI to draft an inventory and Modernization Technical Report. This must include involved legacy programs, orchestration jobs, data sets, a text flow diagram, plain-language business logic, the data model, and technical risks.
- AI Output: pilot_reporting_overview.md
Task 2: Align the Plan. Keep the master plan updated.
- Prompted Action: Instruct the AI to update the ExecPlan to mark the inventory phase as drafted and log any discoveries/surprises.
- AI Output: Updated pilot_execplan.md

Phase 3: Design, Spec, and Validation Planning

Task 1: Draft the Target Design. Outline the modern architecture.
- Prompted Action: Based on the overview document, ask the AI to draft the target service design (e.g., REST API or batch), the new database model, and an API design overview.
- AI Output: pilot_reporting_design.md
Task 2: Create the API Contract. Establish a language-agnostic anchor for implementation and testing.
- Prompted Action: Instruct the AI to use the design document to generate a full OpenAPI specification featuring paths, operations, schemas, and constraints.
- AI Output: modern/openapi/pilot.yaml
Task 3: Define the Test Strategy & Scaffolding. Define exactly how the team will prove the new code matches the legacy behavior.
- Prompted Action: Ask the AI to write a test plan detailing happy paths, edge cases, and a side-by-side comparison strategy. Next, prompt it to use this plan to scaffold an initial test file with placeholder assertions.
- AI Outputs: pilot_reporting_validation.md and modern/tests/pilot_parity_test.py
Task 4: Finalize the Plan for Coding.
- Prompted Action: Instruct the AI to update the ExecPlan one last time so that the Plan of work, Concrete steps, and Validation sections explicitly point to all the newly created design, spec, and testing files.
- AI Output: Updated pilot_execplan.md

Transition to Coding:
Once these artifacts are generated, the AI is fully primed. It transitions into Phase 4 (the actual coding challenge), using the rich context of the ExecPlan, Overview, Design, Validation Plan, API Spec, and Test Scaffolding to safely generate, test, and iterate on the modern code codebase.

bv2309 · March 26, 2026, 5:30pm

_j I don’t have the energy for squares like you. I just hope someone with opinion worth processing reads the idea.

merefield · March 27, 2026, 11:15am

Why don’t you fork codex and try to implement some of these features?

merefield · March 27, 2026, 11:47am

Codex-cli is open source, just fork it.

You don’t need a fancy interface necessarily.

bv2309 · March 27, 2026, 11:47am

@merefield I’ve been thinking about it but this one needs to be implemented into the Codex itself, I’d use something like Unity interface or similar alternative for broader coverage.

bv2309 · March 27, 2026, 11:48am

Thanks for the tip, if I find the time, I’ll try.

Topic		Replies	Views
Gitea MCP - how to run codex like a pro - with local private code repo Codex CLI	5	1084	February 19, 2026
When vibe coding turns into an unfixable mess Codex	21	1022	March 11, 2026
Is it possible to build a full app with mostly Codex? API codex	5	2415	October 6, 2021
New Codex App... Have you tried it yet? Codex codex-app	9	2252	March 8, 2026
Tips and Tricks for using Codex Codex community , best-practices	20	16220	March 3, 2026

Why Codex Still Feels Blind — and the Fix Could Redefine AI Coding

Phase 0: Establish Planning Rules

Phase 1: Project Scoping and Executive Planning

Phase 2: Legacy Inventory and Discovery

Phase 3: Design, Spec, and Validation Planning

Related topics