Codex as an Autonomous Engineer: Refactoring OpenStack’s Front End into a Unified Streamlit Console

Hi everyone,

I wanted to share a recent project that I’ve just finished getting Codex to action. Frankly, it redefined what I thought was possible with an AI engineering assistant.

Over the past few months, I’ve been using Codex to execute a full front-end migration of OpenStack. I took a forked, disconnected clone of what is arguably one of the largest Python-based open-source projects in existence. A public repo with over a decade of drift, bloat and technical debt and I pointed Codex straight at it.

The Challenge

Take multiple hybrid React/Angular/NPM front-end UIs and plugins and replace them entirely with a single, streamlined Python-only Streamlit console that integrates directly with a FastAPI sidecar and a modern CI/CD toolchain.

What Codex Achieved

In just under two weeks, Codex:

  • Re-architected the entire UI layer into six clean functional consoles:
    01_Provision → Infrastructure bootstrap
    02_Secure → IAM, key and policy management
    03_Observe → Metrics, logging and alerts
    04_Orchestrate → Workflow execution
    05_Recover → Rollback and snapshot management
    06_Assets → Resource and cost overview

  • Removed all Node/NPM dependencies, replacing them with a deterministic, Python-only runtime, shaving over 250k LOC from the repo.

  • Implemented a FastAPI sidecar, complete with request validation, CORS, and async-safe endpoints.

  • Applied strict hygiene and security controls:

    • ruff check → Zero warnings

    • mypy --strict → 100 % typing coverage

    • bandit -q -r src → Clean security audit

    • semgrep --config auto → No policy violations

    • pytest -q → Full test suite passed

How It Worked

Codex wasn’t just following prompts, it behaved like an autonomous engineer inside a CI pipeline.

Each prompt was structured like a development ticket:

  • Scope: File paths, module purpose, expected outputs

  • Constraints: No dependency drift, maintain test isolation, preserve interface contracts

  • Tests to pass: Targeted pytest paths with --maxfail=1

  • Acceptance criteria: CI tools green and zero static-analysis failures

Codex ran full loops:

  1. Generated the code

  2. Executed pytest

  3. Analysed tracebacks

  4. Applied deterministic fixes

  5. Re-ran tests until clean

The process was recursive, not reactive.
By the time Codex submitted a pull request, it had already passed the equivalent of a senior engineer’s pre-merge review and cleared all GitHub automated CI gates.

Code Quality and Architecture

The code it produced wasn’t just functional, the coding was elegant:

  • Fully typed functions, clear docstrings and consistent async usage.

  • No orphan imports, no circular dependencies and no duplicated logic.

  • Cohesive internal structure with a capsule pattern for modular isolation.

  • Sensible naming conventions, strong abstraction boundaries and zero technical-debt injection.

  • Readable and maintainable — if you showed the output to a human dev team, they’d assume it was the result of weeks of careful refactoring.

Codex produced code that simultaneously met a heavy-duty quality baseline:
pdm-managed reproducibility, strict Python-only hygiene (no Node or webpack), deterministic CI gating (ruff, black, mypy --strict, pytest, bandit, semgrep, import-linter), ≥ 95 % test coverage, full governance documentation (CONFORMANCE.md, HOWTO.local.md), signed SBOM evidence, and verified NPM-free integrity.

All code was validated autonomously by Codex with zero manual intervention and traced back to documentation. The resulting codebase cleanly passed all automated and manually executed CI checks — the exact opposite of what people call “AI slop”.

The Broader Implication

What impressed me most wasn’t speed, it was semantic comprehension.
Codex didn’t just manipulate files; it understood architectural intent. It could reason across modules, preserve functional boundaries, and refactor while maintaining dependency integrity.

This was the first time I’ve seen an AI system:

  • Absorb an enterprise-scale codebase

  • Execute refactors at human-engineering quality

  • Validate them autonomously through CI

  • And self-correct without intervention

That’s not a co-pilot, that’s a senior engineer operating inside the repo.

Why Share This

I wanted to share this because it shows the positive extreme of what’s possible when Codex is used in structured, disciplined workflows.
It’s not “AI coding snippets”, it’s AI-led systems engineering that’s reproducible, testable and CI-aligned.

Would love to hear if others have pushed Codex into similar territory. I’d love to hear how Codex has tackled large-scale migrations, complex refactors, or multi-repo orchestration. I suspect this is where the real frontier of agentic development lies.

1 Like