Gitea MCP - how to run codex like a pro - with local private code repo

First of all I am sure many of you guys use github for a reason. You want to share code and that makes perfect sense! Thanks for sharing with the community! However some of us have different needs. We may want to build something that is not meant for public eyes.

And for that my favorite tool is gitea. it is an own private git server - can be set up in minutes and with this tutorial you can even avoid some hours of debugging.

First of all: I assume you got docker installed on your machine. If that is not the case, then do it.

Step 1:

Use this code here:

services:
  gitea-db:
    image: mariadb:10.11
    container_name: gitea-db
    restart: unless-stopped
    environment:
      MYSQL_ROOT_PASSWORD: rootpassword
      MYSQL_DATABASE: gitea
      MYSQL_USER: gitea
      MYSQL_PASSWORD: giteapassword
    volumes:
      - ./data/db:/var/lib/mysql

  gitea:
    image: gitea/gitea:latest
    container_name: gitea
    restart: unless-stopped
    depends_on:
      - gitea-db
    environment:
      USER_UID: 1000
      USER_GID: 1000
      GITEA__database__DB_TYPE: mysql
      GITEA__database__HOST: gitea-db:3306
      GITEA__database__NAME: gitea
      GITEA__database__USER: gitea
      GITEA__database__PASSWD: giteapassword
      GITEA__server__DISABLE_SSH: "false"
    volumes:
      - ./data/gitea:/data
    ports:
      - "3000:3000"
      - "2222:22"

save as docker-compose.yml

then type

docker compose up -d

Then open http://localhost:3000 and add the password “giteapassword” and the admin user “gitea” (don’t do that on a prod git .. this is for local only) and then save the configuration.

Then create an access token for the mcp http://localhost:3000/user/settings/applications

copy it and then open vscode

add this to your config.toml:

[mcp_servers.gitea]
command = "docker"
args = [
  "run",
  "--rm",
  "-i",
  "--network",
  "gitea_default",
  "-e",
  "GITEA_HOST=http://gitea:3000",
  "-e",
  "GITEA_ACCESS_TOKEN=____ADD YOUR ACCESS TOKEN  HERE_____",
  "docker.gitea.com/gitea-mcp-server:latest"
]
enabled = true

Create a SSH key:

ssh-keygen -t ed25519 -C "your_email@example.com"

Then take the public key

cat ~/.ssh/id_ed25519.pub

Then add it here: http://localhost:3000/user/settings/keys

Then add an entry to your ~/.ssh/config file (if it doesn’t exist then create it):

Host localhost
  IdentityFile ~/.ssh/id_ed25519
  IdentitiesOnly yes
  Port 2222
  User gitea

And then you need to restart vscode and you can start talking to your git via chat.

I like to let it create issues

http://localhost:3000/gitea/base/issues

where I split the code into sections preferably modules that are lously coupled.

That means that I can run one agent per module without having any problems e.g. because two or more agents work on the same file…

Then I use a split pane tmux instance which looks like this:

Actually it looks like that after a run.

Have fun!

9 Likes

Cool. I like the part where you produced a docker-compose.yml and a howto (for the docker I must install on my machine.)

Curious: are you ultimately still using the ChatGPT codex?

What about for not using Githubbub for a reason. and then not ChatGPT?

I didn’t have a problem sending API framework code over to ask for summation if you’d think about a “you build it” solution…

Briefly describe the technical underpinnings
  • My code is API AI + patches tool + no shell + code constraints = no Git; drop just what AI should see in to “workspace” and select files out of it for “artifact/canvas” (no dozens of tool loops exploring with shell), option to approve every file name for read or read/write. Anything touched with a diff is backed up.

At a high level this is a local, console-based “agent loop” that (1) snapshots a curated view of your workspace into the model’s prompt, (2) lets the model propose structured file edits via a tool call, (3) validates and applies those edits safely and atomically, and (4) repeats until the model returns normal assistant text.

What the system is made of

1) Workspace isolation + path safety

  • All file operations are constrained to workspace/.

  • PathValidator.validate_and_resolve() rejects:

    • absolute paths,
    • .. traversal,
    • backslashes (so you don’t accidentally accept Windows-style escapes),
    • reserved/ignored directories like .git, venv, .backups, etc.
  • FileWorkspace maintains an explicit allowlist (_tracked) of files the model is even allowed to “see” and modify. If it’s not tracked, it’s not in the prompt and it can’t be edited.

2) “Context window” construction

  • FileWorkspace.build_context_section() embeds tracked file contents into the model instructions between <BEGIN_FILES> / <END_FILES> and labels each file with ===== path.
  • That means the model’s edits are grounded in the exact text currently on disk (for tracked files), not guessed structure.

3) Responses API with server-side conversation state

  • A conversation is created once (/conversations), then each turn calls /responses with:

    • conversation: {id: ...} so state accumulates server-side,
    • store: True so the server retains it.
  • ConversationLifecycle uses a lockfile to prevent “orphaned” conversations:

    • On startup, if conversation.lock exists, it attempts to delete that prior conversation.
    • On exit, it deletes the active conversation and removes the lock.
    • If deletion fails, it keeps the lock so the next run can clean up.

4) Tool-mediated code editing: apply_patch

  • The model can return apply_patch_call items instead of (or before) plain text.

  • Each tool call contains an operation dict like:

    • type: update_file / create_file / delete_file
    • path: a workspace-relative path
    • diff: unified diff text
  • Your host code does not let the model write arbitrary bytes; it only accepts tool calls and then enforces the patch rules.

5) Unified diff parsing + context validation

  • UnifiedDiffParser.parse_hunks() extracts @@ -old,+new @@ hunks and their +/-/ lines.

  • apply_hunks() applies them linearly and verifies context:

    • For each context ( ) line and deletion (-) line it checks the current file’s exact line content matches the diff.
    • If anything doesn’t match, it raises an error (prevents “patch drift” or editing the wrong file version).
  • This is basically a minimal, safe “3-way merge avoidance” approach: instead of trying to be clever, it refuses when the base doesn’t match.

6) Atomic multi-patch operations

  • PatchApplier.apply_operations() does a two-phase approach:

    1. Prepare in memory for every requested operation (and for updates, backs up current file).
    2. Commit writes/deletes only after all prepares succeed.
  • If preparation fails for any op, none of the operations are committed (so you don’t end up half-edited).

  • Note: commits aren’t fully transactional against mid-commit failures (e.g., disk write error on op #2 after op #1 wrote), but the design does minimize partial application by front-loading validation.

7) Interactive file browser for “zero friction”

  • /browse scans the workspace (skipping ignored dirs, huge files, and likely-binary files) and allows tracking by index.
  • Tracking is the user’s explicit consent mechanism: the model is limited to what you track.

What an AI Codex-style model can do inside this framework

Within this architecture, a code-capable model can employ a few reliable methods:

  1. Patch planning from exact context
  • It can read the tracked file snapshot and propose a unified diff that targets precise lines with sufficient surrounding context, so application succeeds deterministically.
  1. Iterative tool loop (“plan → patch → observe → patch …”)
  • Your run_turn() loop feeds tool results back as the next input, so the model can:

    • attempt an edit,
    • see success/failure messages,
    • adjust the diff if the patch didn’t apply,
    • continue until it’s done.
  1. Safe refactoring via small, verifiable hunks
  • Because context mismatches hard-fail, the model tends to make smaller, well-anchored diffs:

    • rename a symbol,
    • extract a helper,
    • adjust a function signature and its callsites,
    • update imports/tests in the same atomic batch.
  1. Multi-file consistency edits
  • Since tool calls can be parallel and you support multiple operations per turn, it can update several tracked files in one atomic commit to keep builds/tests consistent (e.g., update an API and its callers together).
  1. Grounded error correction
  • When a patch fails, your error messages (context mismatch / deletion mismatch / “file not tracked”) give the model actionable feedback:

    • rebase the patch onto the latest content,
    • ask the user to track the needed file,
    • switch from create_file to update_file, etc.

That’s the core: a constrained local editor with an explicit file allowlist, deterministic patch application with validation, and a conversational loop that lets the model iteratively converge on correct edits while keeping the blast radius small.

2 Likes

Well, yeah I know exactly what you are talking about I guess…

First let it find the corresponding files in the AST of your GraphRAG, right?

… [snip] …

and then there should be a media gateway that can handle thousands of parallel calls and stuff.. so backend service must be lously coupled (which is done because AMQP…)

It can do this in a loop until the security audit doesn’t fail any checkboxes anymore..

To answer your question: yes, there will be a time without codex eventually.

2 Likes

https://openai.com/index/unrolling-the-codex-agent-loop/

:star_struck: thanks for that deepdive Michael Bolin!

That is more like an “ugh”, sounding more like “I just learned how transformer AI works from an AI and I’m glad to pass on my ambiguous notions.”

It misses: AI models are autoregressive, and token prediction is generative upon growing context window up to the very last prior token produced;
Inference: The AI model infers the certainty value assignment for the token dictionary of logits it produces;
Sampling: the random selection from the probability distribution, biased by top_p removing low ranks and temperature altering the distribution. Not a required component.
Internal prompting: the AI is writing as “assistant” from the start, not finally making “assistant” as a response.
Tool use: that this is an output decision the AI makes to send to a tool recipient, a special sequence and then the addressee, not exclusive to generating seen language in a turn nor generating internal reasoning in a turn also.

Then factually: tell me tools are internally a developer message?

That is all ancillary to describing the Codex CLI software product one not need use on the API at all.

It mentions that the word Codex is extremely overloaded by OpenAI (going back to the Codex model of 2022). Then does nothing to disambiguate that conflation between the description of what the CLI software does on its own, separate from what the Responses API is doing on the backend with internal instructions for each internal tool, along with internal tools producing pre-prompt or post-prompt messages depending on how much more control needs to be taken away from the developer.

2 Likes