Can Codex Low match Medium on easy tasks?

Have any of you had experience with the Codex “Low” setting?

Usually I run Medium for simpler tasks and High by default (mostly to manage weekly limit usage), but I might be able to squeeze out more weekly usage by assigning “Low” to some tasks.

I haven’t really used “Low” that much, so I’m unsure what kinds of tasks it can handle reliably vs. where it starts to break down. For Medium and High, I feel like they often perform similarly when the task isn’t very challenging. I’m curious if Low and Medium also perform on par for “simple” tasks as well.

Please share your insights/examples.

3 Likes

Welcome to the forum!

While this is not specifically about Codex it will give you some ideas.

With the Plan Mode moving into beta

Codex CLI 0.90.0 (2026-01-25)

  • Shipped collaboration mode as beta in the TUI, with a clearer plan → execute handoff and simplified mode selection (Coding vs Plan). (#9690, #9712, #9802, #9834)

many coders will use Plan mode with a higher thinking, but be cautious using high for now.

Fixing main and make plan mode reasoning effort medium (#9980)

It’s overthinking so much on high and going over the context window.

https://github.com/openai/codex/commit/509ff1c643e0b45366fc5d77c78279ca29b71448

Once a solid plan is in place, it’s often effective to drop down to a lower model with less “thinking” enabled to handle the grunt work. In some cases, you can even use the lowest-tier model with thinking turned off—your mileage may vary.

Tasks such as documentation, housekeeping, regular expressions, file searches and many of the read only tasks and possibly Git commands are good candidates for lower models with reduced thinking. These tasks are generally well within the training coverage of smaller models and tend to be handled reliably.

Another way to get by with a lower model and less thinking is to use skills. If you find yourself repeatedly adding the same instructions to prompts—and they work consistently once in the context window—that’s a strong signal that the logic should be captured as a skill instead.

In my own work with SWI-Prolog, many LLMs struggle to get all of the details right, which initially requires a more capable model. Over time, however, I was able to create skills that address common, repetitive mistakes. Once those skills were in place, I could often safely reduce the level of thinking when they were applied.

I also work primarily on Windows, which introduces its own set of issues that skills can sometimes mitigate. For example, I avoid using the Windows environment variable PATH. Instead, I use a skill that explains how to locate executables via the Windows registry. In my setup, Python is not on PATH, but it is installed and registered. With the appropriate skill, the AI can still locate the correct executable and run Python as needed.

HTH

4 Likes

Great answer - thanks!

Could you give any examples of skills you’ve found particularly helpful for “enabling” lower reasoning (so to speak)?

Ie. I find myself asking the same templated questions: particularly prompts involving planning and iterative bug identification after the coding assistant finishes a PR. I’ve tried moving these to “skills” but thus far, I haven’t found the right combination of skills that:
a) Are faster than just queuing up templated questions and
b) Enable me to meaningfully reduce the reasoning effort

Would be helpful to hear what’s worked well for you.

Below is an actual skill I have used about 100 times. It still needs lots of work but currently works well enough that I do use it when needed. Its very specific to my workflow so do not expect it to work for you as written.

Directory: .. skills\detailed-manual-instructions
File: SKILL.md


---
name: detailed-manual-instructions
description: Patterns for writing detailed, error-resistant manual instructions. Covers environment tables, icon systems, step structure, and checkpoint patterns. CRITICAL FIRST STEP - Always check existing documentation for correct terminal type before writing (e.g., x64 Native Tools vs PowerShell for vcpkg). Use when creating installation guides, testing procedures, or any multi-step manual process where precision reduces user errors.
---

# Detailed Manual Instructions

Write step-by-step manual instructions that minimize user errors through explicit context and verification checkpoints.

## Before You Begin: Terminal Selection Workflow

**CRITICAL**: Before writing any manual instructions, verify the correct terminal type to use.

### Step 1: Check Existing Documentation

If this guide extends or updates existing documentation:

1. **Read the related guide** - Find the terminal type used in similar steps
2. **Use the same terminal** - Maintain consistency unless there's a specific reason to change
3. **Example**: If updating a Windows Sandbox vcpkg guide that uses "x64 Native Tools Command Prompt for VS 2022", use that terminal, NOT PowerShell

### Step 2: Determine Required Terminal

If creating new documentation, consider:

| Scenario | Terminal Type | Why |
|----------|---------------|-----|
| C++ compilation (MSVC) | x64 Native Tools Command Prompt for VS 2022 | Sets up MSVC compiler environment |
| vcpkg operations | x64 Native Tools Command Prompt for VS 2022 | Requires MSVC environment + vcpkg paths |
| General Windows tasks | PowerShell | Standard Windows scripting |
| System administration | PowerShell (Administrator) | Elevated privileges needed |
| Linux/macOS | bash | Standard Unix shell |
| MSYS2 operations | mintty (MSYS2 terminal) | MSYS2-specific environment |

### Step 3: If Unsure

**STOP and ask the user** which terminal to use. Include:
- What operations the commands will perform
- What environment/toolchain is needed
- Reference to related documentation (if any)

**Example question**: "This guide involves [operation]. Should I use [Terminal A] or [Terminal B]? I see the related guide uses [Terminal B]."

## Quick Start

Every step should answer: **Where? What? How? Verify?**

```markdown
### Stage N: Do Something

💡 **Purpose**
Why this stage exists and what it accomplishes.

**Step 1: Action description**

| | |
|---|---|
| 🖥️ **Environment** | HOST or SANDBOX |
| 🔧 **Terminal/App** | Specific terminal or application |
| 👤 **Admin** | Yes/No |
| **Directory** | Working directory (if relevant) |
| ⏱️ **Time** | Expected duration (if > 1 min) |

```cmd
actual command here
```

📝 **Note**
Important context or explanation.

✅ **Checkpoint**: How to verify success
```

## When to Use

- Installation guides for software with multiple components
- Testing procedures that span HOST and SANDBOX environments
- Multi-step processes where wrong context causes errors
- Any procedure where "run this command" is insufficient

## Icon Reference

| Icon | Field | Purpose |
|------|-------|---------|
| 🖥️ | Environment | HOST, SANDBOX, VM name, or platform |
| 🔧 | Terminal/App | Specific terminal type or application |
| 👤 | Admin | Whether elevated privileges required |
| 💡 | Purpose | Why this stage/step exists |
| ⚠️ | Warning | Critical information to prevent errors |
| 📝 | Note | Helpful context or explanation |
| ⏱️ | Time | Expected duration for long operations |
| ✅ | Checkpoint | Verification step |

## Environment Table Format

The environment table eliminates the #1 source of manual instruction errors: **wrong context**.

### Minimal Table (required fields)

```markdown
| | |
|---|---|
| 🖥️ **Environment** | SANDBOX |
| 🔧 **Terminal** | PowerShell |
| 👤 **Admin** | No |
```

### Extended Table (with optional fields)

```markdown
| | |
|---|---|
| 🖥️ **Environment** | SANDBOX |
| 🔧 **Terminal** | x64 Native Tools Command Prompt for VS 2022 |
| 👤 **Admin** | **Yes** |
| **Directory** | C:\Shared\installers |
| ⏱️ **Time** | ~20-25 min |
```

### Environment Values

| Value | Meaning |
|-------|---------|
| HOST | User's main Windows machine |
| SANDBOX | Windows Sandbox VM |
| MSYS2 MinGW64 | MSYS2 with MinGW64 toolchain |
| Ubuntu | Ubuntu Linux (VM or native) |

### Terminal Values

Be specific. "PowerShell" vs "PowerShell (Administrator)" matters.

| Value | When to use |
|-------|-------------|
| PowerShell | Standard PowerShell, no elevation |
| PowerShell (Administrator) | Elevated PowerShell for system changes |
| x64 Native Tools Command Prompt for VS 2022 | MSVC compiler environment |
| cmd | Standard Windows command prompt |
| mintty (MSYS2 terminal) | MSYS2 bash shell |
| bash | Linux/macOS terminal |
| File Explorer | GUI file operations |
| Start Menu | GUI application launching |

## Stage Structure

### Stage Header

```markdown
### Stage N: Short Descriptive Title

💡 **Purpose**
One sentence explaining what this stage accomplishes and why it's needed.
```

### Warnings (before steps)

```markdown
⚠️ **Important**
Critical information that could cause failure if missed.
```

### Steps Within a Stage

Each step is numbered and has a clear action verb:

```markdown
**Step 1: Install the package**
**Step 2: Verify installation**
**Step 3: Configure settings**
```

### Notes (after commands)

```markdown
📝 **Note**
Explanation of why the command works this way, or what to expect.
```

### Checkpoints (end of stage)

```markdown
✅ **Checkpoint**: Description of what should be true now
- Specific file exists: `C:\path\to\file`
- Command output shows: `expected output`
- All N tests pass
```

## Common Patterns

### Pattern: Long-Running Operations

```markdown
**Step 2: Install components**

| | |
|---|---|
| 🖥️ **Environment** | SANDBOX |
| 🔧 **App** | VS Installer GUI |
| 👤 **Admin** | No |
| ⏱️ **Time** | ~2+ hours (SDK installation) |

- Select components...
- Click "Install"

📝 **Note**
The installer may appear stuck during SDK installation - this is normal.
Open Task Manager (Ctrl+Shift+Esc) to verify activity.
```

### Pattern: Conditional Execution

```markdown
📝 **Note**
This may already be set from Stage 7. If so, skip this stage.
```

### Pattern: Verification Commands

```markdown
**Step 3: Verify installation**

| | |
|---|---|
| 🖥️ **Environment** | SANDBOX |
| 🔧 **Terminal** | PowerShell |
| 👤 **Admin** | No |

```powershell
cmake --version
```

**Expected output:** `cmake version 3.x.x`
```

### Pattern: Troubleshooting Tables

Include 🔧 Terminal/App column so users know where to execute each fix:

```markdown
## Troubleshooting

### Problem Category

| Symptom | Cause | 🔧 Terminal/App | Fix |
|---------|-------|-----------------|-----|
| Error message X | Missing dependency | PowerShell | Install Y first |
| Command not found | Not in PATH | PowerShell (Administrator) | Add to PATH via... |
```

For non-table troubleshooting sections, include environment metadata:

```markdown
### Verifying Configuration

Check if settings were applied successfully:

| | |
|---|---|
| 🖥️ **Environment** | SANDBOX |
| 🔧 **Terminal** | PowerShell |
| 👤 **Admin** | No |

```powershell
Get-ItemProperty -Path "HKLM:\Path\To\Setting"
```

**Expected**: Should show the configured value.
```

### Pattern: Rollback Procedure

Always include how to undo changes:

```markdown
## Rollback Procedure

💡 **Purpose**
Restore original state if changes cause issues.

**Step 1: Restore backup files**

| | |
|---|---|
| 🖥️ **Environment** | SANDBOX |
| 🔧 **Terminal** | PowerShell (Administrator) |
| 👤 **Admin** | **Yes** |

```powershell
Copy-Item "file.backup" "file" -Force
```
```

### Pattern: Markdown Line Breaks in Lists

For success criteria or checklist items that need line breaks without paragraph spacing:

**Use two trailing spaces** at the end of each line:

```markdown
## Success Criteria

✅ First criterion accomplished••
✅ Second criterion accomplished••
✅ Third criterion accomplished
```

(Note: `••` represents two trailing spaces - they're invisible in editors)

**IMPORTANT**: The Edit tool strips trailing whitespace automatically. For markdown files requiring trailing spaces:
1. Instruct the user to make the edit manually
2. Explain that two trailing spaces are needed at the end of each line
3. Never attempt to use the Edit tool for this pattern

**Alternative approach** (if trailing spaces are problematic):
- Use blank lines between items (creates more spacing)
- Use bulleted list syntax (`- ✅ Item`)

## Anti-Patterns

### ❌ Missing Context

```markdown
Run the installer.
```

**Problem**: Which installer? Where? As admin?

### ✅ Explicit Context

```markdown
**Step 1: Run the installer**

| | |
|---|---|
| 🖥️ **Environment** | SANDBOX |
| 🔧 **App** | File Explorer |
| 👤 **Admin** | No |
| **Directory** | C:\Shared\installers |

- Double-click `vs_BuildTools.exe`
```

### ❌ Assuming Terminal State

```markdown
cd C:\project
npm install
```

**Problem**: Previous command may have changed directory.

### ✅ Absolute Paths

```markdown
npm install --prefix C:\project
```

Or explicitly state working directory in environment table.

### ✅ Explicit Directory Change (First Command Pattern)

When a terminal type appears for the first time in a stage, start with an explicit `cd` command to set the working directory. This is more visible than a table entry and matches user expectations.

```markdown
**Step 1: Run the test suite**

| | |
|---|---|
| 🖥️ **Environment** | SANDBOX |
| 🔧 **Terminal** | PowerShell |
| 👤 **Admin** | No |

```powershell
cd C:\project\tests
.\run-tests.ps1
```
```

**Why this works:**
- Users expect to set their location before running commands
- `cd` command is harder to miss than a table entry
- Subsequent steps in the same stage can omit `cd` (assume same directory)
- Use `%USERPROFILE%`, `$HOME`, or other variables when path varies by user

**When to use:**
- First command after opening a new terminal in a stage
- First command in a new stage if changing directories
- After returning from a different environment (e.g., GUI back to terminal)

### ❌ Bundled Commands Without Explanation

```markdown
git clone repo && cd repo && npm install && npm start
```

**Problem**: If one fails, user doesn't know where.

### ✅ Separate Steps with Verification

```markdown
**Step 1: Clone repository**
```cmd
git clone https://github.com/user/repo.git C:\repo
```

**Step 2: Install dependencies**

| | |
|---|---|
| **Directory** | C:\repo |

```cmd
npm install
```
```

## Success Criteria Summary

End with a clear summary table:

```markdown
## Success Criteria Summary

| Test | Platform | Criteria |
|------|----------|----------|
| 1 | Windows/vcpkg | pack_install works, 28 tests pass |
| 2 | Ubuntu | System packages work |
| 3 | macOS | Homebrew packages work |
```

## Files Reference

Reference existing examples:

- [example 1.md](C:\Users\Groot\example 1.md) - example document 1
- [example 2.md](C:\Users\Groot\example 2.md) - example document 2
- [example 3.md](C:\Users\Groot\example 3.md) - example document 3

## Related Skills

- [documentation-organization](../documentation-organization/) - Overall documentation structure
- [swi-prolog-planning-workflow](../swi-prolog-planning-workflow/) - Development planning patterns

Note: Discourse trick to get the contents of SKILL.md to show as presented. Do not use triple backtick as bookends instead use on separate lines <pre><code> and </code></pre>, yes those are HTML tags and yes Discourse does allow these two to pass through without modification.


If you haven’t seen superpowers yet, it’s worth a look. It’s extremely popular and includes a number of skills that may be relevant to what you’re looking for.
image

2 Likes

@EricGT Thanks for the warm welcome and useful replies. So basically, you suggest that it’s possible to assign “low” reasoning to a well-defined, simple task or when using a strong SKILL.md. That makes sense; I will continue to explore the reasoning levels to get a better understanding of “safe” assignments.

1 Like

I would agree with that in general.

Also note that

  • Do not overfocus on that suggestion
  • More than one skill will be needed

Here is what my list of skills were for one project a few months ago

2 Likes

Here is another use case I just ran, this time with a more advanced model and thinking enabled. In hindsight, given the nature of the task, a lower-tier model with thinking disabled would likely have worked just as well.

While building with CMake, a few hundred instances of the same error appeared. All of them ultimately traced back to a change in a single C macro.

I used the more advanced model with thinking enabled to perform the analysis that led from the compiler errors back to the specific macro definition. That kind of reasoning step—from noisy build output to the root macro line—is not something I would expect a lower model with thinking disabled to handle reliably.

However, once the initial macro was identified, the remaining work was to document the macro expansion chain. At that point I did not change the model or thinking settings, even though the task had become straightforward: simple read-only searching and documentation of the expansion steps. A lower model without thinking would have been sufficient for that phase.

In practice, I keep a close eye on token usage over a given time window. When I have plenty of tokens available, I tend not to switch models. When tokens are tight, I switch more aggressively, or even fall back to issuing manual commands and doing parts of the work myself to conserve tokens.


Also see


Update - have to post in this topic as limited to only two replies in a row.

Over the past two days, I have been working on skills and their supporting scripts while also submitting GitHub pull requests. A few hours into the task, I remembered this topic and switched to a lower-capacity model.

The lower model did not produce the strongest answer on the first attempt, nor was it able to complete as much of the work independently. However, this introduced an interesting side effect: I had to actively guide the model toward a better solution, drawing on prior experience and carefully reviewing assumptions and lines of reasoning that were incorrect. In practice, this meant thinking through the problem again rather than simply reviewing a proposed action and approving it. The process was unexpectedly engaging.

As a concrete example, I have a Visual Studio Code project using multi-root workspaces with two separate Git repositories open; not in the same workspace. The model incorrectly inferred that one of the workspace folders was itself a Git repository, when in fact only the skills directory was version-controlled. After verifying the repository structure and explicitly providing the full path to the .git directory, the model recognized the error and produced an improved response.

While the revised answer was still not optimal, it represented a clear step toward the final solution.

1 Like