I’ve been working on strategies to spawn subagents more efficiently. Here is one of my use cases.
I usually create an implementation plan and ask the main agent to spawn customized subagents to work through the detailed releases in that plan. Each release is executed in rounds.
Each round has a few stages:
The main agent spawns an executor subagent.
The executor implements the required changes.
Tests are run. If they fail, the task goes back to the executor.
The result is reviewed. If the review fails, the task goes back to the first step.
I use to put a clear stop condition or acceptable situation to avoid big loops. Each agent has it’s own configuration.
Is that some kind of overengineering since it could be something similar to the feature /goal? When it can worth use rather than using goal?
How are you managing explicitly spawned subagents? When do you think they are worth the extra cost?
And even for small tasks like just one subagent to review and other to proceed with the fixes/improvements of the findings, when it worth or just let it delegated to the main agent that usually can handle too despite the context rot?
The failure mode I keep seeing with explicit subagents is not just context loss, it is missing lifecycle proof. Once a child is spawned, the parent needs a small receipt for requested role, resolved model or tools, task boundary, spend or time budget, and why the child stopped. Otherwise teams either duplicate work because silence looks like failure, or trust output they cannot audit later. The orchestration layer gets calmer once child admission and child completion are both visible artifacts instead of hidden runtime state.
I agree, this is a pain point that needs to be managed on top of Codex because subagents can currently get lost.
We recently had a topic about this where it was not even clear whether we were looking at a bug or something else entirely:
With Codex I prefer to spawn subagents for distinct, one-shot tasks only and then have the orchestrator monitor the status until the agent is done and can be despawned.
I made a little paper about how to 1. burn token more efficiently and 2. keep track of what is done and what not even when agents fail… you can even stop and start a new session and just tell the new turn to check where to start and it already knows.
Or in short - treat the agents as what they are: stupid little junior devs who master leet code stuff without experience
rafa3 I had bad experiences with those tests… at some point the agents started running matrix tests for modules and permission on every task which resulted in massive loss of speed.
I rather run teh tests manually like every other sprint. It is like in legacy development based on TDD - which was slow as well. And now with agents we can just tell the bots to add the tests as a nice to have and use them only once you have a golife and beyond. And honestly once you go online you should not use subagents anymore. You need a full review and another then for everything that goes online.
I’d separate exploratory subagents from release-boundary subagents. During exploration, I agree not every child needs a full matrix run; that can turn orchestration into ceremony.
But before anything touches a shared branch, production surface, or customer-visible state, the orchestrator should require a small closeout: files changed, tests run or intentionally skipped, known risks, and why the child stopped. If we describe subagents as junior devs, that is exactly why the receipt matters. The trick is keeping the receipt cheaper than another rerun.