Built a red-team skill for Codex that tests prompt injection, MCP poisoning, and concealed agent actions

gangj277 · March 28, 2026, 5:40am

I built HackYourAgent, a public skill bundle for red-teaming coding-agent workflows.

The target use case is narrow on purpose: if you are building or shipping agents with Codex, this is meant to be the adversarial pass you run before trusting the workflow.

What it tests:

prompt injection
MCP/tool poisoning
memory poisoning
approval confusion
concealed side effects

What it actually does:
it maps trust boundaries in a repo or staging agent, creates paired control + attack trials, inspects outputs one by one, and saves findings/evidence/regressions under redteam/.

I focused on Codex-native usage instead of building a separate heavyweight eval stack, so the repo includes:

a native Codex skill wrapper
a self-contained installer
seeded vulnerable example targets for RAG injection, MCP poisoning, and concealment

Repo:

What I’d most like feedback on from Codex users:
does the control-vs-attack forensic workflow feel like the right shape for a Codex skill, or is it still too heavy for normal repo use?

Topic		Replies	Views
Tips and Tricks for using Codex Codex community , best-practices	20	17808	March 3, 2026
Context Pack — MCP tool for high-signal context handoff between AI agents Plugins / Actions builders plugin-development	0	150	February 22, 2026
Flow Studio MCP — Power Automate debugging and building skills for Codex Codex mcp	0	98	April 3, 2026
What do you Prompt Codex with, for App Security Passes? Codex CLI	6	221	April 17, 2026
Repo file-based task management in Codex - example solution Codex codex	0	1591	June 4, 2025

Built a red-team skill for Codex that tests prompt injection, MCP poisoning, and concealed agent actions

Related topics