What it is:
Aardvark is a GPT-5–powered autonomous agent that acts like a human security researcher. It continuously scans codebases to find, validate, and help fix vulnerabilities. It’s in private beta.
How it works (pipeline):
-
Analysis: Builds a repo-specific threat model by reading the entire codebase.
-
Commit scanning: Monitors new commits and also back-scans history; explains suspected vulns with annotated code.
-
Validation: Reproduces issues in a sandbox to confirm exploitability and cut false positives.
-
Patching: Proposes one-click patches via Codex; fixes are attached to each finding for human review.
It integrates with GitHub and existing workflows; uses LLM reasoning and tool use rather than traditional fuzzing/SCA.
Impact/Results so far:
-
Running for months on OpenAI’s internal repos and external alpha partners; surfaced meaningful, hard-to-trigger issues.
-
On “golden” test repos, detected 92% of known/synthetic vulnerabilities (high recall).
Open source stance:
-
Already found and responsibly disclosed multiple OSS vulnerabilities; ten received CVE IDs.
-
Plans pro-bono scanning for select non-commercial OSS projects.
-
Updated outbound disclosure policy to prioritize collaboration over rigid timelines.
Why it matters:
-
Software vulns are systemic risk (40k+ CVEs in 2024; ~1.2% of commits introduce bugs).
-
“Defender-first” model: continuous, validated protection with actionable fixes—without slowing development.
Availability:
- Private beta open to select partners and OSS projects (apply to join).
Contributors listed: Akshay Bhat, Andy Nguyen, Dave Aitel, Harold Nguyen, Ian Brelinsky, Tiffany Citra, Xin Hu, Matt Knight.