Built a pytest-style behavioral testing package for agents. It checks tool usage, tool order, step limits, and regression against baselines. Validated on live OpenAI Agents SDK tests

ashutosh-rath02 · April 28, 2026, 6:16am

I built AgentCheck(pygent-test on PyPI), a pytest-style behavioral testing package for AI agents. Instead of checking exact text, it checks behavior: tool usage, tool order, step limits, unsupported success claims, and baseline regressions. I’ve validated it on live OpenAI Agents SDK examples, including a single-tool and multi-tool workflow. I’d love feedback from people testing real agent systems.

Topic		Replies	Views
A solution for the missing agents playground API agents , ai-agents , agents-sdk	0	306	April 9, 2025
I built ContribArena — a live arena for evaluating coding agents through real open-source PRs Community project , ai-agents	1	86	May 27, 2026
Varden - Securing the Agent Execution Layer: Sandboxing Local CLI Tools and Handling Runtime Tool-Call Drift Community project , ai-agents , ai-security	0	68	May 23, 2026
Built a red-team skill for Codex that tests prompt injection, MCP poisoning, and concealed agent actions Codex	0	300	March 28, 2026
How to increase reliability? Let's compile best practices API hallucinations , agents , agents-sdk	4	954	September 1, 2025

Built a pytest-style behavioral testing package for agents. It checks tool usage, tool order, step limits, and regression against baselines. Validated on live OpenAI Agents SDK tests

Related topics