Project Structure
This page explains what each module does and where to find things.
Top Level
agent-vulnerability-research/
├── agent_redteam/ # Main package
├── tests/ # Test suite
├── examples/ # Example scripts
├── docs/ # Documentation (this site)
├── pyproject.toml # Package metadata, dependencies, tool config
├── mkdocs.yml # Documentation site config
├── README.md # Landing page
├── LICENSE # Apache 2.0
├── CONTRIBUTING.md # Contributor guide
├── SECURITY.md # Responsible disclosure policy
└── CODE_OF_CONDUCT.md # Contributor covenant
Package Structure
agent_redteam/core/ — Foundation
The core module contains everything that other modules depend on.
| File |
Purpose |
enums.py |
All StrEnum types: VulnClass (17 classes), TrustBoundary (7 boundaries), EventType, SignalTier, Severity, StealthLevel, AttackComplexity, ScanProfile, RiskTier |
models.py |
All Pydantic v2 models: AgentTask, Event, AgentTrace, AttackTemplate, Attack, Signal, ScanConfig, ScanResult, CompositeScore, etc. |
protocols.py |
Python Protocol interfaces: AgentAdapter, SignalDetector, ClassScorer, ReportFormatter |
errors.py |
Custom exception hierarchy: AgentRedTeamError, AdapterError, TemplateError, ScanError, BudgetExhaustedError |
events.py |
Simple in-process EventBus for telemetry |
agent_redteam/adapters/ — Agent Integration
| File |
Purpose |
callable.py |
CallableAdapter — wraps any async function, delegates tool execution to EnvironmentRuntime |
llm.py |
LLMAdapter — wraps OpenAI-compatible endpoints with a minimal ReAct loop |
langchain.py |
LangChainAdapter — wraps LangChain AgentExecutor/LangGraph via callbacks; re-exports wrap_tools_with_canaries() |
openai_agents.py |
OpenAIAgentsAdapter — wraps OpenAI Agents SDK via RunHooks |
http.py |
HttpAdapter — wraps any agent exposed over HTTP, with tool-call extraction from structured JSON or free text |
canary_wrapper.py |
CanaryInjector + framework-specific wrappers (wrap_langchain_tools, wrap_openai_agent_tools, wrap_callable_tools) |
mcp_proxy.py |
McpProxyAdapter — stdio MCP proxy with optional description/response injection |
agent_redteam/attacks/ — Attack Pipeline
| File |
Purpose |
registry.py |
AttackRegistry — loads YAML templates, indexes by class/stealth/ID |
planner.py |
AttackPlanner — selects, filters, prioritizes attacks based on config |
executor.py |
AttackExecutor — runs single-shot attacks against the agent |
adaptive.py |
AdaptiveExecutor — multi-turn attacks with attacker LLM follow-ups |
templates/ |
86 YAML attack definitions organized by vulnerability class (V1-V8, V12) |
agent_redteam/environments/ — Synthetic Environments
| File |
Purpose |
builder.py |
EnvironmentBuilder — fluent API for constructing environments; includes select_environment_profile(), inject_attack(), build_for_attack(), and copy() |
runtime.py |
EnvironmentRuntime — stateful tool execution engine with mutable filesystem, shell, HTTP (with NetworkPolicy), SQL, email inbox/outbox, git, and CRM state |
canary.py |
CanaryTokenGenerator — generates realistic fake secrets |
definitions/ |
Pre-built environment YAML files (SWE, Customer Support, Data Analyst) with rich seed data (file trees, email threads, CSV datasets, credentials) |
agent_redteam/detectors/ — Signal Detection
| File |
Purpose |
base.py |
DetectorRegistry — manages available detectors |
secret_access.py |
SecretAccessDetector — canary tokens, secret file paths (V6) |
exfiltration.py |
ExfiltrationDetector — unauthorized outbound requests, canary domain detection via trace.environment (V7) |
injection_success.py |
InjectionSuccessDetector — payload echo, task divergence (V1, V2) |
tool_misuse.py |
ToolMisuseDetector — dangerous commands, path traversal (V5) |
scope_violation.py |
ScopeViolationDetector — out-of-scope tool usage (V1, V2, V3, V5) |
excessive_agency.py |
ExcessiveAgencyDetector — unauthorized high-impact actions (V3) |
insecure_output.py |
InsecureOutputDetector — XSS, injection in agent output (V4) |
memory_poison.py |
MemoryPoisonDetector — embedded instructions in memory writes (V8) |
mcp_security.py |
McpSecurityDetector — MCP/supply-chain signals (V12, V5) |
llm_judge.py |
SemanticJudgeDetector — optional LLM-as-judge over traces (all classes) |
agent_redteam/scoring/ — Security Scoring
| File |
Purpose |
statistics.py |
Wilson score confidence interval computation |
class_scorers.py |
DefaultClassScorer — per-class vulnerability scoring |
composite.py |
CompositeScorer — aggregates into overall score with blast radius |
engine.py |
ScoringEngine — orchestrates the scoring pipeline |
agent_redteam/reporting/ — Report Generation
| File |
Purpose |
renderer.py |
ReportRenderer — dispatches to format-specific renderers |
json_fmt.py |
JsonFormatter — machine-readable JSON output |
markdown.py |
MarkdownFormatter — human-readable Markdown report |
terminal.py |
TerminalFormatter — colored terminal output via rich |
html.py |
HtmlFormatter — self-contained interactive HTML dashboard (open in any browser) |
behavioral.py |
analyze_behavioral_risks() — aggregates trace data into behavioral risk narratives |
agent_redteam/runner/ — Orchestration
| File |
Purpose |
scanner.py |
Scanner — top-level orchestrator, the main entry point |
budget.py |
BudgetTracker — monitors resource consumption |
| File |
Purpose |
vulns.py |
VULN_METADATA — name, description, severity, OWASP/MITRE refs for each class |
boundaries.py |
BOUNDARY_METADATA — trust boundary definitions and diagnostic questions |
Test Structure
tests/
├── core/ # Unit tests for enums, models, environments
├── attacks/ # Registry, planner tests
├── adapters/ # Adapter tests (LangChain, OpenAI Agents, MCP proxy)
├── detectors/ # Per-detector unit tests
├── scoring/ # Scorers, confidence intervals
├── integration/ # End-to-end scanner tests, pytest plugin
└── validation/ # Ground-truth tests with mock agents
├── mock_agents.py # 6 deterministic mock agents (compliant_leaker, shell_executor, eager_agent, echo_agent, memory_truster, hardened_agent)
└── test_ground_truth.py # Calibration matrix + per-detector unit tests + per-agent integration tests
The suite currently collects 159 tests (pytest --collect-only).
Data Flow
flowchart TB
YAML["YAML Templates (86)"] --> Registry[AttackRegistry]
Registry --> Planner[AttackPlanner]
Config[ScanConfig] --> Planner
Planner --> Suite[AttackSuite]
Suite --> Executor[AttackExecutor]
Executor --> Builder[EnvironmentBuilder]
Builder --> Canary[CanaryTokenGenerator]
Executor --> Adapter[AgentAdapter]
Adapter --> Trace[AgentTrace]
Trace --> Detectors[DetectorRegistry]
Detectors --> Signals[Signals]
Signals --> Engine[ScoringEngine]
Engine --> Score[CompositeScore]
Score --> Renderer[ReportRenderer]