Project Structure¶

This page explains what each module does and where to find things.

Top Level¶

agent-vulnerability-research/
├── agent_redteam/       # Main package
├── tests/               # Test suite
├── examples/            # Example scripts
├── docs/                # Documentation (this site)
├── pyproject.toml       # Package metadata, dependencies, tool config
├── mkdocs.yml           # Documentation site config
├── README.md            # Landing page
├── LICENSE              # Apache 2.0
├── CONTRIBUTING.md      # Contributor guide
├── SECURITY.md          # Responsible disclosure policy
└── CODE_OF_CONDUCT.md   # Contributor covenant

Package Structure¶

`agent_redteam/core/` — Foundation¶

The core module contains everything that other modules depend on.

File	Purpose
`enums.py`	All StrEnum types: `VulnClass` (17 classes), `TrustBoundary` (7 boundaries), `EventType`, `SignalTier`, `Severity`, `StealthLevel`, `AttackComplexity`, `ScanProfile`, `RiskTier`
`models.py`	All Pydantic v2 models: `AgentTask`, `Event`, `AgentTrace`, `AttackTemplate`, `Attack`, `Signal`, `ScanConfig`, `ScanResult`, `CompositeScore`, etc.
`protocols.py`	Python Protocol interfaces: `AgentAdapter`, `SignalDetector`, `ClassScorer`, `ReportFormatter`
`errors.py`	Custom exception hierarchy: `AgentRedTeamError`, `AdapterError`, `TemplateError`, `ScanError`, `BudgetExhaustedError`
`events.py`	Simple in-process `EventBus` for telemetry

`agent_redteam/adapters/` — Agent Integration¶

File	Purpose
`callable.py`	`CallableAdapter` — wraps any async function, delegates tool execution to `EnvironmentRuntime`
`llm.py`	`LLMAdapter` — wraps OpenAI-compatible endpoints with a minimal ReAct loop
`langchain.py`	`LangChainAdapter` — wraps LangChain AgentExecutor/LangGraph via callbacks; re-exports `wrap_tools_with_canaries()`
`openai_agents.py`	`OpenAIAgentsAdapter` — wraps OpenAI Agents SDK via RunHooks
`http.py`	`HttpAdapter` — wraps any agent exposed over HTTP, with tool-call extraction from structured JSON or free text
`canary_wrapper.py`	`CanaryInjector` + framework-specific wrappers (`wrap_langchain_tools`, `wrap_openai_agent_tools`, `wrap_callable_tools`)
`mcp_proxy.py`	`McpProxyAdapter` — stdio MCP proxy with optional description/response injection

`agent_redteam/attacks/` — Attack Pipeline¶

File	Purpose
`registry.py`	`AttackRegistry` — loads YAML templates, indexes by class/stealth/ID
`planner.py`	`AttackPlanner` — selects, filters, prioritizes attacks based on config
`executor.py`	`AttackExecutor` — runs single-shot attacks against the agent
`adaptive.py`	`AdaptiveExecutor` — multi-turn attacks with attacker LLM follow-ups
`templates/`	86 YAML attack definitions organized by vulnerability class (V1-V8, V12)

`agent_redteam/environments/` — Synthetic Environments¶

File	Purpose
`builder.py`	`EnvironmentBuilder` — fluent API for constructing environments; includes `select_environment_profile()`, `inject_attack()`, `build_for_attack()`, and `copy()`
`runtime.py`	`EnvironmentRuntime` — stateful tool execution engine with mutable filesystem, shell, HTTP (with `NetworkPolicy`), SQL, email inbox/outbox, git, and CRM state
`canary.py`	`CanaryTokenGenerator` — generates realistic fake secrets
`definitions/`	Pre-built environment YAML files (SWE, Customer Support, Data Analyst) with rich seed data (file trees, email threads, CSV datasets, credentials)

`agent_redteam/detectors/` — Signal Detection¶

File	Purpose
`base.py`	`DetectorRegistry` — manages available detectors
`secret_access.py`	`SecretAccessDetector` — canary tokens, secret file paths (V6)
`exfiltration.py`	`ExfiltrationDetector` — unauthorized outbound requests, canary domain detection via `trace.environment` (V7)
`injection_success.py`	`InjectionSuccessDetector` — payload echo, task divergence (V1, V2)
`tool_misuse.py`	`ToolMisuseDetector` — dangerous commands, path traversal (V5)
`scope_violation.py`	`ScopeViolationDetector` — out-of-scope tool usage (V1, V2, V3, V5)
`excessive_agency.py`	`ExcessiveAgencyDetector` — unauthorized high-impact actions (V3)
`insecure_output.py`	`InsecureOutputDetector` — XSS, injection in agent output (V4)
`memory_poison.py`	`MemoryPoisonDetector` — embedded instructions in memory writes (V8)
`mcp_security.py`	`McpSecurityDetector` — MCP/supply-chain signals (V12, V5)
`llm_judge.py`	`SemanticJudgeDetector` — optional LLM-as-judge over traces (all classes)

`agent_redteam/scoring/` — Security Scoring¶

File	Purpose
`statistics.py`	Wilson score confidence interval computation
`class_scorers.py`	`DefaultClassScorer` — per-class vulnerability scoring
`composite.py`	`CompositeScorer` — aggregates into overall score with blast radius
`engine.py`	`ScoringEngine` — orchestrates the scoring pipeline

`agent_redteam/reporting/` — Report Generation¶

File	Purpose
`renderer.py`	`ReportRenderer` — dispatches to format-specific renderers
`json_fmt.py`	`JsonFormatter` — machine-readable JSON output
`markdown.py`	`MarkdownFormatter` — human-readable Markdown report
`terminal.py`	`TerminalFormatter` — colored terminal output via rich
`html.py`	`HtmlFormatter` — self-contained interactive HTML dashboard (open in any browser)
`behavioral.py`	`analyze_behavioral_risks()` — aggregates trace data into behavioral risk narratives

`agent_redteam/runner/` — Orchestration¶

File	Purpose
`scanner.py`	`Scanner` — top-level orchestrator, the main entry point
`budget.py`	`BudgetTracker` — monitors resource consumption

`agent_redteam/taxonomy/` — Vulnerability Metadata¶

File	Purpose
`vulns.py`	`VULN_METADATA` — name, description, severity, OWASP/MITRE refs for each class
`boundaries.py`	`BOUNDARY_METADATA` — trust boundary definitions and diagnostic questions

Test Structure¶

tests/
├── core/               # Unit tests for enums, models, environments
├── attacks/            # Registry, planner tests
├── adapters/           # Adapter tests (LangChain, OpenAI Agents, MCP proxy)
├── detectors/          # Per-detector unit tests
├── scoring/            # Scorers, confidence intervals
├── integration/        # End-to-end scanner tests, pytest plugin
└── validation/             # Ground-truth tests with mock agents
    ├── mock_agents.py      # 6 deterministic mock agents (compliant_leaker, shell_executor, eager_agent, echo_agent, memory_truster, hardened_agent)
    └── test_ground_truth.py  # Calibration matrix + per-detector unit tests + per-agent integration tests

The suite currently collects 159 tests (pytest --collect-only).

Data Flow¶

flowchart TB
    YAML["YAML Templates (86)"] --> Registry[AttackRegistry]
    Registry --> Planner[AttackPlanner]
    Config[ScanConfig] --> Planner
    Planner --> Suite[AttackSuite]
    Suite --> Executor[AttackExecutor]
    Executor --> Builder[EnvironmentBuilder]
    Builder --> Canary[CanaryTokenGenerator]
    Executor --> Adapter[AgentAdapter]
    Adapter --> Trace[AgentTrace]
    Trace --> Detectors[DetectorRegistry]
    Detectors --> Signals[Signals]
    Signals --> Engine[ScoringEngine]
    Engine --> Score[CompositeScore]
    Score --> Renderer[ReportRenderer]

Project Structure¶

Top Level¶

Package Structure¶

agent_redteam/core/ — Foundation¶

agent_redteam/adapters/ — Agent Integration¶

agent_redteam/attacks/ — Attack Pipeline¶

agent_redteam/environments/ — Synthetic Environments¶

agent_redteam/detectors/ — Signal Detection¶

agent_redteam/scoring/ — Security Scoring¶

agent_redteam/reporting/ — Report Generation¶

agent_redteam/runner/ — Orchestration¶

agent_redteam/taxonomy/ — Vulnerability Metadata¶