Skip to content

Configuration

ScanConfig

The ScanConfig object controls what gets tested and how. Use factory methods for common profiles or build a custom config.

Quick Scan (Default)

Fast smoke test — selects a small subset of attacks:

config = ScanConfig.quick(
    agent_capabilities=capabilities,
    vuln_classes=[VulnClass.V1_INDIRECT_INJECTION, VulnClass.V6_SECRET_EXPOSURE],
)

Release Gate

Thorough scan suitable for CI/CD pipelines:

config = ScanConfig.release_gate(
    agent_capabilities=capabilities,
)

Deep Red Team

Comprehensive assessment with all attack classes and multiple trials:

config = ScanConfig.deep_red_team(
    agent_capabilities=capabilities,
)

Profile Comparison

Setting Quick Release Gate Deep Red Team
Max attacks 15 40 Unlimited
Trials per attack 1 2 3
Stealth levels All All All
Timeout 5 min 15 min 60 min

Agent Capabilities

Declare what your agent can do so the planner selects relevant attacks:

from agent_redteam.core.models import AgentCapabilities, ToolCapability
from agent_redteam.core.enums import Severity

capabilities = AgentCapabilities(
    tools=[
        ToolCapability(name="file_read"),
        ToolCapability(name="shell"),
        ToolCapability(name="http_request"),
        ToolCapability(name="send_email"),
    ],
    has_internet_access=True,
    has_memory=False,
    data_sensitivity=Severity.HIGH,
)

Capability-Based Attack Selection

Capability Enables Classes
Any tools V3 (Excessive Agency), V5 (Tool Misuse)
Tool capabilities named mcp, mcp_tool, or mcp_server V12 (Supply Chain)
has_internet_access V7 (Data Exfiltration)
has_memory V8 (Memory Poisoning)
Always enabled V1, V2, V4, V6

Blast Radius

Capabilities also determine the blast radius factor (1.0x--3.0x) which adjusts the final score. An agent with more powerful capabilities gets penalized more heavily for the same vulnerability, because the potential damage is greater.

Factor Capabilities
1.0x Read-only tools, no internet
1.5x File write or shell access
2.0x Internet access + shell
2.5x--3.0x Internet + email + database + shell

Budget Configuration

Control resource consumption:

from agent_redteam.core.models import BudgetConfig

budget = BudgetConfig(
    max_attacks=20,         # Maximum number of attacks to run
    max_api_calls=200,      # Maximum LLM API calls
    max_cost_usd=5.0,       # Maximum estimated cost
    max_duration_seconds=600,  # Maximum scan duration
    trials_per_attack=2,    # Repeat each attack N times
)

Trials for confidence

Running multiple trials per attack (2--3) significantly narrows the confidence interval on scores. A single trial gives a wide CI; 3 trials gives a much tighter bound.

Vulnerability Class Filtering

Test specific classes only:

config = ScanConfig.quick(
    vuln_classes=[
        VulnClass.V1_INDIRECT_INJECTION,
        VulnClass.V2_DIRECT_INJECTION,
        VulnClass.V3_EXCESSIVE_AGENCY,
        VulnClass.V4_CONFUSED_DEPUTY,
        VulnClass.V5_TOOL_MISUSE,
        VulnClass.V6_SECRET_EXPOSURE,
        VulnClass.V7_DATA_EXFILTRATION,
        VulnClass.V8_MEMORY_POISONING,
        VulnClass.V12_SUPPLY_CHAIN,
    ],
)

Omit the vuln_classes parameter to test all classes relevant to your agent's capabilities.

JudgeConfig (LLM-as-judge)

JudgeConfig configures the optional SemanticJudgeDetector. It is not a field on ScanConfig; pass it to Scanner(adapter, config, judge_config=JudgeConfig(...)).

Field Type Description
base_url str OpenAI-compatible API base URL (default: https://api.openai.com/v1)
api_key str API key for the judge model
model str Model name (default: gpt-4o-mini)
temperature float Sampling temperature (default: 0.0)
max_tokens int Max tokens for judge completion (default: 1024)
evaluation_criteria list[str] Rubric dimensions (defaults include compliance, output safety, reasoning integrity, scope adherence)

When judge_config is omitted, only the nine built-in signal detectors run.

Environment Definitions

The framework includes pre-built environment definitions:

Environment Description Use Case
swe_agent Software engineering agent with shell, git, file tools Testing coding assistants
customer_support_agent CRM, email, knowledge base tools Testing support bots
data_analyst_agent SQL, file I/O, HTTP, shell tools Testing data agents

Automatic Environment Selection

The Scanner automatically selects the best environment profile based on the agent's declared tools via select_environment_profile(agent_capabilities). For example, if your agent declares send_email and lookup_customer tools, the framework selects customer_support_agent; if it declares sql_query or db_query, it selects data_analyst_agent; otherwise it defaults to swe_agent.

At execution time, each attack template's environment_setup is merged into the base profile via EnvironmentBuilder.inject_attack(), producing an isolated per-attack environment with canary secrets, poisoned data, and network rules.

Full Custom Config

config = ScanConfig(
    profile=ScanProfile.RELEASE_GATE,
    agent_capabilities=capabilities,
    vuln_classes=[VulnClass.V1_INDIRECT_INJECTION],
    budget=BudgetConfig(
        max_attacks=30,
        trials_per_attack=3,
        max_duration_seconds=900,
    ),
)