Skip to content

Getting Started

This guide takes you from zero to your first security scan in under 5 minutes.

Prerequisites

  • Python 3.11+
  • An OpenAI-compatible model endpoint (local or remote) — or your own agent function

Installation

# Clone the repository
git clone https://github.com/saichandrapandraju/agent-redteam.git
cd agent-redteam

# Install with HTTP support (needed for LLMAdapter)
pip install -e ".[http]"

# For terminal reports with colors
pip install -e ".[http,rich]"

# For development (adds pytest, mypy, ruff)
pip install -e ".[dev,http,rich]"

Option A: Scan a Model Endpoint

The fastest path — point at any OpenAI-compatible API and scan it immediately.

1. Set your credentials

Create a .env file (or export environment variables):

BASE_URL=http://localhost:8000/v1
API_KEY=your-api-key
MODEL=your-model-name

2. Run the scan

import asyncio
import os
from dotenv import load_dotenv
from agent_redteam import Scanner, ScanConfig
from agent_redteam.adapters import LLMAdapter

load_dotenv()

async def main():
    adapter = LLMAdapter(
        base_url=os.environ["BASE_URL"],
        api_key=os.environ["API_KEY"],
        model=os.environ["MODEL"],
    )
    config = ScanConfig.quick()
    scanner = Scanner(adapter, config)
    result = await scanner.run()

    # Print markdown report
    print(scanner.report(result, format="markdown"))

    # Save JSON for CI integration
    with open("scan_result.json", "w") as f:
        f.write(scanner.report(result, format="json"))

    # Save interactive HTML report
    with open("scan_report.html", "w") as f:
        f.write(scanner.report(result, format="html"))

asyncio.run(main())

3. Read the results

The scan outputs:

  • Overall score (0--100, higher is more secure)
  • Risk tier (CRITICAL / HIGH / MODERATE / LOW)
  • Per-class scores for each vulnerability category tested
  • Findings with severity, evidence, and mitigation guidance

See Understanding Results for a detailed breakdown.

Option B: Scan Your Own Agent

If you have an agent built with any framework, wrap it with CallableAdapter:

from agent_redteam import Scanner, ScanConfig
from agent_redteam.adapters import CallableAdapter

async def my_agent(task, tools, context):
    """Your agent function — receives a task, tools dict, and context."""
    result = tools["file_read"](path=task.instruction)
    return f"File contents: {result}"

adapter = CallableAdapter(my_agent, name="my-agent")
config = ScanConfig.quick()
result = await Scanner(adapter, config).run()

See Scanning Agents for the full adapter guide.

What Happens During a Scan

sequenceDiagram
    participant You
    participant Scanner
    participant Planner
    participant Executor
    participant Agent
    participant Detectors
    participant Scorer

    You->>Scanner: run()
    Scanner->>Planner: plan(config)
    Planner-->>Scanner: AttackSuite
    loop Each Attack
        Scanner->>Executor: execute(attack, agent)
        Executor->>Agent: run(task, environment)
        Agent-->>Executor: AgentTrace
        Executor->>Detectors: analyze(trace)
        Detectors-->>Executor: Signals
    end
    Scanner->>Scorer: score(results)
    Scorer-->>Scanner: CompositeScore
    Scanner-->>You: ScanResult
  1. The planner selects attacks based on your agent's declared capabilities
  2. The scanner automatically selects a base environment profile (e.g., swe_agent, customer_support_agent, data_analyst_agent) based on the agent's declared tools, then for each attack, the executor builds an isolated synthetic environment via build_for_attack() — seeding canary secrets, mock tools, and poisoned data
  3. Your agent runs the task, and all its actions are recorded into an AgentTrace (which includes the Environment used for that run)
  4. Detectors analyze the trace for security signals — using both static rules and the trace's attached environment context (e.g., network rules, canary domains) for accurate detection
  5. The scorer aggregates everything into a composite score with confidence intervals

Next Steps