SCOUT

Firmware-to-Exploit Evidence Engine

Drop a firmware blob. Get SARIF findings, CycloneDX SBOM+VEX, and a hash-anchored evidence chain.

Why SCOUT?

Every finding has a hash-anchored evidence chain.

SCOUT does not emit a finding without a file path, byte offset, SHA-256 hash, and rationale. Artifacts are immutable and traceable from firmware blob to final verdict. No black-box scoring.

Static-only findings capped at 0.60 -- we don't inflate.

If a vulnerability hasn't been dynamically validated, its confidence is hard-capped. Promotion to confirmed requires at least one dynamic verification artifact. Honest confidence beats high numbers.

SARIF + CycloneDX VEX + SLSA provenance -- not another custom format.

Findings export to SARIF 2.1.0 for GitHub Code Scanning and VS Code. SBOM ships with CycloneDX 1.6 + Vulnerability Exploitability eXchange. Analysis artifacts carry SLSA Level 2 in-toto attestations.

What's New

Feature	Description
SARIF 2.1.0 Export	Standard findings output for GitHub Code Scanning, VS Code SARIF Viewer, and CI/CD integration
CycloneDX VEX	Vulnerability Exploitability eXchange states (exploitable / affected / not_affected) embedded in SBOM
Precise .dynstr Detection	ELF dynamic import table parsing replaces naive byte-scan; FORTIFY_SOURCE coverage detection
40+ SBOM Signatures	wolfSSL, mbedTLS, GoAhead, miniUPnPd, SQLite, U-Boot, lighttpd, and 30+ more (up from 8)
Ghidra Headless Scripts	4 analysis scripts: `decompile_all`, `xref_graph`, `dataflow_trace`, `string_refs`
AFL++ Performance	CMPLOG, persistent mode, NVRAM faker, multi-instance campaigns, `AFL_ENTRYPOINT` support
Reachability-Aware CVE	CVE confidence auto-adjusted by BFS network reachability analysis
SLSA L2 Provenance	in-toto attestation for analysis artifacts, cosign-ready verification
Benchmark Runner	Corpus-based quality measurement with precision / recall / FPR tracking
Quality Gate Overrides	Configurable thresholds via environment variables for CI/CD pipelines
GitHub Actions CI	Automated pytest (3.10-3.12), ruff lint, and pyright type checking on every push/PR
Findings SHA-256 Manifest	`stages/findings/stage.json` now carries per-artifact SHA-256 hashes for full evidence chain coverage
Handoff Validation	`firmware_handoff.json` is validated via `validate_handoff()` before write -- missing keys are caught early
Exploit Stage Isolation	Each exploit stage has independent import error handling; a single missing dependency no longer skips all five
v2.0: 8 New Analysis Stages	Enhanced source detection, semantic classification, taint propagation, FP verification, adversarial triage, PoC refinement, chain construction, C-source identification (34 -> 42 stages)
v2.1: Known CVE Signatures	`known_cve_signatures.py`: 13 CVE patterns (NETGEAR, D-Link, Linksys, ASUS, TP-Link, TRENDnet, Zyxel, Belkin) -- vendor/model/binary matching without SBOM
v2.1: Web Server Auto-Detection	`enhanced_source.py` auto-identifies httpd/lighttpd/boa binaries; HTTP input sources classified as `source_type: "http_input"` for prioritized taint analysis
v2.1: Ghidra Auto-Detection	`./scout` wrapper and `ghidra_bridge.py` probe `/opt/ghidra_`, `/usr/local/ghidra`, `/usr/share/ghidra*` -- `AIEDGE_GHIDRA_HOME` no longer required
v2.0: CLI Modularization	`__main__.py` split from ~4500 lines into 7 focused modules (~660 lines entry point)
v2.0: FirmAE Benchmarking	`benchmark_firmae.sh` for SCOUT vs FirmAE comparison; `unpack_firmae_dataset.sh` for dataset classification

How It Works

  1. Drop            2. Analyze              3. Collect               4. Review
  ─────────          ──────────              ──────────               ────────
  firmware.bin  ──>  42-stage pipeline  ──>  SARIF findings      ──>  Web viewer
                     runs automatically      CycloneDX SBOM+VEX      VS Code (SARIF)
                                             Evidence chain           GitHub Code Scanning
                                             SLSA attestation         TUI dashboard

Step 1 -- Point SCOUT at any firmware blob (or pre-extracted rootfs).

Step 2 -- The 42-stage pipeline runs end-to-end: unpacking, profiling, binary analysis, enhanced source detection, semantic classification, C-source identification, SBOM generation, CVE scanning, reachability analysis, taint propagation, FP verification, adversarial triage, security assessment, attack surface mapping, exploit chain construction, PoC refinement, optional Ghidra decompilation, optional AFL++ fuzzing.

Step 3 -- Outputs land in a structured run directory: SARIF 2.1.0 findings, CycloneDX 1.6 SBOM with VEX annotations, hash-anchored evidence chain, SLSA L2 provenance attestation, and executive Markdown report.

Step 4 -- Review results in the built-in web viewer, import SARIF into VS Code or GitHub Code Scanning, query artifacts via MCP server from Claude Code/Desktop, or inspect via TUI dashboard.

Quick Start

# Full analysis (all features enabled by default)
./scout analyze firmware.bin

# Deterministic only (no LLM)
./scout analyze firmware.bin --no-llm

# Pre-extracted rootfs (bypasses weak unpacking)
./scout analyze firmware.img --rootfs /path/to/extracted/rootfs

# Analysis-only profile (no exploit chain)
./scout analyze firmware.bin --profile analysis --no-llm

# SARIF export for CI/CD
./scout analyze firmware.bin --no-llm
# -> aiedge-runs/<run_id>/stages/findings/sarif.json

# MCP server for AI agents
./scout mcp --project-id aiedge-runs/<run_id>

# Web viewer
./scout serve aiedge-runs/<run_id> --port 8080

Comparison

Feature	SCOUT	EMBA	FACT	FirmAE
SBOM (CycloneDX 1.6)	Yes + VEX	Yes	No	No
SARIF 2.1.0 Export	Yes	No	No	No
Hash-Anchored Evidence Chain	Yes	No	No	No
SLSA L2 Provenance	Yes	No	No	No
Reachability-Aware CVE	Yes	No	No	No
Confidence Caps (honest scoring)	Yes	No	No	No
Ghidra Headless Integration	Yes	Yes	No	No
AFL++ Fuzzing Pipeline	Yes	No	No	No
3-Tier Emulation	Yes	Partial	No	Yes
MCP Server (AI agent integration)	Yes	No	No	No
LLM Triage + Synthesis	Yes	No	No	No
Web Report Viewer	Yes	Yes	Yes	No
Adversarial FP Reduction	Yes	No	No	No
Taint Propagation (LLM)	Yes	No	No	No
Zero pip Dependencies	Yes	No	No	No

Key Features

	Feature	Description
📦	SBOM & CVE	CycloneDX 1.6 SBOM (40+ signatures) + NVD API 2.0 CVE scanning with VEX and reachability-aware confidence
🔍	Binary Analysis	ELF hardening audit (NX/PIE/RELRO/Canary) + precise `.dynstr` symbol detection + FORTIFY_SOURCE + optional Ghidra headless decompilation
🎯	Attack Surface	Source-to-sink tracing, IPC detection (5 types), credential auto-mapping
🛡️	Security Assessment	X.509 certificate scanning, boot service auditing, filesystem permission checks
🧪	Fuzzing (optional)	AFL++ pipeline with CMPLOG, persistent mode, NVRAM faker, binary scoring, harness generation, crash triage — requires Docker + AFL++ image
🐛	Emulation	3-tier (FirmAE / QEMU user-mode / rootfs inspection) + GDB remote debugging
🤖	MCP Server	12 tools exposed via Model Context Protocol for Claude Code/Desktop integration
🧠	LLM Drivers	Codex CLI + Claude API + Ollama -- with cost tracking and budget limits
📊	Web Viewer	Glassmorphism dashboard with KPI bar, IPC map, risk heatmap, graph visualization
🔗	Evidence Chain	Hash-anchored artifacts, confidence caps, exploit tiering, verified chain gating
📜	SARIF Export	SARIF 2.1.0 findings for GitHub Code Scanning, VS Code SARIF Viewer, CI/CD
🔒	SLSA Provenance	Level 2 in-toto attestation for analysis artifacts, cosign-ready
📋	Executive Reports	Auto-generated Markdown reports with top risks, SBOM/CVE tables, attack surface
🔄	Firmware Diff	Compare two analysis runs -- filesystem, hardening, and config security changes
📈	Benchmark Runner	Corpus-based quality measurement with precision/recall/FPR tracking
🔌	Cross-Binary IPC Chains	5 IPC types (unix_socket, dbus, shm, pipe, exec_chain); shared `.rodata` string-based cross-binary communication detection
🏷️	Known CVE Signatures	13 built-in CVE patterns (NETGEAR, D-Link, Linksys, ASUS, TP-Link, TRENDnet, Zyxel, Belkin) matched by vendor/model/binary without SBOM

Pipeline (42 Stages)

Firmware --> Unpack --> Profile --> Inventory --> [Ghidra] --> Semantic Classification
    --> SBOM --> CVE Scan --> Reachability --> Endpoints --> Surfaces
    --> Enhanced Source --> C-Source Identification --> Taint Propagation
    --> FP Verification --> Adversarial Triage
    --> Security Assessment --> Graph --> Attack Surface --> Findings
    --> LLM Triage --> LLM Synthesis --> Emulation (3-tier) --> [Fuzzing]
    --> PoC Refinement --> Chain Construction --> Exploit Chain --> PoC --> Verification

New in v2.0: enhanced_source, semantic_classification, taint_propagation, fp_verification, adversarial_triage, poc_refinement, chain_construction, csource_identification.

v2.0 Stage Details:

Stage	Module	Purpose	LLM?	Cost
`enhanced_source`	`enhanced_source.py`	Web server auto-detection + INPUT_APIS scan (21 APIs)	No	$0
`semantic_classification`	`semantic_classifier.py`	3-pass function classifier (static, haiku, sonnet)	Yes	Low
`taint_propagation`	`taint_propagation.py`	HTTP-aware inter-procedural taint with call chain	Yes	Medium
`fp_verification`	`fp_verification.py`	3-pattern FP removal (sanitizer/non-propagating/sysfile)	No	$0
`adversarial_triage`	`adversarial_triage.py`	Advocate/Critic LLM debate for FPR reduction	Yes	Medium
`poc_refinement`	`poc_refinement.py`	Iterative PoC generation from fuzzing seeds (5 attempts)	Yes	Medium
`chain_construction`	`chain_constructor.py`	Same-binary + cross-binary IPC exploit chains	No	$0
`csource_identification`	`csource_identification.py`	HTTP input source identification via static sentinel + QEMU	No	$0

Stages in [brackets] require optional external tools (Ghidra, AFL++/Docker).

Architecture

+------------------------------------------------------------------+
|                      SCOUT (Evidence Engine)                      |
|                                                                   |
|  Firmware --> Unpack --> Profile --> Inventory --> SBOM --> CVE    |
|                                      (+ hardening)  (NVD 2.0)    |
|                                                         |         |
|  --> Security Assessment --> Surfaces --> Reachability --> Find    |
|      (cert/init/fs-perm)                 (BFS graph)              |
|                                                                   |
|  --> [Ghidra] --> LLM Triage --> LLM Synthesis                    |
|  --> Emulation --> [Fuzzing] --> Exploit --> PoC --> Verify        |
|                                                                   |
|  42 stages . stage.json manifests . SHA-256 hashed artifacts      |
|  Outputs: SARIF 2.1.0 + CycloneDX 1.6+VEX + SLSA L2 provenance  |
+------------------------------------------------------------------+
|                   Handoff (firmware_handoff.json)                  |
+------------------------------------------------------------------+
|                    Terminator (Orchestrator)                       |
|  Tribunal --> Validator --> Exploit Dev --> Verified Chain         |
|  (LLM judge)  (emulation)   (lab-gated)    (dynamic evidence)    |
+------------------------------------------------------------------+

Layer	Role	Deterministic?
SCOUT	Evidence production (extraction, profiling, inventory, surfaces, findings)	Yes
Handoff	JSON contract between engine and orchestrator	Yes
Terminator	LLM tribunal, dynamic validation, exploit development, report promotion	No (auditable)

Exploit Promotion Policy

Iron rule: no Confirmed without dynamic evidence.

Level	Requirements	Placement
`dismissed`	Critic rebuttal strong or confidence < 0.5	Appendix only
`candidate`	Confidence 0.5-0.8, evidence exists but chain incomplete	Report (flagged)
`high_confidence_static`	Confidence >= 0.8, strong static evidence, no dynamic	Report (highlighted)
`confirmed`	Confidence >= 0.8 AND >= 1 dynamic verification artifact	Report (top)
`verified_chain`	Confirmed AND PoC reproduced 3x in sandbox, complete chain	Exploit report

CLI Reference

Command	Description
`./scout analyze <firmware>`	Full firmware analysis pipeline
`./scout analyze-8mb <firmware>`	Truncated 8MB canonical track
`./scout stages <run_dir>`	Rerun specific stages on existing run
`./scout mcp --project-id <id>`	Start MCP stdio server
`./scout serve <run_dir>`	Launch web report viewer
`./scout tui <run_dir>`	Terminal UI dashboard
`./scout ti`	TUI `--interactive` mode (latest run)
`./scout tw <run_dir> -t 2`	TUI `--watch` mode (auto-refresh)
`./scout to`	TUI `--mode once` (latest run)
`./scout t`	TUI latest run (default mode)
`./scout corpus-validate <run_dir>`	Validate corpus manifest
`./scout quality-metrics <run_dir>`	Compute quality metrics
`./scout quality-gate <run_dir>`	Check quality thresholds
`./scout release-quality-gate <run_dir>`	Unified release gate

Exit codes: 0 success, 10 partial, 20 fatal, 30 policy violation

Environment Variables

Core

Variable	Default	Description
`AIEDGE_LLM_DRIVER`	`codex`	LLM provider: `codex` / `claude` / `ollama`
`ANTHROPIC_API_KEY`	--	API key for Claude driver
`AIEDGE_OLLAMA_URL`	`http://localhost:11434`	Ollama server URL
`AIEDGE_LLM_BUDGET_USD`	--	LLM cost budget limit
`AIEDGE_PRIV_RUNNER`	--	Privileged command prefix for dynamic stages
`AIEDGE_FEEDBACK_DIR`	`aiedge-feedback`	Terminator feedback directory

SBOM & CVE

Variable	Default	Description
`AIEDGE_NVD_API_KEY`	--	NVD API key (optional, improves rate limits)
`AIEDGE_NVD_CACHE_DIR`	`aiedge-nvd-cache`	Cross-run NVD response cache
`AIEDGE_SBOM_MAX_COMPONENTS`	`500`	Maximum SBOM components
`AIEDGE_CVE_SCAN_MAX_COMPONENTS`	`50`	Maximum components to CVE-scan
`AIEDGE_CVE_SCAN_TIMEOUT_S`	`30`	Per-request NVD API timeout

LLM Timeouts

Variable	Default	Description
`AIEDGE_LLM_CHAIN_TIMEOUT_S`	`180`	LLM synthesis timeout
`AIEDGE_LLM_CHAIN_MAX_ATTEMPTS`	`5`	LLM synthesis max retries
`AIEDGE_AUTOPOC_LLM_TIMEOUT_S`	`180`	Auto-PoC LLM timeout
`AIEDGE_AUTOPOC_LLM_MAX_ATTEMPTS`	`4`	Auto-PoC max retries

Ghidra

Variable	Default	Description
`AIEDGE_GHIDRA_HOME`	--	Ghidra installation path (auto-detected if not set)
`AIEDGE_GHIDRA_MAX_BINARIES`	`20`	Max binaries to analyze
`AIEDGE_GHIDRA_TIMEOUT_S`	`300`	Per-binary analysis timeout

Fuzzing (AFL++)

Variable	Default	Description
`AIEDGE_AFLPP_IMAGE`	`aflplusplus/aflplusplus`	AFL++ Docker image
`AIEDGE_FUZZ_BUDGET_S`	`3600`	Fuzzing time budget (seconds)
`AIEDGE_FUZZ_MAX_TARGETS`	`5`	Max fuzzing target binaries

Emulation

Variable	Default	Description
`AIEDGE_EMULATION_IMAGE`	`scout-emulation:latest`	Tier 1 Docker image
`AIEDGE_FIRMAE_ROOT`	`/opt/FirmAE`	FirmAE installation path
`AIEDGE_QEMU_GDB_PORT`	`1234`	QEMU GDB remote port

MCP & Port Scanning

Variable	Default	Description
`AIEDGE_MCP_MAX_OUTPUT_KB`	`512`	MCP response max size
`AIEDGE_PORTSCAN_TOP_K`	`1000`	Top-K ports to scan
`AIEDGE_PORTSCAN_WORKERS`	`128`	Concurrent scan workers
`AIEDGE_PORTSCAN_BUDGET_S`	`120`	Port scan time budget

Quality Gate Overrides

Variable	Default	Description
`AIEDGE_QG_PRECISION_MIN`	`0.9`	Minimum precision threshold
`AIEDGE_QG_RECALL_MIN`	`0.6`	Minimum recall threshold
`AIEDGE_QG_FPR_MAX`	`0.1`	Maximum false positive rate
`AIEDGE_QG_ABSTAIN_MAX`	`0.25`	Maximum abstention rate

Run Directory Structure

aiedge-runs/<run_id>/
├── manifest.json
├── firmware_handoff.json
├── provenance.intoto.jsonl          # SLSA L2 attestation
├── input/firmware.bin
├── stages/
│   ├── tooling/
│   ├── extraction/
│   ├── firmware_profile/
│   ├── inventory/
│   │   └── binary_analysis.json     # per-binary hardening data
│   ├── sbom/
│   │   ├── sbom.json                # CycloneDX 1.6 + CPE index
│   │   └── vex.json                 # VEX exploitability annotations
│   ├── cve_scan/
│   │   └── cve_scan.json            # NVD API CVE matches
│   ├── reachability/
│   │   └── reachability.json        # BFS reachability classification
│   ├── surfaces/
│   │   └── source_sink_graph.json
│   ├── ghidra_analysis/             # optional
│   ├── findings/
│   │   ├── stage.json                # SHA-256 manifest (evidence chain)
│   │   ├── pattern_scan.json
│   │   ├── credential_mapping.json
│   │   ├── chains.json
│   │   └── sarif.json               # SARIF 2.1.0 export
│   ├── fuzzing/                     # optional
│   │   └── fuzz_results.json
│   └── graph/
│       └── communication_graph.json
└── report/
    ├── report.json
    ├── analyst_digest.json
    └── executive_report.md

Verification Scripts

# Evidence chain integrity
python3 scripts/verify_analyst_digest.py --run-dir aiedge-runs/<run_id>
python3 scripts/verify_verified_chain.py --run-dir aiedge-runs/<run_id>

# Report schema compliance
python3 scripts/verify_aiedge_final_report.py --run-dir aiedge-runs/<run_id>
python3 scripts/verify_aiedge_analyst_report.py --run-dir aiedge-runs/<run_id>

# Security invariants
python3 scripts/verify_run_dir_evidence_only.py --run-dir aiedge-runs/<run_id>
python3 scripts/verify_network_isolation.py --run-dir aiedge-runs/<run_id>
python3 scripts/verify_exploit_meaningfulness.py --run-dir aiedge-runs/<run_id>

# SLSA provenance verification
cosign verify-attestation --type slsaprovenance \
  aiedge-runs/<run_id>/provenance.intoto.jsonl

# Quality gates
./scout quality-gate aiedge-runs/<run_id>
./scout release-quality-gate aiedge-runs/<run_id>

# FirmAE benchmarking (1,124 firmware images)
scripts/benchmark_firmae.sh --parallel 8 --time-budget 300 --cleanup
# Options: --dataset-dir, --results-dir, --parallel N, --time-budget S,
#          --stages STAGES, --max-images N, --8mb, --full, --cleanup, --dry-run
scripts/unpack_firmae_dataset.sh                   # FirmAE dataset classifier

Documentation

Document	Purpose
Blueprint	Full pipeline architecture and design rationale
Status	Current implementation status
Artifact Schema	Profiling + inventory artifact contracts
Adapter Contract	Terminator-SCOUT handoff protocol
Report Contract	Report structure and governance rules
Analyst Digest	Digest schema and verdict semantics
Verified Chain	Evidence requirements for verified chains
Duplicate Gate	Cross-run duplicate suppression rules
Determinism Policy	Replay gate rules and relaxation policy
Quality SLO	Precision, recall, FPR thresholds
Runbook	Operator flow for digest-first review
8MB Track Runbook	8MB truncated track operator guide
Known CVE Ground Truth	Known CVE ground truth for validation
Upgrade Plan v2	Full v2.0 upgrade plan with appendices
LLM Agent Roadmap	LLM integration roadmap and strategy

Security & Ethics

Authorized environments only.

SCOUT is intended for use in controlled environments with proper authorization:

Contracted security audits -- vendor-coordinated firmware assessments
Vulnerability research -- responsible disclosure with coordinated timelines
CTF and training -- designated targets in lab environments

Dynamic validation runs in network-isolated sandbox containers. Exploit profile and lab attestation are enabled by default. No weaponized payloads are included.

Contributing

Contributions are welcome. Before submitting a pull request:

Read Blueprint for architecture context
Run pytest -q -- all tests must pass
Lint ruff check src/ -- zero lint violations
Check pyright src/ -- zero type errors
Follow the existing stage protocol (see Stage in src/aiedge/stage.py)
Zero pip dependencies -- stdlib only for core modules

CI runs these checks automatically on every push and pull request via GitHub Actions.

For new pipeline stages, see the "Adding a New Pipeline Stage" section in CLAUDE.md.

License

MIT

_{Built for the security research community. Not for unauthorized access.}

github.com/R00T-Kim/SCOUT

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
.github		.github
benchmarks/corpus		benchmarks/corpus
docker/scout-emulation		docker/scout-emulation
docs		docs
poc_skeletons		poc_skeletons
scripts		scripts
src/aiedge		src/aiedge
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.ko.md		README.ko.md
README.md		README.md
SECURITY.md		SECURITY.md
exploit_runner.py		exploit_runner.py
pyproject.toml		pyproject.toml
pyrightconfig.json		pyrightconfig.json
ref.md		ref.md
scout		scout

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SCOUT

Firmware-to-Exploit Evidence Engine

Why SCOUT?

What's New

How It Works

Quick Start

Comparison

Key Features

Pipeline (42 Stages)

Architecture

Exploit Promotion Policy

Core

SBOM & CVE

LLM Timeouts

Ghidra

Fuzzing (AFL++)

Emulation

MCP & Port Scanning

Quality Gate Overrides

Documentation

Security & Ethics

Contributing

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SCOUT

Firmware-to-Exploit Evidence Engine

Why SCOUT?

What's New

How It Works

Quick Start

Comparison

Key Features

Pipeline (42 Stages)

Architecture

Exploit Promotion Policy

Core

SBOM & CVE

LLM Timeouts

Ghidra

Fuzzing (AFL++)

Emulation

MCP & Port Scanning

Quality Gate Overrides

Documentation

Security & Ethics

Contributing

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages