Add Autonomous Legal War Game — adversarial multi-agent M&A stress-tester by fmalik100 · Pull Request #116 · docker/welcome-to-docker

fmalik100 · 2026-03-10T21:43:01Z

Summary

Adds legal_warroom/ — a standalone Python tool that stress-tests legal documents (M&A agreements, NDAs, etc.) using an adversarial multi-agent simulation
Multi-round automatic agent interaction — a Plaintiff Agent (Red Team) attacks each clause, a Defense Agent (Blue Team) hardens the language, then the Plaintiff re-attacks the hardened rewrite; this loops until convergence or a configurable round cap
Dual provider support — runs on Anthropic (claude-opus-4-6 with adaptive thinking) or 100% locally via Ollama (no API costs) with any compatible model (qwen2.5:14b, llama3.1:8b, etc.)

Architecture

legal_warroom/
├── main.py                          CLI (typer) — --provider, --model, --rounds, --parallel, --html
├── warroom/
│   ├── providers/
│   │   ├── base.py                  LLMProvider protocol + make_provider() factory
│   │   ├── anthropic_p.py           Anthropic SDK backend
│   │   └── ollama_p.py              Ollama backend (OpenAI-compatible API)
│   ├── agents/
│   │   ├── plaintiff.py             Red Team — hunts ambiguity, liability gaps, edge cases
│   │   └── defense.py               Blue Team — precision redrafting, preserves business intent
│   ├── loop/
│   │   └── adversarial.py           Multi-round loop with convergence detection
│   ├── models/schemas.py            Pydantic structured outputs + IterativeSegmentReport
│   ├── document/processor.py        PDF/TXT ingestion, section-aware segmentation
│   ├── orchestrator.py              Sequential / parallel segment processing
│   └── report/generator.py          Terminal + JSON + HTML report output

Test plan

python main.py sample.txt --max-segments 1 — single segment dry run (Anthropic)
python main.py sample.txt --provider ollama --model llama3.1:8b --max-segments 1 — Ollama dry run
python main.py sample.txt --rounds 2 --html — multi-round with HTML report
python main.py sample.txt --parallel --max-segments 3 — parallel processing
Verify JSON report written to output/
Verify exit code 2 when CRITICAL vulnerabilities remain

🤖 Generated with Claude Code

…tester Implements a full Red Team / Blue Team adversarial pipeline for M&A contract stress-testing using claude-opus-4-6 with adaptive thinking and Pydantic structured outputs. - warroom/models/schemas.py: Pydantic schemas for AttackVector, PlaintiffAnalysis, DefenseAnalysis, and SegmentReport with computed risk scores - warroom/document/processor.py: PDF/TXT ingestion with section-header-aware segmentation, falling back to word-count chunking - warroom/agents/plaintiff.py: Red Team agent — hunts ambiguity, indemnification gaps, liability exposure, and black-swan edge cases (severity 1-5) - warroom/agents/defense.py: Blue Team agent — precision redrafting to neutralise each attack vector while preserving business intent - warroom/orchestrator.py: Drives the pipeline sequentially or in parallel via a thread pool; rich progress display - warroom/report/generator.py: Terminal summary, JSON, and self-contained HTML report generation - main.py: Typer CLI with --parallel, --html, --max-segments, and --words flags Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Agents now automatically interact across multiple rounds: Plaintiff re-attacks the Defense's hardened clause each round, Defense re-patches, until convergence or max rounds is reached. No human intervention required between rounds. Provider abstraction: - warroom/providers/base.py: LLMProvider protocol + make_provider() factory - warroom/providers/anthropic_p.py: Anthropic SDK (claude-opus-4-6, adaptive thinking) - warroom/providers/ollama_p.py: Ollama via OpenAI-compatible endpoint (100% local, free) Multi-round adversarial loop: - warroom/loop/adversarial.py: Plaintiff attacks current clause → Defense hardens → Plaintiff re-attacks hardened clause → repeat until convergence or max_rounds - Convergence detection: stops early when severity drops to threshold (default ≤2) - Severity trajectory tracked per segment (e.g. 4 → 3 → 2 shows convergence) Schema updates (warroom/models/schemas.py): - AdversarialRound: captures one Red→Blue exchange - IterativeSegmentReport: full round history, severity_trajectory, risk_reduction, initial_risk_score, converged property CLI updates (main.py): - --provider anthropic|ollama (default: anthropic) - --model <name> (default varies by provider) - --plaintiff-provider / --plaintiff-model (per-agent overrides for mixing providers) - --defense-provider / --defense-model - --ollama-url (default: http://localhost:11434/v1) - --rounds (default: 3) - --convergence (severity threshold, default: 2) Report updates: terminal + JSON + HTML now show per-round breakdowns, severity trajectories, risk reduction stats, and convergence status. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

fmalik100 and others added 2 commits March 10, 2026 17:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Autonomous Legal War Game — adversarial multi-agent M&A stress-tester#116

Add Autonomous Legal War Game — adversarial multi-agent M&A stress-tester#116
fmalik100 wants to merge 2 commits intodocker:mainfrom
fmalik100:claude/strange-sinoussi

fmalik100 commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fmalik100 commented Mar 10, 2026

Summary

Architecture

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant