An agentic software development lifecycle framework that maps AI agents to enterprise SDLC roles — so organizations can adopt agentic coding incrementally, with every persona accountable to a human.
Built on Deep Agents (runtime), Superpowers (TDD methodology), and the A2A Protocol (agent communication).
Enterprise engineering organizations want to adopt agentic coding but face three blockers:
-
No governance model. Existing frameworks don't map to the roles and approval chains that regulated industries require. When an AI agent writes code, who approved the spec it implemented? Who reviewed the architecture? Who signed off on the test plan?
-
All-or-nothing adoption. Most agentic tools assume full autonomy. Teams can't start with "agents draft, humans decide" and gradually increase autonomy as trust builds.
-
No observability. When agents produce artifacts across a multi-step workflow, there's no structured trace showing what happened, what was approved, and where things went wrong.
Every agent maps to a traditional software development role. The human in that role owns the output.
| Persona | Human Owner | What it does |
|---|---|---|
| Product Manager | PM / Product Owner | Generates PRDs, prioritizes backlogs, writes user stories |
| Architect | Tech Lead | Produces technical specs, evaluates tech debt |
| Developer | Dev Team | Implements code via TDD with subagent-driven execution |
| QA | QA Lead | Designs tests, analyzes results, validates quality |
| Scrum Leader | Engineering Manager | Plans sprints, tracks workflows, orchestrates execution |
| Stakeholder Proxy | Product Owner | Simulates stakeholder feedback, generates executive updates |
A CTO can look at this system and see their org chart. That's the point.
Teams configure how much agents do versus how much humans do:
Level 1 — Assist. Agents draft every artifact. Humans review and approve each one before the next phase begins. Implementation is human-led with agent pair programming.
Level 2 — Hybrid. Agents own planning artifacts autonomously. Humans approve at phase boundaries (e.g., approve the story batch, then agents TDD each story). Human code review at story completion.
Level 3 — Autonomous. Agents execute the full workflow. Humans approve at epic/sprint boundaries. Automated quality gates (test pass rate, review scores) determine proceed/block.
The policy layer intercepts every agent-to-agent handoff and enforces the configured level. Moving from Level 1 to Level 2 is a config change, not a rewrite.
Every persona action, skill execution, handoff, and approval gate decision emits OpenTelemetry spans with structured attributes. This isn't bolted on — it's the first code in the project.
trace: sdlc_workflow
└── persona.product_manager
├── skill.prd_generator
├── approval_gate.prd_review
│ ├── approval.outcome = approved
│ └── gate_duration_ms = 1200
└── handoff.product_manager_to_architect
└── artifact.type = prd
This gives teams the data to justify moving from Level 1 to Level 2: "Here's the approval rejection rate. Here's the defect rate. Here's the time savings."
All implementation follows the Superpowers methodology — not as a suggestion, but as a mandatory workflow:
- Brainstorm before writing code.
- Plan in small tasks (2–5 minutes each).
- RED-GREEN-REFACTOR — write the failing test first. Always.
- Subagent review — two-stage (spec compliance, then code quality).
- Finish cleanly — verify tests pass, present merge options.
Personas communicate via the Agent2Agent Protocol — the open standard for agent interoperability. Each persona publishes an Agent Card describing its capabilities. Handoffs carry typed metadata (artifact path, context summary, trace ID). This means personas can eventually run as independent services, not just in-process — enabling multi-team, multi-framework agent collaboration.
┌─────────────────────────────────────────────────────────────┐
│ Autonomy Policy Layer │
│ (Level 1: assist → Level 2: hybrid → Level 3: auto) │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌───────────┐ │
│ │ PM │ │ Architect│ │ Dev │ │ QA │ │
│ │ Persona │ │ Persona │ │ Persona │ │ Persona │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └─────┬─────┘ │
│ │ │ │ │ │
│ ┌────▼─────┐ ┌────▼─────┐ ┌────▼─────┐ ┌─────▼─────┐ │
│ │ PM Skills│ │ Arch │ │Superpwr │ │ Layered │ │
│ │ │ │ Skills │ │TDD Cycle │ │ Testing │ │
│ │ │ │ │ │ │ │ │ │
│ │ │ │ │ │ │ │ │ │
│ └──────────┘ └──────────┘ └──────────┘ └───────────┘ │
│ │
├─────────────────────────────────────────────────────────────┤
│ A2A Protocol (handoffs + discovery) │
├─────────────────────────────────────────────────────────────┤
│ Deep Agents SDK (orchestration runtime) │
├─────────────────────────────────────────────────────────────┤
│ OpenTelemetry (traces, spans, metrics) │
└─────────────────────────────────────────────────────────────┘
superagents/
├── libs/
│ ├── superagents/ # SDK (extended from Deep Agents)
│ │ └── superagents/
│ │ └── telemetry/ # OpenTelemetry instrumentation
│ ├── sdlc/ # SDLC integration package
│ │ └── src/superagents_sdlc/
│ │ ├── brainstorm/ # LangGraph brainstorm subgraph (HITL)
│ │ ├── personas/ # PM, Architect, Developer, QA persona facades
│ │ ├── skills/ # PM, engineering, QA skills + LLM abstraction
│ │ ├── policy/ # Autonomy policy engine + approval gates
│ │ ├── handoffs/ # A2A-shaped handoff transport + registry
│ │ ├── workflows/ # Pipeline orchestrator + narrative writer
│ │ └── cli.py # Standalone CLI (superagents-sdlc command)
│ ├── cli/ # Terminal UI (Textual, Deep Agents)
│ ├── harbor/ # Evaluation/benchmark framework
│ └── partners/ # Integration packages
├── .github/ # CI/CD
└── CLAUDE.md # Development standards
Active development. The SDLC integration package (libs/sdlc) implements four core personas, eleven skills, an autonomy policy engine, A2A-shaped handoff system, pipeline orchestrator with automated QA retry, prompt caching, phased code plan generation, interactive CLI with human-in-the-loop, and a LangGraph brainstorm subgraph. 345 tests, all passing.
| Phase | What | Tests |
|---|---|---|
| 1 | OpenTelemetry instrumentation (spans for personas, skills, handoffs, approval gates) | 15 |
| 2 | BasePersona ABC, BaseSkill contract, PolicyEngine, A2A handoff system, PersonaRegistry | 41 |
| 3 | PM persona with LLM abstraction (PrdGenerator, PrioritizationEngine, UserStoryWriter) | 30 |
| 4 | Architect + Developer personas (TechSpecWriter, ImplementationPlanner, CodePlanner) | 43 |
| 5 | QA persona (SpecComplianceChecker, ValidationReportGenerator) | 30 |
| 6 | Executable plan format (plan parser, structured QA input, Superpowers format) | 9 |
| 7 | Pipeline orchestrator (PipelineOrchestrator with named workflow methods) | 17 |
| 8 | Standalone CLI + AnthropicLLMClient (argparse, --stub, --json, streaming, rate-limit retry) |
19 |
| 9 | QA feedback loop (FindingsRouter, automated single-retry pass with cascade) | 23 |
| 10 | Interactive CLI mode (human approval gates, free-text revision, NarrativeWriter) | 17 |
| 11 | LangGraph brainstorm subgraph (HITL interrupt/resume, 6-section design brief) | 29 |
| 12 | Brief-to-pipeline integration (--brief, --codebase-context flags) |
9 |
| 13 | Pipeline hardening (model assignment tuning, NEEDS WORK/FAILED calibration, rich skill summaries) | 10 |
| 14 | Prompt caching (Anthropic cache_control breakpoints, stable prefix across pipeline calls) |
5 |
| 15 | Phased code plan generation (per-phase LLM calls, phased revision of flagged phases only) | 8 |
- Harbor evaluation framework (automated artifact quality scoring)
- Deep Agents TUI integration
- Python 3.12+
- uv
git clone https://github.com/mhosner/superagents.git
cd superagents/libs/sdlc
# Install the SDLC package (editable, with test deps)
uv sync --group test
# Optional: install Anthropic SDK for real LLM calls
uv sync --group test --extra anthropic# Brainstorm a feature interactively (builds a design brief)
uv run superagents-sdlc brainstorm "Add recurring tasks" \
--codebase-context ./CLAUDE.md \
--output-dir ./output
# Run the full pipeline with the brief
uv run superagents-sdlc idea-to-code "Add recurring tasks" \
--brief ./output/design_brief.md \
--output-dir ./output/pipeline -i
# Non-interactive pipeline (fire-and-forget)
uv run superagents-sdlc idea-to-code "Add dark mode" --output-dir ./output
# With stub responses (no API key needed, for testing)
uv run superagents-sdlc idea-to-code "Add dark mode" --output-dir ./output --stubFour subcommands:
superagents-sdlc brainstorm <idea> --output-dir <dir> # Interactive brainstorm → design brief
superagents-sdlc idea-to-code <idea> --output-dir <dir> # Full: PM → Arch → Dev → QA
superagents-sdlc spec-from-prd <prd> --user-stories <s> --output-dir <dir> # Skip PM
superagents-sdlc plan-from-spec --plan <p> --spec <s> --output-dir <dir> # Skip PM + ArchKey flags:
-i/--interactive— Human approval gate after QA (approve/revise/quit)--brief <path>— Feed a design brief into the pipeline (from brainstorm)--codebase-context <path>— Codebase description for better artifacts--stub— Use canned responses instead of Anthropic API--json— Dump PipelineResult as JSON to stdout
cd libs/sdlc
uv run --group test pytest tests/ # Run unit tests (no network)
uv run --group test ruff check src/ tests/ # Lint
uv run --group test ruff format src/ tests/ # FormatAll development follows the Superpowers TDD methodology. See CLAUDE.md for standards.
Superagents wouldn't exist without these projects:
- Deep Agents — The SDK and agent harness this project is forked from.
- Superpowers by Jesse Vincent — The TDD-first agentic development methodology that is the engineering backbone of this project.
- BMAD Method — The persona-driven agile AI framework that inspired the SDLC role mapping and adoption gradient.
- MySecond.ai Skills — The PM skill definitions ported into the persona layer.
- A2A Protocol — The open standard for agent-to-agent communication.
MIT