Forge is a Rust workspace implementing a spec-first software factory stack. It orchestrates multi-stage AI workflows using directed graphs defined in Graphviz DOT syntax — nodes are tasks (LLM calls, human reviews, tool executions, conditional branches), and edges define the flow between them.
Upstream spec references:
- Attractor ecosystem source: https://github.com/strongdm/attractor
- Factory vision: https://factory.strongdm.ai/
- Rust toolchain (nightly 1.95.0+):
rustup install nightly - An LLM API key (for real execution):
OPENAI_API_KEYorANTHROPIC_API_KEY
cargo build# Dry-run with mock backend (no API key needed)
cargo run -p forge-cli -- run --dot-file examples/01-linear-foundation.dot --backend mock
# Real execution with an LLM backend
export OPENAI_API_KEY="sk-..."
cargo run -p forge-cli -- run --dot-file examples/01-linear-foundation.dot --backend agentA Forge pipeline is a .dot file — a directed graph where:
- Nodes are tasks (LLM prompts, human gates, tool calls, parallel fan-outs)
- Edges are transitions with optional conditions and weights
- Attributes configure behavior (retries, goals, model selection)
The engine walks the graph from the start node to the exit node, executing each handler and selecting the next edge based on the outcome.
digraph HelloWorld {
graph [goal="Write a hello world script"]
start [shape=Mdiamond]
write [shape=box, prompt="Write a hello world Python script for: $goal"]
exit [shape=Msquare]
start -> write -> exit
}Save as hello.dot and run:
cargo run -p forge-cli -- run --dot-file hello.dot --backend mock| Shape | Type | What It Does |
|---|---|---|
Mdiamond |
Start | Entry point (every pipeline needs exactly one) |
Msquare |
Exit | Exit point (every pipeline needs exactly one) |
box |
Codergen (LLM) | Calls an LLM with a prompt. Default for plain nodes. |
hexagon |
Wait.Human (HITL) | Pauses for human input — presents edge labels as choices |
diamond |
Conditional | Pass-through; the engine evaluates edge conditions |
component |
Parallel | Fans out to multiple branches concurrently |
tripleoctagon |
Fan-In | Consolidates parallel branch results |
parallelogram |
Tool | Executes a shell command |
house |
Manager Loop | Supervisor polling loop for child pipelines |
Edges can have conditions that control routing based on the previous node's outcome:
implement -> review [condition="outcome=success"]
implement -> retry [condition="outcome=fail"]Conditions support =, !=, &&, and context variable lookups (context.key=value).
Mark critical nodes with goal_gate=true — the pipeline won't exit successfully unless all goal gates have succeeded:
implement [shape=box, prompt="Write the code", goal_gate=true]
test [shape=box, prompt="Run tests", goal_gate=true]If a goal gate hasn't succeeded when the engine reaches the exit node, it routes to retry_target instead.
Use hexagon nodes to pause for human decisions. The outgoing edge labels become the choices:
review [shape=hexagon, label="Review"]
review -> approve [label="[A] Approve"]
review -> revise [label="[R] Request Changes"]Configure which LLM model handles which nodes using a CSS-like stylesheet:
graph [model_stylesheet="
* { llm_model: gpt-4o; llm_provider: openai; }
.critical { llm_model: claude-opus-4-6; llm_provider: anthropic; }
#final_review { reasoning_effort: high; }
"]Specificity order: * (universal) < shape name < .class < #node_id.
cargo run -p forge-cli -- run [OPTIONS]Pipeline source (one required):
--dot-file <PATH>— Path to a.dotfile--dot-source <STRING>— Inline DOT source
Options:
--backend <MODE>— Execution backend (default:agent)agent— Forge agent with LLM provider from environmentmock— Dry-run mode, no LLM callsclaude-code— Claude Code CLI adaptercodex-cli— Codex CLI adaptergemini-cli— Gemini CLI adapter
--interviewer <MODE>— Human-in-the-loop mode (default:auto)auto— Console if interactive terminal, else auto-approveconsole— Always prompt in terminalqueue— Use pre-loaded answers from--human-answer
--human-answer <STRING>— Pre-loaded answer (repeatable, forqueuemode)--run-id <ID>— Custom run identifier--logs-root <PATH>— Root directory for artifacts--event-json— Output events as JSON lines--no-stream-events— Disable event streaming
cargo run -p forge-cli -- resume --checkpoint <PATH> --dot-file <PATH> [OPTIONS]Same options as run, plus:
--checkpoint <PATH>— Path to checkpoint file (required)
cargo run -p forge-cli -- inspect-checkpoint --checkpoint <PATH> [--json]A simple plan → summarize flow:
digraph LinearFoundation {
graph [goal="Generate a basic implementation plan"]
start [shape=Mdiamond, label="Start"]
plan [shape=box, prompt="Create a short implementation plan for: $goal"]
summarize [shape=box, prompt="Summarize the plan in 3 bullet points"]
exit [shape=Msquare, label="Exit"]
start -> plan -> summarize -> exit
}A pipeline with human approval and a revision loop:
digraph HitlReviewGate {
graph [goal="Ship a safe patch with human review"]
start [shape=Mdiamond]
implement [shape=box, prompt="Draft the patch for: $goal"]
review_gate [shape=hexagon, label="Review Outcome"]
refine [shape=box, prompt="Apply reviewer feedback and improve the patch"]
ship [shape=box, prompt="Prepare final release notes"]
exit [shape=Msquare]
start -> implement -> review_gate
review_gate -> ship [label="[A] Approve"]
review_gate -> refine [label="[R] Request Changes"]
refine -> review_gate
ship -> exit
}Three investigation branches run in parallel, merge results, then a human picks the direction:
digraph ParallelTriageFanIn {
graph [goal="Diagnose issue, compare options, and pick a direction"]
start [shape=Mdiamond]
triage [shape=component, label="Parallel Triage", join_policy="all_success"]
investigate_logs [shape=box, prompt="Analyze logs and summarize likely root causes"]
inspect_tests [shape=box, prompt="Summarize failing tests and breakpoints"]
review_recent_changes [shape=box, prompt="Identify risky recent changes"]
merge_findings [shape=tripleoctagon, label="Merge Findings"]
decision_gate [shape=hexagon, label="Choose Direction"]
fast_fix [shape=box, prompt="Create a minimal-risk fast fix plan"]
deep_fix [shape=box, prompt="Create a full remediation plan"]
exit [shape=Msquare]
start -> triage
triage -> investigate_logs
triage -> inspect_tests
triage -> review_recent_changes
investigate_logs -> merge_findings
inspect_tests -> merge_findings
review_recent_changes -> merge_findings
merge_findings -> decision_gate
decision_gate -> fast_fix [label="[F] Fast Fix"]
decision_gate -> deep_fix [label="[D] Deep Fix"]
fast_fix -> exit
deep_fix -> exit
}See examples/ for runnable versions of these pipelines.
| Crate | Path | Purpose |
|---|---|---|
forge-llm |
crates/forge-llm |
Unified multi-provider LLM client (OpenAI + Anthropic) |
forge-agent |
crates/forge-agent |
Coding agent loop (session state machine, tools, provider profiles) |
forge-attractor |
crates/forge-attractor |
DOT pipeline parser, graph IR, execution engine, and node handlers |
forge-cli |
crates/forge-cli |
CLI binary for running, resuming, and inspecting pipelines |
forge-cxdb-runtime |
crates/forge-cxdb-runtime |
CXDB runtime integration (persistence contracts, typed store, fakes) |
forge-cxdb |
crates/forge-cxdb |
Vendored CXDB client SDK (binary protocol, TLS, reconnect) |
Forge has three tiers of tests. All #[ignore]d tests fail hard if their prerequisites are missing — no silent skipping.
cargo test # 647 tests, all deterministic
cargo test -p forge-attractor --tests # attractor integration tests
cargo test -p forge-llm # LLM client tests
cargo test -p forge-agent # agent tests
cargo test -p forge-cli --tests # CLI smoke tests
cargo test -p forge-cxdb-runtime # CXDB runtime tests (fakes)These need local services or CLIs installed, but NOT API keys.
# CXDB tests — need running CXDB server on :9009/:9010
cargo test -p forge-cxdb-runtime --test live -- --ignored # 7 tests
cargo test -p cxdb --test integration -- --ignored # 2 tests
# CLI agent e2e — need claude/codex/gemini CLIs (OAuth, no API keys)
cargo test -p forge-llm --test cli_agent_e2e -- --ignored # 9 tests
# Full-stack e2e pipeline — DOT file → CLI agent → JSONL events → disk artifacts → CXDB
cargo test -p forge-cli --test e2e_pipeline -- --ignored # 8 testsThe e2e pipeline tests exercise the complete stack end-to-end: parse a real DOT file,
run it through forge-cli with a real CLI agent backend (Claude Code, Codex, Gemini),
verify JSONL event streams, check disk artifacts (manifest.json, status.json,
prompt.md, response.md), and validate CXDB persistence records. Each test runs
in an isolated git sandbox in /tmp — no project files are touched.
cargo test -p forge-llm --test openai_live -- --ignored # needs OPENAI_API_KEY
cargo test -p forge-llm --test anthropic_live -- --ignored # needs ANTHROPIC_API_KEY
cargo test -p forge-agent --test openai_live -- --ignored # needs OPENAI_API_KEY
cargo test -p forge-agent --test anthropic_live -- --ignored # needs ANTHROPIC_API_KEY
cargo test -p forge-attractor --test live -- --ignored # needs OPENAI_API_KEY| Variable | Purpose |
|---|---|
OPENAI_API_KEY |
OpenAI provider authentication |
ANTHROPIC_API_KEY |
Anthropic provider authentication |
OPENAI_BASE_URL |
Override OpenAI API endpoint |
ANTHROPIC_BASE_URL |
Override Anthropic API endpoint |
FORGE_CXDB_PERSISTENCE |
CXDB persistence mode (off or required) |
FORGE_CXDB_BINARY_ADDR |
CXDB binary protocol address |
FORGE_CXDB_HTTP_BASE_URL |
CXDB HTTP base URL |
FORGE_CLAUDE_BIN |
Path to Claude Code CLI binary |
FORGE_CODEX_BIN |
Path to Codex CLI binary |
FORGE_GEMINI_BIN |
Path to Gemini CLI binary |
Source-of-truth specs live in spec/:
spec/00-vision.md— Vision, principles, techniquesspec/01-unified-llm-spec.md— Unified LLM clientspec/02-coding-agent-loop-spec.md— Coding agent loopspec/03-attractor-spec.md— Attractor pipeline orchestrationspec/04-cxdb-integration-spec.md— CXDB persistence integrationspec/05-factory-control-plane-spec.md— Factory control-planespec/06-unified-agent-provider-spec.md— Unified agent provider
See CONTRIBUTING.md and AGENTS.md for coding standards, test expectations, and spec-alignment requirements.