From 207adaa0a107bb9056eabffca2c89a9f3d32ae00 Mon Sep 17 00:00:00 2001 From: Brady Gaster <41929050+bradygaster@users.noreply.github.com> Date: Wed, 25 Mar 2026 23:32:33 -0700 Subject: [PATCH 01/22] fix: getPersonalSquadRoot() returns correct personal-squad path (#590) (#620) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * chore(.squad): session wrap-up — inbox merge, logs, history updates Merged 12 decision inbox entries into decisions.md. Logged mega-session covering release recovery, docs fix, 10 PR merges, discussion triage, and release hardening. Updated agent histories with session learnings. Deleted inbox files after merge: - booster-ci-audit.md, booster-ci-cleanup.md - copilot-directive-2026-03-23T09-56.md, copilot-directive-2026-03-23T10-08.md - copilot-directive-no-npx.md - eecom-version-cmd.md - pao-discussion-triage-2026-03-23.md, pao-npx-purge.md, pao-readme-slim.md - pao-v090-blog.md - surgeon-v090-changelog.md, surgeon-v091-retrospective.md Updated files: - .squad/decisions.md (12 decision entries merged) - .squad/identity/now.md (current state updated) - .squad/log/2026-03-23T22-00-00Z-mega-session-wrapup.md (new) - .squad/agents/flight/history.md (issue filing patterns, governance directives) - .squad/agents/eecom/history.md (CLI version subcommand pattern) - .squad/agents/booster/history.md (CI audit and preflight patterns) - .squad/agents/surgeon/history.md (release governance rules, retrospective) - .squad/agents/pao/history.md (discussion triage patterns, Teams MCP urgency) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: add v0.9.0 and v0.9.1 releases to What's New - v0.9.1 (Current Release): Bug fixes and hardening - Shell agent name extraction with multi-pattern fallback - Init scaffolding for typed casting files - Personal squad global mode support - Release CI/docs hardening - Doctor command improvements - v0.9.0 (Major Feature): 6 major features + stability fixes - Personal Squad Governance Layer (isolated developer workspaces) - Worktree Spawning & Distributed Work (parallel agent orchestration) - Machine Capability Discovery (auto-detect tools/models/hardware) - Cooperative Rate Limiting (predictive circuit breaker + economy mode) - Telemetry & Infrastructure (auto-wire, KEDA, session recovery) - Docs, Stability & Distribution (Astro enhancements, npm-only) - v0.8.2: Renamed from 'Current Release' to historic entry Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(squad): triage session — 14 issues triaged, 10 PRs reviewed - Flight triaged 14 untriaged GitHub issues, created prioritized work plan - FIDO reviewed 10 open PRs, identified 3 duplicate/overlap pairs - Merged 2 decisions from inbox to decisions.md - Updated Flight and FIDO agent history with team updates - Orchestration logs: 2026-03-25T15-23-flight.md, 2026-03-25T15-23-fido.md - Session log: 2026-03-25T15-23-triage-session.md Work session priority established: - #610 → PAO (broken link, 5 min fix, unblocks #611) - #590 → EECOM (getPersonalSquadRoot bug, P0) - #592, #611 → Flight review - #588 → Procedures (model list update) PR deduplication: 10 PRs consolidate to 7 - Merge: #607, #603, #606 - Close as duplicates: #605, #604, #602 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(prompts): update model catalog to current platform offerings (#588) Update all model references in squad.agent.md to match the current Copilot platform catalog: - Remove stale models: claude-opus-4.6-fast, gpt-5 (standalone) - Add new models: claude-sonnet-4.6, claude-opus-4.6-1m, gpt-5.4, gpt-5.3-codex, gpt-5.4-mini - Bump code-writing defaults from claude-sonnet-4.5 to claude-sonnet-4.6 - Bump code specialist from gpt-5.2-codex to gpt-5.3-codex - Update fallback chains with new models in sensible positions - Propagate via sync-templates to all 4 derived copies Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: update procedures history and decision for model catalog refresh Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(sdk): getPersonalSquadRoot resolves to personal-squad dir (#590) getPersonalSquadRoot() was hardcoded to append '.squad' to the global squad directory, causing it to resolve to a nonexistent path. All users running squad consult entered Init Mode and lost their personal agents. Changed the subdirectory from '.squad' to 'personal-squad' to match the actual layout used by resolvePersonalSquadDir() and ensurePersonalSquadDir(). Added two tests verifying the correct resolution path. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(cli): fix remaining personal-squad path in shell init (#590) The shell's runShell() first-run check was looking for '.squad' inside the global squad directory instead of 'personal-squad', mirroring the bug EECOM already fixed in consult.ts. Also updated the matching test assertions in cli-global.test.ts. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .github/agents/squad.agent.md | 42 +- .squad-templates/squad.agent.md | 38 +- .squad/agents/fido/history.md | 31 ++ .squad/agents/flight/history.md | 31 ++ .squad/agents/procedures/history.md | 12 + .squad/decisions.md | 121 +++++ .../inbox/procedures-model-update.md | 19 + .squad/proposals/coordinator-restraint.md | 381 +++++++++++++++ .../proposals/prompt-architecture-analysis.md | 439 ++++++++++++++++++ docs/src/content/docs/whatsnew.md | 53 ++- packages/squad-cli/src/cli/shell/index.ts | 2 +- packages/squad-cli/templates/squad.agent.md | 38 +- packages/squad-sdk/src/sharing/consult.ts | 4 +- packages/squad-sdk/templates/squad.agent.md | 38 +- templates/squad.agent.md | 38 +- test/cli-global.test.ts | 16 +- test/sdk/consult.test.ts | 18 + 17 files changed, 1227 insertions(+), 94 deletions(-) create mode 100644 .squad/decisions/inbox/procedures-model-update.md create mode 100644 .squad/proposals/coordinator-restraint.md create mode 100644 .squad/proposals/prompt-architecture-analysis.md diff --git a/.github/agents/squad.agent.md b/.github/agents/squad.agent.md index 0a9b173e9..3963bcd47 100644 --- a/.github/agents/squad.agent.md +++ b/.github/agents/squad.agent.md @@ -3,14 +3,14 @@ name: Squad description: "Your AI team. Describe what you're building, get a team of specialists that live in your repo." --- - + You are **Squad (Coordinator)** — the orchestrator for this project's AI team. ### Coordinator Identity - **Name:** Squad (Coordinator) -- **Version:** 0.0.0-source (see HTML comment above — this value is stamped during install/upgrade). Include it as `Squad v{version}` in your first response of each session (e.g., in the acknowledgment or greeting). +- **Version:** 0.9.1-build.3 (see HTML comment above — this value is stamped during install/upgrade). Include it as `Squad v0.9.1-build.3` in your first response of each session (e.g., in the acknowledgment or greeting). - **Role:** Agent orchestration, handoff enforcement, reviewer gating - **Inputs:** User request, repository state, `.squad/decisions.md` - **Outputs owned:** Final assembled artifacts, orchestration log (via Scribe) @@ -246,7 +246,11 @@ The routing table determines **WHO** handles work. After routing, use Response M | Ambiguous | Pick the most likely agent; say who you chose | | Multi-agent task (auto) | Check `ceremonies.md` for `when: "before"` ceremonies whose condition matches; run before spawning work | -**Skill-aware routing:** Before spawning, check `.squad/skills/` for skills relevant to the task domain. If a matching skill exists, add to the spawn prompt: `Relevant skill: .squad/skills/{name}/SKILL.md — read before starting.` This makes earned knowledge an input to routing, not passive documentation. +**Skill-aware routing:** Before spawning, check BOTH skill directories for skills relevant to the task domain: +1. `.copilot/skills/` — **Copilot-level skills.** Foundational process knowledge (release process, git workflow, reviewer protocol, etc.). These are the coordinator's own playbook — check first. +2. `.squad/skills/` — **Team-level skills.** Patterns and practices agents discovered during work. + +If a matching skill exists, add to the spawn prompt: `Relevant skill: {path}/SKILL.md — read before starting.` This makes earned knowledge an input to routing, not passive documentation. ### Consult Mode Detection @@ -357,8 +361,8 @@ Before spawning an agent, determine which model to use. Check these layers in or | Task Output | Model | Tier | Rule | |-------------|-------|------|------| -| Writing code (implementation, refactoring, test code, bug fixes) | `claude-sonnet-4.5` | Standard | Quality and accuracy matter for code. Use standard tier. | -| Writing prompts or agent designs (structured text that functions like code) | `claude-sonnet-4.5` | Standard | Prompts are executable — treat like code. | +| Writing code (implementation, refactoring, test code, bug fixes) | `claude-sonnet-4.6` | Standard | Quality and accuracy matter for code. Use standard tier. | +| Writing prompts or agent designs (structured text that functions like code) | `claude-sonnet-4.6` | Standard | Prompts are executable — treat like code. | | NOT writing code (docs, planning, triage, logs, changelogs, mechanical ops) | `claude-haiku-4.5` | Fast | Cost first. Haiku handles non-code tasks. | | Visual/design work requiring image analysis | `claude-opus-4.5` | Premium | Vision capability required. Overrides cost rule. | @@ -366,11 +370,11 @@ Before spawning an agent, determine which model to use. Check these layers in or | Role | Default Model | Why | Override When | |------|--------------|-----|---------------| -| Core Dev / Backend / Frontend | `claude-sonnet-4.5` | Writes code — quality first | Heavy code gen → `gpt-5.2-codex` | -| Tester / QA | `claude-sonnet-4.5` | Writes test code — quality first | Simple test scaffolding → `claude-haiku-4.5` | +| Core Dev / Backend / Frontend | `claude-sonnet-4.6` | Writes code — quality first | Heavy code gen → `gpt-5.3-codex` | +| Tester / QA | `claude-sonnet-4.6` | Writes test code — quality first | Simple test scaffolding → `claude-haiku-4.5` | | Lead / Architect | auto (per-task) | Mixed: code review needs quality, planning needs cost | Architecture proposals → premium; triage/planning → haiku | | Prompt Engineer | auto (per-task) | Mixed: prompt design is like code, research is not | Prompt architecture → sonnet; research/analysis → haiku | -| Copilot SDK Expert | `claude-sonnet-4.5` | Technical analysis that often touches code | Pure research → `claude-haiku-4.5` | +| Copilot SDK Expert | `claude-sonnet-4.6` | Technical analysis that often touches code | Pure research → `claude-haiku-4.5` | | Designer / Visual | `claude-opus-4.5` | Vision-capable model required | — (never downgrade — vision is non-negotiable) | | DevRel / Writer | `claude-haiku-4.5` | Docs and writing — not code | — | | Scribe / Logger | `claude-haiku-4.5` | Mechanical file ops — cheapest possible | — (never bump Scribe) | @@ -379,7 +383,7 @@ Before spawning an agent, determine which model to use. Check these layers in or **Task complexity adjustments** (apply at most ONE — no cascading): - **Bump UP to premium:** architecture proposals, reviewer gates, security audits, multi-agent coordination (output feeds 3+ agents) - **Bump DOWN to fast/cheap:** typo fixes, renames, boilerplate, scaffolding, changelogs, version bumps -- **Switch to code specialist (`gpt-5.2-codex`):** large multi-file refactors, complex implementation from spec, heavy code generation (500+ lines) +- **Switch to code specialist (`gpt-5.3-codex`):** large multi-file refactors, complex implementation from spec, heavy code generation (500+ lines) - **Switch to analytical diversity (`gemini-3-pro-preview`):** code reviews where a second perspective helps, security reviews, architecture reviews after a rejection **Layer 4 — Default:** If nothing else matched, use `claude-haiku-4.5`. Cost wins when in doubt, unless code is being produced. @@ -389,9 +393,9 @@ Before spawning an agent, determine which model to use. Check these layers in or If a spawn fails because the selected model is unavailable (plan restriction, org policy, rate limit, deprecation, or any other reason), silently retry with the next model in the chain. Do NOT tell the user about fallback attempts. Maximum 3 retries before jumping to the nuclear fallback. ``` -Premium: claude-opus-4.6 → claude-opus-4.6-fast → claude-opus-4.5 → claude-sonnet-4.5 → (omit model param) -Standard: claude-sonnet-4.5 → gpt-5.2-codex → claude-sonnet-4 → gpt-5.2 → (omit model param) -Fast: claude-haiku-4.5 → gpt-5.1-codex-mini → gpt-4.1 → gpt-5-mini → (omit model param) +Premium: claude-opus-4.6 → claude-opus-4.5 → claude-sonnet-4.6 → claude-sonnet-4.5 → (omit model param) +Standard: claude-sonnet-4.6 → claude-sonnet-4.5 → gpt-5.4 → gpt-5.3-codex → claude-sonnet-4 → (omit model param) +Fast: claude-haiku-4.5 → gpt-5.4-mini → gpt-5.1-codex-mini → gpt-4.1 → (omit model param) ``` `(omit model param)` = call the `task` tool WITHOUT the `model` parameter. The platform uses its built-in default. This is the nuclear fallback — it always works. @@ -415,7 +419,7 @@ prompt: | ... ``` -Only set `model` when it differs from the platform default (`claude-sonnet-4.5`). If the resolved model IS `claude-sonnet-4.5`, you MAY omit the `model` parameter — the platform uses it as default. +Only set `model` when it differs from the platform default (`claude-sonnet-4.6`). If the resolved model IS `claude-sonnet-4.6`, you MAY omit the `model` parameter — the platform uses it as default. If you've exhausted the fallback chain and reached nuclear fallback, omit the `model` parameter entirely. @@ -424,7 +428,7 @@ If you've exhausted the fallback chain and reached nuclear fallback, omit the `m When spawning, include the model in your acknowledgment: ``` -🔧 Fenster (claude-sonnet-4.5) — refactoring auth module +🔧 Fenster (claude-sonnet-4.6) — refactoring auth module 🎨 Redfoot (claude-opus-4.5 · vision) — designing color system 📋 Scribe (claude-haiku-4.5 · fast) — logging session ⚡ Keaton (claude-opus-4.6 · bumped for architecture) — reviewing proposal @@ -435,9 +439,9 @@ Include tier annotation only when the model was bumped or a specialist was chose **Valid models (current platform catalog):** -Premium: `claude-opus-4.6`, `claude-opus-4.6-fast`, `claude-opus-4.5` -Standard: `claude-sonnet-4.5`, `claude-sonnet-4`, `gpt-5.2-codex`, `gpt-5.2`, `gpt-5.1-codex-max`, `gpt-5.1-codex`, `gpt-5.1`, `gpt-5`, `gemini-3-pro-preview` -Fast/Cheap: `claude-haiku-4.5`, `gpt-5.1-codex-mini`, `gpt-5-mini`, `gpt-4.1` +Premium: `claude-opus-4.6`, `claude-opus-4.6-1m` (Internal only), `claude-opus-4.5` +Standard: `claude-sonnet-4.6`, `claude-sonnet-4.5`, `claude-sonnet-4`, `gpt-5.4`, `gpt-5.3-codex`, `gpt-5.2-codex`, `gpt-5.2`, `gpt-5.1-codex-max`, `gpt-5.1-codex`, `gpt-5.1`, `gemini-3-pro-preview` +Fast/Cheap: `claude-haiku-4.5`, `gpt-5.4-mini`, `gpt-5.1-codex-mini`, `gpt-5-mini`, `gpt-4.1` ### Client Compatibility @@ -787,7 +791,9 @@ prompt: | Read .squad/decisions.md (team decisions to respect). If .squad/identity/wisdom.md exists, read it before starting work. If .squad/identity/now.md exists, read it at spawn time. - If .squad/skills/ has relevant SKILL.md files, read them before working. + Check .copilot/skills/ for copilot-level skills (process, workflow, protocol). + Check .squad/skills/ for team-level skills (patterns discovered during work). + Read any relevant SKILL.md files before working. {only if MCP tools detected — omit entirely if none:} MCP TOOLS: {service}: ✅ ({tools}) | ❌. Fall back to CLI when unavailable. diff --git a/.squad-templates/squad.agent.md b/.squad-templates/squad.agent.md index 0a9b173e9..2a8ffcf69 100644 --- a/.squad-templates/squad.agent.md +++ b/.squad-templates/squad.agent.md @@ -246,7 +246,11 @@ The routing table determines **WHO** handles work. After routing, use Response M | Ambiguous | Pick the most likely agent; say who you chose | | Multi-agent task (auto) | Check `ceremonies.md` for `when: "before"` ceremonies whose condition matches; run before spawning work | -**Skill-aware routing:** Before spawning, check `.squad/skills/` for skills relevant to the task domain. If a matching skill exists, add to the spawn prompt: `Relevant skill: .squad/skills/{name}/SKILL.md — read before starting.` This makes earned knowledge an input to routing, not passive documentation. +**Skill-aware routing:** Before spawning, check BOTH skill directories for skills relevant to the task domain: +1. `.copilot/skills/` — **Copilot-level skills.** Foundational process knowledge (release process, git workflow, reviewer protocol, etc.). These are the coordinator's own playbook — check first. +2. `.squad/skills/` — **Team-level skills.** Patterns and practices agents discovered during work. + +If a matching skill exists, add to the spawn prompt: `Relevant skill: {path}/SKILL.md — read before starting.` This makes earned knowledge an input to routing, not passive documentation. ### Consult Mode Detection @@ -357,8 +361,8 @@ Before spawning an agent, determine which model to use. Check these layers in or | Task Output | Model | Tier | Rule | |-------------|-------|------|------| -| Writing code (implementation, refactoring, test code, bug fixes) | `claude-sonnet-4.5` | Standard | Quality and accuracy matter for code. Use standard tier. | -| Writing prompts or agent designs (structured text that functions like code) | `claude-sonnet-4.5` | Standard | Prompts are executable — treat like code. | +| Writing code (implementation, refactoring, test code, bug fixes) | `claude-sonnet-4.6` | Standard | Quality and accuracy matter for code. Use standard tier. | +| Writing prompts or agent designs (structured text that functions like code) | `claude-sonnet-4.6` | Standard | Prompts are executable — treat like code. | | NOT writing code (docs, planning, triage, logs, changelogs, mechanical ops) | `claude-haiku-4.5` | Fast | Cost first. Haiku handles non-code tasks. | | Visual/design work requiring image analysis | `claude-opus-4.5` | Premium | Vision capability required. Overrides cost rule. | @@ -366,11 +370,11 @@ Before spawning an agent, determine which model to use. Check these layers in or | Role | Default Model | Why | Override When | |------|--------------|-----|---------------| -| Core Dev / Backend / Frontend | `claude-sonnet-4.5` | Writes code — quality first | Heavy code gen → `gpt-5.2-codex` | -| Tester / QA | `claude-sonnet-4.5` | Writes test code — quality first | Simple test scaffolding → `claude-haiku-4.5` | +| Core Dev / Backend / Frontend | `claude-sonnet-4.6` | Writes code — quality first | Heavy code gen → `gpt-5.3-codex` | +| Tester / QA | `claude-sonnet-4.6` | Writes test code — quality first | Simple test scaffolding → `claude-haiku-4.5` | | Lead / Architect | auto (per-task) | Mixed: code review needs quality, planning needs cost | Architecture proposals → premium; triage/planning → haiku | | Prompt Engineer | auto (per-task) | Mixed: prompt design is like code, research is not | Prompt architecture → sonnet; research/analysis → haiku | -| Copilot SDK Expert | `claude-sonnet-4.5` | Technical analysis that often touches code | Pure research → `claude-haiku-4.5` | +| Copilot SDK Expert | `claude-sonnet-4.6` | Technical analysis that often touches code | Pure research → `claude-haiku-4.5` | | Designer / Visual | `claude-opus-4.5` | Vision-capable model required | — (never downgrade — vision is non-negotiable) | | DevRel / Writer | `claude-haiku-4.5` | Docs and writing — not code | — | | Scribe / Logger | `claude-haiku-4.5` | Mechanical file ops — cheapest possible | — (never bump Scribe) | @@ -379,7 +383,7 @@ Before spawning an agent, determine which model to use. Check these layers in or **Task complexity adjustments** (apply at most ONE — no cascading): - **Bump UP to premium:** architecture proposals, reviewer gates, security audits, multi-agent coordination (output feeds 3+ agents) - **Bump DOWN to fast/cheap:** typo fixes, renames, boilerplate, scaffolding, changelogs, version bumps -- **Switch to code specialist (`gpt-5.2-codex`):** large multi-file refactors, complex implementation from spec, heavy code generation (500+ lines) +- **Switch to code specialist (`gpt-5.3-codex`):** large multi-file refactors, complex implementation from spec, heavy code generation (500+ lines) - **Switch to analytical diversity (`gemini-3-pro-preview`):** code reviews where a second perspective helps, security reviews, architecture reviews after a rejection **Layer 4 — Default:** If nothing else matched, use `claude-haiku-4.5`. Cost wins when in doubt, unless code is being produced. @@ -389,9 +393,9 @@ Before spawning an agent, determine which model to use. Check these layers in or If a spawn fails because the selected model is unavailable (plan restriction, org policy, rate limit, deprecation, or any other reason), silently retry with the next model in the chain. Do NOT tell the user about fallback attempts. Maximum 3 retries before jumping to the nuclear fallback. ``` -Premium: claude-opus-4.6 → claude-opus-4.6-fast → claude-opus-4.5 → claude-sonnet-4.5 → (omit model param) -Standard: claude-sonnet-4.5 → gpt-5.2-codex → claude-sonnet-4 → gpt-5.2 → (omit model param) -Fast: claude-haiku-4.5 → gpt-5.1-codex-mini → gpt-4.1 → gpt-5-mini → (omit model param) +Premium: claude-opus-4.6 → claude-opus-4.5 → claude-sonnet-4.6 → claude-sonnet-4.5 → (omit model param) +Standard: claude-sonnet-4.6 → claude-sonnet-4.5 → gpt-5.4 → gpt-5.3-codex → claude-sonnet-4 → (omit model param) +Fast: claude-haiku-4.5 → gpt-5.4-mini → gpt-5.1-codex-mini → gpt-4.1 → (omit model param) ``` `(omit model param)` = call the `task` tool WITHOUT the `model` parameter. The platform uses its built-in default. This is the nuclear fallback — it always works. @@ -415,7 +419,7 @@ prompt: | ... ``` -Only set `model` when it differs from the platform default (`claude-sonnet-4.5`). If the resolved model IS `claude-sonnet-4.5`, you MAY omit the `model` parameter — the platform uses it as default. +Only set `model` when it differs from the platform default (`claude-sonnet-4.6`). If the resolved model IS `claude-sonnet-4.6`, you MAY omit the `model` parameter — the platform uses it as default. If you've exhausted the fallback chain and reached nuclear fallback, omit the `model` parameter entirely. @@ -424,7 +428,7 @@ If you've exhausted the fallback chain and reached nuclear fallback, omit the `m When spawning, include the model in your acknowledgment: ``` -🔧 Fenster (claude-sonnet-4.5) — refactoring auth module +🔧 Fenster (claude-sonnet-4.6) — refactoring auth module 🎨 Redfoot (claude-opus-4.5 · vision) — designing color system 📋 Scribe (claude-haiku-4.5 · fast) — logging session ⚡ Keaton (claude-opus-4.6 · bumped for architecture) — reviewing proposal @@ -435,9 +439,9 @@ Include tier annotation only when the model was bumped or a specialist was chose **Valid models (current platform catalog):** -Premium: `claude-opus-4.6`, `claude-opus-4.6-fast`, `claude-opus-4.5` -Standard: `claude-sonnet-4.5`, `claude-sonnet-4`, `gpt-5.2-codex`, `gpt-5.2`, `gpt-5.1-codex-max`, `gpt-5.1-codex`, `gpt-5.1`, `gpt-5`, `gemini-3-pro-preview` -Fast/Cheap: `claude-haiku-4.5`, `gpt-5.1-codex-mini`, `gpt-5-mini`, `gpt-4.1` +Premium: `claude-opus-4.6`, `claude-opus-4.6-1m` (Internal only), `claude-opus-4.5` +Standard: `claude-sonnet-4.6`, `claude-sonnet-4.5`, `claude-sonnet-4`, `gpt-5.4`, `gpt-5.3-codex`, `gpt-5.2-codex`, `gpt-5.2`, `gpt-5.1-codex-max`, `gpt-5.1-codex`, `gpt-5.1`, `gemini-3-pro-preview` +Fast/Cheap: `claude-haiku-4.5`, `gpt-5.4-mini`, `gpt-5.1-codex-mini`, `gpt-5-mini`, `gpt-4.1` ### Client Compatibility @@ -787,7 +791,9 @@ prompt: | Read .squad/decisions.md (team decisions to respect). If .squad/identity/wisdom.md exists, read it before starting work. If .squad/identity/now.md exists, read it at spawn time. - If .squad/skills/ has relevant SKILL.md files, read them before working. + Check .copilot/skills/ for copilot-level skills (process, workflow, protocol). + Check .squad/skills/ for team-level skills (patterns discovered during work). + Read any relevant SKILL.md files before working. {only if MCP tools detected — omit entirely if none:} MCP TOOLS: {service}: ✅ ({tools}) | ❌. Fall back to CLI when unavailable. diff --git a/.squad/agents/fido/history.md b/.squad/agents/fido/history.md index 7381edbc7..72210ba44 100644 --- a/.squad/agents/fido/history.md +++ b/.squad/agents/fido/history.md @@ -6,6 +6,8 @@ Quality gate authority for all PRs. Test assertion arrays (EXPECTED_GUIDES, EXPECTED_FEATURES, EXPECTED_SCENARIOS, etc.) MUST stay in sync with files on disk. When reviewing PRs with CI failures, always check if dev branch has the same failures — don't block PRs for pre-existing issues. 3,931 tests passing, 149 test files, ~89s runtime. +📌 **Team update (2026-03-25T15:23Z — Triage Session & PR Review Batch):** FIDO reviewed 10 open PRs for quality and merge readiness. Identified 3 duplicate/overlap pairs consolidating 6 PRs into 4: #607 (retro enforcement, comprehensive) approved for merge, #605 closed as duplicate (less comprehensive). #603 (Challenger agent, correct paths) approved for merge, #604 closed as duplicate (wrong file paths). #606 (tiered memory superset, 3-tier model) approved for merge, #602 closed as duplicate (narrower 2-tier scope). Merge-ready PRs identified: #611 (blocked on #610), #592 (joniba wiring guide, high-quality). Draft #567 not ready. Impact: reduces PR count from 10 to 7, eliminates file conflicts, preserves unique value. All other PRs (#611, #608, #592, #567) can proceed independently. Decisions merged to decisions.md and decisions inbox deleted. + ## Learnings ### Test Assertion Sync Discipline @@ -178,3 +180,32 @@ Added `publish-policy` job to squad-ci.yml — lightweight lint that scans all ` 📌 **Team update (2026-03-24T06-release-hardening):** Publish policy CI gate (#557) implemented. Added `publish-policy` job to squad-ci.yml: lightweight lint scans `.github/workflows/*.yml` for bare `npm publish` commands, rejects non-workspace-scoped invocations. Wrote test/publish-policy.test.ts (36 tests) validating: workspace-scoped passes, bare publish fails, meta-reference (echo/grep/YAML-name) skipping, live validation of 15 workflow files. Pattern: catch "publish root package.json" incident class before merge. Both lint + playbook docs create enforcement + education loop. +### PR Review Batch — 10 Open PRs (2026-03-24) + +Reviewed all 10 open PRs for quality, test coverage, and merge readiness. + +**Critical finding — Duplicate/overlapping PRs (tamirdresher):** +- **PRs #607 / #605** overlap on retrospective ceremony — both add weekly retro ceremony with Ralph enforcement. #607 adds ceremony + enforcement skill + guide (444 lines), #605 modifies existing templates/ceremonies.md + ralph-reference.md (217 lines). Both solve the same problem (retro enforcement) with different file structures. #607 is more comprehensive (includes enforcement guide + pseudocode), #605 is more concise (inline in existing templates). **Verdict: Pick one** — recommend #607 (standalone ceremony file is more discoverable). +- **PRs #604 / #603** are complete duplicates — both add Challenger agent template + fact-checking skill. #604 has `templates/challenger.md` (153 lines), #603 has `.squad/templates/agents/challenger.md` + `.squad/skills/fact-checking/SKILL.md` (133 lines). File locations differ but content is nearly identical. **Verdict: Close one as duplicate** — recommend #603 (file locations match project conventions). +- **PRs #606 / #602** overlap on tiered memory/history — #606 adds tiered-memory skill (hot/cold/wiki tiers, 370 lines), #602 adds tiered-history skill (hot/cold split, 158 lines). #606 is broader (3 tiers, scribe integration, spawn templates), #602 is narrower (2 tiers, history.md only). Both cite same production data source. **Verdict: #606 supersedes #602** — recommend closing #602 as subset. + +**Quality assessment:** +- **PR #611 (TypeDoc API):** CI passing, large well-scoped PR (1569 additions), includes tests (Playwright), screenshots provided, PAO reviewed. Ready to merge pending PAO's requested fixes (crosslink banner, nav URL simplification). Quality: HIGH. +- **PR #608 (Security policy):** Trivial (28 lines), no tests needed, no CI configured. Adds SECURITY.md with standard vulnerability reporting text. Quality: ACCEPTABLE (minor typo: "timely manor" → "timely manner"). +- **PR #592 (Enforcement wiring):** Well-documented (549 additions), adds missing step to hiring process + 3 appendices. CI passing, no code changes, docs-only. Quality: HIGH. +- **PR #567 (StorageProvider):** DRAFT status, clean implementation (321 additions), 18 tests passing, Wave 1 foundation PR (no call-site migration yet). Quality: HIGH, but keep as DRAFT until Wave 2 ready. + +**CI status:** 9/10 PRs have CI passing. #608 (security policy) has no CI configured on branch "patch-1" (external contributor branch). + +**Test coverage:** +- #611: Playwright tests included (8 tests) +- #607, #605, #604, #603, #606, #602: All docs-only, no tests needed +- #592: Docs-only, no tests needed +- #567: 18 tests included, all passing + +**Overlap resolution needed:** tamirdresher has 6 PRs, 3 pairs have significant overlap. Recommend: merge #607 (not #605), merge #603 (close #604), merge #606 (close #602). + +**Blocking issues:** +- None for mergeability — all non-overlapping PRs are technically ready +- Deduplication decision needed for tamirdresher's PRs before merging any of them + diff --git a/.squad/agents/flight/history.md b/.squad/agents/flight/history.md index 053b15bd2..8945e3983 100644 --- a/.squad/agents/flight/history.md +++ b/.squad/agents/flight/history.md @@ -4,6 +4,8 @@ --- +📌 **Team update (2026-03-25T15:23Z — Triage Session & PR Review):** Flight triaged 14 untriaged GitHub issues, created prioritized work session plan. Identified high-value quick wins (P1): #610 (docs broken link, 5-min fix), #590 (getPersonalSquadRoot bug, P0), #591 (hiring wiring docs). Deferred community feature contributions (#601–#595) pending PR review. Categorized maintenance (P2) and questions for community. FIDO reviewed 10 open PRs, identified 3 duplicate/overlap pairs (6 PRs consolidate to 4: merge #607/#603/#606, close #605/#604/#602). Work session priority: #610→PAO, #590→EECOM, #592/#611→Flight review, #588→Procedures. Established PR review strategy: Tamir PRs require proposal-first discipline before review. Merge-ready identified: #611 (blocked on #610), #592 (joniba wiring guide, high-quality). A2A protocol PRs remain shelved. All 14 issues fully categorized with squad assignments. Decision inbox merged to decisions.md. Session complete; team ready for execution. + 📌 **Team update (2026-03-23T22:00Z — Release Crisis Recovery):** v0.9.0→v0.9.1 incident resolved. Released v0.9.1 stable on npm after 8-hour debugging marathon (should have been 10 min). Root causes: dependency validation gap (file: refs in packages), GitHub workflow cache race, npm workspace publish automation broken, coordinator decision-making under pressure, no pre-publish verification. Created comprehensive retrospective with 5 root causes and 6 action items (A1–A6). Filed 9 GitHub issues (#556–#564) documenting release process improvements. Pre-flight job added to publish pipeline (dependency scanning + semver validation). Surgeon charter hardened with release governance rules. 10 community PRs merged (#569, #570, #571, #555, #552, #568, #572, #513, #573, #574). Discussion board fully triaged (15 discussions: 4 closed, 1 consolidated, 2 converted to issue, 8 kept). Dark mode fix deployed to production. Release process skill created at `.squad/skills/release-process/SKILL.md`. 9 GitHub issues filed for release improvements. Team ready for next cycle. 📌 **Team update (2026-03-22T09-35Z — Wave 1):** Ambient personal squad design validated and 19-task implementation plan authored across 4 PRs (Phase 1 SDK, Phase 2 CLI, Phase 3 governance, Phase 4 tests). MVP = PR #1 + PR #3. EECOM executing Phase 1–2 (SDK + CLI), Procedures executing Phase 3 (governance) concurrently. All design gaps resolved; dependency graph established. Procedures wrote governance proposals for personal squad + economy mode — awaiting your review. Sims to execute Phase 4 after Phase 1+2 merge. Directive captured: bug #502 (node:sqlite, P1) to be picked up after Wave 1. No blocking issues — ready for execution. @@ -152,3 +154,32 @@ Brady approved scope for remaining v0.9.1 incident hardening. Three issues to ex **Execution order:** #562 (Brady, manual API call) and #557 (FIDO/Procedures, CI change) run in parallel. #564 (Procedures+Surgeon, playbook) goes last so it can reference the lint rule. Decision written to `.squad/decisions/inbox/flight-release-hardening-plan.md`. + +### Issue Triage Session — 14 Untriaged Issues (2026-03-24) + +**Triaged 14 issues + 10 PRs:** 3 docs issues, 6 community feature proposals, 3 bugs, 2 questions. Key findings: + +**P0 Bug (immediate):** +- #590 (getPersonalSquadRoot) → squad:eecom — personal squad broken since v0.9.1, affects all `squad consult` on new repos + +**P1 Quick Wins:** +- #610 (broken docs link) → squad:pao — 5-minute fix, unblocks diberry PR #611 CI +- #591 (hiring wiring docs) → squad:procedures — matches PR #592 (joniba), high-quality wiring guide ready to merge + +**Community PRs (proposal-first enforcement):** +- Tamir PRs #602-607 (6 PRs) — high technical quality but missing proposal-first compliance. Need `docs/proposals/` entries before review. +- Joniba PR #592 — merge-ready, validates enforcement wiring gap +- Diberry PR #611 — blocked on #610 fix, then merge + +**P2 Maintenance:** +- #597 (upgrade CLI docs) → squad:pao + squad:network +- #588 (model list update) → squad:procedures +- #554 (broken external links) → squad:pao + +**Deferred/Questions:** +- #581 (ADO PRD) → P2, blocked until #341 SDK-first parity ships +- #589, #494 → community replies clarifying skill paths and model selection + +**Pattern:** Tamir is a high-output contributor (6 PRs in 2 weeks) but needs proposal-first discipline. Joniba and diberry deliver MSFT-level quality. + +Decision written to `.squad/decisions/inbox/flight-triage-session-plan.md`. diff --git a/.squad/agents/procedures/history.md b/.squad/agents/procedures/history.md index bdf10b8e8..28ff4f29b 100644 --- a/.squad/agents/procedures/history.md +++ b/.squad/agents/procedures/history.md @@ -139,3 +139,15 @@ Also updated: examples section (showing `name` + `description` pairs), anti-patt **Pattern:** Every `task` tool spawn MUST include `name` set to the agent's lowercase cast name. Without it, the platform defaults to generic slugs. The `description` parameter is for the human-readable summary; `name` is for the agent ID. 📌 **Team update (2026-03-23T23:15Z):** Orchestration complete. Agent name display refactor shipped: spawn templates updated with mandatory `name` parameter across all 4 template variants. VOX and FIDO coordinated on parser extraction and cascading pattern strategies. All decisions merged to decisions.md. Canonical source: `.squad-templates/squad.agent.md` (all derived copies secondary). + +### 2025-07: Model catalog refresh (#588) + +**Problem:** The valid models catalog, fallback chains, role-to-model mappings, and default model references in `squad.agent.md` were stale — missing `claude-sonnet-4.6`, `gpt-5.4`, `gpt-5.3-codex`, `gpt-5.4-mini`, `claude-opus-4.6-1m` and still referencing removed models `claude-opus-4.6-fast` and standalone `gpt-5`. + +**Fix:** Full catalog refresh across all model-referencing sections: +- Catalog: added 5 new models, removed 2 stale ones +- Defaults: code-writing tasks bumped to `claude-sonnet-4.6` (newest standard); code specialist bumped to `gpt-5.3-codex` +- Fallback chains: restructured with new models in sensible positions (e.g., `gpt-5.4-mini` in fast tier, `gpt-5.4` in standard) +- All 5 copies synced via `sync-templates.mjs` + +**Pattern:** Model catalogs drift. When the platform adds/removes models, every section referencing models needs updating — not just the catalog list. Search for all model name strings before considering the refresh complete. diff --git a/.squad/decisions.md b/.squad/decisions.md index ded855710..d6b088e27 100644 --- a/.squad/decisions.md +++ b/.squad/decisions.md @@ -7582,3 +7582,124 @@ Meta-references to "npm publish" in echo, grep, and YAML `name:` lines are exclu - `init --global` now suppresses GitHub workflows (they're meaningless in the global config dir). - `RunInitOptions` has a new `isGlobal` field. +--- + +# Decision: PR Review Batch — Overlap Resolution + +**Date:** 2026-03-25 +**Reviewer:** FIDO (Quality Owner) +**Context:** 10 open PRs reviewed, 3 duplicate/overlap pairs identified + +## Problem + +tamirdresher opened 6 PRs addressing related concerns (retro enforcement, challenger agent, tiered memory). Three pairs have significant overlap: + +1. **#607 vs #605** — Both add weekly retro ceremony with Ralph enforcement +2. **#604 vs #603** — Both add Challenger agent template (complete duplicates) +3. **#606 vs #602** — Both add tiered memory/history skills (superset/subset) + +## Decision + +**Merge these:** +- **#607** (retro enforcement) — comprehensive, standalone ceremony file +- **#603** (Challenger + fact-checking) — correct file locations, follows project conventions +- **#606** (tiered memory) — superset of #602, 3-tier model vs 2-tier + +**Close as duplicate:** +- **#605** — same scope as #607, less comprehensive +- **#604** — duplicate of #603, different file locations +- **#602** — subset of #606, narrower scope + +## Rationale + +- **#607 vs #605:** #607 provides standalone ceremony file (`ceremonies/retrospective.md`) + enforcement guide + skill, while #605 inlines into existing templates. Standalone file is more discoverable and modular. +- **#604 vs #603:** Functionally identical. #603 uses `.squad/` paths matching project conventions; #604 uses `templates/` (non-standard for agents). +- **#606 vs #602:** #606 is a superset — 3-tier model (hot/cold/wiki) vs 2-tier (hot/cold). Both cite same production data. Broader scope is more useful. + +## Impact + +- Reduces PR count from 10 to 7 (close 3 duplicates) +- Eliminates conflicting file changes (e.g., both #607 and #605 modify `templates/ceremonies.md`) +- Preserves all unique value (no functionality lost) + +## Affected PRs + +| PR | Action | Reason | +|-----|--------|--------| +| 607 | Merge | Comprehensive retro enforcement | +| 605 | Close | Duplicate of #607 (less comprehensive) | +| 604 | Close | Duplicate of #603 (wrong file paths) | +| 603 | Merge | Challenger template (correct paths) | +| 606 | Merge | Tiered memory (superset) | +| 602 | Close | Subset of #606 (narrower scope) | + +## Next Steps + +1. Comment on #605, #604, #602 explaining they are duplicates/subsets and will be closed +2. Merge #607, #603, #606 after author confirms deduplication is acceptable +3. All other PRs (#611, #608, #592, #567) can proceed independently + +--- + +# Decision: Triage + Work Session Plan + +**By:** Flight +**Date:** 2026-03-25 + +## Context + +Triaged 14 untriaged issues (3 docs, 6 community features, 3 bugs, 2 questions). Multiple overlap with existing P1 work. 10 open PRs (5 from tamirdresher, 2 from diberry, 1 from joniba, 1 from eric-vanartsdalen, 1 draft). + +## Triage Decisions + +### High-Value Quick Wins (P1) +- **#610** (docs broken link) → squad:pao, P1 — 5-minute fix blocking diberry's PR #611 CI +- **#590** (getPersonalSquadRoot bug) → squad:eecom, P0 — personal squad init broken for all users since v0.9.1 +- **#591** (hiring wiring docs) → squad:procedures, P1 — matches PR #592 (joniba), docs-only, high clarity + +### Community Feature Contributions (Defer to Review) +- **#601, #600, #598, #596, #595** (tamirdresher proposals) — all have matching PRs (#607, #606, #604, #602). Priority: review PRs first, triage issues after PR decisions. + +### Maintenance Items (P2) +- **#597** (upgrade CLI docs) → squad:pao + squad:network, P2 — user confusion, docs fix + UX improvement +- **#588** (model list update) → squad:procedures, P2 — hardcoded model list in squad.agent.md + templates +- **#554** (broken external links) → squad:pao, P2 — automated link checker output, investigate failures + +### Questions (No Squad Assignment) +- **#589** (skills placement) → community reply — clarify `.copilot/skills` vs `.github/skills` vs `.claude/skills` +- **#494** (model vs squad model) → community reply — clarify Copilot CLI `/models` vs squad.agent.md model preference + +### Long-Horizon Feature Work (P2-P3) +- **#581** (ADO Support PRD) → squad:flight, P2 — comprehensive PRD, but blocked until SDK-first parity (#341) ships + +## Work Session Priority (Top 5) + +1. **#610** → PAO — fix broken link (5 min), unblocks #611 +2. **#590** → EECOM — fix getPersonalSquadRoot(), critical user-facing bug +3. **PR #592** → Flight review — matches #591, validate joniba's wiring guide +4. **PR #611** → Flight review — diberry TypeDoc API reference (blocked on #610 fix) +5. **#588** → Procedures — update model lists in templates + +## PR Review Strategy + +**Merge-ready (after minimal validation):** +- #611 (diberry) — blocked on #610, then merge +- #592 (joniba) — high-quality wiring guide + +**Tamir PRs (defer until proposal-first validated):** +- #607, #606, #605, #604, #603, #602 — all substantive feature proposals without prior proposals in `docs/proposals/`. Apply proposal-first policy: request `docs/proposals/{slug}.md` before reviewing implementation. + +**Draft (not ready):** +- #567 (diberry) — explicitly marked DRAFT + +## Patterns Noted + +- **Tamir contributions:** High technical quality, but needs proposal-first discipline (6 PRs without proposals). +- **Joniba contributions:** Consistently high-quality, matches team standards (wiring guide is excellent). +- **Diberry contributions:** MSFT-level quality, merge-ready on delivery. + +## Deferred + +- #357, #336, #335, #334, #333, #332, #316 (A2A) — stays shelved per existing decision +- #581 (ADO PRD) — P2, blocked until #341 (SDK-first parity) ships + diff --git a/.squad/decisions/inbox/procedures-model-update.md b/.squad/decisions/inbox/procedures-model-update.md new file mode 100644 index 000000000..bd987e44b --- /dev/null +++ b/.squad/decisions/inbox/procedures-model-update.md @@ -0,0 +1,19 @@ +# Decision: Model catalog refresh to current platform offerings + +**Author:** Procedures +**Date:** 2025-07 +**Issue:** #588 + +## Context + +The model catalog in `squad.agent.md` was stale. Platform added `claude-sonnet-4.6`, `gpt-5.4`, `gpt-5.3-codex`, `gpt-5.4-mini`, and `claude-opus-4.6-1m`. Platform removed `claude-opus-4.6-fast` and `gpt-5` (standalone). + +## Decision + +- **Default code model bumped to `claude-sonnet-4.6`** — it's the newest standard-tier Claude and replaces `claude-sonnet-4.5` as the recommended model for code-writing tasks, role mappings, and the platform default reference. +- **Code specialist bumped to `gpt-5.3-codex`** — replaces `gpt-5.2-codex` as the recommended model for heavy code generation overrides. +- **Fallback chains restructured** — removed dead model `claude-opus-4.6-fast`, added new models (`gpt-5.4`, `gpt-5.4-mini`) in sensible fallback positions. + +## Impact + +All agents using model selection or fallback chains will now reference current models. No behavioral change for agents using the platform default (omitting `model` param) — they get whatever the platform serves. diff --git a/.squad/proposals/coordinator-restraint.md b/.squad/proposals/coordinator-restraint.md new file mode 100644 index 000000000..3337f3da9 --- /dev/null +++ b/.squad/proposals/coordinator-restraint.md @@ -0,0 +1,381 @@ +# Proposal: Coordinator Restraint + +**Author:** Flight (Lead) +**Requested by:** Brady +**Date:** 2026-03-24 +**Status:** Draft — **Rev 2** (adjusted per Brady's feedback: the fix is about coordinator *autonomy*, not team *parallelism*) +**Target file:** `.squad-templates/squad.agent.md` + +--- + +## Executive Summary + +The coordinator prompt (`squad.agent.md`) gives the coordinator too much permission to *do work itself* when it should be *routing work to agents*. Three specific failure modes: + +1. **The coordinator does domain work instead of routing.** The mindset line says "What can I launch RIGHT NOW?" — but in practice the coordinator interprets this as license to act, not just launch agents. +2. **When skill gaps exist, the coordinator fills them itself** instead of recommending team expansion. There is no gap protocol at all. +3. **The coordinator doesn't read its own skills before acting.** `.copilot/skills/` and `.squad/skills/` were added to the prompt, but there's no enforcement step — no checklist that says "read these BEFORE you route." + +**What this proposal does NOT touch:** Parallel fan-out, "decompose broadly," anticipatory downstream work, agent count, background mode default, Ralph's drive. These are good. The coordinator should launch *more* agents, not fewer. The fix is about what the *coordinator itself* does when it can't find an agent — it should suggest expanding the team, not wing it. + +Core principle: **Read first. Route to agents. Never do domain work yourself. When the team lacks expertise, grow the team.** + +--- + +## 1. Diagnosis — Coordinator Self-Action & Missing Guardrails + +> **Scope clarification (per Brady's feedback):** The problem is the coordinator doing work *itself* — not the breadth of agent spawning. "Decompose broadly," parallel fan-out, anticipatory downstream agents, chain follow-ups, "bias toward upgrading" — all STAY. The surgery targets only: (a) the coordinator's permission to do domain work, (b) the missing skill gap protocol, and (c) the missing pre-action skills check. + +### 1.1 Coordinator Identity — Mindset (Line 17) + +**Current text:** +``` +- **Mindset:** **"What can I launch RIGHT NOW?"** — always maximize parallel work +``` + +**Problem:** The intent is good (maximize parallelism), but the phrasing gives the coordinator license to *act* rather than *route*. When the coordinator reads "What can I launch RIGHT NOW?" and no agent clearly fits, the coordinator interprets this as "I should do it myself RIGHT NOW." The mindset should encode *reading and routing* as the first instinct, with aggressive parallel launching as the mechanism that follows. + +--- + +### 1.2 Eager Execution Philosophy — Section Header and Core Framing (Lines 526–536) + +**Current text:** +``` +### Eager Execution Philosophy + +> **⚠️ Exception:** Eager Execution does NOT apply during Init Mode Phase 1. Init Mode requires explicit user confirmation (via `ask_user`) before creating the team. Do NOT launch file creation, directory scaffolding, or any Phase 2 work until the user confirms the roster. + +The Coordinator's default mindset is **launch aggressively, collect results later.** + +- When a task arrives, don't just identify the primary agent — identify ALL agents who could usefully start work right now, **including anticipatory downstream work**. +- A tester can write test cases from requirements while the implementer builds. A docs agent can draft API docs while the endpoint is being coded. Launch them all. +- After agents complete, immediately ask: *"Does this result unblock more work?"* If yes, launch follow-up agents without waiting for the user to ask. +- Agents should note proactive work clearly: `📌 Proactive: I wrote these test cases based on the requirements while {BackendAgent} was building the API. They may need adjustment once the implementation is final.` +``` + +**Problem:** The bullet points are great — broad fan-out, anticipatory agents, proactive follow-ups. Keep all of that. But "launch aggressively" as the *mindset statement* is the problem. It's the line the coordinator internalizes as its identity, and it gives permission to self-act. The coordinator reads "launch aggressively" and concludes that if no agent is obvious, *it* should launch *itself* at the problem. The fix: change the framing to "route broadly, read first" while preserving every bullet about team parallelism. + +--- + +### 1.3 Constraints — "When in doubt, pick someone and go" (Line 1022) + +**Current text:** +``` +- **When in doubt, pick someone and go.** Speed beats perfection. +``` + +**Problem:** For routing decisions between existing agents, this is correct — speed beats perfection. But the coordinator also reads this when the doubt is "does anyone on the team even know this domain?" and concludes it should just do the work. The constraint needs a carve-out: speed beats perfection for *routing*, but skill gaps need the Skill Gap Protocol. + +--- + +### 1.4 Skill-Aware Routing — Missing Gap Handling (Lines 249–253) + +**Current text:** +``` +**Skill-aware routing:** Before spawning, check BOTH skill directories for skills relevant to the task domain: +1. `.copilot/skills/` — **Copilot-level skills.** Foundational process knowledge (release process, git workflow, reviewer protocol, etc.). These are the coordinator's own playbook — check first. +2. `.squad/skills/` — **Team-level skills.** Patterns and practices agents discovered during work. + +If a matching skill exists, add to the spawn prompt: `Relevant skill: {path}/SKILL.md — read before starting.` This makes earned knowledge an input to routing, not passive documentation. +``` + +**Problem:** The section tells the coordinator what to do when a skill *exists*, but says nothing about what to do when the team *lacks* the required expertise. No gap detection, no onboarding recommendation. The coordinator just proceeds — and since the mindset says "launch aggressively," it fills the gap itself. + +--- + +### 1.5 Refusal Rules — Missing Self-Action Guardrail (Line 18–21) + +**Current text:** +``` +- **Refusal rules:** + - You may NOT generate domain artifacts (code, designs, analyses) — spawn an agent + - You may NOT bypass reviewer approval on rejected work + - You may NOT invent facts or assumptions — ask the user or spawn an agent who knows +``` + +**Problem:** The refusal rules say "spawn an agent" — but they don't address what to do when *no qualified agent exists on the team*. The coordinator follows the spirit of "spawn an agent" by spawning the closest available agent even when that agent lacks the domain expertise, or worse, by doing the work inline and rationalizing it as "coordinator infrastructure" rather than "domain work." There's a gap between "you may NOT generate domain artifacts" and the absence of any protocol for when the team genuinely can't cover a domain. + +--- + +### 1.6 No Pre-Action Skills Check (Structural Gap) + +**Current text:** *There is no text — this is a missing section.* + +**Problem:** The coordinator prompt tells agents to read skills at spawn time (line 794–796 of the spawn template: "Check `.copilot/skills/` for copilot-level skills... Check `.squad/skills/` for team-level skills..."). But the *coordinator itself* has no equivalent mandatory step. The Skill-Aware Routing section (line 249) says to check skills "before spawning" — but it's advisory, not a checklist. There's no enforcement mechanism. The coordinator can (and does) skip straight to launching without reading its own playbook. + +--- + +## 2. Proposed Changes + +> **Scope reminder:** These changes target coordinator *self-action*. "Decompose broadly," parallel fan-out, chain follow-ups, anticipatory agents, "bias toward upgrading" — all untouched. + +### 2.1 Coordinator Identity — Mindset (Line 17) + +**Old:** +``` +- **Mindset:** **"What can I launch RIGHT NOW?"** — always maximize parallel work +``` + +**New:** +``` +- **Mindset:** **"What do I already know about this?"** — read skills, check routing, then launch broadly +``` + +**Rationale:** The coordinator's first instinct should be reading context, not acting. The "launch broadly" ending preserves the parallel-work energy — the change is about *sequence*, not *scale*. Read, route, then fan out hard. + +--- + +### 2.2 Eager Execution Philosophy — Reframe as Informed Execution (Lines 526–536) + +**Old (section header + first paragraph only — bullets stay):** +``` +### Eager Execution Philosophy + +> **⚠️ Exception:** Eager Execution does NOT apply during Init Mode Phase 1. Init Mode requires explicit user confirmation (via `ask_user`) before creating the team. Do NOT launch file creation, directory scaffolding, or any Phase 2 work until the user confirms the roster. + +The Coordinator's default mindset is **launch aggressively, collect results later.** +``` + +**New (section header + first paragraph — all bullets below are UNCHANGED):** +``` +### Informed Execution Philosophy + +> **⚠️ Exception:** Init Mode Phase 1 requires explicit user confirmation (via `ask_user`) before creating the team. Do NOT launch file creation, directory scaffolding, or any Phase 2 work until the user confirms the roster. + +The Coordinator's default mindset is **read first, then launch broadly to the team.** Run the Pre-Flight Checklist before spawning. Once routing confirms the right agents, launch aggressively — the more teammates working in parallel, the better. +``` + +**The following bullet points remain EXACTLY as they are today:** +``` +- When a task arrives, don't just identify the primary agent — identify ALL agents who could usefully start work right now, **including anticipatory downstream work**. +- A tester can write test cases from requirements while the implementer builds. A docs agent can draft API docs while the endpoint is being coded. Launch them all. +- After agents complete, immediately ask: *"Does this result unblock more work?"* If yes, launch follow-up agents without waiting for the user to ask. +- Agents should note proactive work clearly: `📌 Proactive: I wrote these test cases based on the requirements while {BackendAgent} was building the API. They may need adjustment once the implementation is final.` +``` + +**Rationale:** The bullets are the good part — broad fan-out, anticipatory agents, proactive follow-ups. The framing paragraph is what gives the coordinator permission to self-act. The new framing keeps "launch aggressively" but qualifies it: launch aggressively *to the team*, after reading. + +--- + +### 2.3 Constraints — "When in doubt" (Line 1022) + +**Old:** +``` +- **When in doubt, pick someone and go.** Speed beats perfection. +``` + +**New:** +``` +- **When in doubt about routing, pick someone and go.** Speed beats perfection for routing decisions. When in doubt about whether the team has the expertise, follow the Skill Gap Protocol — don't fill the gap yourself. +``` + +**Rationale:** Preserves the speed-over-perfection principle for routing while adding an explicit offramp for skill gaps. The coordinator should never hesitate between two qualified agents. It should hesitate when NO agent is qualified. + +--- + +### 2.4 Refusal Rules — Add Skill Gap Guardrail (Line 18–21) + +**Old:** +``` +- **Refusal rules:** + - You may NOT generate domain artifacts (code, designs, analyses) — spawn an agent + - You may NOT bypass reviewer approval on rejected work + - You may NOT invent facts or assumptions — ask the user or spawn an agent who knows +``` + +**New:** +``` +- **Refusal rules:** + - You may NOT generate domain artifacts (code, designs, analyses) — spawn an agent + - You may NOT bypass reviewer approval on rejected work + - You may NOT invent facts or assumptions — ask the user or spawn an agent who knows + - You may NOT fill a skill gap by doing domain work yourself — follow the Skill Gap Protocol +``` + +**Rationale:** The existing refusal rules say "spawn an agent" but don't handle the case where no qualified agent exists. This closes that gap. One line, maximum impact. + +--- + +### 2.5 Skill-Aware Routing — Add Gap Detection (after Line 253) + +**Old (ends with):** +``` +If a matching skill exists, add to the spawn prompt: `Relevant skill: {path}/SKILL.md — read before starting.` This makes earned knowledge an input to routing, not passive documentation. +``` + +**New (add paragraph after the existing text):** +``` +If a matching skill exists, add to the spawn prompt: `Relevant skill: {path}/SKILL.md — read before starting.` This makes earned knowledge an input to routing, not passive documentation. + +**If NO agent on the roster covers the task domain**, follow the Skill Gap Protocol (see below). Do not attempt to fill the gap by having the coordinator do the work or by assigning an agent outside their expertise. +``` + +**Rationale:** Closes the gap in skill-aware routing — the section now handles both "skill found" and "skill missing." + +--- + +## 3. New Section — Skill Gap Protocol + +*Insert after the Skill-Aware Routing section (after line 253, after the new gap detection paragraph).* + +```markdown +### Skill Gap Protocol + +When the Pre-Flight Checklist reveals that no team member or skill covers the domain required by a task, the coordinator MUST NOT attempt the work itself or assign it to an unqualified agent. Instead: + +**Step 1 — Detect the gap.** During the Pre-Flight Checklist, if: +- No agent's charter covers the task domain, AND +- No skill in `.copilot/skills/` or `.squad/skills/` addresses it, AND +- The routing table has no matching entry + +…then a skill gap exists. + +**Step 2 — Recommend team expansion (default).** Tell the user: + +``` +🔍 Skill gap detected: {domain description}. +No one on the team covers this. My recommendation: onboard a new team member for {domain}. + +Want me to: + 1. **Add a {role} to the team** — I'll cast a name, write a charter, and add routing (recommended) + 2. **I'll handle it this time** — coordinator takes a best effort, but no domain expertise + 3. **Skip it** — park this task for later +``` + +Use the `ask_user` tool with these choices. + +**Step 3 — Follow the user's choice:** + +| Choice | Action | +|--------|--------| +| **Add a team member** (default, recommended) | Follow the Adding Team Members flow. After the new member is onboarded, route the original task to them. | +| **Coordinator handles it** | Proceed with best effort. Log: `⚠️ Coordinator handled {domain} task — no domain expert on team. Consider adding a {role}.` Write to `.squad/decisions/inbox/copilot-gap-{timestamp}.md`. | +| **Skip** | Acknowledge. Optionally create a GitHub issue or todo for later. | + +**Key principles:** +- The Squad-opinionated default is ALWAYS "expand the team." The coordinator should not silently fill gaps. +- The user always has the CHOICE to let the coordinator handle it — but the recommendation is clear. +- If the same gap is detected twice in a session, remind the user: *"This is the second time we've hit a {domain} gap. Adding a team member would prevent this."* +- Skill gaps found during Ralph's work-check loop are reported in the round summary, not individually (to avoid interrupting flow). +``` + +--- + +## 4. New Section — Pre-Flight Checklist + +*Insert immediately before the "Routing" section (before line 228), as the new gating step before any action.* + +```markdown +### Pre-Flight Checklist + +**Before ANY action on a user request — before routing, before spawning — the coordinator runs this checklist. This is not optional. Skip nothing.** + +> **What this checklist does NOT do:** It does NOT limit the number of agents spawned, reduce fan-out, or prevent anticipatory work. After the checklist confirms routing, launch as broadly as the routing table supports. The checklist gates *coordinator self-action* — it prevents the coordinator from doing domain work when it should be routing to an agent. + +**Step 1 — Read skills.** +- Scan `.copilot/skills/` for coordinator-level process knowledge (release flow, git workflow, review protocol). +- Scan `.squad/skills/` for team-level patterns agents have discovered. +- Note any skills relevant to the incoming request. These will be passed to spawned agents. +- ⚡ On first message of a session, these reads happen alongside `team.md` and `routing.md` (parallel tool calls — no latency). On subsequent messages, skills are cached — only re-read if the task domain is new. + +**Step 2 — Read routing.** +- Consult `routing.md` (cached after first message) to identify which agent(s) own the domains touched by this request. +- If the request touches multiple domains, identify all mapped agents — including downstream agents (tests, docs, scaffolding). Decompose broadly. + +**Step 3 — Scope check: coordinator or agent?** +- Is this a Direct Mode question? (status, factual, from context) → Coordinator answers. Done. +- Is this a directive? → Capture per Directive Capture rules. Done (unless also a work request). +- Is this domain work? → Route to an agent. The coordinator does NOT do domain work — not even "just this once," not even "it's a small thing." +- Is this a ceremony trigger? → Follow ceremony rules. + +**Step 4 — Gap check.** +- Does the routing table map an agent for this domain? + - **Yes** → Proceed to routing and fan-out. Launch broadly. + - **No** → Follow the Skill Gap Protocol. Do NOT fill the gap yourself. + +**Step 5 — Route and launch.** +- Apply the routing table. Select the right agent(s). Launch per Response Mode Selection (Direct / Lightweight / Standard / Full). +- Decompose broadly. Anticipatory downstream agents are encouraged. The more teammates working in parallel, the better. + +**This checklist replaces the old "What can I launch RIGHT NOW?" mindset with "What do I already know about this?" The coordinator's speed comes from broad parallel execution across the team, not from the coordinator doing work itself.** +``` + +--- + +## 5. Preserved Behaviors — DO NOT CHANGE + +> **This is the most important section of the proposal.** Brady's feedback is clear: the fix is about coordinator autonomy, not team parallelism. Everything below STAYS. + +| Behavior | Why it's good | Location | +|----------|---------------|----------| +| **"Decompose broadly"** | More teammates involved = better outcomes. The coordinator should identify ALL agents who could start work, including anticipatory downstream. This is a feature, not a bug. | Line 564 | +| **Parallel fan-out via `task` tool** | The infrastructure for concurrent agent work is sound. Launch them all. The problem was never "too many agents" — it was "coordinator doing work instead of agents." | Lines 562–586 | +| **"Chain follow-ups" without waiting for user** | Follow-up work should launch immediately. Speed is good. The fix only requires the coordinator to route follow-ups *to agents*, not do them itself. | Line 576 | +| **"+ any anticipatory agents"** | Anticipatory downstream work (tester alongside implementer, docs alongside API) is correct. The routing table entry stays as-is. | Line 244 | +| **"Bias toward upgrading" in Response Mode Selection** | When uncertain about mode, going one tier higher is the right call. More ceremony is better than less. | Line 277 | +| **"Pick the most likely agent" for ambiguous requests** | Fast routing between known agents is correct. The Skill Gap Protocol only activates when NO agent fits — not when choosing between two. | Line 246 | +| **`mode: "background"` as default** | Background mode enables true parallelism. Correct default. | Lines 537–560 | +| **`task` tool requirement** | Every agent interaction must be a real spawn. No role-playing, no simulation. Core architecture. | Lines 98, 824–833, 1016–1017 | +| **"Feels Heard" acknowledgment** | Users should never see a blank screen. Brief acknowledgment before background work starts. Good UX. | Lines 153–165 | +| **Ralph's work-driving energy** | Ralph keeps the pipeline moving. The continuous loop ("don't ask permission, keep going") is part of Squad's soul. Ralph drives *routing* at speed, not domain work. | Lines 1099–1237 | +| **Scribe always runs (background)** | Memory and decisions must be recorded. Scribe is non-blocking. | Lines 856–879 | +| **Drop-box pattern for shared files** | Eliminates file conflicts in parallel work. | Lines 587–604 | +| **Role emoji in task descriptions** | Visual consistency in spawn notifications. | Lines 167–199 | +| **Directive Capture** | Preferences are captured before routing. Correct sequencing. | Lines 201–226 | +| **Worktree Awareness** | Multi-branch isolation works correctly. | Lines 606–676 | +| **Reviewer Rejection Protocol** | Lockout semantics are sound architecture. | Lines 1027–1049 | +| **Model selection hierarchy** | 4-layer model selection (config → session → charter → auto) is well-designed. | Lines 346–424 | + +--- + +## 6. Rollback Plan + +If these changes make things worse (coordinator becomes too passive, work stalls, users feel ignored): + +### Immediate Revert +```bash +git log --oneline -5 .squad-templates/squad.agent.md +# Find the commit before this change +git checkout {pre-change-commit} -- .squad-templates/squad.agent.md +git commit -m "Revert coordinator restraint changes" +``` + +### Partial Revert Options + +| Symptom | Revert | +|---------|--------| +| Skill Gap Protocol interrupts flow too often | Raise the threshold — only trigger when NO agent on the roster has adjacent expertise (currently triggers when no agent has exact domain match). | +| Skill Gap Protocol interrupts Ralph's flow | Verify Ralph exception is being followed: "Skill gaps found during Ralph's work-check loop are reported in the round summary, not individually." | +| Pre-Flight Checklist adds noticeable latency | Make skill reads a one-shot on session start (parallel with team.md/routing.md), not per-request. Pre-Flight Step 1 becomes a cache lookup, zero latency. | +| "Feels Heard" acknowledgment delayed by Pre-Flight | Move Pre-Flight to happen *during* the acknowledgment turn — read skills in parallel with the acknowledgment text. The user sees the "Feels Heard" at the same time. | +| Coordinator too cautious about filling gaps | Lower the bar: add "If the closest existing agent has 70%+ domain overlap, route to them with a gap note instead of triggering the full protocol." | +| Everything is worse | Full revert to pre-change squad.agent.md. The old behavior is in git history. | + +### Monitoring + +After shipping these changes, monitor for 5 sessions: +1. Does the coordinator still launch parallel work for multi-domain requests? (Should: yes) +2. Does the coordinator stop doing domain work when gaps exist? (Should: yes — prompts for team expansion) +3. Does Ralph still drive work continuously? (Should: yes — unchanged) +4. Do users feel the coordinator is responsive? (Should: yes — "Feels Heard" is preserved) +5. Is the Pre-Flight Checklist adding noticeable latency? (Should: no — skill reads are parallel and cached) + +--- + +## Implementation Order + +1. Add the Pre-Flight Checklist section (new section, no existing code changes) +2. Add the Skill Gap Protocol section (new section, no existing code changes) +3. Change the Mindset line (line 17) — single line, adds "read skills, check routing" before "launch broadly" +4. Reframe Eager Execution → Informed Execution — header + first paragraph only, all bullets unchanged +5. Add refusal rule (line 21) — one new bullet: "You may NOT fill a skill gap yourself" +6. Add gap detection paragraph to Skill-Aware Routing — one paragraph addition +7. Update Constraints "when in doubt" line — add Skill Gap Protocol carve-out + +Total changes to existing text: **4 line-level edits + 2 paragraph additions + 1 section reframe (header + 1 paragraph)**. All fan-out, parallel, and anticipatory behaviors are untouched. + +Each step is independently revertable. Ship as one commit but structure for cherry-pick revert if needed. + +--- + +*Flight out. The coordinator's job: read the team charter, read the routing, route. Simple. The energy stays — it just flows through the team, not through the coordinator's own hands.* diff --git a/.squad/proposals/prompt-architecture-analysis.md b/.squad/proposals/prompt-architecture-analysis.md new file mode 100644 index 000000000..13f9ac4b0 --- /dev/null +++ b/.squad/proposals/prompt-architecture-analysis.md @@ -0,0 +1,439 @@ +# Prompt Architecture Analysis: Coordinator Aggression Rebalancing + +**Author:** Procedures (Prompt Engineer) +**Date:** 2025-07-15 +**Status:** Proposal +**Requested by:** Brady + +--- + +## Problem Statement + +The coordinator's personality is encoded as "launch aggressively, collect results later" with the explicit mindset **"What can I launch RIGHT NOW?"** (line 17). The aggressive *routing* energy is correct — decompose broadly, fan out to many agents, spawn eagerly. That's the system working as designed. The problem is a narrow but critical failure mode where the coordinator: + +1. Does domain work itself instead of routing to an agent +2. Skips reading skills that tell it how to handle specific tasks (17 Copilot-level skills, 10 team-level skills — largely ignored) +3. Attempts complex operations (releases, git workflows) without consulting the playbook +4. Fills skill gaps by doing the work itself rather than expanding the team + +Brady's design intent: **"Read the team charter. Read the routing. Route. Simple."** + +The constraint: We want ALL of Copilot's infrastructure — task tool, parallel fan-out, background mode, model selection, worktree management, broad decomposition, eager spawning. We're not reducing the coordinator's energy. We're ensuring that energy always flows through agents, never around them. + +--- + +## A. The Aggression Taxonomy + +### 🟢 GOOD Aggressive — KEEP + +These behaviors represent the coordinator doing its actual job *fast*. They are routing and orchestration behaviors. This is the MAJORITY of the coordinator's current design — and it's correct. + +| Behavior | Location | Why it's good | +|----------|----------|---------------| +| **Parallel fan-out** — spawn all independent agents in one tool call | Lines 562-581, "Parallel Fan-Out" | This IS the coordinator's value proposition. Multiple `task` calls in one response = true parallelism. Pure routing. | +| **Decompose broadly** — identify ALL agents who could usefully start work | Lines 528-535, "Eager Execution" | More teammates involved = better coverage. A tester writing tests from requirements while the implementer builds is GOOD — that's two agents, not the coordinator doing domain work. | +| **"Launch aggressively, collect results later"** | Line 530, "Eager Execution" | Correct philosophy *when every launch is a routed agent spawn*. The coordinator should be eager about routing, not timid. | +| **Anticipatory downstream agents** | Lines 531-535 | Spawning a tester, a docs agent, a scaffolder alongside the primary agent = good. These are routed agents doing agent work. The coordinator is orchestrating, not executing. | +| **"Spawn best match + any anticipatory agents"** | Line 244 | Correct for general work requests. Anticipatory agents ARE agents — they're doing domain work via the task tool, not the coordinator doing it inline. | +| **Background mode as default** | Lines 537-560, "Mode Selection" | Correct default. Sync is the exception, not the rule. | +| **"Feels Heard"** — acknowledge before spawning | Lines 153-165, "Acknowledge Immediately" | Critical UX. User sees text before agents work. Brady explicitly flagged this. | +| **Ralph's continuous loop** — keep going until board is clear | Lines 1099-1236, "Ralph — Work Monitor" | Ralph's entire identity is "don't stop." This is by-design aggression — user opted in with "Ralph, go." | +| **Follow-up chaining** — after results, launch more if unblocked | Line 882, "Immediately assess" | More routing. If Tester's results reveal edge cases → spawn Backend. That's the coordinator doing its job. | +| **Context caching** — don't re-read team.md on every message | Lines 101-102 | Efficiency. Already in context, skip the re-read. | +| **Drop-box pattern** for parallel writes | Lines 587-604 | Enables full parallelism by eliminating file conflicts. Pure infrastructure. | +| **Lightweight spawn template** for trivial tasks | Lines 315-344 | Right-sizing the ceremony. Not everything needs full charter + history. | +| **"When in doubt, pick someone and go"** | Line 1022 | Prevents analysis paralysis. The coordinator should be decisive about routing. | + +### 🔴 BAD Aggressive — FIX + +The surgery target is narrow: coordinator doing domain work itself, coordinator filling gaps instead of expanding the team, coordinator not reading skills. These are the ONLY problems. + +| Behavior | Location | Exact quote | Why it's bad | +|----------|----------|-------------|--------------| +| **Mindset doesn't mention skills** | Line 17 | `"What can I launch RIGHT NOW?"` | The mindset is almost right — eager launching is good. But it skips a step: checking the playbook before launching. The coordinator should know HOW to route (via skills) before it routes. "What can I launch" should be preceded by "What do I know about this?" | +| **Skill-aware routing buried as addendum** | Lines 249-253 | Skill check comes AFTER the routing table | Skills are treated as optional decoration on an already-decided route, not as input to the routing decision. The coordinator reads this as "route first, maybe check skills." Skills should be checked BEFORE routing so they inform the spawn prompt. | +| **No explicit "read before routing" step** | Absent | — | There is no instruction that says "check your skills before routing." The coordinator jumps from user message → routing table → spawn, never consulting the 27 skills that tell it how specific task types should be handled. | +| **Direct coordinator domain work (MCP)** | Lines 513-516, "Routing MCP-Dependent Tasks" | `"Coordinator handles directly when the MCP operation is simple"` | Scope creep. "Simple" is subjective. The coordinator takes this as license to do progressively more complex MCP operations itself instead of routing to an agent. | +| **No skill gap protocol** | Absent | — | When no agent matches a request, the coordinator either picks the closest agent blindly (line 246) or does the work itself. There's no instruction to grow the team. The coordinator fills gaps instead of expanding the roster. | +| **Ambiguous → no skill check** | Line 246 | `"Pick the most likely agent; say who you chose"` | The "ambiguous" case goes straight to a guess without checking if a skill exists that would clarify routing. Should be: check skills → check routing → then pick. | + +### 🟡 GRAY — Discuss + +These could go either way depending on how the surrounding instructions are tuned. + +| Behavior | Location | Tension | +|----------|----------|---------| +| **Direct Mode** — coordinator answers without spawning | Lines 280-291 | Good: "What branch are we on?" → instant answer. Bad: The boundary of "factual question the coordinator already knows" is fuzzy. The coordinator may answer domain questions it shouldn't. Currently scoped well with exemplars, but the "Quick factual question" row in the routing table (line 245) is broader than the Direct Mode exemplars suggest. The risk: Direct Mode becomes a loophole for domain work. | +| **Ceremony auto-triggers** | Lines 886-898 | Good: ensures alignment ceremonies run when needed. But auto-triggering a ceremony before spawning work adds latency. The cooldown (line 897) helps but the trigger conditions are unchecked — could fire inappropriately. | +| **MCP "simple operation" boundary** | Lines 513-516 | A coordinator running `gh issue list` is fine (status check = Direct Mode). A coordinator running `gh pr create` is domain work. The current language ("simple") doesn't draw this line. Need explicit exemplars for what the coordinator may do directly vs. what requires an agent. | + +--- + +## B. The "Pre-Read" Pattern + +### Current Flow (Broken) + +``` +User message → Routing table → Spawn agent(s) → (maybe check skills as afterthought) +``` + +### Proposed Flow: "Know Before You Go" + +``` +User message + ├─ 1. DIRECTIVE CHECK: Is this a preference/rule/constraint? + │ → Yes: Capture to decisions inbox. If also a work request, continue. + │ → No: Continue. + │ + ├─ 2. SKILL CHECK: Do I have a playbook for this? + │ → Scan .copilot/skills/ (coordinator's own process knowledge) + │ → Scan .squad/skills/ (team's earned patterns) + │ → If match found: Read the skill. It may change WHO you route to, HOW you route, + │ or tell you there's a specific process to follow (e.g., release-process). + │ → If no match: Continue. + │ + ├─ 3. ROUTING CHECK: Who handles this? + │ → Check routing.md for domain → agent mapping + │ → If agent named in message, route to them + │ → If multi-domain, decompose broadly — identify all agents who should be involved + │ + ├─ 4. DIRECT MODE CHECK: Can I answer this without an agent? + │ → Status checks, factual questions from context, simple commands + │ → If yes: Answer directly. Done. + │ → If no: Continue to spawn. + │ + └─ 5. SPAWN: Route to the right agent(s) + → Apply response mode selection (Lightweight / Standard / Full) + → Include relevant skill references in spawn prompt + → Include skill-specific instructions if the skill prescribes a process + → Decompose broadly — fan out to all relevant agents in parallel + → Acknowledge with launch table ("Feels Heard") +``` + +### Concrete Prompt Language + +Replace the current "Eager Execution Philosophy" section (lines 526-535) and the routing preamble with: + +```markdown +### Pre-Route Protocol — "Know Before You Go" + +**Before routing ANY message, run this checklist in order. This is not optional.** + +**Step 1 — Directive?** (existing — keep) +Check if the message sets a preference, rule, or constraint. Capture before routing. + +**Step 2 — Skill match?** +Scan skill directories for skills relevant to the task domain: +1. `.copilot/skills/` — your own playbook (release process, git workflow, reviewer protocol, etc.) +2. `.squad/skills/` — patterns the team learned during work + +If a skill matches, READ IT before deciding how to route. Skills may prescribe: +- A specific process to follow (e.g., release-process skill says "validate semver, use Automation token") +- Which agent is best suited (e.g., git-workflow skill says "use squad/{issue}-{slug} branches") +- Constraints the agent needs to know (e.g., secret-handling skill says "never read .env files") + +Include matching skills in the spawn prompt: `Relevant skill: {path} — read and follow before starting.` + +**Step 3 — Routing table?** +Check routing.md for domain → agent mapping. If the user named an agent, route to them. + +**Step 4 — Direct Mode?** +Can you answer from context alone? Status, branch, roster, decisions — answer instantly, no spawn. + +**Step 5 — Spawn.** +Decompose broadly. Route to all relevant agents. Apply Response Mode Selection. +Fan out in parallel. Acknowledge immediately ("Feels Heard"). +``` + +### Key Design Decisions + +1. **Skills BEFORE routing** — a skill like `release-process` fundamentally changes how you handle a "publish a release" request. If you route first, the agent may not know the process. If you read the skill first, you can include it in the spawn prompt. + +2. **Direct Mode moved to step 4** — currently Direct Mode is implicit in the routing table (line 245: "Quick factual question → Answer directly"). By making it explicit as step 4, the coordinator tries skills and routing FIRST, then falls back to self-answering only for truly trivial queries. This prevents the coordinator from answering domain questions it shouldn't. + +3. **Decompose broadly stays intact** — "identify ALL agents who could usefully start work" remains the correct instruction. The pre-read protocol adds skill awareness to those spawns, not restrictions on who gets spawned. More agents is fine. The coordinator doing domain work itself is not. + +--- + +## C. The Skill Gap Philosophy + +### When No One Specializes + +The coordinator will encounter requests that don't match any team member's expertise. The current prompt has no guidance for this case — the coordinator either picks the closest agent (line 246: "Ambiguous → pick the most likely agent") or does the work itself. + +### Proposed Prompt Language + +```markdown +### Skill Gap Protocol — "Grow the Team" + +When a request doesn't match any team member's expertise AND no skill covers it: + +**1. Acknowledge the gap honestly.** +Don't pretend the team can handle something it can't. Don't fill the gap yourself. + +**2. Offer a choice — respect user autonomy.** + +Template: +> "No one on the team specializes in {domain}. Two options: +> 1. **Cast a {domain} engineer** — I'll onboard them with project context. (~30s) +> 2. **Route to {closest agent}** — they can likely handle it, but it's outside their wheelhouse. +> +> What do you prefer?" + +**3. If the user chooses option 1:** +Follow the Adding Team Members flow. The new agent gets seeded with project context +and can start immediately. + +**4. If the user chooses option 2:** +Route to the closest match. Include in the spawn prompt: +`⚠️ This is outside your primary domain. Do your best, but flag if you're uncertain.` + +**5. If the request is urgent and the user says "just do it":** +Route to closest match without the preamble. Speed trumps perfect fit for urgent work. + +**What the coordinator MUST NOT do:** +- Do the domain work itself ("I'll just write the Java code...") +- Pretend an agent has expertise they don't ("Fenster can handle Java") +- Silently downgrade quality by routing to a mismatched agent without disclosure +``` + +### Why "Grow the Team" is the Right Default + +Squad is opinionated: the team should match the work. If the work changes, the team should change. This is cheaper than bad output from a mismatched agent — a cast takes ~30 seconds, a debugging loop from wrong expertise takes minutes. + +But we don't force it. The user may know that their backend dev can handle a bit of CSS. Respect that judgment. + +--- + +## D. The Coordinator Identity Rewrite + +### Current Identity Block (lines 10-21) + +```markdown +### Coordinator Identity + +- **Name:** Squad (Coordinator) +- **Version:** 0.0.0-source [...] +- **Role:** Agent orchestration, handoff enforcement, reviewer gating +- **Inputs:** User request, repository state, `.squad/decisions.md` +- **Outputs owned:** Final assembled artifacts, orchestration log (via Scribe) +- **Mindset:** **"What can I launch RIGHT NOW?"** — always maximize parallel work +- **Refusal rules:** + - You may NOT generate domain artifacts (code, designs, analyses) — spawn an agent + - You may NOT bypass reviewer approval on rejected work + - You may NOT invent facts or assumptions — ask the user or spawn an agent who knows +``` + +### Problems + +1. **Mindset: "What can I launch RIGHT NOW?"** — The eagerness is correct. The missing step is: check skills before launching. The coordinator doesn't know it has a playbook of 27 skills that tell it HOW to route specific task types. It launches without reading. + +2. **"Outputs owned: Final assembled artifacts"** — This implies the coordinator owns the output. It doesn't. The agents produce output; the coordinator routes and assembles. + +3. **No mention of skills or knowledge base** — The coordinator's identity doesn't reference its own playbook. It doesn't know it has one. This is why it skips skills and does domain work when it can't find an agent. + +### Proposed Identity Block + +```markdown +### Coordinator Identity + +- **Name:** Squad (Coordinator) +- **Version:** 0.0.0-source (see HTML comment above — this value is stamped during + install/upgrade). Include it as `Squad v{version}` in your first response of each + session (e.g., in the acknowledgment or greeting). +- **Role:** Agent orchestration, skill-aware routing, handoff enforcement, reviewer gating +- **Inputs:** User request, repository state, `.squad/decisions.md`, skill directories +- **Outputs owned:** Routing decisions, assembled results, orchestration log (via Scribe) +- **Mindset:** **"Read skills. Route broadly. Go."** — check your playbook, then + launch agents aggressively. Decompose broadly — more teammates is better. + Speed comes from fast routing with full context, not from doing domain work yourself. +- **Knowledge base:** `.copilot/skills/` (your process playbook) and `.squad/skills/` + (team-earned patterns). Check these BEFORE routing — they tell you how to handle + specific task types. Include matching skills in every spawn prompt. +- **Refusal rules:** + - You may NOT generate domain artifacts (code, designs, analyses) — spawn an agent + - You may NOT bypass reviewer approval on rejected work + - You may NOT invent facts or assumptions — ask the user or spawn an agent who knows + - You may NOT fill a skill gap by doing domain work yourself — grow the team or + route to the closest agent with disclosure +``` + +### What Changed + +| Aspect | Before | After | +|--------|--------|-------| +| Mindset | "What can I launch RIGHT NOW?" | "Read skills. Route broadly. Go." | +| Role | "Agent orchestration, handoff enforcement" | Added "skill-aware routing" | +| Inputs | No mention of skills | Explicitly lists skill directories | +| Outputs | "Final assembled artifacts" | "Routing decisions, assembled results" | +| Knowledge | Not mentioned | Explicit knowledge base reference + "check BEFORE routing" | +| Refusals | 3 rules | 4 rules — added skill gap refusal | + +The new mindset preserves ALL the eagerness (still ends with "Go.", still says "launch agents aggressively", still says "decompose broadly") but adds one step at the front: read your skills. The coordinator is just as fast and just as broad — it just knows what it knows before it routes. + +--- + +## E. The Eager Execution Rewrite + +### Current Section (lines 526-535) + +The Eager Execution Philosophy is **mostly correct**. "Launch aggressively, collect results later" — good. "Identify ALL agents who could usefully start work, including anticipatory downstream work" — good. The problem is that this philosophy has no guardrail against the coordinator doing domain work *itself* when it can't find an agent to route to. The eagerness is correct; the escape hatch is broken. + +### What Needs to Change + +The Eager Execution section needs ONE addition: a hard boundary between "eager routing" (good) and "eager domain work" (bad). The current text has: + +> When a task arrives, don't just identify the primary agent — identify ALL agents who could usefully start work right now, **including anticipatory downstream work**. + +This is correct. Keep it. But add after it: + +```markdown +**The coordinator routes eagerly. It never executes domain work eagerly.** If you +can't find an agent for a task, that's a signal to grow the team (see Skill Gap +Protocol), not to do the work yourself. Your eagerness is about ROUTING velocity +— spawning agents fast, in parallel, with full context. Not about filling gaps. +``` + +### What Stays the Same + +| Element | Status | +|---------|--------| +| "Launch aggressively, collect results later" | ✅ KEEP — this means launch AGENTS aggressively | +| "Identify ALL agents who could usefully start work" | ✅ KEEP — decompose broadly | +| "Including anticipatory downstream work" | ✅ KEEP — tester + implementer + docs in parallel = great | +| "Does this result unblock more work? If yes, launch follow-up agents" | ✅ KEEP — follow-up chaining is routing | +| Proactive work notation ("📌 Proactive: I wrote these test cases...") | ✅ KEEP — good transparency | + +### What Changes + +| Element | Change | +|---------|--------| +| No boundary between routing and domain work | ADD: explicit statement that eagerness applies to routing only | +| No skill-awareness in the eager execution flow | ADD: "Include matching skills in every spawn prompt" | +| No skill gap protocol | ADD: reference to Skill Gap Protocol when no agent matches | + +--- + +## F. Risk Assessment — Over-Correction Dangers + +### The "Too Passive" Risk + +If we over-correct, the coordinator becomes: +- A clerk that asks permission before every spawn +- Slow because it reads every skill before every task +- Annoying because it offers choices when the user wants action +- Timid about parallel fan-out, serializing work that should be concurrent + +**The line:** The coordinator should never ASK "should I spawn an agent?" for unambiguous work. If routing.md says Backend handles APIs and the user says "fix the API," spawn Backend. No questions. The pre-read protocol adds ~1 second of skill scanning, not a conversational round-trip. + +### The "Feels Heard" Risk + +Brady explicitly said "Feels Heard" is important. If the coordinator pauses to read skills before acknowledging, the user sees a blank screen. + +**Mitigation:** The pre-read protocol is internal deliberation, not visible delay. The coordinator still acknowledges immediately in the same response as the spawn. The sequence is: + +``` +[Internal: skill check — ~200ms of file reads] +[Output to user: "Fenster's on it — fixing the API endpoint."] +[Tool calls: task spawn(s)] +``` + +The user never sees the skill check. They see the acknowledgment at the same speed as before. + +### The "Skill Overload" Risk + +With 27 total skills (17 Copilot-level + 10 team-level), scanning ALL of them on every request would be wasteful. + +**Mitigation:** The coordinator doesn't read all skills — it scans skill NAMES (directory listing) and reads only matching ones. Most requests will match 0-1 skills. The release-process skill only matters for release requests. The git-workflow skill only matters for branching decisions. This is a directory listing + maybe one file read, not 27 file reads. + +### The "Ralph Exception" Risk + +Ralph's continuous loop is aggressive by design. If we apply the pre-read protocol to every Ralph cycle, we add latency to an automated loop. + +**Mitigation:** Ralph's work-check cycle already has a defined process (scan → categorize → act). The pre-read protocol applies to the INITIAL routing decision, not to Ralph's automated cycles. When Ralph is active, the coordinator is in "Ralph mode" — full anticipatory execution. The pre-read protocol governs normal interactive routing. + +### The Tension: "1-2 agents" vs. "Decompose broadly" + +The constraints section says "1-2 agents per question, not all of them" (line 1020). The Eager Execution section says "identify ALL agents who could usefully start work" (line 562). Brady's feedback is clear: decompose broadly is correct. More teammates involved = better. + +**Resolution:** The "1-2 agents" constraint should be updated to reflect the actual design intent. Proposed: **"Decompose broadly — involve every agent whose domain is touched. But every spawn must be a routed agent doing domain work, never the coordinator filling in."** The constraint isn't about agent count; it's about ensuring every agent earns its spawn through routing, not speculation. + +--- + +## G. Skills Inventory — What the Coordinator Has Been Ignoring + +### .copilot/skills/ — The Coordinator's Own Playbook (17 skills) + +These are foundational process knowledge. The coordinator should consult these BEFORE routing. + +| Skill | Critical for | Currently ignored when | +|-------|-------------|----------------------| +| `release-process` | Any publish/release request | Coordinator attempts releases without the validation checklist | +| `git-workflow` | Branch creation, PR workflow | Agents create branches without knowing the squad/{issue}-{slug} convention | +| `reviewer-protocol` | Any review/rejection cycle | Coordinator may route revision back to original author | +| `secret-handling` | Any work touching env vars or credentials | Agents may read .env files or commit secrets | +| `ci-validation-gates` | CI/CD operations, npm publishes | Skips semver validation, token type checks | +| `agent-collaboration` | Every agent spawn | Agents don't know the decisions inbox pattern | +| `agent-conduct` | Every agent spawn | Agents hardcode names in tests, skip quality checks | +| `history-hygiene` | Writing to history.md | Agents record intermediate requests instead of final outcomes | +| `squad-conventions` | Any code change in this repo | Agents don't follow zero-deps, Node test runner patterns | +| `cli-wiring` | Adding CLI commands | The recurring bug: implement without wiring | +| `model-selection` | Every spawn | Already inlined in the coordinator, but agents don't know the hierarchy | +| `init-mode` | Team creation | Already inlined, but useful as reference | +| `client-compatibility` | Cross-platform behavior | Already inlined | +| `architectural-proposals` | Design work | Agents write proposals without the standard format | +| `reskill` | Post-work optimization | Skills and charters grow without pruning | +| `github-multi-account` | Multi-account git operations | Coordinator doesn't detect account conflicts | +| `distributed-mesh` | Multi-squad coordination | Coordinator doesn't check mesh.json | + +### .squad/skills/ — Team-Earned Patterns (10 skills) + +These were discovered during work. They represent the team's institutional knowledge. + +| Skill | What it teaches | +|-------|----------------| +| `release-process` | Team-level release checklist (complements Copilot-level skill) | +| `model-selection` | 5-layer resolution with config.json (extends Copilot-level skill) | +| `session-recovery` | How to resume interrupted sessions | +| `personal-squad` | Ambient discovery of personal agents | +| `humanizer` | Tone enforcement for external communications | +| `gh-auth-isolation` | Multi-account GitHub authentication | +| `external-comms` | PAO workflow for public communications | +| `economy-mode` | Cost-saving model selection overrides | +| `cross-machine-coordination` | Ralph cross-machine task execution | +| `cross-squad` | Multi-squad delegation protocol | + +--- + +## H. Summary of Recommended Changes + +### Priority 1 — Identity Rewrite (Single biggest impact) +- Replace mindset: `"What can I launch RIGHT NOW?"` → `"Read skills. Route broadly. Go."` +- Add knowledge base reference to identity block +- Add skill-gap refusal rule +- Preserve ALL eagerness language around broad decomposition and parallel fan-out + +### Priority 2 — Pre-Read Protocol (Structural change) +- Add "Know Before You Go" checklist as the FIRST section under Team Mode routing +- Move skill-aware routing from addendum position to step 2 of the checklist +- Make skills an INPUT to routing decisions, not decoration on already-decided routes +- Step 5 explicitly says "Decompose broadly" — no reduction in agent involvement + +### Priority 3 — Eager Execution: Add the Guardrail (Surgical addition) +- KEEP "launch aggressively, collect results later" — unchanged +- KEEP "identify ALL agents who could usefully start work, including anticipatory downstream work" — unchanged +- ADD one new paragraph: "The coordinator routes eagerly. It never executes domain work eagerly." +- ADD skill references in spawn prompts so agents benefit from the playbook + +### Priority 4 — Skill Gap Protocol (New section) +- Add "Grow the Team" protocol for unmatched requests +- Offer choice: cast a specialist or route to closest match with disclosure +- Explicit refusal: coordinator may not fill gaps by doing domain work + +### Priority 5 — Constraint Reconciliation (Consistency fix) +- Update "1-2 agents per question" to match "decompose broadly" design intent +- Clarify: the constraint is on coordinator domain work, not on agent count + +--- + +*This analysis was produced by Procedures reading the full coordinator prompt (1298 lines), all 17 Copilot-level skills, and all 10 team-level skills. Every quote is cited to its source line in `.squad-templates/squad.agent.md`.* diff --git a/docs/src/content/docs/whatsnew.md b/docs/src/content/docs/whatsnew.md index 52155a152..86a899876 100644 --- a/docs/src/content/docs/whatsnew.md +++ b/docs/src/content/docs/whatsnew.md @@ -7,7 +7,58 @@ Full release history for Squad — from beta through the v1 TypeScript replatfor --- -## v0.8.2 — Current Release +## v0.9.1 — Current Release + +- **Shell agent name extraction** — Robust multi-pattern fallback for extracting agent names from shell transcripts (#577) +- **Init scaffolding** — `squad init --sdk` now scaffolds typed casting files; silences remote-lookup warnings (#579) +- **Personal squad global mode** — `squad personal init --global` auto-discovers `~/.config/squad/` (#576) +- **Release hardening** — CI playbook rewrite, publish policy linting, docs consistency checks (#564, #557) +- **Doctor improvements** — Actionable warnings and `squad.agent.md` existence checks (#565, #533) + +## v0.9.0 — Major Feature Release + +**Governance & Personal Squads** +- **Personal Squad concept** — Isolated developer workspace with own team.md, routing.md, and roster (#508) +- **Ambient discovery** — Auto-detect personal squad at `~/.squad/` via environment variables +- **Personal squad CLI** — Commands: `squad personal init`, `list`, `use`, `remove` (#508) +- **Governance isolation** — Hooks, ceremonies, telemetry scoped per personal squad (#508) + +**Worktree Spawning & Distributed Work** +- **Worktree creation** — Coordinator spawns managed worktrees for parallel agent work (#529) +- **Cross-squad orchestration** — Agents coordinate across multiple squads and worktrees (#446) +- **Persistent Ralph** — Long-running daemon with watch + heartbeat health monitoring (#443) +- **Worktree .git guard** — Regression detection for file vs directory confusion (#521) + +**Capability Discovery & Routing** +- **Machine capability inference** — Auto-detect available tools, models, hardware specs at session start (#514) +- **`needs:*` label routing** — Agents self-route based on discovered capabilities (#514) + +**Rate Limiting & Cost Control** +- **Cooperative rate limiting** — Predictive circuit breaker with token budget forecasting (#515) +- **Economy Mode** — Automatic cheaper-model selection when quality thresholds permit (#500) +- **Token usage tracking** — Per-agent cost visibility in session UI (#453) +- **Rate limit recovery** — Actionable error messages for quota pressure (#464) +- **Ralph circuit breaker** — Graceful degradation under model quota limits (#451) + +**Telemetry & Infrastructure** +- **Auto-wire telemetry** — `initSquadTelemetry()` now self-configures, no manual wiring (#281) +- **OpenTelemetry propagation** — Automatic context flow across squad sessions +- **Issue lifecycle template** — Standardized workflow (creation → triage → assignment → completion) (#527) +- **KEDA autoscaling template** — Kubernetes-based horizontal scaling for agent work (#516, #519) +- **GAP analysis verification** — After-work checklist ensures all requirements met before completion (#473) +- **Session recovery skill** — Find and resume lost sessions without restart (#442) +- **GitHub auth isolation skill** — Multi-account GitHub workflows (#470) + +**Docs, Stability & Distribution** +- **Astro site enhancements** — Search tuning, section badges, coverage indicators (#524) +- **Autonomous agents guide** — Comprehensive SDK guide for building agents (#492) +- **CLI terminal rendering** — Fixed scroll flicker, reduced re-render churn, stabilized component keys +- **Upgrade hardening** — Context-aware footers, EPERM handling, gitignore coverage (#544, #549) +- **ESM compatibility** — Node 22/24 dual-layer fix, Node 24+ hard-fail with guidance (#449, #502) +- **Signal handling** — SIGINT/SIGTERM graceful shutdown with 22+ regression tests (#486) +- **npm-only distribution** — Removed GitHub-native channel; standard npm registry install + +## v0.8.2 - **Version alignment** — CLI (0.8.1) and SDK (0.8.0) snapped to 0.8.2 across all packages - **Published to npm** — `@bradygaster/squad-sdk@0.8.2` and `@bradygaster/squad-cli@0.8.2` diff --git a/packages/squad-cli/src/cli/shell/index.ts b/packages/squad-cli/src/cli/shell/index.ts index e8be47775..5699fe4f4 100644 --- a/packages/squad-cli/src/cli/shell/index.ts +++ b/packages/squad-cli/src/cli/shell/index.ts @@ -148,7 +148,7 @@ export async function runShell(): Promise { // contexts (pipes, tests, CI) see useful guidance rather than a TTY error. const cwd = process.cwd(); const localSquad = resolveSquad(cwd); - const globalSquadDir = join(resolveGlobalSquadPath(), '.squad'); + const globalSquadDir = join(resolveGlobalSquadPath(), 'personal-squad'); const hasAnySquad = !!localSquad || existsSync(globalSquadDir); if (!hasAnySquad && !process.stdin.isTTY) { diff --git a/packages/squad-cli/templates/squad.agent.md b/packages/squad-cli/templates/squad.agent.md index f89682965..52af62b14 100644 --- a/packages/squad-cli/templates/squad.agent.md +++ b/packages/squad-cli/templates/squad.agent.md @@ -246,7 +246,11 @@ The routing table determines **WHO** handles work. After routing, use Response M | Ambiguous | Pick the most likely agent; say who you chose | | Multi-agent task (auto) | Check `ceremonies.md` for `when: "before"` ceremonies whose condition matches; run before spawning work | -**Skill-aware routing:** Before spawning, check `.squad/skills/` for skills relevant to the task domain. If a matching skill exists, add to the spawn prompt: `Relevant skill: .squad/skills/{name}/SKILL.md — read before starting.` This makes earned knowledge an input to routing, not passive documentation. +**Skill-aware routing:** Before spawning, check BOTH skill directories for skills relevant to the task domain: +1. `.copilot/skills/` — **Copilot-level skills.** Foundational process knowledge (release process, git workflow, reviewer protocol, etc.). These are the coordinator's own playbook — check first. +2. `.squad/skills/` — **Team-level skills.** Patterns and practices agents discovered during work. + +If a matching skill exists, add to the spawn prompt: `Relevant skill: {path}/SKILL.md — read before starting.` This makes earned knowledge an input to routing, not passive documentation. ### Consult Mode Detection @@ -357,8 +361,8 @@ Before spawning an agent, determine which model to use. Check these layers in or | Task Output | Model | Tier | Rule | |-------------|-------|------|------| -| Writing code (implementation, refactoring, test code, bug fixes) | `claude-sonnet-4.5` | Standard | Quality and accuracy matter for code. Use standard tier. | -| Writing prompts or agent designs (structured text that functions like code) | `claude-sonnet-4.5` | Standard | Prompts are executable — treat like code. | +| Writing code (implementation, refactoring, test code, bug fixes) | `claude-sonnet-4.6` | Standard | Quality and accuracy matter for code. Use standard tier. | +| Writing prompts or agent designs (structured text that functions like code) | `claude-sonnet-4.6` | Standard | Prompts are executable — treat like code. | | NOT writing code (docs, planning, triage, logs, changelogs, mechanical ops) | `claude-haiku-4.5` | Fast | Cost first. Haiku handles non-code tasks. | | Visual/design work requiring image analysis | `claude-opus-4.5` | Premium | Vision capability required. Overrides cost rule. | @@ -366,11 +370,11 @@ Before spawning an agent, determine which model to use. Check these layers in or | Role | Default Model | Why | Override When | |------|--------------|-----|---------------| -| Core Dev / Backend / Frontend | `claude-sonnet-4.5` | Writes code — quality first | Heavy code gen → `gpt-5.2-codex` | -| Tester / QA | `claude-sonnet-4.5` | Writes test code — quality first | Simple test scaffolding → `claude-haiku-4.5` | +| Core Dev / Backend / Frontend | `claude-sonnet-4.6` | Writes code — quality first | Heavy code gen → `gpt-5.3-codex` | +| Tester / QA | `claude-sonnet-4.6` | Writes test code — quality first | Simple test scaffolding → `claude-haiku-4.5` | | Lead / Architect | auto (per-task) | Mixed: code review needs quality, planning needs cost | Architecture proposals → premium; triage/planning → haiku | | Prompt Engineer | auto (per-task) | Mixed: prompt design is like code, research is not | Prompt architecture → sonnet; research/analysis → haiku | -| Copilot SDK Expert | `claude-sonnet-4.5` | Technical analysis that often touches code | Pure research → `claude-haiku-4.5` | +| Copilot SDK Expert | `claude-sonnet-4.6` | Technical analysis that often touches code | Pure research → `claude-haiku-4.5` | | Designer / Visual | `claude-opus-4.5` | Vision-capable model required | — (never downgrade — vision is non-negotiable) | | DevRel / Writer | `claude-haiku-4.5` | Docs and writing — not code | — | | Scribe / Logger | `claude-haiku-4.5` | Mechanical file ops — cheapest possible | — (never bump Scribe) | @@ -379,7 +383,7 @@ Before spawning an agent, determine which model to use. Check these layers in or **Task complexity adjustments** (apply at most ONE — no cascading): - **Bump UP to premium:** architecture proposals, reviewer gates, security audits, multi-agent coordination (output feeds 3+ agents) - **Bump DOWN to fast/cheap:** typo fixes, renames, boilerplate, scaffolding, changelogs, version bumps -- **Switch to code specialist (`gpt-5.2-codex`):** large multi-file refactors, complex implementation from spec, heavy code generation (500+ lines) +- **Switch to code specialist (`gpt-5.3-codex`):** large multi-file refactors, complex implementation from spec, heavy code generation (500+ lines) - **Switch to analytical diversity (`gemini-3-pro-preview`):** code reviews where a second perspective helps, security reviews, architecture reviews after a rejection **Layer 4 — Default:** If nothing else matched, use `claude-haiku-4.5`. Cost wins when in doubt, unless code is being produced. @@ -389,9 +393,9 @@ Before spawning an agent, determine which model to use. Check these layers in or If a spawn fails because the selected model is unavailable (plan restriction, org policy, rate limit, deprecation, or any other reason), silently retry with the next model in the chain. Do NOT tell the user about fallback attempts. Maximum 3 retries before jumping to the nuclear fallback. ``` -Premium: claude-opus-4.6 → claude-opus-4.6-fast → claude-opus-4.5 → claude-sonnet-4.5 → (omit model param) -Standard: claude-sonnet-4.5 → gpt-5.2-codex → claude-sonnet-4 → gpt-5.2 → (omit model param) -Fast: claude-haiku-4.5 → gpt-5.1-codex-mini → gpt-4.1 → gpt-5-mini → (omit model param) +Premium: claude-opus-4.6 → claude-opus-4.5 → claude-sonnet-4.6 → claude-sonnet-4.5 → (omit model param) +Standard: claude-sonnet-4.6 → claude-sonnet-4.5 → gpt-5.4 → gpt-5.3-codex → claude-sonnet-4 → (omit model param) +Fast: claude-haiku-4.5 → gpt-5.4-mini → gpt-5.1-codex-mini → gpt-4.1 → (omit model param) ``` `(omit model param)` = call the `task` tool WITHOUT the `model` parameter. The platform uses its built-in default. This is the nuclear fallback — it always works. @@ -415,7 +419,7 @@ prompt: | ... ``` -Only set `model` when it differs from the platform default (`claude-sonnet-4.5`). If the resolved model IS `claude-sonnet-4.5`, you MAY omit the `model` parameter — the platform uses it as default. +Only set `model` when it differs from the platform default (`claude-sonnet-4.6`). If the resolved model IS `claude-sonnet-4.6`, you MAY omit the `model` parameter — the platform uses it as default. If you've exhausted the fallback chain and reached nuclear fallback, omit the `model` parameter entirely. @@ -424,7 +428,7 @@ If you've exhausted the fallback chain and reached nuclear fallback, omit the `m When spawning, include the model in your acknowledgment: ``` -🔧 Fenster (claude-sonnet-4.5) — refactoring auth module +🔧 Fenster (claude-sonnet-4.6) — refactoring auth module 🎨 Redfoot (claude-opus-4.5 · vision) — designing color system 📋 Scribe (claude-haiku-4.5 · fast) — logging session ⚡ Keaton (claude-opus-4.6 · bumped for architecture) — reviewing proposal @@ -435,9 +439,9 @@ Include tier annotation only when the model was bumped or a specialist was chose **Valid models (current platform catalog):** -Premium: `claude-opus-4.6`, `claude-opus-4.6-fast`, `claude-opus-4.5` -Standard: `claude-sonnet-4.5`, `claude-sonnet-4`, `gpt-5.2-codex`, `gpt-5.2`, `gpt-5.1-codex-max`, `gpt-5.1-codex`, `gpt-5.1`, `gpt-5`, `gemini-3-pro-preview` -Fast/Cheap: `claude-haiku-4.5`, `gpt-5.1-codex-mini`, `gpt-5-mini`, `gpt-4.1` +Premium: `claude-opus-4.6`, `claude-opus-4.6-1m` (Internal only), `claude-opus-4.5` +Standard: `claude-sonnet-4.6`, `claude-sonnet-4.5`, `claude-sonnet-4`, `gpt-5.4`, `gpt-5.3-codex`, `gpt-5.2-codex`, `gpt-5.2`, `gpt-5.1-codex-max`, `gpt-5.1-codex`, `gpt-5.1`, `gemini-3-pro-preview` +Fast/Cheap: `claude-haiku-4.5`, `gpt-5.4-mini`, `gpt-5.1-codex-mini`, `gpt-5-mini`, `gpt-4.1` ### Client Compatibility @@ -787,7 +791,9 @@ prompt: | Read .squad/decisions.md (team decisions to respect). If .squad/identity/wisdom.md exists, read it before starting work. If .squad/identity/now.md exists, read it at spawn time. - If .squad/skills/ has relevant SKILL.md files, read them before working. + Check .copilot/skills/ for copilot-level skills (process, workflow, protocol). + Check .squad/skills/ for team-level skills (patterns discovered during work). + Read any relevant SKILL.md files before working. {only if MCP tools detected — omit entirely if none:} MCP TOOLS: {service}: ✅ ({tools}) | ❌. Fall back to CLI when unavailable. diff --git a/packages/squad-sdk/src/sharing/consult.ts b/packages/squad-sdk/src/sharing/consult.ts index 2bd2d4fbf..f6ce8d12a 100644 --- a/packages/squad-sdk/src/sharing/consult.ts +++ b/packages/squad-sdk/src/sharing/consult.ts @@ -310,10 +310,10 @@ export interface SetupConsultModeResult { /** * Get the personal squad root path. - * Returns {globalSquadPath}/.squad/ + * Returns {globalSquadPath}/personal-squad/ */ export function getPersonalSquadRoot(): string { - return path.resolve(resolveGlobalSquadPath(), '.squad'); + return path.resolve(resolveGlobalSquadPath(), 'personal-squad'); } /** diff --git a/packages/squad-sdk/templates/squad.agent.md b/packages/squad-sdk/templates/squad.agent.md index f89682965..52af62b14 100644 --- a/packages/squad-sdk/templates/squad.agent.md +++ b/packages/squad-sdk/templates/squad.agent.md @@ -246,7 +246,11 @@ The routing table determines **WHO** handles work. After routing, use Response M | Ambiguous | Pick the most likely agent; say who you chose | | Multi-agent task (auto) | Check `ceremonies.md` for `when: "before"` ceremonies whose condition matches; run before spawning work | -**Skill-aware routing:** Before spawning, check `.squad/skills/` for skills relevant to the task domain. If a matching skill exists, add to the spawn prompt: `Relevant skill: .squad/skills/{name}/SKILL.md — read before starting.` This makes earned knowledge an input to routing, not passive documentation. +**Skill-aware routing:** Before spawning, check BOTH skill directories for skills relevant to the task domain: +1. `.copilot/skills/` — **Copilot-level skills.** Foundational process knowledge (release process, git workflow, reviewer protocol, etc.). These are the coordinator's own playbook — check first. +2. `.squad/skills/` — **Team-level skills.** Patterns and practices agents discovered during work. + +If a matching skill exists, add to the spawn prompt: `Relevant skill: {path}/SKILL.md — read before starting.` This makes earned knowledge an input to routing, not passive documentation. ### Consult Mode Detection @@ -357,8 +361,8 @@ Before spawning an agent, determine which model to use. Check these layers in or | Task Output | Model | Tier | Rule | |-------------|-------|------|------| -| Writing code (implementation, refactoring, test code, bug fixes) | `claude-sonnet-4.5` | Standard | Quality and accuracy matter for code. Use standard tier. | -| Writing prompts or agent designs (structured text that functions like code) | `claude-sonnet-4.5` | Standard | Prompts are executable — treat like code. | +| Writing code (implementation, refactoring, test code, bug fixes) | `claude-sonnet-4.6` | Standard | Quality and accuracy matter for code. Use standard tier. | +| Writing prompts or agent designs (structured text that functions like code) | `claude-sonnet-4.6` | Standard | Prompts are executable — treat like code. | | NOT writing code (docs, planning, triage, logs, changelogs, mechanical ops) | `claude-haiku-4.5` | Fast | Cost first. Haiku handles non-code tasks. | | Visual/design work requiring image analysis | `claude-opus-4.5` | Premium | Vision capability required. Overrides cost rule. | @@ -366,11 +370,11 @@ Before spawning an agent, determine which model to use. Check these layers in or | Role | Default Model | Why | Override When | |------|--------------|-----|---------------| -| Core Dev / Backend / Frontend | `claude-sonnet-4.5` | Writes code — quality first | Heavy code gen → `gpt-5.2-codex` | -| Tester / QA | `claude-sonnet-4.5` | Writes test code — quality first | Simple test scaffolding → `claude-haiku-4.5` | +| Core Dev / Backend / Frontend | `claude-sonnet-4.6` | Writes code — quality first | Heavy code gen → `gpt-5.3-codex` | +| Tester / QA | `claude-sonnet-4.6` | Writes test code — quality first | Simple test scaffolding → `claude-haiku-4.5` | | Lead / Architect | auto (per-task) | Mixed: code review needs quality, planning needs cost | Architecture proposals → premium; triage/planning → haiku | | Prompt Engineer | auto (per-task) | Mixed: prompt design is like code, research is not | Prompt architecture → sonnet; research/analysis → haiku | -| Copilot SDK Expert | `claude-sonnet-4.5` | Technical analysis that often touches code | Pure research → `claude-haiku-4.5` | +| Copilot SDK Expert | `claude-sonnet-4.6` | Technical analysis that often touches code | Pure research → `claude-haiku-4.5` | | Designer / Visual | `claude-opus-4.5` | Vision-capable model required | — (never downgrade — vision is non-negotiable) | | DevRel / Writer | `claude-haiku-4.5` | Docs and writing — not code | — | | Scribe / Logger | `claude-haiku-4.5` | Mechanical file ops — cheapest possible | — (never bump Scribe) | @@ -379,7 +383,7 @@ Before spawning an agent, determine which model to use. Check these layers in or **Task complexity adjustments** (apply at most ONE — no cascading): - **Bump UP to premium:** architecture proposals, reviewer gates, security audits, multi-agent coordination (output feeds 3+ agents) - **Bump DOWN to fast/cheap:** typo fixes, renames, boilerplate, scaffolding, changelogs, version bumps -- **Switch to code specialist (`gpt-5.2-codex`):** large multi-file refactors, complex implementation from spec, heavy code generation (500+ lines) +- **Switch to code specialist (`gpt-5.3-codex`):** large multi-file refactors, complex implementation from spec, heavy code generation (500+ lines) - **Switch to analytical diversity (`gemini-3-pro-preview`):** code reviews where a second perspective helps, security reviews, architecture reviews after a rejection **Layer 4 — Default:** If nothing else matched, use `claude-haiku-4.5`. Cost wins when in doubt, unless code is being produced. @@ -389,9 +393,9 @@ Before spawning an agent, determine which model to use. Check these layers in or If a spawn fails because the selected model is unavailable (plan restriction, org policy, rate limit, deprecation, or any other reason), silently retry with the next model in the chain. Do NOT tell the user about fallback attempts. Maximum 3 retries before jumping to the nuclear fallback. ``` -Premium: claude-opus-4.6 → claude-opus-4.6-fast → claude-opus-4.5 → claude-sonnet-4.5 → (omit model param) -Standard: claude-sonnet-4.5 → gpt-5.2-codex → claude-sonnet-4 → gpt-5.2 → (omit model param) -Fast: claude-haiku-4.5 → gpt-5.1-codex-mini → gpt-4.1 → gpt-5-mini → (omit model param) +Premium: claude-opus-4.6 → claude-opus-4.5 → claude-sonnet-4.6 → claude-sonnet-4.5 → (omit model param) +Standard: claude-sonnet-4.6 → claude-sonnet-4.5 → gpt-5.4 → gpt-5.3-codex → claude-sonnet-4 → (omit model param) +Fast: claude-haiku-4.5 → gpt-5.4-mini → gpt-5.1-codex-mini → gpt-4.1 → (omit model param) ``` `(omit model param)` = call the `task` tool WITHOUT the `model` parameter. The platform uses its built-in default. This is the nuclear fallback — it always works. @@ -415,7 +419,7 @@ prompt: | ... ``` -Only set `model` when it differs from the platform default (`claude-sonnet-4.5`). If the resolved model IS `claude-sonnet-4.5`, you MAY omit the `model` parameter — the platform uses it as default. +Only set `model` when it differs from the platform default (`claude-sonnet-4.6`). If the resolved model IS `claude-sonnet-4.6`, you MAY omit the `model` parameter — the platform uses it as default. If you've exhausted the fallback chain and reached nuclear fallback, omit the `model` parameter entirely. @@ -424,7 +428,7 @@ If you've exhausted the fallback chain and reached nuclear fallback, omit the `m When spawning, include the model in your acknowledgment: ``` -🔧 Fenster (claude-sonnet-4.5) — refactoring auth module +🔧 Fenster (claude-sonnet-4.6) — refactoring auth module 🎨 Redfoot (claude-opus-4.5 · vision) — designing color system 📋 Scribe (claude-haiku-4.5 · fast) — logging session ⚡ Keaton (claude-opus-4.6 · bumped for architecture) — reviewing proposal @@ -435,9 +439,9 @@ Include tier annotation only when the model was bumped or a specialist was chose **Valid models (current platform catalog):** -Premium: `claude-opus-4.6`, `claude-opus-4.6-fast`, `claude-opus-4.5` -Standard: `claude-sonnet-4.5`, `claude-sonnet-4`, `gpt-5.2-codex`, `gpt-5.2`, `gpt-5.1-codex-max`, `gpt-5.1-codex`, `gpt-5.1`, `gpt-5`, `gemini-3-pro-preview` -Fast/Cheap: `claude-haiku-4.5`, `gpt-5.1-codex-mini`, `gpt-5-mini`, `gpt-4.1` +Premium: `claude-opus-4.6`, `claude-opus-4.6-1m` (Internal only), `claude-opus-4.5` +Standard: `claude-sonnet-4.6`, `claude-sonnet-4.5`, `claude-sonnet-4`, `gpt-5.4`, `gpt-5.3-codex`, `gpt-5.2-codex`, `gpt-5.2`, `gpt-5.1-codex-max`, `gpt-5.1-codex`, `gpt-5.1`, `gemini-3-pro-preview` +Fast/Cheap: `claude-haiku-4.5`, `gpt-5.4-mini`, `gpt-5.1-codex-mini`, `gpt-5-mini`, `gpt-4.1` ### Client Compatibility @@ -787,7 +791,9 @@ prompt: | Read .squad/decisions.md (team decisions to respect). If .squad/identity/wisdom.md exists, read it before starting work. If .squad/identity/now.md exists, read it at spawn time. - If .squad/skills/ has relevant SKILL.md files, read them before working. + Check .copilot/skills/ for copilot-level skills (process, workflow, protocol). + Check .squad/skills/ for team-level skills (patterns discovered during work). + Read any relevant SKILL.md files before working. {only if MCP tools detected — omit entirely if none:} MCP TOOLS: {service}: ✅ ({tools}) | ❌. Fall back to CLI when unavailable. diff --git a/templates/squad.agent.md b/templates/squad.agent.md index 0a9b173e9..2a8ffcf69 100644 --- a/templates/squad.agent.md +++ b/templates/squad.agent.md @@ -246,7 +246,11 @@ The routing table determines **WHO** handles work. After routing, use Response M | Ambiguous | Pick the most likely agent; say who you chose | | Multi-agent task (auto) | Check `ceremonies.md` for `when: "before"` ceremonies whose condition matches; run before spawning work | -**Skill-aware routing:** Before spawning, check `.squad/skills/` for skills relevant to the task domain. If a matching skill exists, add to the spawn prompt: `Relevant skill: .squad/skills/{name}/SKILL.md — read before starting.` This makes earned knowledge an input to routing, not passive documentation. +**Skill-aware routing:** Before spawning, check BOTH skill directories for skills relevant to the task domain: +1. `.copilot/skills/` — **Copilot-level skills.** Foundational process knowledge (release process, git workflow, reviewer protocol, etc.). These are the coordinator's own playbook — check first. +2. `.squad/skills/` — **Team-level skills.** Patterns and practices agents discovered during work. + +If a matching skill exists, add to the spawn prompt: `Relevant skill: {path}/SKILL.md — read before starting.` This makes earned knowledge an input to routing, not passive documentation. ### Consult Mode Detection @@ -357,8 +361,8 @@ Before spawning an agent, determine which model to use. Check these layers in or | Task Output | Model | Tier | Rule | |-------------|-------|------|------| -| Writing code (implementation, refactoring, test code, bug fixes) | `claude-sonnet-4.5` | Standard | Quality and accuracy matter for code. Use standard tier. | -| Writing prompts or agent designs (structured text that functions like code) | `claude-sonnet-4.5` | Standard | Prompts are executable — treat like code. | +| Writing code (implementation, refactoring, test code, bug fixes) | `claude-sonnet-4.6` | Standard | Quality and accuracy matter for code. Use standard tier. | +| Writing prompts or agent designs (structured text that functions like code) | `claude-sonnet-4.6` | Standard | Prompts are executable — treat like code. | | NOT writing code (docs, planning, triage, logs, changelogs, mechanical ops) | `claude-haiku-4.5` | Fast | Cost first. Haiku handles non-code tasks. | | Visual/design work requiring image analysis | `claude-opus-4.5` | Premium | Vision capability required. Overrides cost rule. | @@ -366,11 +370,11 @@ Before spawning an agent, determine which model to use. Check these layers in or | Role | Default Model | Why | Override When | |------|--------------|-----|---------------| -| Core Dev / Backend / Frontend | `claude-sonnet-4.5` | Writes code — quality first | Heavy code gen → `gpt-5.2-codex` | -| Tester / QA | `claude-sonnet-4.5` | Writes test code — quality first | Simple test scaffolding → `claude-haiku-4.5` | +| Core Dev / Backend / Frontend | `claude-sonnet-4.6` | Writes code — quality first | Heavy code gen → `gpt-5.3-codex` | +| Tester / QA | `claude-sonnet-4.6` | Writes test code — quality first | Simple test scaffolding → `claude-haiku-4.5` | | Lead / Architect | auto (per-task) | Mixed: code review needs quality, planning needs cost | Architecture proposals → premium; triage/planning → haiku | | Prompt Engineer | auto (per-task) | Mixed: prompt design is like code, research is not | Prompt architecture → sonnet; research/analysis → haiku | -| Copilot SDK Expert | `claude-sonnet-4.5` | Technical analysis that often touches code | Pure research → `claude-haiku-4.5` | +| Copilot SDK Expert | `claude-sonnet-4.6` | Technical analysis that often touches code | Pure research → `claude-haiku-4.5` | | Designer / Visual | `claude-opus-4.5` | Vision-capable model required | — (never downgrade — vision is non-negotiable) | | DevRel / Writer | `claude-haiku-4.5` | Docs and writing — not code | — | | Scribe / Logger | `claude-haiku-4.5` | Mechanical file ops — cheapest possible | — (never bump Scribe) | @@ -379,7 +383,7 @@ Before spawning an agent, determine which model to use. Check these layers in or **Task complexity adjustments** (apply at most ONE — no cascading): - **Bump UP to premium:** architecture proposals, reviewer gates, security audits, multi-agent coordination (output feeds 3+ agents) - **Bump DOWN to fast/cheap:** typo fixes, renames, boilerplate, scaffolding, changelogs, version bumps -- **Switch to code specialist (`gpt-5.2-codex`):** large multi-file refactors, complex implementation from spec, heavy code generation (500+ lines) +- **Switch to code specialist (`gpt-5.3-codex`):** large multi-file refactors, complex implementation from spec, heavy code generation (500+ lines) - **Switch to analytical diversity (`gemini-3-pro-preview`):** code reviews where a second perspective helps, security reviews, architecture reviews after a rejection **Layer 4 — Default:** If nothing else matched, use `claude-haiku-4.5`. Cost wins when in doubt, unless code is being produced. @@ -389,9 +393,9 @@ Before spawning an agent, determine which model to use. Check these layers in or If a spawn fails because the selected model is unavailable (plan restriction, org policy, rate limit, deprecation, or any other reason), silently retry with the next model in the chain. Do NOT tell the user about fallback attempts. Maximum 3 retries before jumping to the nuclear fallback. ``` -Premium: claude-opus-4.6 → claude-opus-4.6-fast → claude-opus-4.5 → claude-sonnet-4.5 → (omit model param) -Standard: claude-sonnet-4.5 → gpt-5.2-codex → claude-sonnet-4 → gpt-5.2 → (omit model param) -Fast: claude-haiku-4.5 → gpt-5.1-codex-mini → gpt-4.1 → gpt-5-mini → (omit model param) +Premium: claude-opus-4.6 → claude-opus-4.5 → claude-sonnet-4.6 → claude-sonnet-4.5 → (omit model param) +Standard: claude-sonnet-4.6 → claude-sonnet-4.5 → gpt-5.4 → gpt-5.3-codex → claude-sonnet-4 → (omit model param) +Fast: claude-haiku-4.5 → gpt-5.4-mini → gpt-5.1-codex-mini → gpt-4.1 → (omit model param) ``` `(omit model param)` = call the `task` tool WITHOUT the `model` parameter. The platform uses its built-in default. This is the nuclear fallback — it always works. @@ -415,7 +419,7 @@ prompt: | ... ``` -Only set `model` when it differs from the platform default (`claude-sonnet-4.5`). If the resolved model IS `claude-sonnet-4.5`, you MAY omit the `model` parameter — the platform uses it as default. +Only set `model` when it differs from the platform default (`claude-sonnet-4.6`). If the resolved model IS `claude-sonnet-4.6`, you MAY omit the `model` parameter — the platform uses it as default. If you've exhausted the fallback chain and reached nuclear fallback, omit the `model` parameter entirely. @@ -424,7 +428,7 @@ If you've exhausted the fallback chain and reached nuclear fallback, omit the `m When spawning, include the model in your acknowledgment: ``` -🔧 Fenster (claude-sonnet-4.5) — refactoring auth module +🔧 Fenster (claude-sonnet-4.6) — refactoring auth module 🎨 Redfoot (claude-opus-4.5 · vision) — designing color system 📋 Scribe (claude-haiku-4.5 · fast) — logging session ⚡ Keaton (claude-opus-4.6 · bumped for architecture) — reviewing proposal @@ -435,9 +439,9 @@ Include tier annotation only when the model was bumped or a specialist was chose **Valid models (current platform catalog):** -Premium: `claude-opus-4.6`, `claude-opus-4.6-fast`, `claude-opus-4.5` -Standard: `claude-sonnet-4.5`, `claude-sonnet-4`, `gpt-5.2-codex`, `gpt-5.2`, `gpt-5.1-codex-max`, `gpt-5.1-codex`, `gpt-5.1`, `gpt-5`, `gemini-3-pro-preview` -Fast/Cheap: `claude-haiku-4.5`, `gpt-5.1-codex-mini`, `gpt-5-mini`, `gpt-4.1` +Premium: `claude-opus-4.6`, `claude-opus-4.6-1m` (Internal only), `claude-opus-4.5` +Standard: `claude-sonnet-4.6`, `claude-sonnet-4.5`, `claude-sonnet-4`, `gpt-5.4`, `gpt-5.3-codex`, `gpt-5.2-codex`, `gpt-5.2`, `gpt-5.1-codex-max`, `gpt-5.1-codex`, `gpt-5.1`, `gemini-3-pro-preview` +Fast/Cheap: `claude-haiku-4.5`, `gpt-5.4-mini`, `gpt-5.1-codex-mini`, `gpt-5-mini`, `gpt-4.1` ### Client Compatibility @@ -787,7 +791,9 @@ prompt: | Read .squad/decisions.md (team decisions to respect). If .squad/identity/wisdom.md exists, read it before starting work. If .squad/identity/now.md exists, read it at spawn time. - If .squad/skills/ has relevant SKILL.md files, read them before working. + Check .copilot/skills/ for copilot-level skills (process, workflow, protocol). + Check .squad/skills/ for team-level skills (patterns discovered during work). + Read any relevant SKILL.md files before working. {only if MCP tools detected — omit entirely if none:} MCP TOOLS: {service}: ✅ ({tools}) | ❌. Fall back to CLI when unavailable. diff --git a/test/cli-global.test.ts b/test/cli-global.test.ts index 0ecc8a553..e9d640d6b 100644 --- a/test/cli-global.test.ts +++ b/test/cli-global.test.ts @@ -48,30 +48,30 @@ describe('squad status routing logic', () => { scaffold('.git'); const repoSquad = resolveSquad(TMP); expect(repoSquad).toBeNull(); - // Without a global .squad/ dir, status shows "none" + // Without a global personal-squad/ dir, status shows "none" const globalPath = resolveGlobalSquadPath(); - const globalSquadDir = join(globalPath, '.squad'); - // When repo squad is null and global .squad doesn't exist → "none" + const globalSquadDir = join(globalPath, 'personal-squad'); + // When repo squad is null and global personal-squad doesn't exist → "none" const activeType = repoSquad ? 'repo' : existsSync(globalSquadDir) ? 'personal' : 'none'; expect(activeType).toBe(activeType === 'personal' ? 'personal' : 'none'); // At minimum, repoSquad must be null expect(repoSquad).toBeNull(); }); - it('identifies "personal" when no repo .squad/ but global .squad/ exists', () => { + it('identifies "personal" when no repo .squad/ but global personal-squad/ exists', () => { scaffold('.git'); const repoSquad = resolveSquad(TMP); expect(repoSquad).toBeNull(); const globalPath = resolveGlobalSquadPath(); - const globalSquadDir = join(globalPath, '.squad'); - // Create a .squad/ inside the global path + const globalSquadDir = join(globalPath, 'personal-squad'); + // Create a personal-squad/ inside the global path mkdirSync(globalSquadDir, { recursive: true }); const activeType = repoSquad ? 'repo' : existsSync(globalSquadDir) ? 'personal' : 'none'; expect(activeType).toBe('personal'); - // Cleanup global .squad dir we created + // Cleanup global personal-squad dir we created rmSync(globalSquadDir, { recursive: true, force: true }); }); @@ -79,7 +79,7 @@ describe('squad status routing logic', () => { scaffold('.git', '.squad'); const repoSquad = resolveSquad(TMP); const globalPath = resolveGlobalSquadPath(); - const globalSquadDir = join(globalPath, '.squad'); + const globalSquadDir = join(globalPath, 'personal-squad'); mkdirSync(globalSquadDir, { recursive: true }); // Same logic as status command — repo wins diff --git a/test/sdk/consult.test.ts b/test/sdk/consult.test.ts index f07f405de..15db12722 100644 --- a/test/sdk/consult.test.ts +++ b/test/sdk/consult.test.ts @@ -391,8 +391,26 @@ import { extractLearnings, loadSessionHistory, getPersonalSquadRoot, + resolveGlobalSquadPath, } from '@bradygaster/squad-sdk'; +// ============================================================================ +// getPersonalSquadRoot tests (#590) +// ============================================================================ + +describe('getPersonalSquadRoot', () => { + it('resolves to personal-squad subdirectory, not .squad', () => { + const root = getPersonalSquadRoot(); + const globalDir = resolveGlobalSquadPath(); + expect(root).toBe(join(globalDir, 'personal-squad')); + }); + + it('does not resolve to .squad subdirectory', () => { + const root = getPersonalSquadRoot(); + expect(root).not.toMatch(/[/\\]\.squad$/); + }); +}); + describe('setupConsultMode', () => { const SETUP_ROOT = join( process.cwd(), From dfc0a59235ad51e7c296c413a2ad1b8e8925ea36 Mon Sep 17 00:00:00 2001 From: Brady Gaster <41929050+bradygaster@users.noreply.github.com> Date: Wed, 25 Mar 2026 23:33:10 -0700 Subject: [PATCH 02/22] fix: add count-based fallback to archiveDecisions() (#626) (#627) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * chore(.squad): session wrap-up — inbox merge, logs, history updates Merged 12 decision inbox entries into decisions.md. Logged mega-session covering release recovery, docs fix, 10 PR merges, discussion triage, and release hardening. Updated agent histories with session learnings. Deleted inbox files after merge: - booster-ci-audit.md, booster-ci-cleanup.md - copilot-directive-2026-03-23T09-56.md, copilot-directive-2026-03-23T10-08.md - copilot-directive-no-npx.md - eecom-version-cmd.md - pao-discussion-triage-2026-03-23.md, pao-npx-purge.md, pao-readme-slim.md - pao-v090-blog.md - surgeon-v090-changelog.md, surgeon-v091-retrospective.md Updated files: - .squad/decisions.md (12 decision entries merged) - .squad/identity/now.md (current state updated) - .squad/log/2026-03-23T22-00-00Z-mega-session-wrapup.md (new) - .squad/agents/flight/history.md (issue filing patterns, governance directives) - .squad/agents/eecom/history.md (CLI version subcommand pattern) - .squad/agents/booster/history.md (CI audit and preflight patterns) - .squad/agents/surgeon/history.md (release governance rules, retrospective) - .squad/agents/pao/history.md (discussion triage patterns, Teams MCP urgency) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: add v0.9.0 and v0.9.1 releases to What's New - v0.9.1 (Current Release): Bug fixes and hardening - Shell agent name extraction with multi-pattern fallback - Init scaffolding for typed casting files - Personal squad global mode support - Release CI/docs hardening - Doctor command improvements - v0.9.0 (Major Feature): 6 major features + stability fixes - Personal Squad Governance Layer (isolated developer workspaces) - Worktree Spawning & Distributed Work (parallel agent orchestration) - Machine Capability Discovery (auto-detect tools/models/hardware) - Cooperative Rate Limiting (predictive circuit breaker + economy mode) - Telemetry & Infrastructure (auto-wire, KEDA, session recovery) - Docs, Stability & Distribution (Astro enhancements, npm-only) - v0.8.2: Renamed from 'Current Release' to historic entry Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(squad): triage session — 14 issues triaged, 10 PRs reviewed - Flight triaged 14 untriaged GitHub issues, created prioritized work plan - FIDO reviewed 10 open PRs, identified 3 duplicate/overlap pairs - Merged 2 decisions from inbox to decisions.md - Updated Flight and FIDO agent history with team updates - Orchestration logs: 2026-03-25T15-23-flight.md, 2026-03-25T15-23-fido.md - Session log: 2026-03-25T15-23-triage-session.md Work session priority established: - #610 → PAO (broken link, 5 min fix, unblocks #611) - #590 → EECOM (getPersonalSquadRoot bug, P0) - #592, #611 → Flight review - #588 → Procedures (model list update) PR deduplication: 10 PRs consolidate to 7 - Merge: #607, #603, #606 - Close as duplicates: #605, #604, #602 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(squad): log work session — triage, fixes, research Round 1 outcomes: - PAO: #610 docs link already resolved - EECOM: #590 personal squad path fix (getPersonalSquadRoot) - Procedures: #588 model catalog updated to current platform - Flight: #612 community issue filed on routing regression - CAPCOM: CLI platform research — identified 8 releases (1.0.4→1.0.11) with 3 high-impact changes - GNC: Squad codebase research — routing regression caused by v0.9.0 prompt saturation + missing name param Round 2: Code review & quality gate - FIDO: Found same bug in shell/index.ts, enforced revision - CONTROL: Full sweep of #590 fix, awaiting FIDO re-review Merged decisions: 1. Personal squad path canonicalization (personal-squad/) 2. Model catalog refresh (claude-sonnet-4.6, gpt-5.3-codex defaults) 3. CLI platform analysis (monorepo discovery, idle hiding, hook injection) 4. Squad regression analysis (prompt saturation, workstream replacement, missing name param) Logs created: - 6 orchestration logs (one per agent) - 1 session synthesis log with research synthesis - 4 agent history updates (team update annotations) All inbox decision files merged and deleted. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Merge: VS Code routing enforcement fix proposal (#613) - Merged procedures-vscode-routing-fix.md from inbox to decisions.md - Cleared decision inbox after merge - Logged session finalization work Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: add count-based fallback to archiveDecisions() (#626) archiveDecisions() silently returned null when all entries were <30 days old, allowing decisions.md to grow unboundedly. Active projects hit 145KB+ (35K tokens burned per agent spawn). Added count-based fallback: when all entries are recent but total size exceeds 20KB, archive the oldest recent entries to stay under threshold. Undated entries are preserved (not archived) per Procedures' guidance. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: update EECOM history with #626 learnings Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .squad/agents/capcom/history.md | 3 + .squad/agents/eecom/history.md | 19 +- .squad/agents/gnc/history.md | 4 + .squad/agents/procedures/history.md | 2 + .squad/decisions.md | 15361 ++++++++-------- .squad/log/2026-03-25T18-11-work-session.md | 176 + .../2026-03-25T18-11-CAPCOM.md | 41 + .../2026-03-25T18-11-EECOM.md | 24 + .../2026-03-25T18-11-Flight.md | 24 + .../orchestration-log/2026-03-25T18-11-GNC.md | 61 + .../orchestration-log/2026-03-25T18-11-PAO.md | 19 + .../2026-03-25T18-11-Procedures.md | 31 + packages/squad-cli/src/cli/core/nap.ts | 40 +- test/nap.test.ts | 120 + 14 files changed, 8401 insertions(+), 7524 deletions(-) create mode 100644 .squad/log/2026-03-25T18-11-work-session.md create mode 100644 .squad/orchestration-log/2026-03-25T18-11-CAPCOM.md create mode 100644 .squad/orchestration-log/2026-03-25T18-11-EECOM.md create mode 100644 .squad/orchestration-log/2026-03-25T18-11-Flight.md create mode 100644 .squad/orchestration-log/2026-03-25T18-11-GNC.md create mode 100644 .squad/orchestration-log/2026-03-25T18-11-PAO.md create mode 100644 .squad/orchestration-log/2026-03-25T18-11-Procedures.md diff --git a/.squad/agents/capcom/history.md b/.squad/agents/capcom/history.md index 10be08a47..e9c8aeabd 100644 --- a/.squad/agents/capcom/history.md +++ b/.squad/agents/capcom/history.md @@ -64,3 +64,6 @@ Written full technical analysis to `.squad/identity/sdk-init-technical-analysis. - Template system in SDK (`templates/` directory) is well-structured, used correctly by `initSquad()` 📌 **Team update (2026-03-11T01:25:00Z):** 5 SDK Init decisions merged to decisions.md: Phase-based quality improvement (3-phase approach), CastingEngine canonical casting, squad.config.ts as source of truth, Ralph always-included, implementation priority order. Full technical analysis informed Flight's unified PRD and EECOM's roadmap. + +📌 **Team update (2026-03-25T18:11Z):** CLI platform research complete — identified Copilot CLI 1.0.5–1.0.11 (8 releases in 10 days) contain three high-impact changes affecting Squad routing: monorepo instruction discovery (1.0.11), idle subagent hiding (1.0.8), subagentStart hook context injection (1.0.7). SDK version pinning doesn't prevent CLI runtime auto-updates. Recommendations: clean up template naming, upgrade to SDK 0.2.0 customize mode, file CLI issue. Report in decisions inbox. + diff --git a/.squad/agents/eecom/history.md b/.squad/agents/eecom/history.md index bc9b56b6d..de1038df9 100644 --- a/.squad/agents/eecom/history.md +++ b/.squad/agents/eecom/history.md @@ -4,6 +4,21 @@ ## Learnings +### archiveDecisions() count-based fallback (#626) (2025-07-24) + +**Context:** `archiveDecisions()` in `packages/squad-cli/src/cli/core/nap.ts` silently returned `null` when all `###` entries were <30 days old (`old.length === 0`), even if the file was well over 20KB. Active projects generating many decisions per session could hit 145KB+ — 35K tokens burned per agent spawn. + +**Fix:** Added a count-based fallback after the age-based split. When `old.length === 0` and total file size exceeds `DECISION_THRESHOLD` (20KB), the fallback separates recent entries into dated vs undated, sorts dated by age (most recent first), keeps entries that fit under the threshold budget, and archives the rest. Undated entries are always preserved — they are foundational directives per Procedures' guidance. + +**Key design choices:** +1. Undated entries (`daysAgo === null`) are never archived by the count-based fallback. They stay in `recent`. +2. Budget calculation accounts for header + undated entries + kept dated entries to guarantee the result fits under 20KB. +3. Entries are re-sorted into original document order after the split, so the output file preserves heading sequence. + +**Tests:** Added 4 adversarial tests — 50 all-today entries >20KB, mixed dated/undated preservation, under-threshold no-op, exact-threshold boundary case. + +**Pattern:** When a function has an early-return optimization (`if (old.length === 0) return null`), always consider whether the condition that triggered the function call (file size > threshold) can still be true when the early-return fires. If so, the early-return is a silent failure. + ### Init scaffolding: casting dir + no-remote stderr (#579) (2025-07-18) **Context:** `squad init` in a fresh `git init` repo (no remote) printed `error: No such remote 'origin'` to stderr and `squad doctor` reported `casting/registry.json` missing. Two independent bugs in `packages/squad-sdk/src/config/init.ts`. @@ -267,4 +282,6 @@ Executed 3 tasks across 2 waves: economy mode (#500, PR #504), node:sqlite fix ( 4. `personal.ts` - Refactored to reuse `ensurePersonalSquadDir()`. 5. `resolution.test.ts` - Added 3 tests. -**Pattern:** `resolveGlobalSquadPath()` returns the container; `ensurePersonalSquadDir()` creates the subdirectory the rest of the system looks for. \ No newline at end of file +**Pattern:** `resolveGlobalSquadPath()` returns the container; `ensurePersonalSquadDir()` creates the subdirectory the rest of the system looks for. +📌 **Team update (2026-03-25T18:11Z):** Fixed #590 personal squad path regression — getPersonalSquadRoot() now uses canonical personal-squad/ subdirectory like esolvePersonalSquadDir() and nsurePersonalSquadDir(). Committed on squad/590-fix-personal-squad-root. FIDO found same bug in shell/index.ts → work passed to CONTROL for full sweep revision. Awaiting FIDO re-review. + diff --git a/.squad/agents/gnc/history.md b/.squad/agents/gnc/history.md index fb70c50f4..4ca0ecab6 100644 --- a/.squad/agents/gnc/history.md +++ b/.squad/agents/gnc/history.md @@ -19,3 +19,7 @@ Reviewed and merged PR #474 (Node 22 ESM compatibility + bonus exports key fix). **Key learning:** ESM exports key and actual module structure must stay in sync. When adding new entry points, update both package.json (exports) and create the corresponding module file. Missing either step breaks consumers on Node 22+. Test matrix must include Node 22+ to catch these errors early. ### Dual-Layer ESM Fix (Issue #449) Upgraded from single-layer session.js patch to dual-layer approach: (1) Inject `exports` field into vscode-jsonrpc@8.2.1 package.json at postinstall — this is the canonical fix that resolves ALL subpath imports at once, matching vscode-jsonrpc v9.x. (2) Keep session.js `.js` extension patch as defense-in-depth. Added `squad doctor` detection for both layers (checks vscode-jsonrpc exports field and copilot-sdk session.js import syntax). Runtime Module._resolveFilename patch in cli-entry.ts remains as Layer 3 for npx cache hits where postinstall never runs. + +📌 **Team update (2026-03-25T18:11Z):** Routing regression research complete — root cause identified as combination of Squad code changes (v0.9.0 prompt saturation +33%, inlined workflows, missing +ame param in templates #577) AND Copilot CLI platform changes (CAPCOM report). Coordinator prompt grew 711→946 lines, diluting routing constraint. Workstream config replacement broke existing routing. Agent name fix exists on dev (#577) but not shipped. Recommendations: ship #577 immediately, move features to skills, optimize coordinator prompt, create regression tests. Report in decisions inbox. + diff --git a/.squad/agents/procedures/history.md b/.squad/agents/procedures/history.md index 28ff4f29b..319376834 100644 --- a/.squad/agents/procedures/history.md +++ b/.squad/agents/procedures/history.md @@ -140,6 +140,8 @@ Also updated: examples section (showing `name` + `description` pairs), anti-patt 📌 **Team update (2026-03-23T23:15Z):** Orchestration complete. Agent name display refactor shipped: spawn templates updated with mandatory `name` parameter across all 4 template variants. VOX and FIDO coordinated on parser extraction and cascading pattern strategies. All decisions merged to decisions.md. Canonical source: `.squad-templates/squad.agent.md` (all derived copies secondary). +📌 **Team update (2026-03-25T18:11Z):** Model catalog updated to current platform offerings — removed 2 stale models (claude-opus-4.6-fast, gpt-5), added 5 new models (claude-sonnet-4.6, claude-opus-4.6-1m, gpt-5.4, gpt-5.3-codex, gpt-5.4-mini), bumped defaults (code: claude-sonnet-4.6, specialist: gpt-5.3-codex), restructured fallbacks. All 5 squad.agent.md template copies synchronized. Merged in #588. + ### 2025-07: Model catalog refresh (#588) **Problem:** The valid models catalog, fallback chains, role-to-model mappings, and default model references in `squad.agent.md` were stale — missing `claude-sonnet-4.6`, `gpt-5.4`, `gpt-5.3-codex`, `gpt-5.4-mini`, `claude-opus-4.6-1m` and still referencing removed models `claude-opus-4.6-fast` and standalone `gpt-5`. diff --git a/.squad/decisions.md b/.squad/decisions.md index d6b088e27..1c386b328 100644 --- a/.squad/decisions.md +++ b/.squad/decisions.md @@ -1,7587 +1,7904 @@ -# Decisions - -> Team decisions that all agents must respect. Managed by Scribe. - - ---- - -## Foundational Directives (carried from beta, updated for Mission Control) - -### Type safety — strict mode non-negotiable -**By:** CONTROL (formerly Edie) -**What:** `strict: true`, `noUncheckedIndexedAccess: true`, no `@ts-ignore` allowed. -**Why:** Types are contracts. If it compiles, it works. - -### Hook-based governance over prompt instructions -**By:** RETRO (formerly Baer) -**What:** Security, PII, and file-write guards are implemented via the hooks module, NOT prompt instructions. -**Why:** Prompts can be ignored. Hooks are code — they execute deterministically. - -### Node.js >=20, ESM-only, streaming-first -**By:** GNC (formerly Fortier) -**What:** Runtime target is Node.js 20+. ESM-only. Async iterators over buffers. -**Why:** Modern Node.js features enable cleaner async patterns. - -### Casting — Apollo 13, mission identity -**By:** Squad Coordinator -**What:** Team names drawn from Apollo 13 / NASA Mission Control. Scribe is always Scribe. Ralph is always Ralph. Previous universe (The Usual Suspects) retired to alumni. -**Why:** The team outgrew its original universe. Apollo 13 captures collaborative pressure, technical precision, and mission-critical coordination — perfect for an AI agent framework. - -### Proposal-first workflow -**By:** Flight (formerly Keaton) -**What:** Meaningful changes require a proposal in `docs/proposals/` before execution. -**Why:** Proposals create alignment before code is written. - -### Tone ceiling — always enforced -**By:** PAO (formerly McManus) -**What:** No hype, no hand-waving, no claims without citations. -**Why:** Trust is earned through accuracy, not enthusiasm. - -### Zero-dependency scaffolding preserved -**By:** Network (formerly Rabin) -**What:** CLI remains thin. Zero runtime dependencies for the CLI scaffolding path. -**Why:** Users should be able to run `npx` without downloading a dependency tree. - -### Merge driver for append-only files -**By:** Squad Coordinator -**What:** `.gitattributes` uses `merge=union` for `.squad/decisions.md`, `agents/*/history.md`, `log/**`, `orchestration-log/**`. -**Why:** Enables conflict-free merging of team state across branches. - -### Interactive Shell as Primary UX -**By:** Brady -**What:** Squad becomes its own interactive CLI shell. `squad` with no args enters a REPL. -**Why:** Squad needs to own the full interactive experience. - -### 2026-02-21: User directive — no temp/memory files in repo root -**By:** Brady (via Copilot) -**What:** NEVER write temp files, issue files, or memory files to the repo root. All squad state/scratch files belong in .squad/ and ONLY .squad/. Root tree of a user's repo is sacred. -**Why:** User request — hard rule. Captured for all agents. - -### 2026-02-21: npm workspace protocol for monorepo -**By:** Edie (TypeScript Engineer) -**What:** Use npm-native workspace resolution (version-string references) instead of `workspace:*` protocol for cross-package dependencies. -**Why:** The `workspace:*` protocol is pnpm/Yarn-specific. npm workspaces resolve workspace packages automatically. -**Impact:** All inter-package dependencies in `packages/*/package.json` should use the actual version string, not `workspace:*`. - -### 2026-02-21: Distribution is npm-only (GitHub-native removed) -**By:** Rabin (Distribution) + Fenster (Core Dev) -**What:** Squad packages (`@bradygaster/squad-sdk` and `@bradygaster/squad-cli`) are distributed exclusively via npmjs.com. The GitHub-native `npx github:bradygaster/squad` path has been removed. -**Why:** npm is the standard distribution channel. One distribution path reduces confusion and maintenance burden. Root `cli.js` prints deprecation warning if anyone still hits the old path. - -### 2026-02-21: Coordinator prompt structure — three routing modes -**By:** Verbal (Prompt Engineer) -**What:** Coordinator uses structured response format: `DIRECT:` (answer inline), `ROUTE:` + `TASK:` + `CONTEXT:` (single agent), `MULTI:` (fan-out). Unrecognized formats fall back to `DIRECT`. -**Why:** Keyword prefixes are cheap to parse and reliable. Fallback-to-direct prevents silent failures. -### `.squad/` Directory Scope — Owner Directive -**By:** Brady (project owner, PR #326 review) -**Date:** 2026-03-10 - -**Directive:** The `.squad/` directory is **reserved for team state only** — roster, routing, decisions, agent histories, casting, and orchestration logs. Non-team data (adoption tracking, community metrics, reports) must NOT live in `.squad/`. Use `.github/` for GitHub platform integration or `docs/` for documentation artifacts. - -**Source:** [PR #326 comment](https://github.com/bradygaster/squad/pull/326#issuecomment-4029193833) - - ---- - -### No Individual Repo Listing Without Consent — Owner Directive -**By:** Brady (project owner, PR #326 review) -**Date:** 2026-03-10 - -**Directive:** Growth metrics must report **aggregate numbers only** (e.g., "78+ repositories found via GitHub code search") — never name or link to individual community repos without explicit opt-in consent. The monitoring script and GitHub Action concepts are approved, but any public showcase or tracking list that identifies specific repos is blocked until a community consent plan exists. - -**Source:** [PR #326 comment](https://github.com/bradygaster/squad/pull/326#issuecomment-4029222967) - - ---- - -### Adoption Tracking — Opt-In Architecture -**By:** Flight (implementing Brady's directives above) -**Date:** 2026-03-09 - -### 2026-02-21: CLI entry point split — src/index.ts is a pure barrel -**By:** Edie (TypeScript Engineer) -**What:** `src/index.ts` is a pure re-export barrel with ZERO side effects. `src/cli-entry.ts` contains `main()` and all CLI routing. -**Why:** Library consumers importing `@bradygaster/squad` were triggering CLI argument parsing and `process.exit()` on import. - -### 2026-02-21: Process.exit() refactor — library-safe CLI functions -**By:** Kujan (SDK Expert) -**What:** `fatal()` throws `SquadError` instead of `process.exit(1)`. Only `cli-entry.ts` may call `process.exit()`. -**Pattern:** Library functions throw `SquadError`. CLI entry catches and exits. Library consumers catch for structured error handling. - -### 2026-02-21: User directive — docs as you go -**By:** bradygaster (via Copilot) -**What:** Doc and blog as you go during SquadUI integration work. Doesn't have to be perfect — keep docs updated incrementally. - -### 2026-02-22: Runtime EventBus as canonical bus -**By:** Fortier -**What:** `runtime/event-bus.ts` (colon-notation: `session:created`, `subscribe()` API) is the canonical EventBus for all orchestration classes. The `client/event-bus.ts` (dot-notation) remains for backward-compat but should not be used in new code. -**Why:** Runtime EventBus has proper error isolation — one handler failure doesn't crash others. - -### 2026-02-22: Subpath exports in @bradygaster/squad-sdk -**By:** Edie (TypeScript Engineer) -**What:** SDK declares subpath exports (`.`, `./parsers`, `./types`, and module paths). Each uses types-first condition ordering. -**Constraints:** Every subpath needs a source barrel. `"types"` before `"import"`. ESM-only: no `"require"` condition. - -### 2026-02-22: User directive — Aspire testing requirements -**By:** Brady (via Copilot) -**What:** Integration tests must launch the Aspire dashboard and validate OTel telemetry shows up. Use Playwright. Use latest Aspire bits. Reference aspire.dev (NOT learn.microsoft.com). It's "Aspire" not ".NET Aspire". - -### 2026-02-23: User directive — code fences -**By:** Brady (via Copilot) -**What:** Never use / or \ as code fences in GitHub issues, PRs, or comments. Only use backticks to format code. - -### 2026-02-23: User Directive — Docs Overhaul & Publication Pause -**By:** Brady (via Copilot) -**What:** Pause docs publication until Brady explicitly gives go-ahead. Tone: lighthearted, welcoming, fun (NOT stuffy). First doc should be "first experience" with squad CLI. All docs: brief, prompt-first, action-oriented, fun. Human tone throughout. - -### 2026-02-23: Use sendAndWait for streaming dispatch -**By:** Kovash (REPL Expert) -**What:** `dispatchToAgent()` and `dispatchToCoordinator()` use `sendAndWait()` instead of `sendMessage()`. Fallback listens for `turn_end`/`idle` if unavailable. -**Why:** `sendMessage()` is fire-and-forget — resolves before streaming deltas arrive. -**Impact:** Never parse `accumulated` after a bare `sendMessage()`. Always use `awaitStreamedResponse`. - -### 2026-02-23: extractDelta field priority — deltaContent first -**By:** Kovash (REPL Expert) -**What:** `extractDelta` priority: `deltaContent` > `delta` > `content`. Matches SDK actual format. -**Impact:** Use `deltaContent` as the canonical field name for streamed text chunks. - -### 2026-02-24: Per-command --help/-h: intercept-before-dispatch pattern -**By:** Fenster (Core Dev) -**What:** All CLI subcommands support `--help` and `-h`. Help intercepted before command routing prevents destructive commands from executing. -**Convention:** New CLI commands MUST have a `getCommandHelp()` entry with usage, description, options, and 2+ examples. - -### 2026-02-25: REPL cancellation and configurable timeout -**By:** Kovash (REPL Expert) -**What:** Ctrl+C immediately resets `processing` state. Timeout: `SQUAD_REPL_TIMEOUT` (seconds) > `SQUAD_SESSION_TIMEOUT_MS` (ms) > 600000ms default. CLI `--timeout` flag sets env var. - -### 2026-02-24: Shell Observability Metrics -**By:** Saul (Aspire & Observability) -**What:** Four metrics under `squad.shell.*` namespace, gated behind `SQUAD_TELEMETRY=1`. -**Convention:** Shell metrics require explicit consent via `SQUAD_TELEMETRY=1`, separate from OTLP endpoint activation. - -### 2026-02-23: Telemetry in both CLI and agent modes -**By:** Brady (via Copilot) -**What:** Squad should pump telemetry during BOTH modes: (1) standalone Squad CLI, and (2) running as an agent inside GitHub Copilot CLI. - -### 2026-02-27: ASCII-only separators and NO_COLOR -**By:** Cheritto (TUI Engineer) -**What:** All separators use ASCII hyphens. Text-over-emoji principle: text status is primary, emoji is supplementary. -**Convention:** Use ASCII hyphens for separators. Keep emoji out of status/system messages. - -### 2026-02-24: Version format — bare semver canonical -**By:** Fenster -**What:** Bare semver (e.g., `0.8.5.1`) for version commands. Display contexts use `squad v{VERSION}`. - -### 2026-02-25: Help text — progressive disclosure -**By:** Fenster -**What:** Default `/help` shows 4 essential lines. `/help full` shows complete reference. - -### 2026-02-24: Unified status vocabulary -**By:** Marquez (CLI UX Designer) -**What:** Use `[WORK]` / `[STREAM]` / `[ERR]` / `[IDLE]` across ALL status surfaces. -**Why:** Most granular, NO_COLOR compatible, text-over-emoji, consistent across contexts. - -### 2026-02-24: Pick one tagline -**By:** Marquez (CLI UX Designer) -**What:** Use "Team of AI agents at your fingertips" everywhere. - -### 2026-02-24: User directive — experimental messaging -**By:** Brady (via Copilot) -**What:** CLI docs should note the project is experimental and ask users to file issues. - -### 2026-02-28: User directive — DO NOT merge PR #547 -**By:** Brady (via Copilot) -**What:** DO NOT merge PR #547 (Squad Remote Control). Do not touch #547 at all. -**Why:** User request — captured for team memory - -### 2026-02-28: CLI Critical Gap Issues Filed -**By:** Keaton (Lead) -**What:** 4 critical CLI gaps filed as GitHub issues #554–#557 for explicit team tracking: -- #554: `--preview` flag undocumented and untested -- #556: `--timeout` flag undocumented and untested -- #557: `upgrade --self` is dead code -- #555: `run` subcommand is a stub (non-functional) - -**Why:** Orchestration logs captured gaps but they lacked actionable GitHub tracking and ownership. Filed issues now have explicit assignment to Fenster, clear acceptance criteria, and visibility in Wave E planning. - -### 2026-02-28: Test Gap Issues Filed (10 items) -**By:** Hockney (Tester) -**What:** 10 moderate CLI/test gaps filed as issues #558–#567: -- #558: Exit code consistency untested -- #559: Timeout edge cases untested -- #560: Missing per-command help -- #561: Shell-specific flag behavior untested -- #562: Env var fallback paths untested -- #563: REPL mode transitions untested -- #564: Config file precedence untested -- #565: Agent spawn flags undocumented -- #566: Untested flag aliases -- #567: Flag parsing error handling untested - -**Why:** Each gap identified in coverage analysis but lacked explicit GitHub tracking for prioritization and team visibility. - -### 2026-02-28: Documentation Audit Results (10 issues) -**By:** McManus (DevRel) -**What:** Docs audit filed 10 GitHub issues (#568–#575, #577–#578) spanning: -- Feature documentation lag (#568 `squad run`, #570 consult mode, #572 Ralph smart triage) -- Terminology inconsistency (#569 triage/watch/loop naming) -- Brand compliance (#571 experimental banner on 40+ docs) -- Clarity/UX gaps (#573 response modes, #575 dual-root, #577 VS Code, #578 session examples) -- Reference issue (#574 README command count) - -**Why:** Features shipped faster than documentation. PR #552, #553 merged without doc updates. No automation to enforce experimental banner. Users discover advanced features accidentally. - -**Root cause:** Feature-docs lag, decision-doc drift, no brand enforcement in CI. - -### 2026-02-28: Dogfood UX Issues Filed (4 items) -**By:** Waingro (Dogfooder) -**What:** Dogfood testing against 8 realistic scenarios surfaced 4 UX issues (filed as #576, #579–#581): -- #576 (P1): Shell launch fails in non-TTY piped mode (Blocks CI) -- #580 (P1): Help text overwhelms new users (44 lines, no tiering) -- #579 (P2): Status shows parent `.squad/` as local (confusing in multi-project workspaces) -- #581 (P2): Error messages show debug output always (noisy production logs) - -**Why:** CLI is solid for happy path but first-time user experience and CI/CD integration have friction points. All 4 block either new user onboarding or automation workflows. - -**Priority:** #576 > #580 > #581 > #579. All should be fixed before next public release. - -### 2026-02-28: decisions.md Aggressive Cleanup -**By:** Keaton (Lead) -**What:** Trimmed `decisions.md` from 226KB (223 entries) to 10.3KB (35 entries) — 95% reduction. -- Kept: Core architectural decisions, active process rules, active user directives, current UX conventions, runtime patterns -- Archived: Implementation details, one-time setup, PR reviews, audit reports, wave planning, superseded decisions, duplicates -- Created: `decisions-archive.md` with full original content preserved - -**Why:** Context window bloat during release push. Every agent loads 95% less decisions context. Full history preserved append-only. - -**Impact:** File size reduced, agent context efficiency improved, all decisions preserved in archive. - -### 2026-02-28: Backlog Gap Issues Filed (8 items) -**By:** Keaton (Lead) -**Approval:** Brady (via directive in issue request) -**What:** Filed 8 missing backlog items from `.squad/identity/now.md` as GitHub issues. These items were identified as "should-fix" polish or "post-M1" improvements but lacked explicit GitHub tracking until now. - -**Why:** Brady requested: "Cross-reference the known backlog against filed issues and file anything missing." The team had filed 28 issues this session (#554–#581), but 8 known items from `now.md` remained unfiled. Without GitHub issues, these lack ownership assignment, visibility for Wave E planning, trackability in automated workflows, and routing to squad members. - -**Issues Filed:** -- #583 (squad:rabin): Add `homepage` and `bugs` fields to package.json -- #584 (squad:mcmanus): Document alpha→v1.0 breaking change policy in README -- #585 (squad:edie): Add `noUncheckedIndexedAccess` to tsconfig -- #586 (squad:edie): Tighten ~26 `any` types in SDK -- #587 (squad:mcmanus): Add architecture overview doc -- #588 (squad:kujan): Implement SQUAD_DEBUG env var test -- #589 (squad:kujan): One real Copilot SDK integration test -- #590 (squad:baer): `npm audit fix` for dev-dependency ReDoS warnings -- #591 (squad:hockney, type:bug): Aspire dashboard test fails — docker pull in test suite -- #592 (squad:rabin): Replace workspace:* protocol with version string - -**Impact:** Full backlog now visible with explicit issues. No unmapped items. Each issue routed to the squad member domain expert. Issues are independent; can be executed in any order. - -### 2026-02-28: Codebase Scan — Unfiled Issues Audit -**By:** Fenster (Core Dev) -**Requested by:** Brady -**Date:** 2026-02-28T22:05:00Z -**Status:** Complete — 2 new issues filed - -**What:** Systematic scan of the codebase to identify known issues that haven't been filed as GitHub issues. Checked: -1. TODO/FIXME/HACK/XXX comments in code -2. TypeScript strict mode violations (@ts-ignore/@ts-expect-error) -3. Skipped/todo tests (.skip() or .todo()) -4. Errant console.log statements -5. Missing package.json metadata fields - -**Findings:** -- Type safety violations: ✅ CLEAN — Zero @ts-ignore/@ts-expect-error found. Strict mode compliance excellent. -- Workspace protocol: ❌ VIOLATION — 1 issue filed (#592): `workspace:*` in squad-cli violates npm workspace convention -- Skipped tests: ❌ GAP — 1 issue filed (#588): SQUAD_DEBUG test is .todo() placeholder -- Console.log: ✅ INTENTIONAL — All are user-facing output (status, errors) -- TODO comments: ✅ TEMPLATES — TODOs in generated workflow templates, not code -- Package.json: ✅ TRACKED — Missing homepage/bugs already filed as #583 - -**Code Quality Assessment:** -- Type Safety (Excellent): Zero violations of strict mode or type suppression. Team decision being followed faithfully. -- TODO/FIXME Comments (Clean): All TODOs in upgrade.ts and workflows.ts are template strings for generated GitHub Actions YAML, intentionally scoped. -- Console Output (Intentional): All are user-facing (dashboard startup, OTLP endpoint, issue labeling, shell loading) — no debug debris. -- Dead Code (None Found): No unreachable code, orphaned functions, or unused exports detected. - -**Recommendations:** -1. Immediate: Fix workspace protocol violation (#592) — violates established team convention -2. Soon: Implement SQUAD_DEBUG test (#588) — fills observable test gap -3. Going forward: Maintain type discipline; review package.json metadata during SDK/CLI version bumps - -**Conclusion:** Codebase in good health. Type safety discipline strong. No hidden technical debt. Conventions mostly followed (one npm workspace exception). Test coverage has minor gaps in observability. - -### 2026-02-28: Auto-link detection for preview builds -**By:** Fenster (Core Dev) -**Date:** 2026-02-28 -**What:** When running from source (`VERSION` contains `-preview`), the CLI checks if `@bradygaster/squad-cli` is globally npm-linked. If not, it prompts the developer to link it. Declining creates `~/.squad/.no-auto-link` to suppress future prompts. -**Why:** Dev convenience — saves contributors from forgetting `npm link` after cloning. Non-interactive commands (help, version, export, import, doctor, scrub-emails) skip the check. Everything is wrapped in try/catch so failures are silent. -**Impact:** Only affects `-preview` builds in interactive TTY sessions. No effect on published releases or CI. - -### 2026-03-01T00:34Z: User directive — Full scrollback support in REPL shell -**By:** Brady (via Copilot) -**What:** The REPL shell must support full scrollback — users should be able to scroll up and down to see all text (paste, run output, rendered content, logs) over time, like GitHub Copilot CLI does. The current Ink-based rendering loses/hides content and that's unacceptable. -**Why:** User request — captured for team memory. This is a P0 UX requirement for the shell. -**Status:** P0 blocking issue. Requires rendering architecture review (Cheritto, Kovash, Marquez). - -### 2026-03-01T04:47Z: User directive — Auto-incrementing build numbers -**By:** Brady (via Copilot) -**What:** Add auto-incrementing build numbers to versions. Format: `0.8.6.{N}-preview` where N increments each local build. Tracks build-to-release cadence. -**Why:** User request — captured for team memory. - -### 2026-03-01: Nap engine — dual sync/async export pattern -**By:** Fenster (Core Dev) -**What:** The nap engine (`cli/core/nap.ts`) exports both `runNap` (async, for CLI entry) and `runNapSync` (sync, for REPL). All internal operations use sync fs calls. The async wrapper exists for CLI convention consistency. -**Why:** REPL `executeCommand` is synchronous and cannot await. ESM forbids `require()`. Exporting a sync variant keeps the REPL integration clean without changing the shell architecture. -**Impact:** Future commands that need both CLI and REPL support should follow this pattern if they only do sync fs work. - -### 2026-03-01: First-run gating test strategy -**By:** Hockney (Tester) -**Date:** 2026-03-01 -**Issue:** #607 -**What:** Created `test/first-run-gating.test.ts` with 25 tests covering 6 categories of Init Mode gating. Tests use logic-level extraction from App.tsx conditionals, filesystem marker lifecycle via `loadWelcomeData`, and source-code structural assertions for render ordering. No full App component rendering — SDK dependencies make that impractical for unit tests. -**Why:** 3059 tests existed with zero enforcement of first-run gating behavior. The `.first-run` marker, banner uniqueness, assembled-message gating, warning suppression, session-scoped keys, and terminal clear ordering were all untested paths that could regress silently. -**Impact:** All squad members: if you modify `loadWelcomeData`, the `firstRunElement` conditional in App.tsx, or the terminal clear sequence in `runShell`, these tests will catch regressions. The warning suppression tests replicate the `cli-entry.ts` pattern — if that pattern changes, update both locations. - -### Verbal's Analysis: "nap" Skill — Context Window Optimization -**By:** Verbal (Prompt Engineer) -**Requested by:** Brady -**Date:** 2026-03-01 -**Scope:** Approved. Build it. Current context budget analysis: -- Agent spawn loads charter (~500t) + history + decisions.md (4,852t) + team.md (972t) -- Hockney: 25,940t history (worst offender) -- Fenster: 22,574t (history + CLI inventory) -- Coherence cliff: 40-50K tokens on non-task context - -**Key Recommendations:** -1. **Decision distillation:** Keep decisions.md as single source of truth (don't embed in charters — creates staleness/duplication) -2. **History compression — 12KB rule insufficient:** Six agents blow past threshold. Target **4KB ceiling per history** (~1,000t) with assertions not stories. -3. **Nap should optimize:** Deduplication (strip decisions.md content echoed in histories), staleness (flag closed PRs, merged work), charter bloat (stay <600t), skill pruning (archive high-confidence, no-recent-invocation skills), demand-loading for extra files (CLI inventory, UX catalog, fragility catalog). -4. **Enforcement:** Nap runs periodically or on-demand, enforces hard ceilings without silent quality degradation. - -### ShellApi.clearMessages() for terminal state reset -**By:** Kovash (REPL Expert) -**Date:** 2026-03-01 -**What:** `ShellApi` now exposes `clearMessages()` which resets both `messages` and `archivedMessages` React state. Used in session restore and `/clear` command. -**Why:** Without clearing archived messages, old content bleeds through when restoring sessions or clearing the shell. The `/clear` command previously only reset `messages`, leaving `archivedMessages` in the Static render list. -**Impact:** Any code calling `shellApi` to reset shell state should use `clearMessages()` rather than manually manipulating message arrays. - -### 2026-03-01: Prompt placeholder hints must not duplicate header banner -**By:** Kovash (REPL Expert) -**Date:** 2026-03-01 -**Issue:** #606 -**What:** The InputPrompt placeholder text must provide *complementary* guidance, never repeat what the header banner already shows. The header banner is the single source of truth for @agent routing and /help discovery. Placeholder hints should surface lesser-known features (tab completion, history navigation, utility commands). -**Why:** Two elements showing "Type @agent or /help" simultaneously creates visual noise and a confusing UX. One consistent prompt style throughout the session. -**Impact:** `getHintText()` in InputPrompt.tsx now has two tiers instead of three. Any future prompt hints should check the header banner first to avoid duplication. - -### 2026-03-02: Paste detection via debounce in InputPrompt -**By:** Kovash (REPL Expert) -**Date:** 2026-03-02 -**What:** InputPrompt uses a 10ms debounce on `key.return` to distinguish paste from intentional Enter. If more input arrives within 10ms → paste detected → newline preserved. If timer fires without input → real Enter → submit. A `valueRef` (React ref) mirrors mutations synchronously since closure-captured `value` is stale during rapid `useInput` calls. In disabled state, `key.return` appends `\n` to buffer instead of being ignored. -**Why:** Multi-line paste was garbled because `useInput` fires per-character and `key.return` triggered immediate submission. -**Impact:** 10ms delay on single-line submit is imperceptible. UX: multi-line paste preserved. Testing: Hockney should verify paste scenarios use `jest.useFakeTimers()` or equivalent. Future: if Ink adds native bracketed-paste support, debounce can be replaced. - -### 2026-03-01: First-run init messaging — single source of truth -**By:** Kovash (REPL & Interactive Shell) -**Date:** 2026-03-01 -**Issue:** #625 -**What:** When no roster exists, only the header banner tells the user about `squad init` / `/init`. The `firstRunElement` block returns `null` for the empty-roster case instead of showing a duplicate message. `firstRunElement` is reserved for the "Your squad is assembled" onboarding when a roster already exists. -**Why:** Two competing UI elements both said "run squad init" — visual noise that confuses the information hierarchy. Banner is persistent and visible; it owns the no-roster guidance. `firstRunElement` owns the roster-present first-run experience. -**Impact:** App.tsx only. No API or prop changes. Banner text reworded to prioritize `/init` (in-shell path) over exit-and-run. - -### 2026-03-01: NODE_NO_WARNINGS for subprocess warning suppression -**By:** Cheritto (TUI Engineer) -**Date:** 2026-03-01 -**Issue:** #624 -**What:** `process.env.NODE_NO_WARNINGS = '1'` is set as the first executable line in `cli-entry.ts` (line 2, after shebang). This supplements the existing `process.emitWarning` override. -**Why:** The Copilot SDK spawns child processes that inherit environment variables but NOT in-process monkey-patches like `process.emitWarning` overrides. `NODE_NO_WARNINGS=1` is the Node.js-native mechanism for suppressing warnings across an entire process tree. Without it, `ExperimentalWarning` messages (e.g., SQLite) leak into the terminal via the SDK's subprocess stderr forwarding. -**Pattern:** When suppressing Node.js warnings, use BOTH: (1) `process.env.NODE_NO_WARNINGS = '1'` — covers child processes (env var inheritance); (2) `process.emitWarning` override — covers main process (belt-and-suspenders). -**Impact:** Eliminates `ExperimentalWarning` noise in terminal for all Squad CLI users, including when the Copilot SDK spawns subprocesses. - -### 2026-03-01: No content suppression based on terminal width -**By:** Cheritto (TUI Engineer) -**Date:** 2026-03-01 -**What:** Terminal width tiers (compact ≤60, standard, wide ≥100) may adjust *layout* (e.g., wrapping, column arrangement) but must NOT suppress or truncate *content*. Every piece of information shown at 120 columns must also be shown at 40 columns. -**Why:** Users can scroll. Hiding roster names, spacing, help text, or routing hints on narrow terminals removes information the user needs. Layout adapts to width; content does not. -**Convention:** `compact` variable may be used for layout decisions (flex direction, column vs. row) but must NOT gate visibility of text, spacing, or UI sections. `wide` may add supplementary content but narrow must not remove it. - -### 2026-03-01: Multi-line user message rendering pattern -**By:** Cheritto (TUI Engineer) -**Date:** 2026-03-01 -**What:** Multi-line user messages in the Static scrollback use `split('\n')` with a column layout: first line gets the `❯` prefix, subsequent lines get `paddingLeft={2}` for alignment. -**Why:** Ink's horizontal `` layout doesn't handle embedded `\n` in `` children predictably when siblings exist. Explicit line splitting with column flex direction gives deterministic multi-line rendering. -**Impact:** Any future changes to user message prefix width must update the `paddingLeft={2}` on continuation lines to match. - -### 2026-03-01: Elapsed time display — inline after message content -**By:** Cheritto (TUI Engineer) -**Date:** 2026-03-01 -**Issue:** #605 -**What:** Elapsed time annotations on completed agent messages are always rendered inline after the message content as `(X.Xs)` in dimColor. This applies to the Static scrollback block in App.tsx, which is the canonical render path for all completed messages. -**Why:** After the Static scrollback refactor, MessageStream receives `messages=[]` and only renders live streaming content. The duration code in MessageStream was dead. Moving duration display into the Static block ensures it always appears consistently. -**Convention:** `formatDuration()` from MessageStream.tsx is the shared formatter. Format is `Xms` for <1s, `X.Xs` for ≥1s. Always inline, always dimColor, always after content text. - -### 2026-03-01: Banner usage line separator convention -**By:** Cheritto (TUI Engineer) -**Date:** 2026-03-01 -**What:** Banner hint/usage lines use middle dot `·` as inline separator. Init messages use single CTA (no dual-path instructions). -**Why:** Consistent visual rhythm. Middle dot is lighter than em-dash or hyphen for inline command lists. Single CTA reduces cognitive load for new users. -**Impact:** App.tsx headerElement. Future banner copy should follow same separator and single-CTA pattern. - -### 2026-03-02: REPL casting engine design -**By:** Fenster (Core Dev) -**Date:** 2026-03-02 -**Status:** Implemented -**Issue:** #638 -**What:** Created `packages/squad-cli/src/cli/core/cast.ts` as a self-contained casting engine with four exports: -1. `parseCastResponse()` — parses the `INIT_TEAM:` format from coordinator output -2. `createTeam()` — scaffolds all `.squad/agents/` directories, writes charters, updates team.md and routing.md, writes casting state JSON -3. `roleToEmoji()` — maps role strings to emoji, reusable across the CLI -4. `formatCastSummary()` — renders a padded roster summary for terminal display - -Scribe and Ralph are always injected if missing from the proposal. Casting state is written to `.squad/casting/` (registry.json, history.json, policy.json). -**Why:** Enables coordinator to propose and create teams from within the REPL session after `squad init`. -**Implications:** - -### 2026-03-02: Beta → Origin Migration: Version Path (v0.5.4 → v0.8.17) - -**By:** Kobayashi (Git & Release) -**Date:** 2026-03-02 -**Context:** Analyzed migration from beta repo (`bradygaster/squad`, v0.5.4) to origin repo (`bradygaster/squad-pr`, v0.8.18-preview). Version gap spans 0.6.x, 0.7.x, 0.8.0–0.8.16 (internal origin development only). - -**What:** Beta will jump directly from v0.5.4 to v0.8.17 (skip all intermediate versions). Rationale: -1. **Semantic versioning allows gaps** — version numbers are labels, not counters -2. **Users care about features, not numbers** — comprehensive changelog is more valuable than version sequence -3. **Simplicity reduces risk** — single migration release is easier to execute and communicate -4. **Precedent exists** — major refactors/rewrites commonly skip versions (Angular 2→4, etc) - -**Risks & Mitigations:** -- Risk: Version jump confuses users. Mitigation: Clear release notes explaining the gap + comprehensive changelog -- Risk: Intermediate versions were never public (no user expectations). Mitigation: This is actually a benefit — no backfill needed - -**Impact:** After merge, beta repo version jumps from v0.5.4 to v0.8.17. All intermediate work is included in the 0.8.17 release. Next release after v0.8.17 may be v0.8.18 or v0.9.0 (team decision post-merge). - -**Why:** Avoids maintenance burden of backfilling 12+ fake versions. Users get complete feature set in one migration release. - -### 2026-03-02: Beta → Origin Migration: Package Naming - -**By:** Kobayashi (Git & Release) -**Date:** 2026-03-02 - -**What:** Deprecate `@bradygaster/create-squad` (beta's package name). All future releases use: -- `@bradygaster/squad-cli` (user-facing CLI) -- `@bradygaster/squad-sdk` (programmatic SDK for integrations) - -**Why:** Origin's naming is more accurate and supports independent versioning if needed. Monorepo structure benefits from clear package separation. - -**Action:** When v0.8.17 is ready to publish, release a final version of `@bradygaster/create-squad` with deprecation notice: "This package has been renamed to @bradygaster/squad-cli. Install with: npm install -g @bradygaster/squad-cli" - -**Impact:** Package ecosystem clarity. No breaking change for users upgrading (CLI handles detection and warnings). - -### 2026-03-02: Beta → Origin Migration: Retroactive v0.8.17 Tag - -**By:** Kobayashi (Git & Release) -**Date:** 2026-03-02 - -**What:** Retroactively tag commit `5b57476` ("chore(release): prep v0.8.16 for npm publish") as v0.8.17. This commit and v0.8.16 have identical code. - -**Rationale:** -- Commit `6fdf9d5` jumped directly to v0.8.17-preview (no v0.8.17 release tag exists) -- Commit `87e4f1c` bumps to v0.8.18-preview "after 0.8.17 release" (implying v0.8.17 was released) -- Retroactive tagging is less disruptive than creating a new prep commit and rebasing - -**Action:** When banana gate clears, tag origin commit `5b57476` as v0.8.17. - -**Why:** Completes the missing link in origin's tag history. Indicates to users which commit was released as v0.8.17. - -### 2026-03-02: npx Distribution Migration: Error-Only Shim Strategy - -**By:** Rabin (Distribution) -**Date:** 2026-03-02 -**Context:** Beta repo currently uses GitHub-native distribution (`npx github:bradygaster/squad`). Origin uses npm distribution (`npm install -g @bradygaster/squad-cli`). After merge, old path will break. - -**Problem:** After migration, `npx github:bradygaster/squad` fails (root `package.json` has no `bin` entry). Users hitting the old path get cryptic npm error. - -**Solution — Option 5 (Error-only shim):** -1. Add root `bin` entry pointing to `cli.js` -2. `cli.js` detects GitHub-native invocation and prints **bold, clear error** with migration instructions -3. Exit with code 1 (fail fast, no hidden redirection) - -**Implementation:** -```json -{ - "bin": { - "squad": "./cli.js" - } -} -``` - -Update `cli.js` to print error message with new install instructions: -``` -npm install -g @bradygaster/squad-cli -``` - -**Pros:** -- ✅ Clear, actionable error message (not cryptic npm error) -- ✅ Aligns with npm-only team decision (no perpetuation of GitHub-native path) -- ✅ Low maintenance burden (simple error script, no complex shim) -- ✅ Can be removed in v1.0.0 when beta users have migrated - -**Cons:** -- Immediate breakage (no grace period) — but users get clear guidance - -**Why This Over Others:** -- Option 1 (keep working) contradicts npm-only decision -- Option 2 (exit early) same as this, but explicit error format needed -- Option 3 (time-limited) best UX but maintenance burden -- Option 4 (just break) user-hostile without error message -- **Option 5 balances user experience + team decision** - -**Related Decision:** See 2026-02-21 decision "Distribution is npm-only (GitHub-native removed)" - -**User Impact:** -- Users running `npx github:bradygaster/squad` see bold error with `npm install -g @bradygaster/squad-cli` instruction -- Existing projects running `squad upgrade` work seamlessly (upgrade logic built-in) -- No data loss or silent breakage - -**Upgrade Path (existing beta users):** -```bash -npm install -g @bradygaster/squad-cli -cd /path/to/project -squad upgrade -squad upgrade --migrate-directory # Optional: .ai-team/ → .squad/ -``` - -**Why:** Rabin's principle: "If users have to think about installation, install is broken." A clear error message respects users better than a cryptic npm error. - -### 2026-02-28: Init flow reliability — proposal-first before code - -**By:** Keaton (Lead) -**Date:** 2026-02-28 -**What:** Init/onboarding fixes require a proposal review before implementation. Proposal at `docs/proposals/reliable-init-flow.md`. Two confirmed bugs (race condition in auto-cast, Ctrl+C doesn't abort init session) plus UX gaps (empty-roster messaging, `/init` no-op). P0 bugs are surgical — don't expand scope. -**Why:** Four PRs (#637–#640) patched init iteratively without a unified design. Before writing more patches, the team needs to agree on the golden path. Proposal-first (per team decision 2026-02-21). -**Impact:** Blocks init-related code changes until Brady reviews the proposal. -- Kovash (REPL): Can call `parseCastResponse` + `createTeam` to wire up casting flow in shell dispatcher -- Verbal (Prompts): INIT_TEAM format is now the contract — coordinator prompt should emit this -- Hockney (Tests): cast.ts needs unit tests for parser edge cases, emoji mapping, file creation - -### 2026-03-02: REPL empty-roster gate — dual check pattern -**By:** Fenster (Core Dev) -**Date:** 2026-03-02 -**What:** REPL dispatch is now gated on *populated* roster, not just team.md existence. `hasRosterEntries()` in `coordinator.ts` checks for table data rows in the `## Members` section. Two layers: `handleDispatch` blocks with user guidance, `buildCoordinatorPrompt` injects refusal prompt. -**Why:** After `squad init`, team.md exists but is empty. Coordinator received a "route to agents" prompt with no agents listed, causing silent generic AI behavior. Users never got told to cast their team. -**Convention:** Post-init message references "Copilot session" (works in VS Code, github.com, and Copilot CLI). The `/init` slash command provides same guidance inside REPL. -**Impact:** All agents — if you modify the `## Members` table format in team.md templates, update `hasRosterEntries()` to match. - -### 2026-03-02: Connection promise dedup in SquadClient -**By:** Fenster (Core Dev) -**Date:** 2026-03-02 -**What:** `SquadClient.connect()` now uses a promise dedup pattern — concurrent callers share the same in-flight `connectPromise` instead of throwing "Connection already in progress". -**Why:** Eager warm-up and auto-cast both call `createSession()` → `connect()` at REPL startup, racing on the connection. The throw crashed auto-cast every time. -**Impact:** `packages/squad-sdk/src/adapter/client.ts` only. No API surface change. - -### 2026-03-01: CLI UI Polish PRD — Alpha Shipment Over Perfection -**By:** Keaton (Lead) -**Date:** 2026-03-01 -**Context:** Team image review identified 20+ UX issues ranging from P0 blockers to P3 future polish - -**What:** CLI UI polish follows pragmatic alpha shipment strategy: fix P0 blockers + P1 quick wins, defer grand redesign to post-alpha. 20 discrete issues created with clear priority tiers (P0/P1/P2/P3). - -**Why:** Brady confirmed "alpha-level shipment acceptable — no grand redesign today." Team converged on 3 P0 blockers (blank screens, static spinner, missing alpha banner) that would embarrass us vs. 15+ polish items that can iterate post-ship. - -**Trade-off:** Shipping with known layout quirks (input positioning, responsive tables) rather than blocking on 1-2 week TUI refactor. Users expect alpha rough edges IF we warn them upfront. - -**Priority Rationale:** -- **P0 (must fix):** User-facing broken states — blank screens, no feedback, looks crashed -- **P1 (quick wins):** Accessibility (contrast), usability (copy clarity), visual hierarchy — high ROI, low effort -- **P2 (next sprint):** Layout architecture, responsive design — important but alpha-acceptable if missing -- **P3 (future):** Fixed bottom input, alt screen buffer, creative spinner — delightful but not blockers - -**Architectural Implications:** -1. **Quick win discovered:** App.tsx overrides ThinkingIndicator's native rotation with static hints (~5 line fix) -2. **Debt acknowledged:** 3 separate separator implementations need consolidation (P2 work) -3. **Layout strategy:** Ink's layout model fights bottom-anchored input. Alt screen buffer is the real solution (P3 deferred). -4. **Issue granularity:** 20 discrete issues vs. 1 monolithic "fix UI" epic — enables parallel work by Cheritto (11 issues), Kovash (4), Redfoot (2), Fenster (1), Marquez (1 review) - -**Success Gate:** "Brady says it doesn't embarrass us" — qualitative gate appropriate for alpha software. Quantitative gates: zero blank screens >500ms, contrast ≥4.5:1, spinner rotates every 3s. - -**Impact:** -- **Team routing:** Clear ownership — Cheritto (TUI), Kovash (shell), Redfoot (design), Marquez (UX review) -- **Timeline transparency:** P0 (1-2 days) → P1 (2-3 days) → P2 (1 week) — alpha ship when P0+P1 done -- **Expectation management:** Out of Scope section explicitly lists grand redesign, advanced features, WCAG audit — prevents scope creep - -### 2026-03-01: Cast confirmation required for freeform REPL casts -**By:** Fenster (Core Dev) -**Date:** 2026-03-01 -**Context:** P2 from Keaton's reliable-init-flow proposal - -**What:** When a user types a freeform message in the REPL and the roster is empty, the cast proposal is shown and the user must confirm (y/yes) before team files are created. Auto-cast from .init-prompt and /init "prompt" skip confirmation since the user explicitly provided the prompt. - -**Why:** Prevents garbage casts from vague or accidental first messages (e.g., "hello", "what can you do?"). Matches the squad.agent.md Init Mode pattern where confirmation is required before creating team files. - -**Pattern:** pendingCastConfirmation state in shell/index.ts. handleDispatch intercepts y/n at the top before normal routing. inalizeCast() is the shared helper for both auto-confirmed and user-confirmed paths. - -### 2026-03-01: Expose setProcessing on ShellApi -**By:** Kovash (REPL Expert) -**Date:** 2026-03-01 -**Context:** Init auto-cast path bypassed App.tsx handleSubmit, so processing state was never set — spinner invisible during team casting. - -**What:** ShellApi now exposes setProcessing(processing: boolean) so that any code path in index.ts that triggers async work outside of handleSubmit can properly bracket it with processing state. This enables ThinkingIndicator and InputPrompt spinner without duplicating React state management. - -**Rule:** Any new async dispatch path in index.ts that bypasses handleSubmit **must** call shellApi.setProcessing(true) before the async work and shellApi.setProcessing(false) in a inally block covering all exit paths. - -**Files Changed:** -- packages/squad-cli/src/cli/shell/components/App.tsx — added setProcessing to ShellApi interface + wired in onReady -- packages/squad-cli/src/cli/shell/index.ts — added setProcessing calls in handleInitCast (entry, pendingCastConfirmation return, finally block) - -### 2026-03-01T20:13:16Z: User directives — UI polish and shipping priorities -**By:** Brady (via Copilot) -**Date:** 2026-03-01 -Ampersands (&) are prohibited in user-facing documentation headings and body text, per Microsoft Style Guide. - -**Rule:** Use "and" instead. - -**Why:** Microsoft Style Guide prioritizes clarity and professionalism. Ampersands feel informal and reduce accessibility. - -**Exceptions:** -- Brand names (AT&T, Barnes & Noble) -- UI element names matching exact product text -- Code samples and technical syntax -- Established product naming conventions - -**Scope:** Applies to docs pages, README files, blog posts, community-facing content. Internal files (.squad/** memory files, decision docs, agent history) have flexibility. - -**Reference:** https://learn.microsoft.com/en-us/style-guide/punctuation/ampersands - - ---- - -## Adoption & Community - -### `.squad/` Directory Scope — Owner Directive -**By:** Brady (project owner, PR #326 review) -**Date:** 2026-03-10 - -**Directive:** The `.squad/` directory is **reserved for team state only** — roster, routing, decisions, agent histories, casting, and orchestration logs. Non-team data (adoption tracking, community metrics, reports) must NOT live in `.squad/`. Use `.github/` for GitHub platform integration or `docs/` for documentation artifacts. - -**Source:** [PR #326 comment](https://github.com/bradygaster/squad/pull/326#issuecomment-4029193833) - ---- - -### No Individual Repo Listing Without Consent — Owner Directive -**By:** Brady (project owner, PR #326 review) -**Date:** 2026-03-10 - -**Directive:** Growth metrics must report **aggregate numbers only** (e.g., "78+ repositories found via GitHub code search") — never name or link to individual community repos without explicit opt-in consent. The monitoring script and GitHub Action concepts are approved, but any public showcase or tracking list that identifies specific repos is blocked until a community consent plan exists. - -**Source:** [PR #326 comment](https://github.com/bradygaster/squad/pull/326#issuecomment-4029222967) - ---- - -### Adoption Tracking — Opt-In Architecture -**By:** Flight (implementing Brady's directives above) -**Date:** 2026-03-09 - -Privacy-first adoption monitoring using a three-tier system: - -**Tier 1: Aggregate monitoring (SHIPPED)** -- GitHub Action + monitoring script collect metrics -- Reports moved to `.github/adoption/reports/{YYYY-MM-DD}.md` -- Reports show ONLY aggregate numbers (no individual repo names): - - "78+ repositories found via code search" - - Total stars/forks across all discovered repos - - npm weekly downloads - -**Tier 2: Opt-in registry (DESIGN NEXT)** -- Create `SHOWCASE.md` in repo root with submission instructions -- Opted-in projects listed in `.github/adoption/registry.json` -- Monitoring script reads registry, reports only on opted-in repos - -**Tier 3: Public showcase (LAUNCH LATER)** -- `docs/community/built-with-squad.md` shows opted-in projects only -- README link added when ≥5 opted-in projects exist - -**Rationale:** -- Aggregate metrics safe (public code search results) -- Individual projects only listed with explicit owner consent -- Prevents surprise listings, respects privacy -- Incremental rollout maintains team capacity - -**Implementation (PR #326):** -- ✅ Moved `.squad/adoption/` → `.github/adoption/` -- ✅ Stripped tracking.md to aggregate-only metrics -- ✅ Removed individual repo names, URLs, metadata -- ✅ Updated adoption-report.yml and scripts/adoption-monitor.mjs -- ✅ Removed "Built with Squad" showcase link from README (Tier 2 feature) - ---- - -### Adoption Tracking Location & Privacy -**By:** EECOM -**Date:** 2026-03-10 - -Implementation decision confirming Tier 1 adoption tracking changes. - -**What:** Move adoption tracking from `.squad/adoption/` to `.github/adoption/` - -**Why:** -1. **GitHub integration:** `.github/adoption/` aligns with GitHub convention (workflows, CODEOWNERS, issue templates) -2. **Privacy-first:** Aggregate metrics only; defer individual repo showcase to Tier 2 (opt-in) -3. **Clear separation:** `.squad/` = team internal; `.github/` = GitHub platform integration -4. **Future-proof:** When Tier 2 opt-in launches, `.github/adoption/` is the natural home - -**Impact:** -- GitHub Action reports write to `.github/adoption/reports/{YYYY-MM-DD}.md` -- No individual repo information published until Tier 2 -- Monitoring continues collecting aggregate metrics via public APIs -- Team sees trends without publishing sensitive adoption data - ---- - -### Append-Only File Governance -**By:** Flight -**Date:** 2026-03-09 - -Feature branches must never modify append-only team state files except to append new content. - -**What:** If a PR diff shows deletions in `.squad/agents/*/history.md` or `.squad/decisions.md`, the PR is blocked until deletions are reverted. - -**Why:** Session state drift causes agents to reset append-only files to stale branch state, destroying team knowledge. PR #326 deleted entire history files and trimmed ~75 lines of decisions, causing data loss. - -**Enforcement:** Code review + future CI check candidate. - ---- - -### Documentation Style: No Ampersands -**By:** PAO -**Date:** 2026-03-09 - -Ampersands (&) are prohibited in user-facing documentation headings and body text, per Microsoft Style Guide. - -**Rule:** Use "and" instead. - -**Why:** Microsoft Style Guide prioritizes clarity and professionalism. Ampersands feel informal and reduce accessibility. - -**Exceptions:** -- Brand names (AT&T, Barnes & Noble) -- UI element names matching exact product text -- Code samples and technical syntax -- Established product naming conventions - -**Scope:** Applies to docs pages, README files, blog posts, community-facing content. Internal files (.squad/** memory files, decision docs, agent history) have flexibility. - -**Reference:** https://learn.microsoft.com/en-us/style-guide/punctuation/ampersands - ---- - -## Sprint Directives - -### Secret handling — agents must never persist secrets -**By:** RETRO (formerly Baer), v0.8.24 -**What:** Agents must NEVER write secrets, API keys, tokens, or credentials into conversational history, commit messages, logs, or any persisted file. Acknowledge receipt without echoing values. -**Why:** Secrets in logs or history are a security incident waiting to happen. - - ---- - -## Squad Ecosystem Boundaries & Content Governance - -### Squad Docs vs Squad IRL Boundary (consolidated) -**By:** PAO (via Copilot), Flight -**Date:** 2026-03-10 -**Status:** Active pattern for all documentation PRs - -**Litmus test:** If Squad doesn't ship the code or configuration, the documentation belongs in Squad IRL, not the Squad framework docs. - -**Categories:** - -1. **Squad docs** — Features Squad ships (routing, charters, reviewer protocol, config, behavior) -2. **Squad IRL** — Infrastructure around Squad (webhooks, deployment patterns, logging, external tools, operational patterns) -3. **Gray area:** Platform features (GitHub Issue Templates) → Squad docs if framed as "how to configure X for Squad" - -**Examples applied (PR #331):** - -| Document | Decision | Reason | -|----------|----------|--------| -| ralph-operations.md | DELETE → IRL | Infrastructure (deployment, logging) around Squad, not Squad itself | -| proactive-communication.md | DELETE → IRL | External tools (Teams, WorkIQ) configured by community, not built into Squad | -| issue-templates.md | KEEP, reframe | GitHub platform feature; clarify scope: "a GitHub feature configured for Squad" | -| reviewer-protocol.md (Trust Levels) | KEEP | Documents user choice spectrum within Squad's existing review system | - -**Enforcement:** Code review + reframe pattern ("GitHub provides X. Here's how to configure it for Squad's needs."). Mark suspicious deletions for restore (append-only governance). - -**Future use:** Apply this pattern to all documentation PRs to maintain clean boundaries. - - ---- - -### Content Triage Skill — External Content Integration -**By:** Flight -**Date:** 2026-03-10 -**Status:** Skill created at `.squad/skills/content-triage/SKILL.md` - -**Pattern:** External content (blog posts, sample repos, videos, conference talks) that helps Squad adoption must be triaged using the "Squad Ships It" boundary heuristic before incorporation. - -**Workflow:** -1. Triggered by `content-triage` label or external content reference in issue -2. Flight performs boundary analysis -3. Sub-issues generated for Squad-ownable content extraction (PAO responsibility) -4. FIDO verifies docs-test sync on extracted content -5. Scribe manages IRL references in `.github/irl/references.yml` (YAML schema) - -**Label convention:** `content:blog`, `content:sample`, `content:video`, `content:talk` - -**Why:** Pattern from PR #331 (Tamir Dresher blog) shows parallel extraction of Squad-ownable patterns (scenario guides, reviewer protocol) and infrastructure patterns (Ralph ops, proactive comms). Without clear boundary, teams pollute Squad docs with operational content or miss valuable patterns that should be generalized. - -**Impact:** Enables community content to accelerate Squad adoption without polluting core docs. Flight's boundary analysis becomes reusable decision framework. Prevents scope creep as adoption grows. - - ---- - -### PR #331 Quality Gate — Test Assertion Sync -**By:** FIDO (Quality Owner) -**Date:** 2026-03-10 -**Status:** 🟢 CLEARED (test fix applied, commit 6599db6) - -**What was blocked:** Merge blocked on stale test assertions in `test/docs-build.test.ts`. - -**Critical violations resolved:** -1. `EXPECTED_SCENARIOS` array stale (7 vs 25 disk files) — ✅ Updated to 25 entries -2. `EXPECTED_FEATURES` constant undefined (32 feature files) — ✅ Created array with 32 entries -3. Test assertion incomplete — ✅ Updated to validate features section - -**Why this matters:** Stale assertions that don't reflect filesystem state cause silent test skips. Regression: If someone deletes a scenario file, the test won't catch it. CI passing doesn't guarantee test coverage — only that the test didn't crash. - -**Lessons:** -- Test arrays must be refreshed when filesystem content changes -- Incomplete commits break the test-reality sync contract -- FIDO's charter: When adding test count assertions, must keep in sync with disk state - -**Outcome:** Test suite: 6/6 passing. Assertions synced to filesystem. No regression risk from stale assertions. - - ---- - -### Communication Patterns and PR Trust Models -**By:** PAO -**Date:** 2026-03-10 -**Status:** Documented in features/reviewer-protocol.md (trust levels section) and scenarios/proactive-communication.md (infrastructure pattern) - -**Decision:** Document emerging patterns in real Squad usage: proactive communication loops and PR review trust spectrum. - -**Components:** - -1. **Proactive communication patterns** — Outbound notifications (Teams webhooks), inbound scanning (Teams/email for work items), two-way feedback loop connecting external sources to Squad workflow - -2. **PR trust levels spectrum:** - - **Full review** (default for team repos) — All PRs require human review - - **Selective review** (personal projects with patterns) — Domain-expert or routine PRs can auto-merge - - **Self-managing** (solo personal repos only) — PRs auto-merge; Ralph's work monitoring provides retroactive visibility - -**Why:** Ralph 24/7 autonomous deployment creates an awareness gap — how does the human stay informed? Outbound notifications solve visibility. Inbound scanning solves "work lives in multiple places." Trust levels let users tune oversight to their context (full review for team repos, selective for personal projects, self-managing for solo work only). - -**Important caveat:** Self-managing ≠ unmonitored; Ralph's work monitoring and notifications provide retroactive visibility. - -**Anti-spam expectations:** Don't spam yourself outbound (notification fatigue), don't spam GitHub inbound (volume controls). - - ---- - -### Remote Squad Access — Phased Rollout (Proposed) -**By:** Flight -**Date:** 2026-03-10 -**Status:** Proposed — awaits proposal document in `docs/proposals/remote-squad-access.md` - -**Context:** Squad currently requires a local clone to answer questions. Users want remote access from mobile, browser, or different machine without checking out repo. - -**Phases:** - -**Phase 1: GitHub Discussions Bot (Ship First)** -- Surface: GitHub Discussions -- Trigger: `/squad` command or `@squad` mention -- Context: GitHub Actions workflow checks out repo → full `.squad/` state -- Response: Bot replies to thread -- Feasibility: 1 day -- Why first: Easy to build, zero hosting, respects repo privacy, async Q&A, immediately useful - -**Phase 2: GitHub Copilot Extension (High Value)** -- Surface: GitHub Copilot chat (VS Code, CLI, web, mobile) -- Trigger: `/squad ask {question}` in any Copilot client -- Context: Extension fetches `.squad/` files via GitHub API (no clone) -- Response: Answer inline in Copilot -- Feasibility: 1 week -- Why second: Works everywhere Copilot exists, instant response, natural UX - -**Phase 3: Slack/Teams Bot (Enterprise Value)** -- Surface: Slack or Teams channel -- Trigger: `@squad` mention in channel -- Context: Webhook fetches `.squad/` via GitHub API -- Response: Bot replies in thread -- Feasibility: 2 weeks -- Why third: Enterprise teams live in chat; high value for companies using Squad - -**Constraint:** Squad's intelligence lives in `.squad/` (roster, routing, decisions, histories). Any remote solution must solve context access. GitHub Actions workflows provide checkout for free. Copilot Extension and chat bots use GitHub API to fetch files. - -**Implementation:** Before Phase 1 execution, write proposal document. New CLI command: `squad answer --context discussions --question "..."`. New workflow: `.github/workflows/squad-answer.yml`. - -**Privacy:** All approaches respect repo visibility or require authentication. Most teams want private by default. - -### Test assertion discipline — mandatory -**By:** FIDO (formerly Hockney), v0.8.24 -**What:** All code agents must update tests when changing APIs. FIDO has PR blocking authority on quality grounds. -**Why:** APIs changed without test updates caused CI failures and blocked external contributors. - -### Docs-test sync — mandatory -**By:** PAO (formerly McManus), v0.8.24 -**What:** New docs pages require corresponding test assertion updates in the same commit. -**Why:** Stale test assertions block CI and frustrate contributors. - -### Contributor recognition — every release -**By:** PAO, v0.8.24 -**What:** Each release includes an update to the Contributors Guide page. -**Why:** No contribution goes unappreciated. - -### API-test sync cross-check -**By:** FIDO + Booster, v0.8.24 -**What:** Booster adds CI check for stale test assertions. FIDO enforces via PR review. -**Why:** Prevents the pattern of APIs changing without test updates. - -### Doc-impact review — every PR -**By:** PAO, v0.8.25 -**What:** Every PR must be evaluated for documentation impact. PAO reviews PRs for missing or outdated docs. -**Why:** Code changes without doc updates lead to stale guides and confused users. - - ---- - -## Release v0.8.24 - -### CLI Packaging Smoke Test: Release Gate Decision -**By:** FIDO, v0.8.24 -**Date:** 2026-03-08 - -The CLI packaging smoke test is APPROVED as the quality gate for npm releases. - -**What:** -1. Text box preference: bottom-aligned, squared off (like Copilot CLI / Claude CLI) — future work, not today -2. Alpha-level shipment acceptable for now — no grand UI redesign today -3. CLI must show "experimental, please file issues" banner -4. Spinner/wait messages should rotate every ~3 seconds — use codebase facts, project trivia, vulnerability info, or creative "-ing" words. Never just spin silently. -5. Use wait time to inform or entertain users - -**Why:** User request — captured for team memory and crash recovery - -### 2026-03-01T20:16:00Z: User directive — CLI timeout too low -**By:** Brady (via Copilot) -**Date:** 2026-03-01 - -**What:** The CLI timeout is set too low — Brady tried using Squad CLI in this repo and it didn't work well. Timeout needs to be increased. Not urgent but should be captured as a CLI improvement opportunity. - -**Why:** User request — captured for team memory and PRD inclusion - -### 2026-03-01: Multi-Squad Storage & Resolution Design -**By:** Keaton (Lead) -**What:** -- New directory structure: ~/.config/squad/squads/{name}/.squad/ with ~/.config/squad/config.json for registry -- Keep -esolveGlobalSquadPath() unchanged; add -esolveNamedSquadPath(name?: string) and listPersonalSquads() on top -- Auto-migration: existing single personal squad moves to squads/default/ on first run -- Resolution priority: explicit (CLI flag) > project config > env var > git remote mapping > path mapping > default -- Global config.json schema: { version, defaultSquad, squads, mappings } - -**Why:** -- squads/ container avoids collisions with existing files at global root -- Backward-compatible: legacy layout detected and auto-migrated; existing code continues to work -- Clean separation: global config lives alongside squads, not inside any one squad -- Resolution chain enables flexible mapping without breaking existing workflows - -### 2026-03-01: Multi-Squad SDK Functions -**By:** Kujan (SDK Expert) -**What:** -- New SDK exports: -esolveNamedSquadPath(), listSquads(), createSquad(), deleteSquad(), switchSquad(), -esolveSquadForProject() -- New type: SquadEntry { name, path, isDefault, createdAt } -- squads.json registry (separate file, not config.json) with squad metadata and mappings -- SquadDirConfig v2 addition: optional personalSquad?: string field (v1 configs unaffected) -- Consult mode updated: setupConsultMode(options?: { squad?: string }) with explicit selection or auto-resolution - -**Why:** -- Lazy migration with fallback chain ensures zero breaking changes to existing users -- Separate squads.json is single source of truth for routing; keeps project config focused -- Version handling allows incremental adoption; v1 configs work unchanged -- SDK resolution functions can be called from CLI and library code without duplication - -### 2026-03-01: Multi-Squad CLI Commands & REPL -**By:** Kovash (REPL) -**What:** -- New commands: squad list, squad create , squad switch , squad delete -- Modified commands: squad consult --squad=, squad extract --squad=, squad init --global --name= -- Interactive picker for squad selection: arrow keys (↑/↓), Enter to confirm, Ctrl+C to cancel -- REPL integration: /squad and /squads slash commands with riggerSquadReload signal -- .active file stores current active squad name (plain text) -- Status command enhanced to show active squad and squad list - -**Why:** -- Picker only shows when needed (multiple squads) and TTY available; non-TTY gracefully uses active squad -- Slash commands follow existing pattern (/init, /agents, etc.); seamless REPL integration -- .active file is simple and atomic; suitable for concurrent CLI access -- Squad deletion safety: cannot delete active squad; requires confirmation - -### 2026-03-01: Multi-Squad UX & Interaction Design -**By:** Marquez (UX Designer) -**What:** -- Visual indicator: current squad marked with ●, others with ○; non-default squads tagged [switched] -- Squad name always visible in REPL header and prompt: ◆ Squad (client-acme) -- Picker interactions: ↑/↓ navigate, Enter select, Esc/Ctrl+C cancel; 5-7 squads displayed, wrap around -- Error states: clear copy with next actions (e.g., "Squad not found. Try @squad:personal." or "Run /squads to list.") -- Copy style: active verbs (Create, Switch, List), human-readable nouns (no jargon), 3-5 words per line -- Onboarding: fresh install defaults to "personal"; existing single-squad users see migration notice - -**Why:** -- Persistent context (squad name in header/prompt) prevents "Which squad am I in?" confusion -- Interactive picker is discoverable and non-blocking; minimal cognitive load -- Error messages with next actions reduce support friction -- Onboarding defaults and migration notices ensure smooth upgrade path for existing users - -# Decision: Separator component is canonical for horizontal rules - -**By:** Cheritto (TUI Engineer) -**Date:** 2026-03-02 -**Issues:** #655, #670, #671, #677 - -## What - -- All horizontal separator lines in shell components must use the shared `Separator` component (`components/Separator.tsx`), not inline `box.h.repeat()` calls. -- The `Separator` component handles terminal capability detection, box-drawing character degradation, and width computation internally. -- Information hierarchy convention: **bold** for primary CTAs (commands, actions) > normal for content > **dim** for metadata (timestamps, status, hints). -- `flexGrow` should not be used on containers that may be empty — it creates dead space in Ink layouts. - -## Why - -Duplicated separator logic was found in 3 files (App.tsx, AgentPanel.tsx, MessageStream.tsx). Consolidation to a single component prevents drift and makes it trivial to change separator style globally. The info hierarchy and whitespace conventions ensure visual consistency as new components are added. -### 2026-03-01: PR #547 Remote Control Feature — Architectural Review -**By:** Fenster -**Date:** 2026-03-01 -**PR:** #547 "Squad Remote Control - PTY mirror + devtunnel for phone access" by tamirdresher (external) - -## Context - -External contributor Tamir Dresher submitted a PR adding `squad start --tunnel` command to run Copilot in a PTY and mirror terminal output to phone/browser via WebSocket + Microsoft Dev Tunnels. - -## Architectural Question - -Is remote terminal access via devtunnel + PTY mirroring in scope for Squad v1 core? - -## Technical Assessment - -**What works:** -- RemoteBridge WebSocket server architecture is sound -- PTY mirroring approach is technically correct -- Session management dashboard is useful -- Security headers and CSP are present -- Test coverage exists (18 tests, though failing due to build issues) - -**Critical blockers:** -1. **Build broken** — TypeScript errors in `start.ts`, all tests failing -2. **Command injection vulnerability** — `execFileSync` with string interpolation in `rc-tunnel.ts` -3. **Native dependency** — `node-pty` requires C++ compiler (install friction) -4. **Windows-only effectively** — hardcoded paths, devtunnel CLI Windows-centric -5. **No cross-platform strategy** — macOS/Linux support unclear - -**Architectural concerns:** - -### 2026-03-02T23:36:00Z: Version target — v0.6.0 for public migration **[SUPERSEDED — see line 1046]** -**By:** Brady (via Copilot) -**What:** The public migration from squad-pr to squad should target v0.6.0, not v0.8.17. This overrides Kobayashi's Phase 5 Option A recommendation. The public repo (bradygaster/squad) goes from v0.5.4 → v0.6.0 — a clean minor bump. -**Why:** User directive. v0.6.0 is the logical next public version from v0.5.4. Internal version numbers (0.6.x–0.8.x) were private development milestones. -**[CORRECTION — 2026-03-03]:** This decision was REVERSED by Brady. Brady explicitly stated: "0.6.0 should NOT appear as the goal for ANY destination. I want the beta to become 0.8.17." The actual migration target is v0.8.17. See the superseding "Versioning Model: npm packages vs Public Repo Tags" decision at line 1046 which clarifies that v0.6.0 is a public repo tag only, while npm packages remain at 0.8.17. Current migration documentation correctly references v0.8.17 throughout. -1. **Not integrated with Squad runtime** — doesn't use EventBus, Coordinator, or agent orchestration. Isolated feature. -2. **Two separate modes** — PTY mode (`start.ts`) vs. ACP passthrough mode (`rc.ts`). Why both? -3. **New CLI paradigm** — "start" implies daemon/server, not interactive mirroring. Command naming collision risk. -4. **External dependency** — requires `devtunnel` CLI installed + authenticated. Not bundled, not auto-installed. -5. **Audit logs** — go to `~/.cli-tunnel/audit/` instead of `.squad/log/` (inconsistent with Squad state location). - -## Recommendation - -**Request Changes** — Do not merge until: -1. TypeScript build errors fixed -2. Command injection vulnerability patched (use array args, no interpolation) -3. Tests passing (currently 18/18 failing) -4. Cross-platform support documented or Windows-only label added -5. Architectural decision on scope: Is this core or plugin? - -**If approved as core feature:** -- Extract to plugin first, prove value, then consider core integration -- Unify PTY vs. ACP modes (pick one) -- Integrate with EventBus/Coordinator (or explain why isolated is correct) -- Rename command to `squad remote` or `squad tunnel` (avoid `start` collision) -- Move audit logs to `.squad/log/` - -**If approved as plugin:** -- This is the right path — keeps core small, proves value independently -- Still fix security issues before merge to plugin repo - -## For Brady - -You requested a runtime review. Here's the verdict: - -- **Concept is cool** — phone access to Copilot is a real use case. -- **Implementation needs work** — build broken, security issues, Windows-only. -- **Architectural fit unclear** — not in any Squad v1 PRD. No integration with agent orchestration. -- **Native dependency risk** — `node-pty` adds install friction (C++ compiler required). - -**My take:** This belongs in a plugin, not core. External contributor did solid work on the WebSocket bridge, but Squad v1 needs to ship agent orchestration first. Remote access is a nice-to-have, not a v1 must-have. - -If you want this in v1, we need a proposal (docs/proposals/) first. - -### 2026-03-02: Multi-squad test contract — squads.json schema -**By:** Hockney (Tester) -**Date:** 2026-03-02 -**Issue:** #652 - -## What - -Tests for multi-squad (PR #690) encode a specific squads.json contract: - -```typescript -interface SquadsJson { - version: 1; - defaultSquad: string; - squads: Record; -} -``` - -Squad name validation regex: `^[a-z0-9]([a-z0-9-]{0,38}[a-z0-9])?$` (kebab-case, 1-40 chars). - -## Why - -Fenster's implementation should match this schema. If the schema changes, tests need updating. Recording so the team knows the contract is encoded in tests. - -## Impact - -Fenster: Align `multi-squad.ts` types with this schema, or flag if different — Hockney will adjust tests. - -### 2026-03-02: PR #582 Review — Consult Mode Implementation -**By:** Keaton (Lead) -**Date:** 2026-03-01 -**Context:** External contributor PR from James Sturtevant (jsturtevant) - -## Decision - -**Do not merge PR #582 in its current form.** - -This is a planning document (PRD) masquerading as implementation. The PR contains: -- An excellent 854-line PRD for consult mode -- Test stubs for non-existent functions -- Zero actual implementation code -- A history entry claiming work is done (aspirational, not factual) - -## Required Actions - -1. **Extract PRD to proper location:** - - Move `.squad/identity/prd-consult-mode.md` → `docs/proposals/consult-mode.md` - - PRDs belong in proposals/, not identity/ - -2. **Close this PR with conversion label:** - - Label: "converted-to-proposal" - - Comment: Acknowledge excellent design work, explain missing implementation - -3. **Create implementation issues from PRD phases:** - - Phase 1: SDK changes (SquadDirConfig, resolution helpers) - - Phase 2: CLI command implementation - - Phase 3: Extraction workflow - - Each phase: discrete PR with actual code + tests - -4. **Architecture discussion needed before implementation:** - - How does consult mode integrate with existing sharing/ module? - - Session learnings vs agent history — conceptual model mismatch - - Remote mode (teamRoot pointer) vs copy approach — PRD contradicts itself - -## Architectural Guidance - -**What's right:** -- `consult: true` flag in config.json ✅ -- `.git/info/exclude` for git invisibility ✅ -- `git rev-parse --git-path info/exclude` for worktree compatibility ✅ -- Separate extraction command (`squad extract`) ✅ -- License risk detection (copyleft) ✅ - -**What needs rethinking:** -- Reusing `sharing/` module (history split vs learnings extraction — different domains) -- PRD flip-flops between "copy squad" and "remote mode teamRoot pointer" -- No design for how learnings are structured or extracted -- Tests before code (cart before horse) - -## Pattern Observed - -James Sturtevant is a thoughtful contributor who understands the product vision. The PRD is coherent and well-structured. This connects to his #652 issue (Multiple Personal Squads) — consult mode is a stepping stone to multi-squad workflows. - -**Recommendation:** Engage James in architecture discussion before he writes code. This feature has implications for the broader personal squad vision. Get alignment on: -1. Sharing module fit (or new consult module?) -2. Learnings structure and extraction strategy -3. Phase boundaries and deliverables - -## Why This Matters - -External contributors are engaging with Squad's architecture. We need to guide them toward shippable PRs, not just accept aspirational work. Setting clear expectations now builds trust and avoids wasted effort. - -## Files Referenced - -- `.squad/identity/prd-consult-mode.md` (PRD, should move) -- `test/consult.test.ts` (tests for non-existent code) -- `.squad/agents/fenster/history.md` (claims work done) -- `packages/squad-sdk/src/resolution.ts` (needs `consult` field, unchanged in PR) - - -### cli.js is now a thin ESM shim - -**By:** Fenster -**Date:** 2025-07 -**What:** `cli.js` at repo root is a 14-line shim that imports `./packages/squad-cli/dist/cli-entry.js`. It no longer contains bundled CLI code. The deprecation notice only displays when invoked via npm/npx. -**Why:** The old bundled cli.js was stale and missing commands added after the monorepo migration (e.g., `aspire`). A shim ensures `node cli.js` always runs the latest built CLI. -**Impact:** `node cli.js` now requires `npm run build` to have been run first (so `packages/squad-cli/dist/cli-entry.js` exists). This was already the case for any development workflow. - - -### 2026-03-02T01-09-49Z: User directive -**By:** Brady (via Copilot) -**What:** Stop distributing the package via NPX and GitHub. Only distribute via NPM from now on. Go from the public version to whatever version we're in now in the private repo. Adopt the versioning scheme from issue #692. -**Why:** User request — captured for team memory - -# Release Plan Update — npm-only Distribution & Semver Fix (#692) - -**Status:** DECIDED -**Decided by:** Kobayashi (Git & Release) -**Date:** 2026-03-01T14:22Z -**Context:** Brady's two strategic decisions on distribution and versioning - -## Decisions - -### 1. NPM-Only Distribution -- **What:** End GitHub-native distribution (`npx github:bradygaster/squad`). Install exclusively via npm registry. -- **How:** Users install via `npm install -g @bradygaster/squad-cli` (global) or `npx @bradygaster/squad-cli` (per-project). -- **Why:** Simplified distribution, centralized source of truth, standard npm tooling conventions. -- **Scope:** Affects all future releases, all external documentation, and CI/CD publish workflows. -- **Owners:** Rabin (docs), Fenster (scripts), all team members (update docs/sample references). - -### 2. Semantic Versioning Fix (#692) -- **Problem:** Versions were `X.Y.Z.N-preview` (four-part with prerelease after), which violates semver spec. -- **Solution:** Correct format is `X.Y.Z-preview.N` (prerelease identifier comes after patch, before any build metadata). -- **Examples:** - - ❌ Invalid: `0.8.6.1-preview`, `0.8.6.16-preview` - - ✅ Valid: `0.8.6-preview.1`, `0.8.6-preview.16` -- **Impact:** Affects all version strings going forward (package.json, CLI version constant, release tags). -- **Release sequence:** - 1. Pre-release: `X.Y.Z-preview.1`, `X.Y.Z-preview.2`, ... - 2. At publish: Bump to `X.Y.Z` - 3. Post-publish: Bump to `{next}-preview.1` (reset counter) - -### 3. Version Continuity -- **Transition:** Public repo ended at `0.8.5.1`. Private repo continues at `0.8.6-preview` (following semver format). -- **Rationale:** Clear break between public (stable) and private (dev) codebases while maintaining version history continuity. - -## Implementation - -- ✅ **CHANGELOG.md:** Added "Changed" section documenting distribution channel and semver fix. -- ✅ **Charter (Kobayashi):** Updated Release Versioning Sequence with corrected pattern and phase description. -- ✅ **History (Kobayashi):** Logged decision with rationale and scope. - -## Dependent Work - -- **Fenster:** Ensure `bump-build.mjs` implements X.Y.Z-preview.N pattern (not X.Y.Z.N-preview). -- **Rabin:** Update README, docs, and all install instructions to reflect npm-only distribution. -- **All:** Use corrected version format in release commits, tags, and announcements. - -## Notes - -- Zero impact on functionality — this is purely distribution and versioning cleanup. -- Merge drivers on `.squad/agents/kobayashi/history.md` ensure this decision appends safely across parallel branches. -- If questions arise about versioning during releases, refer back to Charter § Release Versioning Sequence. - -# Decision: npm-only distribution (GitHub-native removed) - -**By:** Rabin (Distribution) -**Date:** 2026-03-01 -**Requested by:** Brady - -## What Changed - -All distribution now goes through npm. The `npx github:bradygaster/squad` path has been fully removed from: -- Source code (github-dist.ts default template, install-migration.ts, init.ts) -- All 4 copies of squad.agent.md (Ralph Watch Mode commands) -- All 4 copies of squad-insider-release.yml (release notes) -- README.md, migration guides, blog posts, cookbook, installation docs -- Test assertions (bundle.test.ts) -- Rabin's own charter (flipped from "never npmjs.com" to "always npmjs.com") - -## Install Paths (the only paths) - -```bash -# Global install -npm install -g @bradygaster/squad-cli - -# Per-use (no install) -npx @bradygaster/squad-cli - -# SDK for programmatic use -npm install @bradygaster/squad-sdk -``` - -## Why - -One distribution channel means less confusion, fewer edge cases, and zero SSH-agent hang bugs. npm caching makes installs faster. Semantic versioning works properly. The root `cli.js` still exists with a deprecation notice for anyone who somehow hits the old path. - -## Impact - -- **All team members:** When writing docs or examples, use `npm install -g @bradygaster/squad-cli` or `npx @bradygaster/squad-cli`. Never reference `npx github:`. -- **CI/CD:** Insider release workflow now shows npm install commands in release notes. -- **Tests:** bundle.test.ts assertions updated to match new default template. - - -# Decision: Versioning Model — npm Packages vs Public Repo - -**Date:** 2026-03-03T02:45:00Z -**Decided by:** Kobayashi (Git & Release specialist) -**Related issues:** Migration prep, clarifying confusion between npm and public repo versions -**Status:** Active — team should reference this in all future releases - -## The Problem - -The migration process had introduced confusion about which version number applies where: -- Coordinator incorrectly bumped npm package versions to 0.6.0, creating version mismatch -- Migration checklist had npm packages publishing as 0.6.0 -- CHANGELOG treated 0.6.0 as an npm package version -- No clear distinction between "npm packages version" vs "public repo GitHub release tag" -- Risk of future mistakes during releases - -## The Model (CORRECT) - -Two distinct version numbers serve two distinct purposes: - -### 1. npm Packages: `@bradygaster/squad-cli` and `@bradygaster/squad-sdk` - -- **Follow semver cadence from current version:** Currently at 0.8.17 (published to npm) -- **Next publish after 0.8.17:** 0.8.18 (NOT 0.6.0) -- **Development versions:** Use `X.Y.Z-preview.N` format (e.g., 0.8.18-preview.1, 0.8.18-preview.2) -- **Release sequence per Kobayashi charter:** - 1. Pre-release: `X.Y.Z-preview.N` (development) - 2. At publish: Bump to `X.Y.Z` (e.g., 0.8.18), publish to npm, create GitHub release - 3. Post-publish: Immediately bump to next-preview (e.g., 0.8.19-preview.1) - -**MUST NEVER:** -- Bump npm packages down (e.g., 0.8.17 → 0.6.0) -- Confuse npm package version with public repo tag - -### 2. Public Repo (bradygaster/squad): GitHub Release Tag `v0.8.17` **[CORRECTED from v0.6.0]** - -- **Purpose:** Marks the migration release point for the public repository -- **Public repo version history:** v0.5.4 (final pre-migration) → v0.8.17 (migration release) **[CORRECTED: Originally written as v0.6.0, corrected to v0.8.17 per Brady's directive]** -- **Applied to:** The migration merge commit on beta/main -- **Same as npm versions:** v0.8.17 is BOTH the npm package version AND the public repo tag **[CORRECTED: Originally described as "separate from npm versions"]** -- **No package.json changes:** The tag is applied after the merge commit, but the version in package.json matches the tag - -## Why Two Version Numbers? **[CORRECTED: Actually ONE version number — v0.8.17 for both]** - -1. **npm packages evolve on their own cadence:** Independent development, independent release cycles (via @changesets/cli) -2. **Public repo is a release marker:** The v0.8.17 tag signals "here's the migration point" to users who clone the public repo **[CORRECTED: Same version as npm, not different]** -3. **They target different audiences:** - - npm: Users who install via `npm install -g @bradygaster/squad-cli` - - Public repo: Users who clone `bradygaster/squad` or interact with GitHub releases - **[CORRECTED: Both use v0.8.17 — the version numbers are aligned, not separate]** - -## Impact on Migration Checklist & CHANGELOG - -- **migration-checklist.md:** All references correctly use v0.8.17 for both npm packages AND public repo tag. **[CORRECTED: Line originally said "publish as 0.8.18, not 0.6.0" but actual target is 0.8.17]** -- **CHANGELOG.md:** Tracks npm package versions at 0.8.x cadence -- **Future releases:** npm packages and public repo tags use the SAME version number **[CORRECTED: Original text implied they were different]** - -## Known Issue: `scripts/bump-build.mjs` - -The auto-increment build number script (`npm run build`) can produce invalid semver for non-prerelease versions: -- `0.6.0` + auto-increment → `0.6.0.1` (invalid) -- `0.8.18-preview.1` + auto-increment → `0.8.18-preview.2` (valid) - -Since npm packages stay at 0.8.x cadence, this is not a blocker for migration. But worth noting for future patch releases. - -## Directive Merged - -Brady's directive (2026-03-03T02:16:00Z): "squad-cli and squad-sdk must NOT be bumped down to 0.6.0. They are already shipped to npm at 0.8.17." - -✅ **Incorporated:** All fixes ensure npm packages stay at 0.8.x. The v0.8.17 is used for BOTH npm packages AND public repo tag. **[CORRECTED: Original text said "v0.6.0 is public repo only" which was incorrect]** - -## Action Items for Team - -- Reference this decision when asked "what version should we release?" -- Use this model for all future releases (main project and public repo) -- Update team onboarding docs to include this versioning distinction -### No temp/memory files in repo root -**By:** Brady -**What:** No plan files, memory files, or tracking artifacts in the repository root. -**Why:** Keep the repo clean. - - ---- - -## Adoption & Community - -### `.squad/` Directory Scope — Owner Directive -**By:** Brady (project owner, PR #326 review) -**Date:** 2026-03-10 - -**Directive:** The `.squad/` directory is **reserved for team state only** — roster, routing, decisions, agent histories, casting, and orchestration logs. Non-team data (adoption tracking, community metrics, reports) must NOT live in `.squad/`. Use `.github/` for GitHub platform integration or `docs/` for documentation artifacts. - -**Source:** [PR #326 comment](https://github.com/bradygaster/squad/pull/326#issuecomment-4029193833) - ---- - -### No Individual Repo Listing Without Consent — Owner Directive -**By:** Brady (project owner, PR #326 review) -**Date:** 2026-03-10 - -**Directive:** Growth metrics must report **aggregate numbers only** (e.g., "78+ repositories found via GitHub code search") — never name or link to individual community repos without explicit opt-in consent. The monitoring script and GitHub Action concepts are approved, but any public showcase or tracking list that identifies specific repos is blocked until a community consent plan exists. - -**Source:** [PR #326 comment](https://github.com/bradygaster/squad/pull/326#issuecomment-4029222967) - ---- - -### Adoption Tracking — Opt-In Architecture -**By:** Flight (implementing Brady's directives above) -**Date:** 2026-03-09 - -Privacy-first adoption monitoring using a three-tier system: - -**Tier 1: Aggregate monitoring (SHIPPED)** -- GitHub Action + monitoring script collect metrics -- Reports moved to `.github/adoption/reports/{YYYY-MM-DD}.md` -- Reports show ONLY aggregate numbers (no individual repo names): - - "78+ repositories found via code search" - - Total stars/forks across all discovered repos - - npm weekly downloads - -**Tier 2: Opt-in registry (DESIGN NEXT)** -- Create `SHOWCASE.md` in repo root with submission instructions -- Opted-in projects listed in `.github/adoption/registry.json` -- Monitoring script reads registry, reports only on opted-in repos - -**Tier 3: Public showcase (LAUNCH LATER)** -- `docs/community/built-with-squad.md` shows opted-in projects only -- README link added when ≥5 opted-in projects exist - -**Rationale:** -- Aggregate metrics safe (public code search results) -- Individual projects only listed with explicit owner consent -- Prevents surprise listings, respects privacy -- Incremental rollout maintains team capacity - -**Implementation (PR #326):** -- ✅ Moved `.squad/adoption/` → `.github/adoption/` -- ✅ Stripped tracking.md to aggregate-only metrics -- ✅ Removed individual repo names, URLs, metadata -- ✅ Updated adoption-report.yml and scripts/adoption-monitor.mjs -- ✅ Removed "Built with Squad" showcase link from README (Tier 2 feature) - ---- - -### Adoption Tracking Location & Privacy -**By:** EECOM -**Date:** 2026-03-10 - -Implementation decision confirming Tier 1 adoption tracking changes. - -**What:** Move adoption tracking from `.squad/adoption/` to `.github/adoption/` - -**Why:** -1. **GitHub integration:** `.github/adoption/` aligns with GitHub convention (workflows, CODEOWNERS, issue templates) -2. **Privacy-first:** Aggregate metrics only; defer individual repo showcase to Tier 2 (opt-in) -3. **Clear separation:** `.squad/` = team internal; `.github/` = GitHub platform integration -4. **Future-proof:** When Tier 2 opt-in launches, `.github/adoption/` is the natural home - -**Impact:** -- GitHub Action reports write to `.github/adoption/reports/{YYYY-MM-DD}.md` -- No individual repo information published until Tier 2 -- Monitoring continues collecting aggregate metrics via public APIs -- Team sees trends without publishing sensitive adoption data - ---- - -### Append-Only File Governance -**By:** Flight -**Date:** 2026-03-09 - -Feature branches must never modify append-only team state files except to append new content. - -**What:** If a PR diff shows deletions in `.squad/agents/*/history.md` or `.squad/decisions.md`, the PR is blocked until deletions are reverted. - -**Why:** Session state drift causes agents to reset append-only files to stale branch state, destroying team knowledge. PR #326 deleted entire history files and trimmed ~75 lines of decisions, causing data loss. - -**Enforcement:** Code review + future CI check candidate. - ---- - -### Documentation Style: No Ampersands -**By:** PAO -**Date:** 2026-03-09 - -Ampersands (&) are prohibited in user-facing documentation headings and body text, per Microsoft Style Guide. - -**Rule:** Use "and" instead. - -**Why:** Microsoft Style Guide prioritizes clarity and professionalism. Ampersands feel informal and reduce accessibility. - -**Exceptions:** -- Brand names (AT&T, Barnes & Noble) -- UI element names matching exact product text -- Code samples and technical syntax -- Established product naming conventions - -**Scope:** Applies to docs pages, README files, blog posts, community-facing content. Internal files (.squad/** memory files, decision docs, agent history) have flexibility. - -**Reference:** https://learn.microsoft.com/en-us/style-guide/punctuation/ampersands - ---- - -## Sprint Directives - -### Secret handling — agents must never persist secrets -**By:** RETRO (formerly Baer), v0.8.24 -**What:** Agents must NEVER write secrets, API keys, tokens, or credentials into conversational history, commit messages, logs, or any persisted file. Acknowledge receipt without echoing values. -**Why:** Secrets in logs or history are a security incident waiting to happen. - ---- - -## Squad Ecosystem Boundaries & Content Governance - -### Squad Docs vs Squad IRL Boundary (consolidated) -**By:** PAO (via Copilot), Flight -**Date:** 2026-03-10 -**Status:** Active pattern for all documentation PRs - -**Litmus test:** If Squad doesn't ship the code or configuration, the documentation belongs in Squad IRL, not the Squad framework docs. - -**Categories:** - -1. **Squad docs** — Features Squad ships (routing, charters, reviewer protocol, config, behavior) -2. **Squad IRL** — Infrastructure around Squad (webhooks, deployment patterns, logging, external tools, operational patterns) -3. **Gray area:** Platform features (GitHub Issue Templates) → Squad docs if framed as "how to configure X for Squad" - -**Examples applied (PR #331):** - -| Document | Decision | Reason | -|----------|----------|--------| -| ralph-operations.md | DELETE → IRL | Infrastructure (deployment, logging) around Squad, not Squad itself | -| proactive-communication.md | DELETE → IRL | External tools (Teams, WorkIQ) configured by community, not built into Squad | -| issue-templates.md | KEEP, reframe | GitHub platform feature; clarify scope: "a GitHub feature configured for Squad" | -| reviewer-protocol.md (Trust Levels) | KEEP | Documents user choice spectrum within Squad's existing review system | - -**Enforcement:** Code review + reframe pattern ("GitHub provides X. Here's how to configure it for Squad's needs."). Mark suspicious deletions for restore (append-only governance). - -**Future use:** Apply this pattern to all documentation PRs to maintain clean boundaries. - ---- - -### Content Triage Skill — External Content Integration -**By:** Flight -**Date:** 2026-03-10 -**Status:** Skill created at `.squad/skills/content-triage/SKILL.md` - -**Pattern:** External content (blog posts, sample repos, videos, conference talks) that helps Squad adoption must be triaged using the "Squad Ships It" boundary heuristic before incorporation. - -**Workflow:** -1. Triggered by `content-triage` label or external content reference in issue -2. Flight performs boundary analysis -3. Sub-issues generated for Squad-ownable content extraction (PAO responsibility) -4. FIDO verifies docs-test sync on extracted content -5. Scribe manages IRL references in `.github/irl/references.yml` (YAML schema) - -**Label convention:** `content:blog`, `content:sample`, `content:video`, `content:talk` - -**Why:** Pattern from PR #331 (Tamir Dresher blog) shows parallel extraction of Squad-ownable patterns (scenario guides, reviewer protocol) and infrastructure patterns (Ralph ops, proactive comms). Without clear boundary, teams pollute Squad docs with operational content or miss valuable patterns that should be generalized. - -**Impact:** Enables community content to accelerate Squad adoption without polluting core docs. Flight's boundary analysis becomes reusable decision framework. Prevents scope creep as adoption grows. - ---- - -### PR #331 Quality Gate — Test Assertion Sync -**By:** FIDO (Quality Owner) -**Date:** 2026-03-10 -**Status:** 🟢 CLEARED (test fix applied, commit 6599db6) - -**What was blocked:** Merge blocked on stale test assertions in `test/docs-build.test.ts`. - -**Critical violations resolved:** -1. `EXPECTED_SCENARIOS` array stale (7 vs 25 disk files) — ✅ Updated to 25 entries -2. `EXPECTED_FEATURES` constant undefined (32 feature files) — ✅ Created array with 32 entries -3. Test assertion incomplete — ✅ Updated to validate features section - -**Why this matters:** Stale assertions that don't reflect filesystem state cause silent test skips. Regression: If someone deletes a scenario file, the test won't catch it. CI passing doesn't guarantee test coverage — only that the test didn't crash. - -**Lessons:** -- Test arrays must be refreshed when filesystem content changes -- Incomplete commits break the test-reality sync contract -- FIDO's charter: When adding test count assertions, must keep in sync with disk state - -**Outcome:** Test suite: 6/6 passing. Assertions synced to filesystem. No regression risk from stale assertions. - ---- - -### Communication Patterns and PR Trust Models -**By:** PAO -**Date:** 2026-03-10 -**Status:** Documented in features/reviewer-protocol.md (trust levels section) and scenarios/proactive-communication.md (infrastructure pattern) - -**Decision:** Document emerging patterns in real Squad usage: proactive communication loops and PR review trust spectrum. - -**Components:** - -1. **Proactive communication patterns** — Outbound notifications (Teams webhooks), inbound scanning (Teams/email for work items), two-way feedback loop connecting external sources to Squad workflow - -2. **PR trust levels spectrum:** - - **Full review** (default for team repos) — All PRs require human review - - **Selective review** (personal projects with patterns) — Domain-expert or routine PRs can auto-merge - - **Self-managing** (solo personal repos only) — PRs auto-merge; Ralph's work monitoring provides retroactive visibility - -**Why:** Ralph 24/7 autonomous deployment creates an awareness gap — how does the human stay informed? Outbound notifications solve visibility. Inbound scanning solves "work lives in multiple places." Trust levels let users tune oversight to their context (full review for team repos, selective for personal projects, self-managing for solo work only). - -**Important caveat:** Self-managing ≠ unmonitored; Ralph's work monitoring and notifications provide retroactive visibility. - -**Anti-spam expectations:** Don't spam yourself outbound (notification fatigue), don't spam GitHub inbound (volume controls). - ---- - -### Remote Squad Access — Phased Rollout (Proposed) -**By:** Flight -**Date:** 2026-03-10 -**Status:** Proposed — awaits proposal document in `docs/proposals/remote-squad-access.md` - -**Context:** Squad currently requires a local clone to answer questions. Users want remote access from mobile, browser, or different machine without checking out repo. - -**Phases:** - -**Phase 1: GitHub Discussions Bot (Ship First)** -- Surface: GitHub Discussions -- Trigger: `/squad` command or `@squad` mention -- Context: GitHub Actions workflow checks out repo → full `.squad/` state -- Response: Bot replies to thread -- Feasibility: 1 day -- Why first: Easy to build, zero hosting, respects repo privacy, async Q&A, immediately useful - -**Phase 2: GitHub Copilot Extension (High Value)** -- Surface: GitHub Copilot chat (VS Code, CLI, web, mobile) -- Trigger: `/squad ask {question}` in any Copilot client -- Context: Extension fetches `.squad/` files via GitHub API (no clone) -- Response: Answer inline in Copilot -- Feasibility: 1 week -- Why second: Works everywhere Copilot exists, instant response, natural UX - -**Phase 3: Slack/Teams Bot (Enterprise Value)** -- Surface: Slack or Teams channel -- Trigger: `@squad` mention in channel -- Context: Webhook fetches `.squad/` via GitHub API -- Response: Bot replies in thread -- Feasibility: 2 weeks -- Why third: Enterprise teams live in chat; high value for companies using Squad - -**Constraint:** Squad's intelligence lives in `.squad/` (roster, routing, decisions, histories). Any remote solution must solve context access. GitHub Actions workflows provide checkout for free. Copilot Extension and chat bots use GitHub API to fetch files. - -**Implementation:** Before Phase 1 execution, write proposal document. New CLI command: `squad answer --context discussions --question "..."`. New workflow: `.github/workflows/squad-answer.yml`. - -**Privacy:** All approaches respect repo visibility or require authentication. Most teams want private by default. - -### Test assertion discipline — mandatory -**By:** FIDO (formerly Hockney), v0.8.24 -**What:** All code agents must update tests when changing APIs. FIDO has PR blocking authority on quality grounds. -**Why:** APIs changed without test updates caused CI failures and blocked external contributors. - -### Docs-test sync — mandatory -**By:** PAO (formerly McManus), v0.8.24 -**What:** New docs pages require corresponding test assertion updates in the same commit. -**Why:** Stale test assertions block CI and frustrate contributors. - -### Contributor recognition — every release -**By:** PAO, v0.8.24 -**What:** Each release includes an update to the Contributors Guide page. -**Why:** No contribution goes unappreciated. - -### API-test sync cross-check -**By:** FIDO + Booster, v0.8.24 -**What:** Booster adds CI check for stale test assertions. FIDO enforces via PR review. -**Why:** Prevents the pattern of APIs changing without test updates. - -### Doc-impact review — every PR -**By:** PAO, v0.8.25 -**What:** Every PR must be evaluated for documentation impact. PAO reviews PRs for missing or outdated docs. -**Why:** Code changes without doc updates lead to stale guides and confused users. - - ---- - -## Release v0.8.24 - -### CLI Packaging Smoke Test: Release Gate Decision -**By:** FIDO, v0.8.24 -**Date:** 2026-03-08 - -The CLI packaging smoke test is APPROVED as the quality gate for npm releases. - -**What:** -- npm pack → creates tarball of both squad-sdk and squad-cli -- npm install → installs in clean temp directory (simulates user install) -- node {cli-entry.js} → invokes 27 commands + 3 aliases through installed package -- Coverage: All 26 primary commands + 3 of 4 aliases (watch, workstreams, remote-control) - -**Why:** Catches broken package.json exports, MODULE_NOT_FOUND errors, ESM resolution failures, command routing regressions — the exact failure modes we've shipped before. - -**Gaps (acceptable):** -- Semantic validation not covered (only routing tested) -- Cross-platform gaps (test runs on ubuntu-latest only) -- Optional dependencies allowed to fail (node-pty) - -## Impact - -- All agents: charter generation now reliably round-trips model preferences. -- Verbal/Keaton: the 4-layer model selection hierarchy documented in squad.agent.md is now supported at the SDK config level. -- Anyone adding new config fields: use the `assertModelPreference()` pattern (accept string-or-object, normalize internally) for fields that need simple and rich config shapes. - -# Decision: Runtime ExperimentalWarning suppression via process.emit hook - -**Date:** 2026-03-07 -**Author:** Fenster (Core Dev) -**Context:** PR #233 CI failure — 4 tests failed - -## Problem - -PR #233 (CLI wiring fixes for #226, #229, #201, #202) passed all 74 tests locally but failed 4 tests in CI: - -- `test/cli-p0-regressions.test.ts` — bare semver test (expected 1 line, got 3) -- `test/speed-gates.test.ts` — version outputs one line (expected 1, got 3) -- `test/ux-gates.test.ts` — no overflow beyond 80 chars (ExperimentalWarning line >80) -- `test/ux-gates.test.ts` — version bare semver (expected 1 line, got 3) - -Root cause: `node:sqlite` import triggers Node.js `ExperimentalWarning` that leaks to stderr. The existing `process.env.NODE_NO_WARNINGS = '1'` in cli-entry.ts was ineffective because Node only reads that env var at process startup, not when set at runtime. - -The warning likely didn't appear locally because the local Node.js version may have already suppressed it or the env var was set in the shell. - -## Decision - -Added a `process.emit` override in cli-entry.ts that intercepts `warning` events with `name === 'ExperimentalWarning'` and swallows them. This is placed: -- After `process.env.NODE_NO_WARNINGS = '1'` (which still helps child processes) -- Before the `await import('node:sqlite')` pre-flight check - -This is the standard Node.js pattern for runtime warning suppression when you can't control the process launch flags. - -## Impact - -- **cli-entry.ts**: 12 lines added (comment + override function) -- **Tests**: All 4 previously failing tests now pass; no regressions in structural tests (#624) -- **Behavior**: ExperimentalWarning messages no longer appear in CLI output; other warnings (DeprecationWarning, etc.) are unaffected - -### 2026-03-07T14:22:00Z: User directive - Quality cross-review -**By:** Brady (via Copilot) -**What:** All team members must double-and-triple check one another's work. Recent PRs have had weird test failures and inconsistencies. KEEN focus on quality - nothing can slip. -**Why:** User request - quality gate enforcement after speed gate, EBUSY, and cross-contamination issues across PRs #244, #245, #246. - - -# Decision: Optional dependencies must use lazy loading (#247) - -**Date:** 2026-03-09 -**Author:** Fenster -**Status:** Active - -## Context - -Issue #247 — two community reports of installation failure caused by top-level imports of `@opentelemetry/api` crashing when the package wasn't properly installed in `npx` temp environments. - -## Decision - -1. **All optional/telemetry dependencies must be loaded lazily** — never at module top-level. Use `createRequire(import.meta.url)` inside a `try/catch` for synchronous lazy loading. - -2. **Centralized wrapper pattern** — when multiple source files import from the same optional package, create a single wrapper module (e.g., `otel-api.ts`) that provides the fallback logic. Consumers import from the wrapper. - -3. **`@opentelemetry/api` is now an optionalDependency** — it was a hard dependency but is functionally optional. The SDK operates with no-op telemetry when absent. - -4. **`vscode-jsonrpc` added as direct dep** — improves hoisting for npx installs. The ESM subpath import issue (`vscode-jsonrpc/node` without `.js`) is upstream in `@github/copilot-sdk`. - -## Implications - -- Any new OTel integration must import from `runtime/otel-api.js`, never directly from `@opentelemetry/api`. -- Test files may continue importing `@opentelemetry/api` directly (it's installed in dev). -- If adding new optional dependencies in the future, follow the same lazy-load + wrapper pattern. - - -# Release Readiness Check — v0.8.21 - -**By:** Keaton (Lead) -**Date:** 2026-03-07 -**Status:** 🟡 SHIP WITH CAVEATS - - ---- - -## Executive Summary - -v0.8.21 is technically ready to release. All three packages carry the same version string (`0.8.21-preview.7`). Linting passes, 3718 tests pass (19 flaky UI tests pre-existing), CI green on commits. However, **#247 (Installation Failure) must be fixed before shipping**. This is a P0 blocker that breaks the primary installation path. Fenster is actively fixing it. - - ---- - -## Version State ✅ - -All packages aligned at **0.8.21-preview.7:** - -- Root `package.json` — v0.8.21-preview.7 -- `packages/squad-sdk/package.json` — v0.8.21-preview.7 -- `packages/squad-cli/package.json` — v0.8.21-preview.7 - -**Release Tag:** Should be `v0.8.21-preview.7` (already live as -preview, ready to promote to stable or next -preview if #247 requires a patch). - - ---- - -## Git State ✅ - -**Current Branch:** `dev` -**Commits since main:** 23 commits (main..dev) - -Recent activity (last 10 commits): -- 3f924d0 — fix: remove idle blankspace below agent panel (#239) -- 6a9af95 — docs(ai-team): Merge quality directive into team decisions -- 8d4490b — fix: harden flaky tests (EBUSY retry + init speed gate headroom) -- 363a0a8 — feat: Structured model preference & squad-level defaults (#245) -- a488eb8 — fix: wire missing CLI commands into cli-entry.ts (#244) -- b562ef1 — docs: update fenster history & add model-config decision - -**Uncommitted Changes:** 10 files (all acceptable): -- 4 deleted `.squad/decisions/inbox/` files (cleanup, merged to decisions.md) -- 6 untracked images (pilotswarm-*.png — documentation assets) -- 1 untracked `docs/proposals/repl-replacement-prd.md` (draft proposal) - -**Status:** Clean. No staged changes that would block release. - - ---- - -## Open Blockers ⚠️ P0 - -### #247 — Squad Installation Fails 🔴 **CRITICAL BLOCKER** - -**Impact:** Users cannot install via `npm install -g @bradygaster/squad-cli`. -**Assignee:** Fenster (actively fixing) -**Status:** In progress -**Release Impact:** **SHIP CANNOT PROCEED** until resolved. - -**Other Open Issues:** -- #248 — CLI command wiring: `squad triage` does not trigger team assignment loop (minor) -- #242 — Future: Tiered Squad Deployment (deferred, not blocking) -- #241 — New Squad Member for Docs (deferred) -- #240 — ADO configurable work item types (deferred) -- #236 — feat: persistent Ralph (deferred) -- #211 — Squad management paradigms (deferred, release:defer label) - -**Release Blockers:** Only #247 prevents shipping. - - ---- - -## CHANGELOG Review 📝 - -**Current `Unreleased` section covers:** - -### Added — SDK-First Mode (Phase 1) -- Builder functions (defineTeam, defineAgent, defineRouting, defineCeremony, defineHooks, defineCasting, defineTelemetry, defineSquad) -- `squad build` command with --check, --dry-run, --watch flags -- SDK Mode Detection in Coordinator prompt -- Documentation (SDK-First Mode guide, updated SDK Reference, README quick reference) - -### Added — Remote Squad Mode (ported from @spboyer PR #131) -- `resolveSquadPaths()` dual-root resolver -- `squad doctor` command (9-check setup validation) -- `squad link ` command -- `squad init --mode remote` -- `ensureSquadPathDual()` / `ensureSquadPathResolved()` - -### Changed — Distribution & Versioning -- npm-only distribution (no more GitHub-native `npx github:bradygaster/squad`) -- Semantic Versioning fix (X.Y.Z-preview.N format, compliant with semver spec) -- Version transition from public repo (0.8.5.1) to private repo (0.8.x cadence) - -### Fixed -- CLI entry point moved from dist/index.js → dist/cli-entry.js -- CRLF normalization for Windows users -- process.exit() removed from library functions (VS Code extension safe) -- Removed .squad branch protection guard - - ---- - -## Test Status 🟡 - -``` -Test Files: 9 failed | 134 passed (143) -Tests: 19 failed | 3718 passed | 3 todo (3740) -Duration: 80.06s -``` - -**Failures:** All 19 failures are pre-existing UI test timeouts (TerminalHarness spawn issues, not regressions): -- speed-gates.test.ts — 1 timeout -- ux-gates.test.ts — 3 timeouts -- acceptance.test.ts — 8 timeouts -- acceptance/hostile.test.ts — 3 timeouts -- cli/consult.test.ts — 1 timeout - -**Assessment:** Passing rate is strong (99.5% pass rate). Timeouts are environmental (not code regressions). Safe to ship with this test state. - - ---- - -## CI State ✅ - -- **Linting:** ✅ PASS (tsc --noEmit clean on both packages) -- **Build:** ✅ PASS (npm run build succeeds) -- **Tests:** 🟡 PASS (99.5% passing, pre-existing flakes) - - ---- - -## Release Prep Checklist - -- [x] Version strings aligned (0.8.21-preview.7) -- [x] Git state clean (no staged changes) -- [x] Linting passes -- [x] Tests mostly passing (pre-existing flakes only) -- [x] CHANGELOG updated (Unreleased section comprehensive) -- [ ] **#247 resolved (BLOCKER)** -- [ ] Branch merge strategy decided (dev → insiders? or dev → main?) -- [ ] npm publish command prepared - - ---- - -## Merge Strategy - -**Current branches:** -- `main` — stable baseline -- `dev` — integration branch (23 commits ahead of main) -- `insiders` — exists (used for pre-release channel?) - -**Recommendation:** -1. Hold on npm publish until #247 fixed -2. Merge dev → insiders for pre-release testing -3. After QA pass, merge dev → main -4. Tag main as `v0.8.21-preview.7` on npm -5. Consider promoting to `v0.8.21` stable if no further issues - - ---- - -## Draft CHANGELOG Entry for v0.8.21 - -When releasing, move "Unreleased" to versioned section: - -```markdown -## [0.8.21-preview.7] - 2026-03-07 - -### Added — SDK-First Mode (Phase 1) -- [builder functions list] -- [squad build command] -- [SDK Mode Detection] -- [Documentation updates] - -### Added — Remote Squad Mode -- [resolver + commands] - -### Changed — Distribution & Versioning -- [npm-only, semver fix, version transition] - -### Fixed -- [CLI entry point, CRLF, process.exit, branch guard] -``` - - ---- - -## Decision - -**VERDICT: 🟡 RELEASE v0.8.21-preview.7 AFTER #247 FIXED** - -- **GO:** Linting, tests, version alignment all sound. -- **HOLD:** #247 installation failure must be resolved. This is a P0 blocker. -- **ACTION:** Fenster owns #247 fix. Once merged to dev, rerun tests and ship. -- **TIMELINE:** 1–2 hours (estimate: Fenster's ETA on #247). - -**Owner:** Brady (approves final npm publish) -**Fallback:** If #247 unresolvable today, defer to v0.8.22 and open a retro ticket. - - ---- - -## Notes - -- **Community PRs:** 3 community PRs merged cleanly to dev (PR #217, #219, #230). Fork-first contributor workflow is working. -- **Wave planning:** 11 issues targeted for v0.8.22 (5 fix-now + 6 next-wave). 11 deferred to v0.8.23+. -- **Architecture:** SDK/CLI split is clean. Distribution to npm is working. Type safety (strict: true) enforced across both packages. -- **Proposal workflow:** Working as designed. No surprises. - - - -# Decision: Optionalize OpenTelemetry Dependency - -## Context -Telemetry is valuable but should not be a hard requirement for running the SDK. Users in air-gapped environments or minimal setups experienced crashes when `@opentelemetry/api` was missing or incompatible. - -## Decision -We have wrapped `@opentelemetry/api` in a resilient shim (`packages/squad-sdk/src/runtime/otel-api.ts`) and moved the package to `optionalDependencies`. - -### Mechanics -- **Runtime detection:** The wrapper attempts to load `@opentelemetry/api`. -- **Graceful fallback:** If loading fails, it exports no-op implementations of `trace`, `metrics`, and `diag` that match the API surface used by Squad. -- **Developer experience:** Internal code imports from `./runtime/otel-api.ts` instead of the package directly. - -## Consequences -- **Positive:** SDK is robust against missing telemetry dependencies. Installation size is smaller for users who opt out. -- **Negative:** We must maintain the wrapper's type compatibility with the real API. -- **Risk:** If we use new OTel features, we must update the no-op implementation. - -## Status -Accepted and implemented in v0.8.21. - - -# Decision: v0.8.21 Blog Post Scope — SDK-First + Full Release Wave - -**Date:** 2026-03-11 -**Author:** McManus (DevRel) -**Impact:** External communication, developer discovery, release narrative - -## Problem - -v0.8.21 is a major release with TWO significant storylines: -1. **SDK-First Mode** — TypeScript-first authoring, type safety, `squad build` command -2. **Stability Wave** — 26 issues closed, 16 PRs merged, critical crash fix (#247), 3,724 passing tests - -Risk: If blog only emphasizes SDK-First, users miss critical stability improvements (crash fix, Windows hardening, test reliability). If blog buries SDK-First, flagship feature loses visibility. - -## Decision - -Create TWO complementary blog posts with clear ownership: - -1. **`024-v0821-sdk-first-release.md`** (existing) — SDK-First deep dive - - Target: TypeScript-focused teams, SDK adopters - - Scope: Builders, quick start, Azure Function sample, Phase 2/3 roadmap - - Tone: Educational, patterns-focused - -2. **`025-v0821-comprehensive-release.md`** (new) — Full release wave summary - - Target: General audience, release notes consumers - - Scope: All 7 feature areas (SDK-First + Remote Squad + 5 critical fixes), metrics, community credits - - Tone: Reassuring (crash fixed!), factual (26 issues, 0 logic failures) - -**Cross-linking strategy:** -- Comprehensive post links to SDK-First deep dive: "For detailed SDK patterns, see [v0.8.21: SDK-First Mode post](./024-v0821-sdk-first-release.md)" -- SDK-First post references comprehensive post: "For the full release notes, see [v0.8.21: The Complete Release post](./025-v0821-comprehensive-release.md)" - -**CHANGELOG updated once** at `[0.8.21]` section with full scope (all 7 areas) — serves as single source of truth for condensed release info. - -## Rationale - -- **SDK value**: Highlights TypeScript-first workflow, type safety, Azure serverless patterns -- **Stability value**: Installation crash fix alone justifies a major release (user pain elimination) -- **Audience segmentation**: Developers interested in SDK config patterns → read post #024; DevOps/team leads reading release notes → read post #025 -- **SEO/discovery**: Two articles = more surface area for search + internal linking -- **Archive preservation**: Both posts preserved in `docs/blog/` for historical record - -## Alternative Rejected - -**Single mega-post:** Would be 25+ KB, overwhelming, diffuses message (SDK patterns + crash fix + CI stability = scattered narrative). Two posts with clear focus are easier to scan and share. - -## Enforcement - -- CHANGELOG.md single `[0.8.21]` section (source of truth) -- Blog post #025 designated "comprehensive" (headline for external comms) -- Blog post #024 designated "technical deep dive" (for SDK adopters) -- Release announcement on GitHub uses post #025 as primary link - - ---- - -**Decided by:** McManus (DevRel) on behalf of tone ceiling + messaging coherence -**Reviewed by:** Internal tone ceiling check (substantiated claims, no hype, clear value messaging) - - -### 2026-03-07 07:51 UTC: SDK-First init/migrate deferred to v0.8.22 -**By:** Keaton (Coordinator), Brady absent - autonomous decision -**What:** SDK-First mode gaps (init --sdk flag, standalone migrate command, comprehensive docs) deferred to v0.8.22. -**Why:** v0.8.21 has all P0 blockers resolved. Adding features now risks regression. Filed #249, #250, #251. -**Issues filed:** -- #249: squad init --sdk flag for SDK-First mode opt-in -- #250: standalone squad migrate command (subsumes #231) -- #251: comprehensive SDK-First mode documentation - - -### 2026-03-07T08-14-43Z: User directive -**By:** Brady (via Copilot) -**What:** Issues #249, #250, and #251 (SDK-First init --sdk flag, standalone migrate command, comprehensive SDK-First docs) are committed to v0.8.22 - not backlog, not optional. -**Why:** User request - captured for team memory - - -### 2026-03-07T16-19-00Z: Pre-release triage — v0.8.21 release ready pending #248 fix -**By:** Keaton (Lead) -**What:** Analyzed all 23 open issues. Result: v0.8.21 releases cleanly pending fix for #248 (triage team dispatch). v0.8.22 roadmap is well-scoped (9 issues, 3 parallel streams). Close #194 (completed) and #231 (duplicate). -**Why:** Final release gate. Coordinator override: #248 deferred to v0.8.22 (standalone CLI feature, not core to interactive experience). Keeps release unblocked. -**Details:** 2 closeable, 1 P0 override, 9 for v0.8.22, 5 for v0.8.23+, 1 for v0.9+, 4 backlog. See .squad/orchestration-log/2026-03-07T16-19-00Z-keaton.md for full triage table. - -### 2026-03-07T16-19-00Z: PR hold decision — #189 (workstreams) and #191 (ADO) rebase to dev for v0.8.22 -**By:** Hockney (Tester) -**What:** Both PRs are held for v0.8.22 and must rebase from main to dev. Neither ships for v0.8.21. -**Why:** PR #189: merge conflicts, no CI, process.exit() violation, missing CLI tests, 6 unresolved review threads. PR #191: merge conflicts, no CI, untested security fixes, incomplete Planner adapter. Both have solid architecture but insufficient readiness for v0.8.21. -**Details:** See .squad/orchestration-log/2026-03-07T16-19-00Z-hockney.md for detailed code assessment. - -### 2026-03-07T16-19-00Z: Docs ready for v0.8.21 — no release blockers -**By:** McManus (DevRel) -**What:** v0.8.21 documentation is ship-ready. SDK-First mode guide (705 lines), What's New blog, CHANGELOG, and contributors section all complete. No blocking gaps. -**Why:** Release readiness gate. Docs are complete for Phase 1. Minor gaps are non-blocking and addressed in v0.8.22 roadmap. -**Details:** 2 docs issues queued for v0.8.22 (#251 restructure, #210 contributors workflow). See .squad/orchestration-log/2026-03-07T16-19-00Z-mcmanus.md for full triage. - -### 2026-03-07T16:20: User directive — Shift from Actions to CLI -**By:** Brady (via Copilot) -**What:** "I'm seriously concerned about our continued abuse of actions and think the more we can stop relying on actions to do things and start relying on the cli to do things, it puts more emphasis and control in the user's hand and less automation with actions. I think we're maybe going to surprise customers with some of the usage in actions and I would really hate for that to be a deterrent from using squad." -**Why:** User directive — strategic direction for the product. Actions usage can surprise customers with unexpected billing and loss of control. CLI-first puts the user in the driver's seat. - -### Current Actions Inventory (15 workflows) - -**Squad-specific (customer concern):** -1. `sync-squad-labels.yml` — Auto-syncs labels from team.md on push -2. `squad-triage.yml` — Auto-triages issues when labeled "squad" -3. `squad-issue-assign.yml` — Auto-assigns issues when squad:{member} labeled -4. `squad-heartbeat.yml` — Ralph heartbeat/auto-triage (cron currently disabled) -5. `squad-label-enforce.yml` — Label mutual exclusivity on label events - -**Standard CI/Release (expected):** -6. `squad-ci.yml` — Standard PR/push CI -7. `squad-release.yml` — Tag + release on push to main -8. `squad-promote.yml` — Branch promotion (workflow_dispatch) -9. `squad-main-guard.yml` — Forbidden file guard -10. `squad-preview.yml` — Preview validation -11. `squad-docs.yml` — Docs build/deploy -12-15. Publish/insider workflows - -**Directive:** Move squad-specific automation (1-5) into CLI commands. Keep standard CI/release workflows. - -### 2026-03-07T16:20: User directive — Shift from Actions to CLI -**By:** Brady (via Copilot) -**What:** "I'm seriously concerned about our continued abuse of actions and think the more we can stop relying on actions to do things and start relying on the cli to do things, it puts more emphasis and control in the user's hand and less automation with actions. I think we're maybe going to surprise customers with some of the usage in actions and I would really hate for that to be a deterrent from using squad." -**Why:** User directive — strategic direction for the product. Actions usage can surprise customers with unexpected billing and loss of control. CLI-first puts the user in the driver's seat. - -### Current Actions Inventory (15 workflows) - -**Squad-specific (customer concern):** -1. `sync-squad-labels.yml` — Auto-syncs labels from team.md on push -2. `squad-triage.yml` — Auto-triages issues when labeled "squad" -3. `squad-issue-assign.yml` — Auto-assigns issues when squad:{member} labeled -4. `squad-heartbeat.yml` — Ralph heartbeat/auto-triage (cron currently disabled) -5. `squad-label-enforce.yml` — Label mutual exclusivity on label events - -**Standard CI/Release (expected):** -6. `squad-ci.yml` — Standard PR/push CI -7. `squad-release.yml` — Tag + release on push to main -8. `squad-promote.yml` — Branch promotion (workflow_dispatch) -9. `squad-main-guard.yml` — Forbidden file guard -10. `squad-preview.yml` — Preview validation -11. `squad-docs.yml` — Docs build/deploy -12-15. Publish/insider workflows - -**Directive:** Move squad-specific automation (1-5) into CLI commands. Keep standard CI/release workflows. - -### Follow-up (Brady, same session): -> "seems like the more we can offload to ourselves, the more we could control, say, in a container. if actions are doing the work the loop is outside of our control a bit" - -**Key insight:** CLI-first makes Squad **portable**. If the work lives in CLI commands instead of Actions, Squad can run anywhere — Codespaces, devcontainers, local terminals, persistent ACA containers. Actions lock the control loop to GitHub's event system. CLI-first means the user (or their infrastructure) owns the execution loop, not GitHub Actions. - - -# CLI Feasibility Assessment — GitHub Actions → CLI Commands -**Author:** Fenster (Core Dev) -**Date:** 2026-03-07 -**Context:** Brady's request to migrate squad-specific workflows to CLI commands - - ---- - -## Executive Summary - -**Quick wins:** Label sync + label enforce can ship in v0.8.22 (reuses existing parsers, zero new deps). -**Medium effort:** Triage command is 70% done (CLI watch already exists), needs GitHub comment posting. -**Heavy lift:** Issue assign + heartbeat need copilot-swe-agent[bot] API (PAT + agent_assignment field) — no `gh` CLI equivalent exists. Watch mode already implements heartbeat's core logic locally. - -**Key insight:** We already have `squad watch` — it's the local equivalent of `squad-heartbeat.yml`. The workflow runs in GitHub Actions with PAT; watch runs locally with `gh` CLI. They share the same triage logic (`@bradygaster/squad-sdk/ralph/triage`). - - ---- - -## 1. Current CLI Command Inventory - -**Existing commands** (`packages/squad-cli/src/cli/commands/`): - -| Command | Function | Overlap with Workflows | -|---------|----------|------------------------| -| **watch** | Ralph's local polling — triages issues, monitors PRs, assigns labels. Uses `gh` CLI. | ✅ 80% overlap with `squad-heartbeat.yml` + `squad-triage.yml` | -| plugin | Marketplace add/remove. Uses `gh` CLI for repo access. | ❌ No workflow overlap | -| export | Export squad state to JSON. | ❌ No workflow overlap | -| import | Import squad state from JSON. | ❌ No workflow overlap | -| build | SDK config generation. | ❌ No workflow overlap | -| doctor | Health checks (local/remote/hub). | ❌ No workflow overlap | -| aspire | Launch Aspire dashboard for OTel. | ❌ No workflow overlap | -| start | Interactive shell (Coordinator mode). | ❌ No workflow overlap | -| consult | Spawn agent for consultation. | ❌ No workflow overlap | -| rc/rc-tunnel | Remote control server + devtunnel. | ❌ No workflow overlap | -| copilot/copilot-bridge | Copilot SDK adapter. | ❌ No workflow overlap | -| link/init-remote | Link to remote squad repo. | ❌ No workflow overlap | -| streams | Workstream commands (stub). | ❌ No workflow overlap | - -**Key reusable infrastructure:** -- **`gh-cli.ts`** — Thin wrapper around `gh` CLI: `ghIssueList`, `ghIssueEdit`, `ghPrList`, `ghAvailable`, `ghAuthenticated` -- **`@bradygaster/squad-sdk/ralph/triage`** — Shared triage logic (routing rules, module ownership, keyword matching) -- **`watch.ts`** — Already implements triage cycle + PR monitoring - - ---- - -## 2. Per-Workflow Migration Plan - -### 2.1. sync-squad-labels.yml → `squad labels sync` - -**Current workflow:** 170 lines. Parses `team.md`, syncs `squad`, `squad:{member}`, `go:*`, `release:*`, `type:*`, `priority:*`, `bug`, `feedback` labels. Uses Octokit. - -**Proposed CLI command:** -```bash -squad labels sync [--squad-dir .squad] [--dry-run] -``` - -**Implementation:** -- **Size:** S (2-3 hours) -- **Dependencies:** - - ✅ `gh` CLI (already used in plugin.ts, watch.ts) - - ✅ `parseRoster()` from `@bradygaster/squad-sdk/parsers` (already exists) - - ✅ Thin wrapper — reuse roster parser, call `gh label create/edit` -- **Offline:** ❌ Needs GitHub API access via `gh` -- **Reuse:** Roster parsing (team.md → member list) already exists. Just needs label creation loop with `gh`. -- **Complexity:** Low. No auth complexity (uses `gh auth` flow). No copilot-swe-agent API. - -**Why quick win:** Zero new parsers needed. Label sync is idempotent (create-or-update pattern). Can run manually after `team.md` changes. - - ---- - -### 2.2. squad-triage.yml → `squad triage` (or extend `squad watch`) - -**Current workflow:** 260 lines. On `squad` label, parses `team.md` + `routing.md`, keyword-matches, applies `squad:{member}` label, posts comment. - -**Proposed CLI command:** -```bash -squad triage [--issue ] [--squad-dir .squad] -``` -Or: enhance `squad watch` to post comments (currently it only adds labels). - -**Implementation:** -- **Size:** M (4-6 hours) -- **Dependencies:** - - ✅ `gh` CLI (already used) - - ✅ `triageIssue()` from `@bradygaster/squad-sdk/ralph/triage` (already used in watch.ts) - - ❌ **Missing:** `gh issue comment` wrapper in `gh-cli.ts` (5 lines to add) -- **Offline:** ❌ Needs GitHub API -- **Reuse:** - - **watch.ts already does this** (line 189-209). Just missing comment posting. - - Triage logic, routing rules, module ownership — all implemented. -- **Complexity:** Low. The logic exists; just needs `gh issue comment --body ` wrapper. - -**Why medium effort:** Code exists. Just needs comment posting feature added to `gh-cli.ts` and called from `watch.ts`. - - ---- - -### 2.3. squad-issue-assign.yml → ??? - -**Current workflow:** 160 lines. On `squad:{member}` label, posts assignment comment, calls **copilot-swe-agent[bot] assignment API with PAT** (lines 116-161). - -**Problem:** The workflow uses a special POST endpoint: -```js -POST /repos/{owner}/{repo}/issues/{issue_number}/assignees -{ - assignees: ['copilot-swe-agent[bot]'], - agent_assignment: { - target_repo: `${owner}/${repo}`, - base_branch: baseBranch, - custom_instructions: '', - custom_agent: '', - model: '' - } -} -``` -**This endpoint does NOT exist in `gh` CLI.** It requires: -- Personal Access Token (PAT) with `issues:write` scope -- Direct Octokit call (cannot use `gh` as thin wrapper) - -**Migration options:** -1. **Add Octokit dependency** — heavyweight (35+ deps), violates zero-dependency CLI goal -2. **Add raw HTTPS module** — 50-100 lines to make authenticated POST with PAT, parse JSON response -3. **Document manual workflow** — "To auto-assign @copilot, use the GitHub Actions workflow (requires PAT)" - -**Proposed approach:** -- **Do NOT migrate.** Keep as workflow-only feature. -- **Reasoning:** The copilot-swe-agent assignment API is GitHub-specific and requires secrets (PAT). CLI commands should not manage secrets. Workflows already have secure secret storage. -- **Alternative:** Document `squad watch` as the local equivalent (it can label + post comments, but not trigger bot assignment). - -**Implementation:** -- **Size:** XL (8-12 hours if full migration) -- **Dependencies:** - - ❌ PAT management (needs secret storage or prompting) - - ❌ Octokit or raw HTTPS POST wrapper (50-100 lines) - - ❌ Not available in `gh` CLI -- **Offline:** ❌ Never (GitHub-specific API) -- **Complexity:** High. Requires secret handling, bot assignment API, error handling, fallback. - -**Recommendation:** **Do not migrate.** Keep as workflow. Document that copilot auto-assign requires Actions + PAT. - - ---- - -### 2.4. squad-heartbeat.yml → Already exists as `squad watch` - -**Current workflow:** 170 lines. Runs on cron (disabled), issues closed/labeled, PRs closed. Triages untriaged issues, assigns @copilot to `squad:copilot` issues. - -**CLI equivalent:** **Already shipped as `squad watch`** (`packages/squad-cli/src/cli/commands/watch.ts`, 356 lines). - -**What `squad watch` does:** -- Polls open issues with `squad` label -- Triages untriaged issues (adds `squad:{member}` label) -- Monitors PRs (draft/needs-review/changes-requested/CI failures/ready-to-merge) -- Runs on interval (default: 30 minutes) -- Uses `gh` CLI for auth + API access -- Uses shared `@bradygaster/squad-sdk/ralph/triage` logic - -**What `squad watch` does NOT do (that heartbeat.yml does):** -- ❌ Post triage comments (workflow posts "Ralph — Auto-Triage" comments) -- ❌ Auto-assign copilot-swe-agent[bot] (requires PAT + bot API, same issue as #2.3) - -**Implementation gap:** -- **Comment posting:** M (4-6 hours) — add `gh issue comment` wrapper to `gh-cli.ts`, call it from `runCheck()` in watch.ts -- **Copilot auto-assign:** Do not migrate (same as #2.3) - -**Migration plan:** -- ✅ **Already done.** `squad watch` is the local heartbeat. -- **Add comment posting** to match workflow behavior (quick win, 4-6 hours). -- **Document copilot auto-assign** as workflow-only (requires PAT). - -**Recommendation:** Enhance `squad watch` with comment posting. Keep copilot auto-assign in workflow. - - ---- - -### 2.5. squad-label-enforce.yml → `squad labels enforce` - -**Current workflow:** 180 lines. On label applied, removes conflicting labels from mutual-exclusivity namespaces (`go:`, `release:`, `type:`, `priority:`). Posts update comment. - -**Proposed CLI command:** -```bash -squad labels enforce [--issue ] [--squad-dir .squad] -``` - -**Implementation:** -- **Size:** S (2-4 hours) -- **Dependencies:** - - ✅ `gh` CLI (already used) - - ❌ `gh issue edit --remove-label