Skip to content

Squad product: samples/ directory has zero CI coverage -- SDK changes silently break samples #103

@diberry

Description

@diberry

Scope: This is a Squad product issue. The samples/ directory ships 11 sample projects with zero CI coverage. SDK changes silently break samples without detection.
Status: Implementation-ready PRD (v2 -- updated per FIDO review)
Priority: P1
Owner: Flight (Lead)
Related: #98 (minutes optimization), #104 (PR completeness gates), #100 (review completeness)


1. Problem Statement

Squad ships 11 sample projects in samples/ that demonstrate SDK usage patterns to customers. These samples form a critical part of the developer experience -- customers copy them, reference them, and build on them.

Current state:

  • 10 of 11 samples import from @bradygaster/squad-sdk
  • All 11 have package.json with dependencies
  • 7 of 11 have test scripts
  • 6 of 11 have actual test files (.test.*)
  • squad-ci.yml has zero references to samples/
  • No samples-specific CI workflow exists

Impact:

Sample Inventory

Sample Has package.json Has build script Has test script Has test files Imports squad-sdk CI coverage
autonomous-pipeline Yes check package.json Yes 2 Yes None
azure-function-squad Yes check package.json Yes 0 Yes (via builders) None
cost-aware-router Yes check package.json Yes 1 Yes None
hello-squad Yes check package.json Yes 1 Yes None
hook-governance Yes check package.json Yes 1 Yes None
knock-knock Yes check package.json No 0 Yes None
rock-paper-scissors Yes check package.json No 0 Yes None
skill-discovery Yes check package.json Yes 1 Yes None
storage-provider-azure Yes check package.json No (demo only) 161* Yes None
storage-provider-sqlite Yes check package.json No (demo only) 0 Yes None
streaming-chat Yes check package.json Yes 1 Yes None

*storage-provider-azure has 161 test-like files from azurite/azure SDK dependencies in node_modules, not actual sample tests.

v2 correction (per FIDO review): azure-function-squad DOES import squad-sdk via src/squad/config.ts (imports from @bradygaster/squad-sdk/builders). Previous version incorrectly stated it did not. All 11 samples now confirmed as SDK-importing.


2. Goals

Goal Target
All samples build in CI 11 of 11 pass build or type-check in CI
Tests run for samples with test scripts 7 of 11 run npm test in CI
SDK changes trigger sample validation packages/squad-sdk/src/** path filter on workflow
Feature-flagged vars.SQUAD_SAMPLES_CI controls whether samples CI runs
New samples auto-included Matrix reads directory listing, not hardcoded list
No minutes wasted on unrelated changes Path filter: only triggers on samples/** or SDK source changes
Failing sample isolated Separate workflow, not a required status check initially
Samples validate against in-PR SDK All samples use local file: ref, not published npm version

3. Current State Audit

What squad-ci.yml checks today

  • npm run build (TypeScript compilation)
  • npm test (vitest suite, packages only)
  • Source tree canary (hardcoded file list)
  • Deletion guard (protects critical files)
  • Publish policy check

What squad-ci.yml does NOT check

  • Whether any sample builds
  • Whether any sample tests pass
  • Whether SDK type changes broke sample imports
  • Whether new samples added to PR are functional

Result: samples/ is a CI blind spot

Every PR that touches packages/squad-sdk/src/ is a silent risk to all 11 sdk-importing samples.


4. CI Validation Tiers

Tier 1 -- Build check (minimum, catches type errors and import path changes)

cd samples/{name} && npm install && npm run build --if-present

Catches: import path changes, type signature changes, missing exports, structural breakage.

Applies to: all samples that have a build script.

Tier 2 -- Test execution (for samples with test scripts)

cd samples/{name} && npm install && npm test --if-present

Catches: runtime behavior regressions, API contract violations, broken sample logic.

Applies to: autonomous-pipeline, azure-function-squad, cost-aware-router, hello-squad, hook-governance, skill-discovery, streaming-chat (7 of 11).

Note: Tier 2 is only valid for samples that can run tests without external services or env vars. See Section 5 (Environment-Dependent Samples) for samples that must be downgraded to Tier 1 build-only in CI.

Tier 3 -- Type-check only (fallback for samples without build or test)

cd samples/{name} && npm install && npx tsc --noEmit

Catches: TypeScript compilation errors from SDK changes. Used when no build script exists.

Applies to: knock-knock, rock-paper-scissors (2 of 11).

Note: storage-provider-azure and storage-provider-sqlite are missing tsconfig.json, so Tier 3 will fail on them until tsconfig.json is added (see Phase 2 and Section 6).

Each sample runs the highest tier applicable to it. A sample with a build script runs Tier 1. A sample with a test script runs Tier 2 (which subsumes Tier 1). A sample with neither runs Tier 3.


5. Environment-Dependent Samples

v2 addition (per FIDO review B4): Samples that require environment variables or external services will produce false-negative CI failures if run at Tier 2 (test execution) without proper configuration. This section categorizes samples by their environment dependencies.

Tier A -- No env vars needed (can run in any CI)

These samples can run all tiers (build, test, type-check) in a clean CI environment with zero configuration.

Sample Notes
autonomous-pipeline Pure SDK usage, no external deps
hello-squad Pure SDK usage, no external deps
hook-governance Pure SDK usage, no external deps
skill-discovery Pure SDK usage, no external deps
streaming-chat Pure SDK usage, no external deps
rock-paper-scissors Pure SDK usage, no external deps
cost-aware-router Pure SDK usage, no external deps

Tier B -- Needs .env but can use dummy values for build/type-check

These samples reference environment variables but can still pass Tier 1 (build) and Tier 3 (type-check) without real values. Tier 2 (test execution) may fail without valid env vars.

Sample Env requirement CI strategy
knock-knock .env.example with GITHUB_TOKEN Run Tier 1 build only; skip Tier 2 test

Tier C -- Needs external services (skip Tier 2 tests, run Tier 1 build only)

These samples depend on external services (Azure, etc.) that are not available in standard CI. They should run Tier 1 (build check) only. Tier 2 tests will always fail without the service runtime.

Sample External dependency CI strategy
azure-function-squad Azure Functions runtime Tier 1 build only; test script is runtime execution, not unit test
storage-provider-azure Azure Storage / Azurite Tier 1 build only; also missing tsconfig.json (see Phase 2)
storage-provider-sqlite SQLite runtime Tier 1 build only; also missing tsconfig.json (see Phase 2)

Matrix integration

The workflow must determine each sample's environment tier and skip Tier 2 for Tier B and Tier C samples. Implementation options:

Option A (recommended): Maintain a ci-skip-tests.json file in samples/ that lists samples to skip Tier 2 for:

["knock-knock", "azure-function-squad", "storage-provider-azure", "storage-provider-sqlite"]

Option B: Check for a .ci-skip-tests marker file in each sample directory.

The validate step checks this list and runs npm test --if-present only for samples NOT in the skip list. All samples still run Tier 1 (build).


6. Architecture

Recommended approach: Separate workflow (Option B)

Create .github/workflows/squad-samples.yml. Do NOT add samples validation as a job in squad-ci.yml.

Rationale:

Trigger conditions

on:
  pull_request:
    branches: [dev, main]
    paths:
      - 'samples/**'
      - 'packages/squad-sdk/src/**'
  push:
    branches: [dev, main]
    paths:
      - 'samples/**'
      - 'packages/squad-sdk/src/**'

This means: a PR that only modifies docs or .squad/ files never triggers samples CI.

Feature flag integration

jobs:
  samples:
    if: vars.SQUAD_SAMPLES_CI != 'false'

Default: runs. Customers who want to opt out set vars.SQUAD_SAMPLES_CI = 'false' in their repo settings. Aligns with the vars.SQUAD_* feature flag pattern established in #98.

Skip label

If a PR is labeled skip-samples, the workflow skips. Useful for WIP or known-broken samples being fixed in a later PR.

if: vars.SQUAD_SAMPLES_CI != 'false' && !contains(github.event.pull_request.labels.*.name, 'skip-samples')

Note: The skip-samples label must be created in the repository as part of Phase 1 (see implementation phases). Without the label existing, the workflow condition will silently never match.

Matrix strategy -- dynamic (not hardcoded)

Do NOT hardcode the sample list. Use directory listing so new samples added to samples/ are automatically included.

Build sequence (critical -- npm ci does NOT build the SDK)

v2 fix (per FIDO review B1): The previous version incorrectly stated "npm ci at root builds the SDK from source." This is wrong. npm ci only installs dependencies; it does NOT compile TypeScript. An explicit build step is required.

The correct build sequence is:

1. npm ci                                    # Install root dependencies (does NOT compile)
2. npm run build -w packages/squad-sdk       # Build SDK from source (produces dist/)
3. Patch sample package.json refs            # Ensure all samples use file: ref (see B2 fix)
4. Per-sample: npm install                   # Install sample-specific dependencies
5. Per-sample: npm run build --if-present    # Tier 1 build check
6. Per-sample: npm test --if-present         # Tier 2 test (only for Tier A samples)
7. Per-sample: npx tsc --noEmit              # Tier 3 type-check fallback

Without step 2, samples referencing file:../../packages/squad-sdk will find a stale or empty dist/ directory and fail for the wrong reason. This corrupts the test signal.

SDK reference patching (critical -- 3 samples use published npm ref)

v2 fix (per FIDO review B2): Three samples reference the published npm registry instead of the local workspace path. They must be patched in CI to validate against the in-PR SDK.

The following samples currently use published npm references:

  • cost-aware-router: "@bradygaster/squad-sdk": "^0.8.0"
  • autonomous-pipeline: "@bradygaster/squad-sdk": "^0.8.0"
  • storage-provider-azure: "@bradygaster/squad-sdk": "latest"

The workflow must patch these to use file:../../packages/squad-sdk before running npm install. This ensures all 11 samples validate against the in-PR SDK version, not the last published version.

Implementation (added as a workflow step before per-sample install):

- name: Patch sample SDK references to use local workspace
  run: |
    SDK_REF="file:../../packages/squad-sdk"
    for pkg in samples/*/package.json; do
      if grep -q '"@bradygaster/squad-sdk"' "$pkg"; then
        sed -i "s|\"@bradygaster/squad-sdk\": \"[^\"]*\"|\"@bradygaster/squad-sdk\": \"$SDK_REF\"|" "$pkg"
      fi
    done
    echo "Patched SDK references:"
    grep -r '"@bradygaster/squad-sdk"' samples/*/package.json || true

This approach:

  • Patches ALL samples uniformly to use the local SDK
  • Runs before per-sample npm install so the local version is resolved
  • Prints patched refs for debugging
  • Does not modify committed files (CI working copy only)

Long-term fix: The 3 samples should update their committed package.json files to use file:../../packages/squad-sdk (Phase 2 cleanup). The CI patching step is a safety net.

Workflow YAML (complete)

name: Squad Samples CI

on:
  pull_request:
    branches: [dev, main]
    paths:
      - 'samples/**'
      - 'packages/squad-sdk/src/**'
  push:
    branches: [dev, main]
    paths:
      - 'samples/**'
      - 'packages/squad-sdk/src/**'

jobs:
  discover:
    if: vars.SQUAD_SAMPLES_CI != 'false' && !contains(github.event.pull_request.labels.*.name, 'skip-samples')
    runs-on: ubuntu-latest
    outputs:
      samples: ${{ steps.list.outputs.samples }}
    steps:
      - uses: actions/checkout@v4
      - id: list
        run: |
          samples=$(find samples -maxdepth 1 -mindepth 1 -type d -not -name '.*' -printf '%f\n' | sort | jq -R . | jq -sc .)
          if [ -z "$samples" ] || [ "$samples" = "[]" ]; then
            echo "::error::No samples found in samples/ directory"
            exit 1
          fi
          echo "samples=$samples" >> $GITHUB_OUTPUT
          echo "Discovered samples: $samples"

  validate:
    needs: discover
    runs-on: ubuntu-latest
    strategy:
      matrix:
        sample: ${{ fromJson(needs.discover.outputs.samples) }}
      fail-fast: false
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'

      - name: Install root dependencies
        run: npm ci

      - name: Build SDK from source
        run: npm run build -w packages/squad-sdk

      - name: Patch sample SDK references to use local workspace
        run: |
          SDK_REF="file:../../packages/squad-sdk"
          pkg="samples/${{ matrix.sample }}/package.json"
          if [ -f "$pkg" ] && grep -q '"@bradygaster/squad-sdk"' "$pkg"; then
            sed -i "s|\"@bradygaster/squad-sdk\": \"[^\"]*\"|\"@bradygaster/squad-sdk\": \"$SDK_REF\"|" "$pkg"
            echo "Patched $pkg to use local SDK"
          fi

      - name: Verify sample has package.json
        run: |
          if [ ! -f "samples/${{ matrix.sample }}/package.json" ]; then
            echo "::error::No package.json found in samples/${{ matrix.sample }}"
            exit 1
          fi

      - name: Install sample dependencies
        run: cd samples/${{ matrix.sample }} && npm install

      - name: Build (Tier 1)
        run: cd samples/${{ matrix.sample }} && npm run build --if-present

      - name: Test (Tier 2 -- skip for env-dependent samples)
        run: |
          cd samples/${{ matrix.sample }}
          SKIP_TEST_SAMPLES="knock-knock azure-function-squad storage-provider-azure storage-provider-sqlite"
          if echo "$SKIP_TEST_SAMPLES" | grep -qw "${{ matrix.sample }}"; then
            echo "Skipping Tier 2 test for ${{ matrix.sample }} (env-dependent sample)"
          else
            npm test --if-present
          fi

      - name: Type-check fallback (Tier 3)
        if: always()
        run: |
          cd samples/${{ matrix.sample }}
          if [ ! -f package.json ] || ! grep -q '"build"' package.json; then
            if [ -f tsconfig.json ]; then
              npx tsc --noEmit
            else
              echo "::warning::No tsconfig.json in ${{ matrix.sample }} -- Tier 3 type-check skipped"
            fi
          fi

fail-fast: false is required so one failing sample does not cancel validation of all other samples.

Dependency installation

v2 fix (per FIDO review B1): Corrected build sequence. npm ci installs only; an explicit build step is required.

  1. npm ci at root -- installs all workspace dependencies (does NOT compile TypeScript)
  2. npm run build -w packages/squad-sdk -- compiles the SDK from source so dist/ is populated with the in-PR version
  3. Patch sample package.json SDK refs -- ensures all samples resolve to local SDK (safety net for samples using published npm refs)
  4. npm install per-sample -- resolves sample-specific dependencies against the freshly built local SDK
  5. Per-sample validation (Tier 1/2/3 as applicable)

This is critical: samples must build against the in-PR SDK, not the last published npm version. Steps 2 and 3 together guarantee this for all 11 samples.

Relationship to squad-ci.yml

squad-samples.yml is a separate workflow file. It does not call or depend on squad-ci.yml. The two workflows run in parallel when both are triggered. squad-ci.yml remains the required status check for merge. squad-samples.yml starts as informational (not required) until samples are confirmed green.


7. Samples That Need Fixes Before CI

The following samples have issues that need attention before CI produces reliable signal. These do not block Phase 1 (CI can run as informational), but must be fixed before Phase 3 (promoting to required status check).

Sample Issue Recommended fix
knock-knock No test script; needs GITHUB_TOKEN (.env.example) Add build script or accept Tier 3 type-check; env-dependent, Tier B (build-only in CI)
rock-paper-scissors No test script Add build script or accept Tier 3 type-check
storage-provider-azure Demo scripts only, no test; missing tsconfig.json Add tsconfig.json first, then add build script; env-dependent, Tier C (build-only in CI)
storage-provider-sqlite Demo scripts only, no test; missing tsconfig.json Add tsconfig.json first, then add build script; env-dependent, Tier C (build-only in CI)
azure-function-squad Test script is runtime execution, not unit test; needs Azure Functions env Tier C: run Tier 1 build only in CI; skip Tier 2 test execution
cost-aware-router Uses published npm SDK ref (^0.8.0), not local file: path Update package.json to use file:../../packages/squad-sdk (CI patches as safety net)
autonomous-pipeline Uses published npm SDK ref (^0.8.0), not local file: path Update package.json to use file:../../packages/squad-sdk (CI patches as safety net)

v2 correction (per FIDO review): azure-function-squad DOES import squad-sdk (via src/squad/config.ts). Previous version incorrectly stated it might not need SDK-change triggers. It does need them. However, its test script is a runtime execution that requires Azure Functions environment -- it should run Tier 1 build only, not Tier 2 test.

v2 addition (per FIDO review): storage-provider-azure and storage-provider-sqlite are missing tsconfig.json. Tier 3 type-check will fail on them until tsconfig.json is created. Added to Phase 2 checklist.


8. Minutes Impact

Per-PR estimate (when triggered)

Matrix runs samples in parallel. Wall-clock time is bounded by the slowest sample.

Step Estimated time
Root npm ci ~45-90 sec
SDK build (npm run build -w packages/squad-sdk) ~15-30 sec
SDK ref patching ~5 sec
Per-sample npm install (parallel) ~30-60 sec
Per-sample build (parallel) ~15-30 sec
Per-sample test (parallel) ~30-90 sec
Total wall-clock per triggered PR ~2.5-5.5 min

Minutes consumed (11 samples x ~2 min each): ~22 min per triggered PR.

Frequency filter (path filter saves most minutes)

Most PRs do not touch samples/ or packages/squad-sdk/src/. Estimated trigger rate: 20-30% of PRs. At 30 PRs/month: ~6-9 triggers. Monthly cost: ~130-200 min/month.

This fits within the Tier 1 budget defined in #98 (~400-800 min/month) with room to spare.

Feature flag opt-out

Customers who cannot afford the minutes set vars.SQUAD_SAMPLES_CI = 'false'. Samples CI never runs. No impact on their budget.

See #98 for the full tiered minutes architecture.


9. Implementation Phases

Phase 1 -- Wire CI (unblocks everything)

  • Create .github/workflows/squad-samples.yml
  • Implement discover job (dynamic matrix using find, with empty-array guard)
  • Implement validate job (Tier 1/2/3 per sample)
  • Add explicit npm run build -w packages/squad-sdk step after npm ci (B1 fix)
  • Add SDK reference patching step to ensure all samples use local file: ref (B2 fix)
  • Add environment-dependent sample handling: skip Tier 2 tests for Tier B/C samples (B4 fix)
  • Create the skip-samples label in the repository (B3 fix)
  • Add feature flag check (vars.SQUAD_SAMPLES_CI)
  • Add skip-samples label support in workflow condition
  • Set as informational (not required status check)
  • Verify all 11 samples run through the workflow

Owner: Procedures or FIDO
Dependency: None -- can start immediately

Phase 2 -- Fix samples missing build/test scripts and config

  • Add build script to knock-knock
  • Add build script to rock-paper-scissors
  • Add tsconfig.json to storage-provider-azure
  • Add tsconfig.json to storage-provider-sqlite
  • Add build script to storage-provider-azure (verify azurite files are excluded)
  • Add build script to storage-provider-sqlite
  • Update cost-aware-router package.json to use file:../../packages/squad-sdk
  • Update autonomous-pipeline package.json to use file:../../packages/squad-sdk
  • Update storage-provider-azure package.json to use file:../../packages/squad-sdk
  • Evaluate azure-function-squad test script (runtime exec vs unit test)

Owner: FIDO (verification) + EECOM (SDK alignment)
Dependency: Phase 1 (CI must be running to verify fixes work)

Phase 3 -- Promote to required status check

Owner: Flight (sign-off)
Dependency: Phase 2 complete, all samples green

Phase 4 -- Runtime config integration

Owner: EECOM
Dependency: Phase 3 (config integration is polish, not blocking)


10. Interaction with Other Issues

#98 -- Minutes optimization PRD

samples CI adds ~130-200 min/month at estimated trigger rate. This fits within Tier 1 budget. Feature flag vars.SQUAD_SAMPLES_CI matches the vars.SQUAD_* pattern from #98. Phase 4 of this issue (runtime config) should extend the ci.features map in #98's config schema.

#104 -- PR completeness gates

This issue is Gate 3 in #104. Gate 3 reads: "if PR modifies packages/squad-sdk/src/, build all samples to catch breakage." Phase 3 of this issue (promoting to required status check) directly satisfies Gate 3 in #104.

bradygaster#640 -- StorageProvider PR

PR bradygaster#640 ships storage-provider-azure and storage-provider-sqlite with no CI. Phase 1 of this issue immediately gives those samples CI coverage. Phase 2 should include adding tsconfig.json and build scripts to both storage-provider samples (currently demo-only, missing tsconfig.json). bradygaster#640 should not merge until Phase 1 is in place or the samples are explicitly accepted as Tier 1 (build-only).

#100 -- Review completeness

#100 asks what reviewers should catch. Samples CI coverage is exactly the kind of systematic gap that CI should catch rather than reviewers. Phase 3 (required status check) satisfies the review completeness requirement for samples.


11. Acceptance Criteria

  • CI workflow .github/workflows/squad-samples.yml exists and triggers on samples/** and packages/squad-sdk/src/** path changes
  • CI builds all 11 samples (Tier 1 or Tier 3 minimum for each)
  • CI runs tests for Tier A samples that have test scripts (Tier 2) and skips Tier 2 for env-dependent Tier B/C samples
  • Matrix is dynamic (reads directory listing via find, not hardcoded sample names)
  • Discover step has empty-array guard (fails if no samples found)
  • fail-fast: false so one failing sample does not cancel others
  • Root npm ci AND npm run build -w packages/squad-sdk run before per-sample install; SDK dist/ is freshly compiled from in-PR source
  • All 11 samples that import squad-sdk reference file:../../packages/squad-sdk (patched in CI if not already committed)
  • Feature flag vars.SQUAD_SAMPLES_CI = 'false' disables the workflow
  • skip-samples label exists in the repository and workflow respects it
  • New sample added to samples/ is automatically included in next CI run without workflow edits
  • Failing sample produces clear error output identifying which sample and which step failed
  • squad-ci.yml is not modified (samples CI is separate workflow)
  • Env-dependent samples (Tier B/C) run Tier 1 build only; do not produce false-negative test failures
  • Tier 3 type-check skips gracefully for samples missing tsconfig.json (warning, not error)
  • Phase 3: squad-samples / validate promoted to required status check after 2+ weeks green

Changelog

v2 (per FIDO quality review):

  • B1: Fixed incorrect claim that npm ci builds the SDK. Added explicit npm run build -w packages/squad-sdk step. Updated build sequence, workflow YAML, dependency installation section, and acceptance criteria.
  • B2: Added SDK reference patching step for 3 samples (cost-aware-router, autonomous-pipeline, storage-provider-azure) that use published npm refs instead of local file: path. Added to Architecture, Phase 1, Phase 2, and acceptance criteria.
  • B3: Added "Create skip-samples label" to Phase 1 checklist. Added acceptance criterion for label existence.
  • B4: Added new Section 5 (Environment-Dependent Samples) categorizing samples into Tier A/B/C by env requirements. Updated workflow YAML to skip Tier 2 tests for env-dependent samples. Updated acceptance criteria.
  • Bonus: Corrected azure-function-squad inventory to show it imports squad-sdk (via builders). Added tsconfig.json gap for storage-provider-azure and storage-provider-sqlite to Phase 2 and Samples Needing Fixes. Updated discover step to use find (not ls) with empty-array guard. Added Tier 3 graceful skip for missing tsconfig.json.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions