Skip to content

fix(coordinator): persist agent results immediately after read_agent#684

Open
tamirdresher wants to merge 1 commit intodevfrom
squad/652-persist-agent-results
Open

fix(coordinator): persist agent results immediately after read_agent#684
tamirdresher wants to merge 1 commit intodevfrom
squad/652-persist-agent-results

Conversation

@tamirdresher
Copy link
Copy Markdown
Collaborator

Summary

ead_agent results expire within ~2-3 minutes. When the coordinator takes too long between collecting and processing results (e.g., during fan-out with many agents), data is silently lost. This adds a mandatory persistence step to the "After Agent Work" coordinator flow.

Changes

New step 2 — Persist results immediately:

  • After
    ead_agent returns, write each agent's result to .squad/orchestration-log/{timestamp}-{agent-name}.md BEFORE any other processing
  • Defined format: Task / Result / Files Modified
  • Explicitly marked as the ONE exception to the "no file I/O" LEAN rule

Updated step 3 — Silent success detection:

  • Now also checks .squad/orchestration-log/ for files agents may have written directly during their run

Updated Scribe task 3:

  • Changed from "Write" to "Enrich" — Scribe now enriches coordinator-written logs rather than creating from scratch

Housekeeping:

  • Steps renumbered 3→4, 4→5, 5→6, 6→7
  • LEAN instruction updated: (1) persist results, (2) present compact results, (3) spawn Scribe

Files Changed

  • .github/agents/squad.agent.md
  • emplates/squad.agent.md.template
  • packages/squad-cli/templates/squad.agent.md.template
  • packages/squad-sdk/templates/squad.agent.md.template

Testing

Coordinator instruction change only — no runtime code. Verified all 4 files have identical changes.

Refs #652

…652)

read_agent data expires within ~2-3 minutes, causing silent data loss when
the coordinator takes too long between collecting results and processing them.

Changes to After Agent Work flow:
- Add mandatory step 2: write results to orchestration-log BEFORE any other
  processing (presenting results, spawning Scribe, etc.)
- Update silent-success detection (now step 3) to check orchestration-log for
  files agents may have written directly
- Update Scribe task 3 to enrich coordinator-written logs rather than create
  from scratch
- Renumber steps 3-6 to 4-7
- Update LEAN instruction to acknowledge persistence write as the one
  permitted file I/O operation

Refs #652

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings March 29, 2026 14:58
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the Squad coordinator “After Agent Work” instructions to mitigate read_agent result expiration by persisting agent outputs immediately and adjusting downstream Scribe/orchestration-log behavior.

Changes:

  • Add a new mandatory step to persist read_agent results to .squad/orchestration-log/ before any further processing.
  • Extend “silent success detection” to also look for agent-written orchestration-log files.
  • Update Scribe’s orchestration-log task from “write” to “enrich”, and renumber subsequent coordinator steps.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
.github/agents/squad.agent.md Updates authoritative coordinator + Scribe post-agent-work procedure (new persistence step, renumbering).
templates/squad.agent.md.template Mirrors the updated coordinator/Scribe instructions for template distribution.
packages/squad-cli/templates/squad.agent.md.template Mirrors the updated template in the CLI package.
packages/squad-sdk/templates/squad.agent.md.template Mirrors the updated template in the SDK package.

Comment on lines +848 to +856
**⚡ Keep the post-work turn LEAN.** Coordinator's job: (1) persist results, (2) present compact results, (3) spawn Scribe. That's ALL. No decision consolidation, no heavy file I/O beyond the mandatory persistence write in step 2.

**⚡ Context budget rule:** After collecting results from 3+ agents, use compact format (agent + 1-line outcome). Full details go in orchestration log via Scribe.

After each batch of agent work:

1. **Collect results** via `read_agent` (wait: true, timeout: 300).

2. **Silent success detection** — when `read_agent` returns empty/no response:
2. **Persist results immediately** — `read_agent` data expires within ~2-3 minutes. Write each agent's result to `.squad/orchestration-log/{timestamp}-{agent-name}.md` **BEFORE** any other processing (presenting results, spawning Scribe, etc.). Create the directory if it doesn't exist. Use ISO 8601 UTC timestamp (e.g. `20260401T1423Z`). Format:
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Template parity: these edits were applied to mirror copies, but the canonical template .squad-templates/squad.agent.md still has the old “After Agent Work” steps. The repo’s template-sync tests enforce byte-for-byte equality from .squad-templates/ to templates/, packages/*/templates/, and .github/agents/ (see test/template-sync.test.ts:4-6,118-141), so this PR will fail CI and/or get reverted the next time node scripts/sync-templates.mjs runs. Please apply the change to .squad-templates/squad.agent.md and re-run the sync script so all mirrors match.

Copilot uses AI. Check for mistakes.
Comment on lines +856 to +866
2. **Persist results immediately** — `read_agent` data expires within ~2-3 minutes. Write each agent's result to `.squad/orchestration-log/{timestamp}-{agent-name}.md` **BEFORE** any other processing (presenting results, spawning Scribe, etc.). Create the directory if it doesn't exist. Use ISO 8601 UTC timestamp (e.g. `20260401T1423Z`). Format:
```markdown
# {Agent Name} — {ISO 8601 UTC timestamp}
## Task
{what was asked}
## Result
{agent output summary}
## Files Modified
{list of files the agent created or changed, or "None detected"}
```
This is the ONE exception to "no file I/O" — it is mandatory because results are unrecoverable once expired.
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new persistence step writes coordinator output into .squad/orchestration-log/{timestamp}-{agent-name}.md using a new header-based format, but this conflicts with existing orchestration-log conventions in this repo: templates/orchestration-log.md defines a table-based entry and says entries are created BEFORE spawning (and append-only), and this same squad.agent prompt elsewhere says “Scribe writes one entry per agent … format matches the existing orchestration log entry template.” Either adjust the persistence location/filename (e.g., a separate “raw results” log) or update the orchestration-log template + the earlier sections so the coordinator-written file format and timing are consistent.

Copilot uses AI. Check for mistakes.
Comment on lines +856 to +869
2. **Persist results immediately** — `read_agent` data expires within ~2-3 minutes. Write each agent's result to `.squad/orchestration-log/{timestamp}-{agent-name}.md` **BEFORE** any other processing (presenting results, spawning Scribe, etc.). Create the directory if it doesn't exist. Use ISO 8601 UTC timestamp (e.g. `20260401T1423Z`). Format:
```markdown
# {Agent Name} — {ISO 8601 UTC timestamp}
## Task
{what was asked}
## Result
{agent output summary}
## Files Modified
{list of files the agent created or changed, or "None detected"}
```
This is the ONE exception to "no file I/O" — it is mandatory because results are unrecoverable once expired.

3. **Silent success detection** — when `read_agent` returns empty/no response:
- Check `.squad/orchestration-log/` for a file the agent may have written directly during its run.
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flow ambiguity: step 2 is written as mandatory “persist results immediately” for each agent, but step 3 is specifically for the case where read_agent returns empty/no response (i.e., there is no result content to persist). Please clarify the conditional behavior (e.g., only persist when read_agent returns content, or persist a stub that explicitly records the loss) so the numbered procedure is executable as written.

Copilot uses AI. Check for mistakes.
1. DECISIONS ARCHIVE [HARD GATE]: If decisions.md >= 20480 bytes, archive entries older than 30 days NOW. If >= 51200 bytes, archive entries older than 7 days. Do not skip this step.
2. DECISION INBOX: Merge .squad/decisions/inbox/ → decisions.md, delete inbox files. Deduplicate.
3. ORCHESTRATION LOG: Write .squad/orchestration-log/{timestamp}-{agent}.md per agent. Use ISO 8601 UTC timestamp.
3. ORCHESTRATION LOG: Enrich .squad/orchestration-log/{timestamp}-{agent}.md files written by coordinator in step 2. Add detail from agent history if available. Write new entries for any agents not yet logged. Use ISO 8601 UTC timestamp.
Copy link

Copilot AI Mar 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Placeholder/filename inconsistency: coordinator step 2 uses {timestamp}-{agent-name}.md, while the Scribe task refers to {timestamp}-{agent}.md. To avoid Scribe missing files or generating duplicates, please standardize on one placeholder and explicitly state it should use the spawned agent name (lowercase cast id) as the filename-safe identifier.

Suggested change
3. ORCHESTRATION LOG: Enrich .squad/orchestration-log/{timestamp}-{agent}.md files written by coordinator in step 2. Add detail from agent history if available. Write new entries for any agents not yet logged. Use ISO 8601 UTC timestamp.
3. ORCHESTRATION LOG: Enrich .squad/orchestration-log/{timestamp}-{agent-name}.md files written by coordinator in step 2, where `agent-name` is the spawned agent `name` (cast id), lowercased and made filename-safe. Add detail from agent history if available. Write new entries for any agents not yet logged. Use ISO 8601 UTC timestamp.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants