Skip to content

feat(workflows): postAction hook workflow for #500#527

Open
khaliqgant wants to merge 5 commits intomainfrom
feature/500-post-command-workflow
Open

feat(workflows): postAction hook workflow for #500#527
khaliqgant wants to merge 5 commits intomainfrom
feature/500-post-command-workflow

Conversation

@khaliqgant
Copy link
Member

@khaliqgant khaliqgant commented Mar 10, 2026

Summary

  • Adds relay.post-action-hook.yaml workflow to coordinate agents implementing the postAction hook feature from issue WorkflowRunner: add postCommand hook to workflow steps #500
  • 11-phase pipeline: read context, add types, implement runner logic, update validator, write tests, type-check, run tests, fix failures, final tests, capture diff, code review
  • 6 agents (types-author, implementer, validator-author, test-writer, fixer, reviewer) with appropriate presets

What the workflow implements

The postAction hook adds an optional postAction field to WorkflowStep supporting:

  • Shell commands via child_process.execSync
  • HTTP webhooks via fetch with POST/PUT/PATCH methods
  • Template interpolation for {{step.output}}, {{step.name}}, {{run.id}}
  • Configurable failAction: 'fail' (abort) or 'warn' (log and continue)

Files

  • relay.post-action-hook.yaml — workflow definition

Test plan

  • Validate YAML with agent-relay workflow validate relay.post-action-hook.yaml
  • Dry-run with agent-relay workflow dry-run relay.post-action-hook.yaml
  • Execute the workflow to implement the feature end-to-end

Closes #500

🤖 Generated with Claude Code


Open with Devin

Add agent-relay workflow YAML that coordinates agents to implement
the postAction hook feature on WorkflowStep, supporting shell
commands and HTTP webhooks after step completion.

Closes #500

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
devin-ai-integration[bot]

This comment was marked as resolved.

Copy link
Member Author

@khaliqgant khaliqgant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Workflow Review: relay.post-action-hook.yaml

Overall this is a well-structured workflow with solid DAG dependencies and good phasing. A few issues to address:

Issues (should fix)

1. onError: continue is not a valid value
The WorkflowDefinition type (types.ts:176) only accepts 'fail' | 'skip' | 'retry'. The runner internally maps 'skip' to a continue strategy, so this should be:

onError: skip

2. Swarm pattern pipeline is semantically incorrect
Steps implement-runner and update-validator run in parallel (both depend on add-types but not on each other). This is a DAG, not a pipeline. The SwarmPattern type includes 'dag' which is the correct choice:

pattern: dag

3. awk-based code extraction is fragile
The read-runner-spawn, read-runner-interpolate, and read-runner-deterministic steps use awk regex patterns to extract method bodies from runner.ts (a 3000+ line file). These patterns assume specific indentation and method signatures that may not match. If they fail, the implementer agent will operate with incomplete context.

Suggestion: Use sed -n 'START,ENDp' with known line ranges, or simply cat the full file and let the agent parse it.

4. Security: execSync with interpolated values
The implementer task provides code that passes interpolated {{step.output}} values directly into execSync. This is a command injection vector — a step's output could contain shell metacharacters. The task description should instruct the agent to sanitize/escape interpolated values before shell execution, or document this as an accepted risk.

Suggestions (consider)

5. fix-if-broken runs unconditionally
The fixer agent spawns even when type-check and tests pass. This wastes an agent invocation and risks unwanted changes. The task description does say "If both show PASSED, output: FIX_DONE:none" which mitigates this, but a conditional skip mechanism would be cleaner.

6. Missing per-step timeouts
Individual agent steps don't specify timeoutMs. Only the global 40-min swarm timeout exists. If one agent hangs, it could consume most of the time budget. Consider adding per-step timeouts (e.g., 5 min for types-author, 10 min for implementer).

7. Phase numbering gap
Comments jump from "Phase 3" to "Phase 5" — update-validator is labeled "Phase 3 parallel" instead of Phase 4. Minor but confusing.

What's done well

  • DAG structure is correct: Read steps have no dependencies (parallel), types before implementation, implementation before tests, tests before fix, fix before final validation. The dependency graph is sound.
  • Preflight checks verify target files exist before running — good safety net.
  • failOnError: false on read steps prevents blocking if file paths shift.
  • maxIterations: 2 on fixer prevents infinite retry loops.
  • Reviewer agent uses a different model (sonnet vs o4-mini) — good diversity of perspective.
  • capture-diff before review ensures the reviewer sees the complete change.
  • Task descriptions are detailed with exact code snippets and file paths — agents have clear instructions.
  • Test coverage plan is comprehensive: shell, webhook, interpolation, failAction warn/fail, skip when undefined, both combined.

khaliqgant and others added 2 commits March 10, 2026 11:55
- Change onError from 'continue' to 'skip' (valid enum value)
- Change swarm pattern from 'pipeline' to 'dag' (steps run in parallel)
- Replace fragile awk-based code extraction with simple cat commands
- Add command injection security note to implementer task
- Add per-step timeoutMs values (5 min default, 10 min for implementer)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… workflow

The type-check step piped tsc output through tail, which swallowed tsc's
exit code (tail always exits 0), causing TYPE_CHECK_PASSED to always be
printed regardless of actual type errors. Applied the same fix to run-tests
and final-tests steps for consistency: capture command output and exit code
in variables before piping through tail.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
devin-ai-integration[bot]

This comment was marked as resolved.

Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 7 additional findings in Devin Review.

Open in Devin Review

Comment on lines +178 to +179
verification:
type: exit_code
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot Mar 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 All verification blocks missing required value field for exit_code type

All 6 agent steps with verification specify only type: exit_code but omit the value field, which is required by both the VerificationCheck TypeScript interface (packages/sdk/src/workflows/types.ts:307-311) and the JSON schema (packages/sdk/src/workflows/schema.json:690-692). Every other existing workflow YAML in the repo (e.g., relay.workflow-hardening.yaml, relay.bug-fixes.yaml) always provides both type and value. While the runner doesn't crash at runtime because exit_code is a no-op in runVerification (packages/sdk/src/workflows/runner.ts:3694-3696), the missing field causes buildSupervisorVerificationGuide at runner.ts:2515 to produce a degraded supervisor prompt containing undefined instead of a meaningful value, and violates the defined schema contract.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — exit_code verification type requires a value field (the expected exit code). These should be value: 0 for success checks. Fixed in f807d5b along with similar issues in type-check and final-tests steps.

- All agents now use preset: worker/reviewer (non-interactive)
- Replaced all output_contains verification with exit_code
  (tokens appeared in task text → double-occurrence problem)
- Removed duplicate deterministic reads (read-runner-spawn,
  read-runner-interpolate, read-runner-deterministic all catted the
  same file) — replaced with targeted grep/sed extracts
- Shortened task prompts from 60+ to ~15 lines
- Reduced from 11 phases to 3 (context → impl → verify/fix)
- Added file verification gates between implementation waves
- Removed coordination/barriers/state config (unnecessary overhead)
- Channel renamed to wf-post-action (skill says use wf- prefix)
- Removed per-step timeoutMs (skill says use global timeout only)
- Fixed read-runner steps to extract relevant sections instead of
  catting the entire 6000-line runner.ts 3 times
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View 6 additional findings in Devin Review.

Open in Devin Review

Comment on lines +290 to +294
command: |
cd packages/sdk && npx vitest run 2>&1 | tail -40
exit_code=$?
[ $exit_code -eq 0 ] && echo "ALL_TESTS_PASSED" || echo "TESTS_FAILED"
exit $exit_code
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Pipeline captures tail exit code instead of vitest exit code, making final-tests always pass

In the final-tests step, the command npx vitest run 2>&1 | tail -40 is a pipeline, and exit_code=$? on the following line captures the exit code of the last command in the pipeline (tail -40), not npx vitest run. Since tail almost always exits 0, exit_code will be 0 regardless of whether tests pass or fail. This means the step always prints "ALL_TESTS_PASSED" and exits 0, defeating the failOnError: true gate.

This is critical because final-tests is the gate step before capture-diff and review — both depend on it. The review step receives {{steps.final-tests.output}} as context, so the reviewer would see "ALL_TESTS_PASSED" even when tests are failing, potentially approving broken code.

Confirmed with shell reproduction
$ echo '
exit_with_1() { echo "FAIL"; exit 1; }
exit_with_1 2>&1 | tail -40
exit_code=$?
echo "exit_code=$exit_code"
[ $exit_code -eq 0 ] && echo "ALL_TESTS_PASSED" || echo "TESTS_FAILED"
' | sh
FAIL
exit_code=0
ALL_TESTS_PASSED

The runner spawns commands via sh -c (packages/sdk/src/workflows/runner.ts:1948), which does not support pipefail.

Suggested change
command: |
cd packages/sdk && npx vitest run 2>&1 | tail -40
exit_code=$?
[ $exit_code -eq 0 ] && echo "ALL_TESTS_PASSED" || echo "TESTS_FAILED"
exit $exit_code
command: |
cd packages/sdk
output=$(npx vitest run 2>&1) || true
exit_code=${?}
echo "$output" | tail -40
[ $exit_code -eq 0 ] && echo "ALL_TESTS_PASSED" || echo "TESTS_FAILED"
exit $exit_code
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WorkflowRunner: add postCommand hook to workflow steps

1 participant