From 4a027abbc90c99df97a9e4678c1ef03781987598 Mon Sep 17 00:00:00 2001 From: Pawel Zmarzly Date: Wed, 25 Mar 2026 20:00:54 +0000 Subject: [PATCH] tools: claude: use new review prompt to catch more issues --- .github/workflows/claude-pr-review.yaml | 49 ++++++++++++++++--------- 1 file changed, 32 insertions(+), 17 deletions(-) diff --git a/.github/workflows/claude-pr-review.yaml b/.github/workflows/claude-pr-review.yaml index 9c8b233e5..c14e5eaae 100644 --- a/.github/workflows/claude-pr-review.yaml +++ b/.github/workflows/claude-pr-review.yaml @@ -187,41 +187,55 @@ jobs: in the issue comments. If one exists, you will use it as the basis for your `summary` in Phase 3. - ## Phase 2: Parallel review subagents + ## Phase 2: Independent parallel reviewers Read `.claude/commands/review-pr.md` for the review checklist. All build, test, coverage, and cleanup instructions do not apply — other CI workflows handle those. Also skip the "Output format" and "Report" sections - they are not applicable here. - For the remaining sections (Code review, Style, Documentation, Commit), - launch subagents to review them in parallel. Group related sections - as needed — use 2-8 subagents based on PR size and scope. - Give each subagent the PR title, description, full patch, and the list of changed files. + Launch a team of 5 independent review agents. Each agent reviews the ENTIRE PR + independently across ALL remaining sections (Code review, Style, Documentation, Commit). + + CRITICAL: You MUST launch all 5 agents in a SINGLE assistant message containing + 5 parallel Agent tool calls. Do NOT use `run_in_background`. Do NOT launch them + one at a time across multiple turns. All 5 must be foreground calls in one message + so they run concurrently and you automatically wait for all of them to finish + before proceeding. + + Give each agent a unique reviewer ID (reviewer-1 through reviewer-5) and pass them + the PR title, description, full patch, and the list of changed files. Tell them to look at the pre-patch repo for context, and to read `.claude/commands/review-pr.md` and `doc/developers/style.rst` for review guidelines. Tell them to skip the "Output format" and "Report" subsections in review-pr.md — those are for the standalone slash-command, not for subagent output. - Each subagent must return a JSON array of issues: + Each agent works independently and must NOT communicate with other agents. + Each agent must return a JSON array of issues: `[{"file": "path", "line": (optional), "severity": "must-fix|suggestion|nit", "body": "..."}]` - `line` should be a line number from the NEW side of the diff **that appears inside - a diff hunk** (i.e. a line that is shown in the patch output). GitHub's review - comment API rejects lines outside the diff context, so never reference lines - that are not visible in the patch. If you cannot determine a valid diff line, - omit `line` — the comment will still appear in the tracking comment summary. + Do NOT escape characters in `body`. Write plain markdown — no backslash + escaping of `!` or other characters. + + `line` should be a line number from the NEW side of the diff **that appears + inside a diff hunk**. GitHub rejects lines outside the diff context. If you + cannot determine a valid diff line, omit `line`. - Each subagent MUST verify its findings before returning them: + Each agent MUST verify its findings before returning them and only return + issues it is confident about: - For style/convention claims, check at least 3 existing examples in the codebase to confirm the pattern actually exists before flagging a violation. - For "use X instead of Y" suggestions, confirm X actually exists and works for this case. - - If unsure, don't include the issue. + - If unsure about an issue, do NOT include it. Only return findings you are confident are real. ## Phase 3: Collect, deduplicate, and summarize - After ALL subagents complete: - 1. Collect all issues. Merge duplicates (same file, lines within 3 of each other, same problem). - 2. Drop low-confidence findings. + After ALL 5 reviewers complete (do NOT proceed until every agent has returned): + 1. Collect all issues from all reviewers. Merge duplicates (same file, lines within 3 of each + other, same problem). Keep ALL unique findings — each reviewer has already filtered out + low-confidence issues, so everything that made it through is worth reporting. + Issues found by multiple reviewers independently are higher confidence. + 2. Do NOT drop findings just because only one reviewer flagged them. A single reviewer + catching a real issue is the whole point of having 5 independent reviewers. 3. For CLAUDE.md violations that appear in 3+ existing places in the codebase, do NOT include them in `comments`. Instead, add them to the 'CLAUDE.md improvements' section of the summary. 4. Check the existing inline review comments fetched in Phase 1. Do NOT include a @@ -294,9 +308,10 @@ jobs: claude_args: | --max-turns 100 --model us.anthropic.claude-opus-4-6-v1 + --effort max --json-schema '${{ env.REVIEW_SCHEMA }}' --allowedTools " - Read,LS,Grep,Glob,Task, + Read,LS,Grep,Glob,Task,Agent, Bash(cat:*),Bash(test:*),Bash(printf:*),Bash(jq:*),Bash(head:*),Bash(tail:*), Bash(git:*),Bash(gh:*),Bash(grep:*),Bash(find:*),Bash(ls:*),Bash(wc:*), Bash(diff:*),Bash(sed:*),Bash(awk:*),Bash(sort:*),Bash(uniq:*),