Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
123 changes: 123 additions & 0 deletions .claude/commands/finalize.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
Get the current feature branch ready for PR merge. Follow these phases exactly.

## Phase 1: Merge Latest Main

1. Capture current branch: `git branch --show-current`
2. Check for dirty working tree: `git status --porcelain`
- If dirty: `git stash`
3. `git checkout main && git pull origin main`
4. `git checkout <branch> && git merge main`
5. If merge conflicts exist (git status shows conflict markers):
- Report the conflicting files
- **STOP** — do not proceed. The user must resolve conflicts manually.
6. If stashed: `git stash pop`

## Phase 2: Local PR Review (Replicating claude.yml)

### Generate diff artifacts
```bash
git diff origin/main...HEAD > pr-diff.txt
git diff --name-status origin/main...HEAD > pr-files.txt
```

### Self-review using the claude.yml checklist

Read `.github/workflows/claude.yml` lines 78–246 for the full review criteria. Apply that same checklist to the current branch changes.

**File reading strategy:**
- Read `pr-diff.txt` first — it shows ALL changes
- Read `pr-files.txt` to see which files changed
- For large files (>1000 lines), use Grep with context or Read with offset/limit — do NOT read the entire file
- Focus on reviewing CHANGED code, not entire files

**Apply all 7 review sections:**

1. **Code Quality & Patterns** — architecture consistency, pattern reuse, error handling, code style, CLAUDE.md compliance
2. **Security** — SQL injection, command injection, XSS, secrets exposure, path traversal, unsafe deserialization, resource cleanup
3. **Testing** — tests exist for new functionality, edge cases covered, test quality
4. **Documentation** — docs updated for new features/CLI commands/SDK changes
5. **Breaking Changes & Compatibility** — public API changes, backward compatibility
6. **Performance & Architecture** — N+1 queries, inefficient algorithms, unnecessary dependencies
7. **Commit Quality** — commit messages are clear and logical

**Classify all findings:**
- 🔴 **Critical** — Security issues, breaking changes, data loss risks
- 🟡 **Important** — Bugs, architectural concerns, missing tests
- 🟢 **Minor** — Style issues, optimizations, suggestions

**DO NOT review or flag:**
- Copyright headers (presence, absence, or year inconsistencies)
- SPDX license identifiers
- License-related boilerplate

### Fix findings
- Fix all 🔴 Critical and 🟡 Important issues immediately using Edit/Write tools
- Skip 🟢 Minor issues unless the fix is trivial (one-liner)
- For security issues: fix them directly

### Clean up diff artifacts
```bash
rm -f pr-diff.txt pr-files.txt
```

## Phase 3: Ralph Wiggum Loop

Loop until everything is green, **maximum 5 iterations**.

On each iteration:

### Step 1: Re-run local PR review
Repeat Phase 2 (generate diffs, review, fix, clean up). If no new 🔴/🟡 issues, proceed to Step 2.

### Step 2: Lint
```bash
python util/lint.py --all --fix
```
Check exit code. If lint still reports failures after `--fix`:
- Read the lint output carefully
- Manually fix remaining issues using Edit (common: import ordering, line length, f-string issues black can't auto-fix)
- Re-run `python util/lint.py --all` to verify clean

### Step 3: Run tests
```bash
python -m pytest tests/ -x --tb=short
```
- `-x` stops on first failure — analyze and fix before continuing
- Tests requiring external services (Lemonade server) skip automatically via pytest markers
- If tests fail: read the traceback, identify root cause, fix with Edit, then re-run

### Step 4: Evaluate
- If lint is clean AND tests pass AND no 🔴/🟡 issues in review → **exit loop, report success**
- If max iterations (5) reached → report remaining issues and stop

## Exit Report

Always end with:

```
## Finalize Implementation Report

**Branch:** <branch-name>
**Iterations:** <n>/5

### Lint
✅ Clean / ❌ <remaining issues>

### Tests
✅ All passed (<N> tests) / ❌ <failure summary>

### PR Review Verdict
✅ Approve / ✅ Approve with suggestions (minor only) / ❌ Request changes — <summary>

### Ready for PR
✅ Yes — branch is ready to open/update PR
❌ No — <list remaining blocking issues for user to resolve>
```

## Key Behaviors

- **Never commit** — only fix files; the user decides when to commit
- **Never skip the lint step** — lint failures will be caught by CI
- **Prefer Edit over Write** — surgical fixes only
- **Preserve existing tests** — if your fixes break tests, undo and rethink
- **If uncertain about a fix** — describe the issue and ask rather than guessing
169 changes: 169 additions & 0 deletions .claude/skills/finalize-implementation/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
---
name: finalize-implementation
description: >
Run tests, lint, CI review simulation, fix issues in a loop, then commit and
create a draft PR. Invoke when you believe the implementation is complete and
ready for review. Runs inline to preserve full conversation context.
model: sonnet
disable-model-invocation: true
---

# Finalize Implementation

Validate the implementation against tests, lint, and the CI review checklist,
fix any issues found, then commit and open a draft PR.

## Prerequisites

Before starting, confirm:
- Working directory is clean or changes are ready to finalize
- You are on the correct feature branch (not `main`)

If on `main`, stop and ask the user which branch to use.

## Phase 1 — Baseline Verification

Run in order:

```bash
python -m pytest tests/unit/ -v --tb=short --cache-clear
```

```bash
python util/lint.py --all --fix
```

```bash
python util/lint.py --all
```

Record:
- Number of test failures (and which tests)
- Any lint violations remaining after `--fix`

If lint still fails after `--fix`, fix manually before proceeding.

## Phase 2 — CI Simulation

**Every iteration: read `.github/workflows/claude.yml` fresh — never use a
cached version.**

Steps:
1. Read `.github/workflows/claude.yml` and extract the `custom_instructions`
from the `pr-review` job.
2. Run:
```bash
git diff origin/main...HEAD
git diff --name-status origin/main...HEAD
```
3. Review all changed files against the extracted checklist.
4. Produce a structured report:

```
## CI Review Report — Iteration N

### 🔴 Critical
- [issue] (file:line)

### 🟡 Important
- [issue] (file:line)

### 🟢 Minor
- [issue] (file:line)
```

Severity definitions (from the CI checklist):
- 🔴 Critical — security vulnerabilities, breaking changes, data loss risks
- 🟡 Important — bugs, architectural concerns, missing tests, missing docs
- 🟢 Minor — style, optimizations, non-blocking suggestions

## Phase 3 — Remediation Loop

**Hard cap: 5 iterations total** (Phase 1 + Phase 2 = one iteration).

Each iteration:
1. Fix 🔴 issues first, then 🟡 issues. Skip 🟢 unless trivial.
2. Re-run Phase 1 (tests + lint with `--cache-clear`).
3. Re-run Phase 2 (fresh `.github/workflows/claude.yml` read every time).
4. Evaluate exit conditions.

### Exit Conditions

**Exit normally (proceed to Phase 4) when:**
- Zero 🔴 and zero 🟡 issues remain AND all tests pass AND lint is clean

**Exit with escalation (stop and report to user) when:**
- 5 iterations reached and issues remain — report what's left and ask for guidance
- The same 🔴 or 🟡 issue appears unchanged in 2 consecutive iterations — you
are stuck; report it immediately rather than continuing

**Never silently skip a 🔴 issue to reach the exit condition.**

## Phase 4 — Final Validation

### 4a. Intent Check

Using the full conversation context (this skill runs inline), verify:
- The implementation matches what the user originally asked for
- No scope creep was introduced during remediation
- Nothing from the original request was accidentally dropped

If there is a mismatch, fix it and re-run Phase 1 before continuing.

### 4b. Sub-agent Reviews

Launch both agents in parallel:

```
Agent: code-reviewer
Prompt: Review all files changed in this branch (git diff origin/main...HEAD)
for bugs, logic errors, security issues, and GAIA/AMD compliance.
Report 🔴 Critical, 🟡 Important, 🟢 Minor issues only.
```

```
Agent: architecture-reviewer
Prompt: Review all files changed in this branch (git diff origin/main...HEAD)
for SOLID principles, proper layering, dependency hygiene, and architectural
consistency with the existing GAIA codebase.
Report 🔴 Critical, 🟡 Important, 🟢 Minor issues only.
```

If either reviewer finds 🔴 or 🟡 issues, return to Phase 3 (counts against
the 5-iteration cap).

### 4c. Commit and PR

Once clean, invoke:

```
Skill: commit-commands:commit-push-pr
```

The PR must:
- Be created as a **draft**
- Title derived from the branch name or original issue title
- Body includes a link to the GitHub issue (if one was mentioned in the
conversation) using `Closes #NNN` or `Relates to #NNN`

## Output

After completion, print a summary table:

```
## Finalize Implementation — Complete

| Step | Result |
|-------------------|-------------------------------|
| Iterations used | N / 5 |
| Tests | ✅ Passing / ❌ N failures |
| Lint | ✅ Clean / ❌ Violations |
| 🔴 Issues | 0 resolved, 0 remaining |
| 🟡 Issues | N resolved, 0 remaining |
| 🟢 Issues | N noted (not blocking) |
| PR | <URL or "Not created"> |
```

If the loop exited early due to the iteration cap or a stuck issue, replace the
PR row with a clear description of what blocked completion and what the user
should do next.
Loading
Loading