Skip to content

Add language guards for Python-only endpoints #4606

Add language guards for Python-only endpoints

Add language guards for Python-only endpoints #4606

Workflow file for this run

name: Claude Code
on:
workflow_dispatch:
pull_request:
types: [opened, synchronize, ready_for_review, reopened]
paths-ignore:
- '.github/workflows/**'
- '*.md'
- 'docs/**'
- 'demos/**'
- 'experiments/**'
- 'LICENSE'
- '.tessl/**'
- 'code_to_optimize/**'
- 'codeflash.code-workspace'
- 'uv.lock'
issue_comment:
types: [created]
pull_request_review_comment:
types: [created]
issues:
types: [opened, assigned]
pull_request_review:
types: [submitted]
jobs:
# Automatic PR review (can fix linting issues and push)
# Blocked for fork PRs to prevent malicious code execution
pr-review:
concurrency:
group: pr-review-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
if: |
(
github.event_name == 'pull_request' &&
github.event.sender.login != 'claude[bot]' &&
github.event.pull_request.head.repo.full_name == github.repository
) ||
github.event_name == 'workflow_dispatch'
runs-on: ubuntu-latest
permissions:
contents: write
pull-requests: write
issues: read
id-token: write
actions: read
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0
ref: ${{ github.event.pull_request.head.ref || github.ref }}
- name: Install uv
uses: astral-sh/setup-uv@v6
- name: Install dependencies
run: |
uv venv --seed
uv sync
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_ROLE_TO_ASSUME }}
aws-region: ${{ secrets.AWS_REGION }}
- name: Run Claude Code
id: claude
uses: anthropics/claude-code-action@v1
with:
use_bedrock: "true"
use_sticky_comment: true
track_progress: true
allowed_bots: "claude[bot],codeflash-ai[bot]"
exclude_comments_by_actor: "*[bot]"
prompt: |
<context>
repo: ${{ github.repository }}
pr_number: ${{ github.event.pull_request.number }}
event: ${{ github.event.action }}
is_re_review: ${{ github.event.action == 'synchronize' }}
</context>
<commitment>
Execute these steps in order. If a step has no work, state that and continue to the next step.
Post all review findings in a single summary comment only — never as inline PR review comments.
</commitment>
<step name="triage">
Before doing any work, assess the PR scope:
1. Run `gh pr diff ${{ github.event.pull_request.number }} --name-only` to get changed files.
2. Run `gh pr diff ${{ github.event.pull_request.number }} --stat` to get the total lines changed.
3. Classify the PR into one of these categories:
- TRIVIAL: ALL changed files are config/CI (.github/, .tessl/, *.toml, *.lock, *.json, *.yml, *.yaml), documentation (*.md, docs/), non-production code (demos/, experiments/, code_to_optimize/), or only whitespace/formatting/comment changes.
- SMALL: ≤ 50 lines of production code changed (excluding tests, config, lock files)
- LARGE: > 50 lines of production code changed
If TRIVIAL: post a single comment "No substantive code changes to review." and stop — do not execute any further steps.
Otherwise: record the size (SMALL or LARGE) and continue. Later steps will adapt their depth based on this.
</step>
<step name="lint_and_typecheck">
Run checks on files changed in this PR and auto-fix what you can.
1. Run `uv run prek run --from-ref origin/main` to check linting/formatting.
If there are auto-fixable issues, run it again to fix them.
Report any issues prek cannot auto-fix in your summary.
2. Run `uv run mypy <changed_files>` to check types.
Fix type annotation issues (missing return types, Optional unions, import errors).
Always fix the root cause instead of adding `type: ignore` comments.
Leave alone: type errors requiring logic changes, complex generics, anything changing runtime behavior.
3. After fixes: stage with `git add`, commit ("style: auto-fix linting issues" or "fix: resolve mypy type errors"), push.
4. Verify by running `uv run prek run --from-ref origin/main` one more time. Report honestly if issues remain.
</step>
<step name="resolve_stale_threads">
Before reviewing, resolve any stale review threads from previous runs.
1. Fetch unresolved threads you created:
`gh api graphql -f query='{ repository(owner: "${{ github.repository_owner }}", name: "${{ github.event.repository.name }}") { pullRequest(number: ${{ github.event.pull_request.number }}) { reviewThreads(first: 100) { nodes { id isResolved path comments(first: 1) { nodes { body author { login } } } } } } } }' --jq '.data.repository.pullRequest.reviewThreads.nodes[] | select(.isResolved == false) | select(.comments.nodes[0].author.login == "claude") | {id: .id, path: .path, body: .comments.nodes[0].body}'`
2. For each unresolved thread:
a. Read the file at that path to check if the issue still exists
b. If fixed → resolve the thread silently (do NOT post a reply comment like "✅ Fixed"):
`gh api graphql -f query='mutation { resolveReviewThread(input: {threadId: "<THREAD_ID>"}) { thread { isResolved } } }'`
c. If still present → leave it
Read the actual code before deciding. If there are no unresolved threads, skip to the next step.
</step>
<step name="review">
Review the diff (`gh pr diff ${{ github.event.pull_request.number }}`).
SCOPE RULES:
- Only read files that appear in the diff. Do NOT explore the broader codebase unless a specific function call in the diff requires understanding its definition.
- Match review depth to PR size: SMALL PRs get a focused correctness check; LARGE PRs get the full review below.
- For codeflash-ai[bot] optimization PRs: focus on whether the optimization is correct and the speedup claim is credible. Keep the review concise — a short correctness verdict, not a multi-paragraph essay.
Check for:
1. Bugs that will crash at runtime
2. Security vulnerabilities
3. Breaking API changes
4. Design issues (LARGE PRs only):
- Is logic placed in the right module? (e.g., language-specific logic belongs in `languages/<lang>/`, not in the base optimizer)
- Are fixes addressing the root cause or just papering over symptoms? (e.g., hardcoding a list of imports vs having the AI service handle it)
- Is language-agnostic config being stored in a language-specific file? (e.g., storing general config in `pyproject.toml` which is Python-exclusive)
5. Accidental file inclusions — flag if the diff contains:
- Binary files (`.jar`, `.whl`, `.so`, `.dll`, `.exe`)
- Version file changes (`version.py`) that look auto-generated (e.g., `.post<N>.dev0+<hash>`)
- Internal planning/design docs committed to `docs/` (those belong in Linear or a separate location)
- Changes to `codeflash-benchmark/` version files
Ignore style issues, type hints, and log message wording.
Record findings for the summary comment. Refer to CLAUDE.md for project conventions.
</step>
<step name="duplicate_detection">
Check whether this PR introduces code that duplicates logic already present elsewhere in the repository.
Depth depends on PR size:
- SMALL PRs: Quick check — search for any new function names defined elsewhere using Grep. Only flag exact or near-exact duplicates. Skip the cross-module deep dive.
- LARGE PRs: Full analysis as described below.
Full analysis (LARGE PRs only):
1. Get changed source files (excluding tests and config):
`git diff --name-only origin/main...HEAD -- '*.py' '*.js' '*.ts' '*.java' | grep -v -E '(test_|_test\.(py|js|ts)|\.test\.(js|ts)|\.spec\.(js|ts)|conftest\.py|/tests/|/test/|/__tests__/)' | grep -v -E '^(\.github/|code_to_optimize/|\.tessl/|node_modules/)'`
2. For each changed file, read it and identify functions/methods added or substantially modified (longer than 5 lines).
3. Search for duplicates using Grep:
- Same function name defined elsewhere
- 2-3 distinctive operations from the body (specific API calls, algorithm patterns, string literals)
4. Cross-module check: this codebase has parallel modules under `languages/python/`, `languages/javascript/`, and `languages/java/` plus runtimes under `packages/codeflash/runtime/` and `codeflash-java-runtime/`. When a changed file is under one of these areas, search the others for equivalent logic. Only flag cases where the logic is genuinely shared or one module could import from the other.
5. When a Grep hit looks promising, read the full function and compare semantics. Flag only:
- Same function with same/very similar body in another module
- Same helper logic repeated in sibling files
- Same logic implemented inline across multiple classes
- Same algorithm reimplemented across language modules (Python code, not target-language differences)
Report at most 5 findings with confidence (HIGH/MEDIUM), locations, what's duplicated, and suggestion.
DO NOT report: boilerplate, functions under 5 lines, config/setup, intentional polymorphism, test files, imports, code that must differ due to target-language semantics.
If no duplicates found, include "No duplicates detected" in the summary.
</step>
<step name="coverage">
Skip this step for SMALL PRs — only run for LARGE PRs.
Analyze test coverage for changed files:
1. Get changed Python files (excluding tests): `git diff --name-only origin/main...HEAD -- '*.py' | grep -v test`
2. Run coverage on PR branch: `uv run coverage run -m pytest tests/ -q --tb=no` then `uv run coverage json -o coverage-pr.json`
3. Get per-file coverage: `uv run coverage report --include="<changed_files>"`
4. Compare with main: checkout main, run coverage, checkout back
5. Flag: new files below 75%, decreased coverage, untested changed lines
6. If the PR adds new public functions or methods without any corresponding tests, explicitly call this out and request tests be added. New production logic should have tests.
</step>
<step name="summary_comment">
Post exactly one summary comment containing all results from previous steps using this format:
## PR Review Summary
### Prek Checks
### Code Review
(Include bugs, security, breaking changes, and design issues. Omit subsections with no findings.)
### Duplicate Detection
### Test Coverage
---
*Last updated: <timestamp>*
</step>
<step name="merge_optimization_prs">
Check for open PRs from codeflash-ai[bot]:
`gh pr list --author "codeflash-ai[bot]" --state open --json number,title,headRefName,createdAt,mergeable`
For each PR:
- If CI passes and the PR is mergeable → merge with `--squash --delete-branch`
- If CI is failing, first determine whether the failures are CAUSED BY the PR or PRE-EXISTING on the base branch:
1. Run `gh pr checks <number>` to identify failing checks
2. Check out the PR's BASE branch and check if the same tests/checks fail there too
3. If failures are PRE-EXISTING on the base branch (not caused by the PR):
- Do NOT close the PR
- Leave a comment: "CI failures are pre-existing on the base branch (not caused by this PR): <list failing checks>. Leaving open for merge once base branch CI is fixed."
4. If failures are CAUSED BY the PR's changes:
a. Check out the PR branch and attempt to fix (lint issues, duplicate definitions, type errors, etc.)
b. If fixed: commit, push, and leave a comment explaining what was fixed
c. If unfixable: close with `gh pr close <number> --comment "Closing: this PR introduces CI failures that cannot be auto-fixed — <describe the specific failures>." --delete-branch`
- Close the PR (without attempting fixes) ONLY if ANY of these apply:
- The optimized function no longer exists in the target file (check the diff)
- Has merge conflicts (mergeable state is "CONFLICTING") AND the PR is older than 3 days (to give time for the base branch to stabilize before giving up)
Close with: `gh pr close <number> --comment "<reason>" --delete-branch`
where <reason> explains WHY the PR is being closed. Examples:
- "Closing: merge conflicts with the target branch and PR is older than 3 days."
- "Closing: the optimized function no longer exists in the target file."
- NEVER close a PR solely because it is old. Age alone is not a valid reason to close.
- NEVER mass-close multiple PRs without individually evaluating each one.
</step>
<verification>
Before finishing, confirm:
- All steps were attempted (even if some had no work)
- Stale review threads were checked and resolved where appropriate
- All findings are in a single summary comment (no inline review comments were created)
- If fixes were made, they were verified with prek
</verification>
claude_args: '--model us.anthropic.claude-sonnet-4-6 --allowedTools "Bash(gh pr comment:*),Bash(gh pr diff:*),Bash(gh pr view:*),Bash(gh pr list:*),Bash(gh pr checks:*),Bash(gh pr merge:*),Bash(gh pr close:*),Bash(gh issue view:*),Bash(gh issue list:*),Bash(gh api:*),Bash(uv run prek *),Bash(uv run mypy *),Bash(uv run coverage *),Bash(uv run pytest *),Bash(git status*),Bash(git add *),Bash(git commit *),Bash(git push*),Bash(git diff *),Bash(git checkout *),Read,Glob,Grep,Edit"'
additional_permissions: |
actions: read
# @claude mentions (can edit and push) - restricted to maintainers only
claude-mention:
concurrency:
group: claude-mention-${{ github.event.issue.number || github.event.pull_request.number || github.run_id }}
cancel-in-progress: false
if: |
(
github.event_name == 'issue_comment' &&
contains(github.event.comment.body, '@claude') &&
(github.event.comment.author_association == 'OWNER' || github.event.comment.author_association == 'MEMBER' || github.event.comment.author_association == 'COLLABORATOR')
) ||
(
github.event_name == 'pull_request_review_comment' &&
contains(github.event.comment.body, '@claude') &&
(github.event.comment.author_association == 'OWNER' || github.event.comment.author_association == 'MEMBER' || github.event.comment.author_association == 'COLLABORATOR') &&
github.event.pull_request.head.repo.full_name == github.repository
) ||
(
github.event_name == 'pull_request_review' &&
contains(github.event.review.body, '@claude') &&
(github.event.review.author_association == 'OWNER' || github.event.review.author_association == 'MEMBER' || github.event.review.author_association == 'COLLABORATOR') &&
github.event.pull_request.head.repo.full_name == github.repository
) ||
(
github.event_name == 'issues' &&
(contains(github.event.issue.body, '@claude') || contains(github.event.issue.title, '@claude')) &&
(github.event.issue.author_association == 'OWNER' || github.event.issue.author_association == 'MEMBER' || github.event.issue.author_association == 'COLLABORATOR')
)
runs-on: ubuntu-latest
permissions:
contents: write
pull-requests: write
issues: read
id-token: write
actions: read
steps:
- name: Get PR head ref
id: pr-ref
env:
GH_TOKEN: ${{ github.token }}
run: |
# For issue_comment events, we need to fetch the PR info
if [ "${{ github.event_name }}" = "issue_comment" ]; then
PR_REF=$(gh api repos/${{ github.repository }}/pulls/${{ github.event.issue.number }} --jq '.head.ref')
echo "ref=$PR_REF" >> $GITHUB_OUTPUT
else
echo "ref=${{ github.event.pull_request.head.ref || github.head_ref }}" >> $GITHUB_OUTPUT
fi
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0
ref: ${{ steps.pr-ref.outputs.ref }}
- name: Install uv
uses: astral-sh/setup-uv@v6
- name: Install dependencies
run: |
uv venv --seed
uv sync
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_ROLE_TO_ASSUME }}
aws-region: ${{ secrets.AWS_REGION }}
- name: Run Claude Code
id: claude
uses: anthropics/claude-code-action@v1
with:
use_bedrock: "true"
claude_args: '--model us.anthropic.claude-sonnet-4-6 --allowedTools "Read,Edit,Write,Glob,Grep,Bash(git status*),Bash(git diff*),Bash(git add *),Bash(git commit *),Bash(git push*),Bash(git log*),Bash(git merge*),Bash(git fetch*),Bash(git checkout*),Bash(git branch*),Bash(uv run prek *),Bash(prek *),Bash(uv run ruff *),Bash(uv run pytest *),Bash(uv run mypy *),Bash(uv run coverage *),Bash(gh pr comment*),Bash(gh pr view*),Bash(gh pr diff*),Bash(gh pr merge*),Bash(gh pr close*)"'
additional_permissions: |
actions: read