Skip to content

feat: add security scanners workflow#161

Open
scottschreckengaust wants to merge 25 commits intomainfrom
feat/security-scanners
Open

feat: add security scanners workflow#161
scottschreckengaust wants to merge 25 commits intomainfrom
feat/security-scanners

Conversation

@scottschreckengaust
Copy link
Copy Markdown
Member

@scottschreckengaust scottschreckengaust commented Mar 31, 2026

Summary

Adds a comprehensive security scanning pipeline (security-scanners.yml) with six scanners running on every push to main, every PR, and on a daily schedule. Also hardens all existing workflow permissions and adds developer/admin documentation for reviewing, remediating, and suppressing findings.

Security Scanners

Job Scanner What it detects Fails on
gitleaks Gitleaks Secrets in git history Any secret not in baseline
semgrep Semgrep Multi-language SAST (all registry rules) Any finding (PRs: new only)
grype Grype Dependency CVEs (SCA) High or critical CVEs
bandit Bandit Python SAST Any high-confidence finding
checkov Checkov IaC misconfigurations (GHA, Dockerfiles) Any check failure
clamav ClamAV Malware (service container) Any detection

All scanner tool versions and GitHub Actions are pinned to specific versions/SHAs. All scanners use a deferred-failure pattern: scan runs to completion, SARIF is always uploaded to Code Scanning and as an artifact, then the job fails if findings exceed the threshold.

Configuration Files

File Purpose
.bandit Targets, excludes, confidence level
.semgrepignore Path exclusions
.gitleaks.toml Ruleset extension, path allowlist
.gitleaks-baseline.json 12 pre-existing test credential findings
.grype.yaml Severity threshold, CVE ignore list with suppression docs
.checkov.yaml Frameworks, skipped checks with inline suppression docs

Workflow Permissions Hardening

All six workflows now use deny-all-then-grant permissions:

  • release.yml, release-pr.yml, tag-on-merge.yml — moved from workflow-level grants to explicit deny-all + job-level grants (matching codebuild.yml and pull-request-lint.yml)
  • security-scanners.yml — deny-all at workflow level, each job grants only actions: read, contents: read, security-events: write

Documentation

  • DEVELOPERS_GUIDE.md — New "Security Scanners" section with per-scanner review/remediation/suppression instructions and summary tables
  • ADMINISTRATIVE_GUIDE.md — New workflow reference, Pipeline 3 architecture diagram, updated permissions model, Security Finding Requirements (all HIGH/CRITICAL must be remediated or have documented risk acceptance), and Updating Pinned Versions TODO

Changes

  • feat: add security scanners workflow — base workflow with gitleaks, semgrep, grype, checkov, clamav
  • feat: add bandit job for Python SAST scanning
  • feat: add security scanner configuration and baseline files
  • fix: raise bandit confidence to high, add suppression docs, fix clamav deferred failure
  • fix: apply deny-all permissions to release workflows
  • chore: update security scanner tools and actions to latest versions
  • docs: add security scanner remediation guide to DEVELOPERS_GUIDE
  • docs: add security scanners to ADMINISTRATIVE_GUIDE

Test plan

  • Verify all six scanner jobs pass on the PR
  • Confirm SARIF artifacts are uploaded for each scanner (gitleaks, semgrep, grype, bandit, checkov)
  • Confirm ClamAV text log artifact is uploaded
  • Check semgrep baseline comparison works (PR diff only, not full repo)
  • Verify release workflows still function with moved permissions (release.yml, release-pr.yml, tag-on-merge.yml)
  • Review Code Scanning alerts in Security tab after SARIF upload

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.

Adds an auto-label job to the Pull Request Validation workflow using
actions/labeler v6.0.1. Labels are applied based on changed file paths
and removed when those files are no longer changed (sync-labels: true).

Works for fork PRs via pull_request_target — no checkout of fork code,
the action only reads file paths from the API.

Initial label rules:
- codebuild: aidlc-rules/**
- documentation: **/*.md, docs/**
- workflows: .github/**
- Rename 'codebuild' label to 'rules' in codebuild.yml (conditions,
  reminder text, and marker)
- Rename 'workflows' label to 'github' matching .github/**
- Scope 'documentation' label to *.md files NOT under aidlc-rules/
  using all-globs-to-any-file with negation
Allows actions/labeler to create labels that don't yet exist in the
repository, preventing failures on first use of a new label rule.
- Rename all 'codebuild' label references to 'rules' (preserving
  CodeBuild service/environment references)
- Add auto-label job to Pipeline 3 diagram and workflow reference
- Document label rules table (rules, documentation, github)
- Add actions/labeler to external actions table
- Add auto-label job to permissions table
- Add labeler.yml to repository tree diagram
Adds five security scanning jobs as a new workflow:
- gitleaks: secret detection across full git history
- semgrep: SAST with SARIF output and GitHub compatibility fixes
- grype: dependency vulnerability scanning
- checkov: IaC scanning (GitHub Actions workflows, configs)
- clamav: malware scanning via service container

All jobs run on push to main, PRs to main, daily schedule, and
manual dispatch. SARIF results are uploaded as artifacts and to
GitHub Code Scanning (when available). Follows the deny-all
permissions pattern with per-job grants.
Base automatically changed from feat/auto-label-codebuild to main March 31, 2026 22:18
@github-actions github-actions bot added documentation Improvements or additions to documentation github labels Mar 31, 2026
Scans Python code under scripts/aidlc-evaluator/ for security issues.
Uses bandit v1.9.3 with SARIF output, matching the pattern from
awslabs/agent-plugins.
@github-advanced-security
Copy link
Copy Markdown

You are seeing this message because GitHub Code Scanning has recently been set up for this repository, or this pull request contains the workflow file for the Code Scanning tool.

What Enabling Code Scanning Means:

  • The 'Security' tab will display more code scanning analysis results (e.g., for the default branch).
  • Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results.
  • You will be able to see the analysis results for the pull request's branch on this overview once the scans have completed and the checks have passed.

For more information about GitHub Code Scanning, check out the documentation.

- .gitleaks.toml: extends default rules, allowlists lock files
- .gitleaks-baseline.json: baselines 12 known findings (all fake
  credentials in test_credential_scrubber.py test fixtures)
- .semgrepignore: skips lock files, test fixtures, build artifacts
- .checkov.yaml: scopes to github_actions + dockerfile frameworks,
  skips CKV_GHA_7 (conflicts with inline buildspec pattern)
- .bandit: targets scripts/aidlc-evaluator, excludes tests,
  medium+ confidence only
- .grype.yaml: fail-on-severity high, with placeholder ignore list
Copy link
Copy Markdown

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semgrep OSS found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

Add .gitleaks-baseline.json to .semgrepignore and expand all ignore
comments with specific reasoning for why each entry is excluded from
Semgrep scanning.
…v deferred failure

- Raise bandit confidence-level from medium to high to reduce noise
- Add inline suppression documentation to .grype.yaml and .checkov.yaml
- Fix clamav job to use deferred-failure pattern (always upload artifact
  before failing) consistent with all other scanner jobs
Document each scanner's failure thresholds, how to review findings,
and how to remediate or suppress them (inline comments, config-level
ignores, baselines). Includes summary tables for quick reference.
Add security-scanners.yml workflow reference, Pipeline 3 architecture
diagram, updated permissions model and security posture tables, and
Security Finding Requirements section requiring all HIGH and CRITICAL
findings to be remediated or have documented risk acceptance.
Move release.yml, release-pr.yml, and tag-on-merge.yml to the same
deny-all-then-grant pattern used by all other workflows. All 16
permission scopes are now set to none at the workflow level with only
the required scopes granted at the job level.
Scanner tools:
- Gitleaks 8.30.0 → 8.30.1
- Semgrep 1.151.0 → 1.157.0
- Grype 0.104.3 → 0.110.0
- Bandit 1.9.3 → 1.9.4
- Checkov 3.2.500 → 3.2.513
- ClamAV image digest updated to latest stable

GitHub Actions:
- github/codeql-action v4.32.2 → v4.35.1

Remove specific version numbers from ADMINISTRATIVE_GUIDE docs (they
go stale), note that versions are pinned and should be updated
periodically, and add TODO for update procedure documentation.
@scottschreckengaust
Copy link
Copy Markdown
Member Author

Follow-up: Security Findings Burndown (subsequent PR)

A full local scan with all six scanners produced 939 total findings (Bandit: 86, Semgrep: 853). Gitleaks, Grype, Checkov, and ClamAV are all clean. ~90% of findings are low-severity noise (assert-in-test, AI mention detection, import detection).

The findings CSV is in the branch at security-findings.csv for reference (not intended to be merged — delete before merge).

Burndown by Priority

🔴 P0 — HIGH severity, would fail CI (7 findings)

Count Issue Files Effort
2 Path traversal — hook input flows into file path without validation quantitative/analyzers.py:110,323 Small — validate path input
2 Dangerous subprocess without static string contracttest/server.py:120, trend_reports/fetcher.py:141 Small — verify safety, suppress with # nosemgrep
2 Secrets detected (JWT, API key) in test credential scrubber shared/tests/test_credential_scrubber.py:29,63 Trivial — synthetic test data, suppress
1 eqeq-is-bad in test assertion trend_reports/tests/test_models.py:51 Trivial — false positive in test, suppress

🟠 P1 — MEDIUM severity (46 findings)

Count Issue Effort
14 Missing encoding="utf-8" in open() calls Quick win — trivial mechanical fix
21 subprocess import detection (gitlab.bandit.B404) Trivial — suppress (import-only, not actionable)
4 changed-semgrepignore audit entries None — app-sec sign-off only
4 Subprocess call audit (gitlab.bandit.B603) Small — verify + suppress
3 return-not-in-function false positives None — false positives in code snippets

🟡 P2 — LOW severity / noise (866 findings)

Count Issue Effort
667 assert in test code (gitlab.bandit.B101) Noise — tests use assert by design
118 AI mention detection (informational) Noise — not actionable
43 subprocess calls (B603/B607) already with # nosec Small — already suppressed for bandit, need # nosemgrep too
15 Secret-like patterns (semgrep gitleaks rules) in test fixtures Medium — verify all are synthetic
23 Other import/informational findings Trivial

Recommended Approach for Follow-up PR

  1. Fix the 14 open() calls missing encoding= parameter (quick win, P1)
  2. Verify + suppress the 2 path traversal findings in analyzers.py with justification
  3. Suppress the 5 remaining P0 findings (subprocess, test secrets, test false positive) with inline # nosemgrep comments
  4. Add */tests/* to .semgrepignore to eliminate 667+ B101 assert-in-test noise and other test-only findings
  5. Bulk suppress P1 noise (B404 imports, false positives) via targeted .semgrepignore entries or inline comments

This would resolve all P0 and P1 findings. The remaining P2 findings are informational/noise and can be addressed incrementally.

@scottschreckengaust scottschreckengaust marked this pull request as ready for review April 1, 2026 21:27
@scottschreckengaust scottschreckengaust requested a review from a team as a code owner April 1, 2026 21:27
Copy link
Copy Markdown
Contributor

@Kalindi-Dev Kalindi-Dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scottschreckengaust

  1. SECURITY: ${{ }} expression interpolation in run: blocks contradicts documented security posture
    File: security-scanners.yml, lines with exit ${{ steps.*.outputs.exit_code }}
    Also: docs/ADMINISTRATIVE_GUIDE.md, line 544
    Six instances of:
    exit ${{ steps.gitleaks.outputs.exit_code }} # nosemgrep: ...

The ADMINISTRATIVE_GUIDE's Security Posture table (which this PR does not update) claims:
Zero ${{ }} expression interpolation in run: blocks"

This claim might now be false. The risk here is low (the values are integer exit codes set by the workflow itself, not attacker-controlled), but it breaks an explicitly documented invariant.

  1. Potential Blocker: cancel-in-progress: true can silently cancel scheduled scans
    File: security-scanners.yml, concurrency block
    concurrency:
    group: ${{ github.workflow }}-${{ github.ref }}
    cancel-in-progress: true
    Scheduled runs use refs/heads/main as the ref. A push to main during the daily 03:47 UTC scan will cancel it. This creates silent gaps in daily coverage.

  2. Nitpicky:
    DOCS: Duplicate/contradictory paragraphs about deny-all-then-grant
    File: docs/ADMINISTRATIVE_GUIDE.md, around lines 526-528

Two consecutive paragraphs:
"All six workflows follow a deny-all-then-grant pattern..."

"codebuild.yml, pull-request-lint.yml, and security-scanners.yml all follow a deny-all-then-grant pattern..."

Rest changes look okay and are needed for setting security guardrails

Replace six instances of ${{ steps.*.outputs.exit_code }} in run:
blocks with step-level env: variables, eliminating all expression
interpolation in run: blocks. This restores the "zero ${{ }}
interpolation in run: blocks" invariant documented in the Security
Posture table.
Add github.event_name to the concurrency group key so that scheduled
runs (group: ...-schedule-refs/heads/main) and push runs (group:
...-push-refs/heads/main) use separate groups. This prevents a push
to main from silently cancelling the daily scheduled scan.
Remove the contradictory paragraph that listed only three workflows
and merge its "strictest possible configuration" clause into the
correct paragraph that covers all six workflows.
@scottschreckengaust
Copy link
Copy Markdown
Member Author

@Kalindi-Dev Thanks for the review! All three items have been addressed in separate commits:

1. ${{ }} expression interpolation in run: blocks96c6e6c
Moved all six exit ${{ steps.*.outputs.exit_code }} expressions from run: blocks to step-level env: variables (SCANNER_EXIT), then referenced as exit "$SCANNER_EXIT". This eliminates all ${{ }} interpolation in run: blocks, restoring the "zero expression interpolation" invariant in the Security Posture table. The # nosemgrep suppression comments are also no longer needed and were removed.

2. cancel-in-progress: true silently cancelling scheduled scans0a2da23
Added github.event_name to the concurrency group key: ${{ github.workflow }}-${{ github.event_name }}-${{ github.ref }}. Scheduled runs now use group Security Scanners-schedule-refs/heads/main while push runs use Security Scanners-push-refs/heads/main, so they no longer interfere with each other.

3. Duplicate deny-all-then-grant paragraphse1e7e33
Merged the two contradictory paragraphs into one. The surviving paragraph correctly states all six workflows follow the pattern and retains the "strictest possible configuration" language from the removed paragraph.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new GitHub Actions security scanning suite and updates repository documentation and workflow permissions to support a least-privilege, reproducible security posture across CI/CD.

Changes:

  • Adds a new .github/workflows/security-scanners.yml workflow running six security scanners (SAST/SCA/secrets/IaC/malware) on push, pull_request, and a daily schedule.
  • Adds scanner configuration/baseline files (.bandit, .semgrepignore, .grype.yaml, .checkov.yaml, .gitleaks.toml, .gitleaks-baseline.json) plus remediation/suppression guidance in docs.
  • Hardens existing release workflows to “deny-all-then-grant” workflow permissions with job-level overrides.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
docs/DEVELOPERS_GUIDE.md Adds developer-facing guidance for reviewing/remediating/suppressing findings for each scanner.
docs/ADMINISTRATIVE_GUIDE.md Documents the new security scanning workflow, updated CI/CD architecture, and permissions model.
.semgrepignore Defines Semgrep path exclusions (lockfiles, test fixtures, CodeBuild artifacts, gitleaks baseline).
.grype.yaml Establishes Grype severity threshold and suppression mechanism (ignore).
.gitleaks.toml Extends default gitleaks rules and allowlists baseline/lockfile paths.
.gitleaks-baseline.json Records known pre-existing gitleaks findings to enable “new findings only” behavior.
.github/workflows/tag-on-merge.yml Switches to workflow-level deny-all permissions with job-level grants.
.github/workflows/security-scanners.yml New workflow implementing six scanner jobs, SARIF uploads, artifacts, and deferred-failure behavior.
.github/workflows/release.yml Switches to workflow-level deny-all permissions with job-level grants.
.github/workflows/release-pr.yml Switches to workflow-level deny-all permissions with job-level grants.
.checkov.yaml Configures Checkov frameworks and repo-wide skipped checks.
.bandit Configures Bandit targets/excludes and confidence filtering.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Kalindi-Dev
Kalindi-Dev previously approved these changes Apr 2, 2026
Copy link
Copy Markdown
Contributor

@Kalindi-Dev Kalindi-Dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@scottschreckengaust Approving from my end, as all my previous comments were addressed, but I see good suggestions from Michael. Feel free to merge once they are addressed as well.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
scottschreckengaust and others added 6 commits April 7, 2026 20:16
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Bandit 1.9.4 expects YAML config but .bandit used legacy INI format,
causing a parse error (exit code 2) that failed the CI job. Convert to
valid YAML and add -ll flag for high-confidence filtering.
- Remove -ll severity filter so LOW/MEDIUM/HIGH all appear in SARIF
- Check SARIF for HIGH severity (level=error) to decide pass/fail
- Move scan targets into .bandit config so new Python directories
  can be added without editing the workflow
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation github

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants