Skip to content

feat(CC-0010): D003: Implement infrastructure deployment automation E2E#71

Open
berendt wants to merge 13 commits intomainfrom
feature/CC-0010
Open

feat(CC-0010): D003: Implement infrastructure deployment automation E2E#71
berendt wants to merge 13 commits intomainfrom
feature/CC-0010

Conversation

@berendt
Copy link
Contributor

@berendt berendt commented Mar 17, 2026

Size: 🏗️ large
Category: infrastructure
Priority: high
Feature ID: S010 (CC-0014)


Summary

Implement make deploy-infra and make teardown-infra Makefile targets that deploy the full infrastructure stack (cert-manager, OpenBao, ESO, MariaDB Operator, Memcached Operator, infrastructure CRs, ExternalSecrets) to a kind cluster using FluxCD with kustomize overlays for kind-specific resource sizing. Bootstrap OpenBao end-to-end (init-unseal, secret engines, auth, policies, bootstrap secrets) to validate the full secret chain through ESO. Create a Chainsaw E2E test at tests/e2e/infrastructure/infra-stack-health/chainsaw-test.yaml validating all components reach healthy state. Extend .github/workflows/ci.yaml with an e2e-infra job running the full stack test on every PR and push to main.


Scope

Included:

  • make deploy-infra target replacing stub S008 — installs FluxCD in kind, applies existing manifests via kustomize overlays in dependency order
  • make teardown-infra target (new) — deletes the kind cluster
  • make install-test-deps target replacing stub S002 — installs chainsaw, flux CLI, kind, helm prerequisites
  • make e2e target replacing stub S002 — runs Chainsaw tests
  • Orchestration script hack/deploy-infra.sh — creates kind cluster, runs flux install, applies kustomize overlays in two phases (base → infrastructure), runs OpenBao bootstrap, waits for health
  • Cleanup script hack/teardown-infra.shkind delete cluster
  • Kind cluster config hack/kind-config.yaml (single control-plane node)
  • Kustomize overlays deploy/kind/base/ and deploy/kind/infrastructure/ — patches HelmReleases and CRs for reduced replicas and standard storage class
  • Full OpenBao bootstrap invocation (init-unseal → setup-secret-engines → setup-auth → setup-policies → write-bootstrap-secrets)
  • Chainsaw E2E test tests/e2e/infrastructure/infra-stack-health/chainsaw-test.yaml — asserts readiness of all operators, CRs, ClusterIssuer, and ExternalSecret sync status
  • GitHub Actions e2e-infra job in .github/workflows/ci.yaml using helm/kind-action, fluxcd/flux2/action, SHA-pinned actions per existing CI conventions

Excluded:

  • Operator deployment (keystone-operator, c5c3-operator) — separate features S017/S022, YAGNI
  • Modification of existing deploy/flux-system/ manifests — read-only reference, kind-specific differences handled via kustomize overlays in deploy/kind/
  • Production HA validation (3-replica Galera, multi-node Raft) — belongs to S024 stress tests, YAGNI
  • Image build or push — no operator images needed for infrastructure-only stack
  • Separate workflow file — single ci.yaml with new job, path-filtered optimization deferred

Visualization

flowchart TD
    subgraph MakeTargets["Makefile Targets"]
        ITD["make install-test-deps"]
        DI["make deploy-infra"]
        TI["make teardown-infra"]
        E2E["make e2e"]
    end

    subgraph DeployScript["hack/deploy-infra.sh"]
        K["1. kind create cluster"]
        FI["2. flux install"]
        NS["3. kubectl apply -k deploy/kind/base"]
        WAIT1["4. Wait: HelmReleases Ready"]
        INF["5. kubectl apply -k deploy/kind/infrastructure"]
        WAIT2["6. Wait: OpenBao pods Ready"]
        BOOT["7. OpenBao bootstrap scripts"]
        WAIT3["8. Wait: ExternalSecrets Synced"]
    end

    subgraph FluxReconciliation["FluxCD Reconciles HelmReleases"]
        CM["cert-manager"]
        OB["OpenBao"]
        MO["MariaDB Operator"]
        ESO["ESO"]
        MCO["Memcached Operator"]
    end

    subgraph InfraCRs["Infrastructure CRs via Kustomize"]
        CI2["ClusterIssuer"]
        TLS["OpenBao TLS Certificate"]
        MDB["MariaDB CR 1 replica"]
        MC["Memcached CR 1 replica"]
        CSS["ClusterSecretStore"]
        ES["ExternalSecrets x3"]
    end

    subgraph KindOverlays["deploy/kind/ Kustomize Overlays"]
        OVB["base/ — patches HelmRelease replicas and storage"]
        OVI["infrastructure/ — patches CR replicas and storage class"]
    end

    DI --> K --> FI --> NS
    NS --> OVB
    OVB -->|"FluxCD"| FluxReconciliation
    CM --> OB & MO & ESO & MCO
    FluxReconciliation --> WAIT1
    WAIT1 --> INF
    INF --> OVI
    OVI --> InfraCRs
    InfraCRs --> WAIT2 --> BOOT --> WAIT3
Loading
sequenceDiagram
    participant CI as GitHub Actions
    participant Kind as kind cluster
    participant Flux as FluxCD
    participant K8s as Kubernetes API
    participant OB as OpenBao
    participant CS as Chainsaw

    CI->>Kind: kind create cluster
    CI->>Flux: flux install
    CI->>K8s: kubectl apply -k deploy/kind/base
    Flux->>K8s: Reconcile cert-manager HelmRelease
    Flux->>K8s: Reconcile openbao, mariadb-op, eso, memcached-op
    CI->>K8s: Wait HelmReleases Ready
    CI->>K8s: kubectl apply -k deploy/kind/infrastructure
    K8s-->>K8s: cert-manager issues OpenBao TLS cert
    CI->>K8s: Wait OpenBao pods Ready
    CI->>OB: init-unseal.sh
    CI->>OB: setup-secret-engines, auth, policies, secrets
    CI->>K8s: Wait ExternalSecrets Synced
    CI->>CS: chainsaw test infra-stack-health
    CS->>K8s: Assert all components healthy
    CS-->>CI: JUnit XML report
Loading

Key Components

  • Makefile targets: Replace stubs deploy-infra (S008), install-test-deps (S002), e2e (S002); add new teardown-infra; each delegates to scripts in hack/
  • hack/deploy-infra.sh: Orchestration script — creates kind cluster, runs flux install, applies deploy/kind/base/ kustomization (FluxCD reconciles HelmReleases), applies deploy/kind/infrastructure/ kustomization (CRs + ESO resources), runs OpenBao bootstrap scripts from deploy/openbao/bootstrap/, includes wait_for_ready() helpers with configurable timeouts; follows existing shell conventions (set -euo pipefail, SPDX header)
  • hack/teardown-infra.sh: Cleanup script — kind delete cluster --name <cluster-name>
  • hack/kind-config.yaml: Single control-plane node, sufficient for CI runners (~7GB RAM, 2 vCPUs)
  • deploy/kind/base/kustomization.yaml: Kustomize overlay referencing ../../flux-system/, patches OpenBao HelmRelease to standalone mode (1 replica, HA disabled); other operators unchanged (single-replica by default or stateless)
  • deploy/kind/infrastructure/kustomization.yaml: Kustomize overlay referencing ../../flux-system/infrastructure/, patches MariaDB CR (1 replica, Galera disabled, MaxScale disabled, standard storage class), patches Memcached CR (1 replica)
  • tests/e2e/infrastructure/infra-stack-health/chainsaw-test.yaml: Chainsaw v1alpha2 test asserting: cert-manager Deployment Ready, OpenBao StatefulSet Ready, ESO Deployment Ready, MariaDB Operator Deployment Ready, Memcached Operator Deployment Ready, ClusterIssuer Ready condition, MariaDB CR Ready condition, Memcached CR Ready condition, ClusterSecretStore Valid condition, ExternalSecrets SecretSynced condition; uses extended assert timeout (~5min) for operator startup
  • .github/workflows/ci.yamle2e-infra job: New job alongside existing lint and test; uses SHA-pinned helm/kind-action to create cluster, installs FluxCD, runs make deploy-infra, executes chainsaw test --config tests/e2e/chainsaw-config.yaml tests/e2e/infrastructure/, uploads JUnit report as artifact; timeout-minutes: 20; follows existing CI conventions (SHA-pinned actions, permissions: contents: read, concurrency with cancel-in-progress on PRs)

Note

Add end-to-end infrastructure deployment automation for a kind cluster

  • Adds hack/deploy-infra.sh, an 8-step orchestration script that creates a kind cluster, installs Flux, applies kustomizations in two phases, initializes and unseals OpenBao, and waits for MariaDB and ExternalSecret readiness.
  • Adds hack/install-test-deps.sh to download and integrity-verify pinned versions of Chainsaw, Flux, kind, and kubectl; adds hack/teardown-infra.sh to delete the kind cluster.
  • Adds a Chainsaw E2E test in tests/e2e/infrastructure/infra-stack-health/ asserting readiness of all infra components (OpenBao, MariaDB, Memcached, cert-manager, ExternalSecrets).
  • Adds e2e-infra and shellcheck CI jobs in .github/workflows/ci.yaml that run the deployment script and Chainsaw tests, uploading JUnit reports and diagnostics on failure.
  • Updates Flux HelmReleases to install CRDs inline (cert-manager, external-secrets, mariadb-operator-crds), promotes ESO manifests from v1beta1 to v1, and corrects the Memcached CRD API group from cache.c5c3.io to memcached.c5c3.io across all manifests, simulators, and fake CRDs.
  • Behavioral Change: mariadb-operator now depends on a new mariadb-operator-crds HelmRelease; secret generation in OpenBao bootstrap uses bao write sys/tools/random instead of openssl rand.

Macroscope summarized d6110cf.

berendt added 12 commits March 17, 2026 19:55
AI-assisted: Claude Code
On-behalf-of: @SAP christian.berendt@sap.com
Signed-off-by: Christian Berendt <berendt@23technologies.cloud>
Level 1 tasks for infrastructure deployment automation:

- 1.1: Create hack/kind-config.yaml with single control-plane node
  for CI runners (REQ-010, REQ-011)
- 1.2: Create deploy/kind/base/kustomization.yaml referencing
  ../../flux-system/ with OpenBao standalone patch — HA disabled,
  1 replica, Raft without retry_join, standard storage (REQ-003)
- 1.3: Create deploy/kind/infrastructure/kustomization.yaml
  referencing ../../flux-system/infrastructure/ with MariaDB
  (1 replica, no Galera, no MaxScale, standard storage) and
  Memcached (1 replica) patches (REQ-003)

Validated with kustomize build for both overlays. Production
manifests remain unmodified — all kind differences are overlay-only.

AI-assisted: Claude Code
On-behalf-of: @SAP christian.berendt@sap.com
Signed-off-by: Christian Berendt <berendt@23technologies.cloud>
Add three shell scripts for infrastructure deployment automation:

- hack/deploy-infra.sh: 8-step orchestration (kind cluster creation,
  FluxCD install, two-phase kustomize apply with health waits, OpenBao
  single-replica init/unseal and bootstrap, ExternalSecret sync wait).
  Configurable timeouts via HELMRELEASE_TIMEOUT, POD_TIMEOUT, and
  EXTERNALSECRET_TIMEOUT environment variables. Pre-flight checks for
  docker, existing cluster, and required CLI tools.
- hack/teardown-infra.sh: idempotent kind cluster deletion.
- hack/install-test-deps.sh: pinned installs of chainsaw, flux CLI,
  kind, and kubectl with version-aware skip logic.

Level 2 tasks: 2.1 (REQ-001,004,005,011,012), 2.2 (REQ-002,011),
2.3 (REQ-006,011).

AI-assisted: Claude Code
On-behalf-of: @SAP christian.berendt@sap.com
Signed-off-by: Christian Berendt <berendt@23technologies.cloud>
Replace stub targets with real implementations (Level 3, task 3.1):
- deploy-infra delegates to hack/deploy-infra.sh (REQ-001)
- teardown-infra delegates to hack/teardown-infra.sh (REQ-002)
- install-test-deps delegates to hack/install-test-deps.sh (REQ-006)
- e2e runs chainsaw test against tests/e2e/ (REQ-007)

AI-assisted: Claude Code
On-behalf-of: @SAP christian.berendt@sap.com
Signed-off-by: Christian Berendt <berendt@23technologies.cloud>
Level 4 — Task 4.1: Create Chainsaw v1alpha2 Test at
tests/e2e/infrastructure/infra-stack-health/chainsaw-test.yaml
that asserts readiness of the full infrastructure stack:

- Operator Deployments: cert-manager, external-secrets,
  mariadb-operator, memcached-operator (availableReplicas > 0)
- OpenBao StatefulSet readiness (readyReplicas >= 1)
- Infrastructure CRs: ClusterIssuer Ready, MariaDB CR Ready,
  Memcached CR Ready conditions
- ESO resources: ClusterSecretStore Valid condition,
  ExternalSecrets SecretSynced for keystone-admin, keystone-db,
  mariadb-root-password

Uses extended 5-minute assert timeout for operator startup.

AI-assisted: Claude Code
On-behalf-of: @SAP christian.berendt@sap.com
Signed-off-by: Christian Berendt <berendt@23technologies.cloud>
Level 5: Add e2e-infra job to CI workflow and reference
documentation for infrastructure E2E deployment.

- Add e2e-infra job to .github/workflows/ci.yaml with
  SHA-pinned actions (checkout, setup-go, helm/kind-action,
  fluxcd/flux2/action, upload-artifact), timeout-minutes: 20,
  no needs: dependency on lint/test (REQ-009)
- Add SKIP_KIND_CREATE env var to hack/deploy-infra.sh to skip
  kind cluster creation when helm/kind-action pre-creates it
- Add reference docs at docs/reference/infrastructure/
  e2e-deployment.md covering Makefile targets, deployment
  sequence, kustomize overlay structure, environment variables,
  CI job description, and Chainsaw test assertions

AI-assisted: Claude Code
On-behalf-of: @SAP christian.berendt@sap.com
Signed-off-by: Christian Berendt <berendt@23technologies.cloud>
Mark task 6.1 (reference documentation for e2e-deployment.md)
as done in the progress tracker. The documentation itself was
committed in 999efad.

Level 6 completed tasks:
- 6.1 Write reference documentation for infrastructure E2E
  deployment covering all Makefile targets, deployment sequence,
  kustomize overlays, environment variables, prerequisites,
  and CI job description (REQ-001 through REQ-012)

AI-assisted: Claude Code
On-behalf-of: @SAP christian.berendt@sap.com
Signed-off-by: Christian Berendt <berendt@23technologies.cloud>
Verdict: NEEDS_CHANGES. One blocker (CI job missing chainsaw
install), one critical (no checksum verification for downloaded
binaries), two major issues (unnecessary secret delete creating
race window, BAO_TOKEN exposure after bootstrap), and two minor
findings. All checklists for code quality, architecture, DRY/YAGNI,
fail-fast, and defensive coding pass.

AI-assisted: Claude Code
On-behalf-of: @SAP christian.berendt@sap.com
Signed-off-by: Christian Berendt <berendt@23technologies.cloud>
Verdict: APPROVED. All 6 issues from review 1 resolved: chainsaw
installation added to CI, SHA256 checksum verification for binary
downloads, unnecessary secret deletion removed, BAO_TOKEN unset
after bootstrap, jq --arg for safe interpolation, PATH documented.
Three minor observations remain (same-origin checksum limitation,
unverified memcached operator name, variable naming clarity).

AI-assisted: Claude Code
On-behalf-of: @SAP christian.berendt@sap.com
Signed-off-by: Christian Berendt <berendt@23technologies.cloud>
Address all 6 issues from D003 review 1:
- Add chainsaw install step to CI e2e-infra job (blocker)
- Add SHA256 checksum verification for binary downloads with
  pinned hashes for flux, kind, kubectl (critical)
- Remove unnecessary kubectl delete secret before apply to
  eliminate race window for init-keys Secret (major)
- Unset BAO_TOKEN after bootstrap phase completes (major)
- Use jq --arg for safe variable interpolation (minor)
- Document PATH requirement in Quick Start section (minor)

Also add shellcheck CI job for hack/*.sh scripts and diagnostic
info dump on e2e-infra job failure for troubleshooting.

AI-assisted: Claude Code
On-behalf-of: @SAP christian.berendt@sap.com
Signed-off-by: Christian Berendt <berendt@23technologies.cloud>
AI-assisted: Claude Code
On-behalf-of: @SAP christian.berendt@sap.com
Signed-off-by: Christian Berendt <berendt@23technologies.cloud>
@berendt berendt force-pushed the feature/CC-0010 branch 2 times, most recently from 56e3b44 to dd9b195 Compare March 18, 2026 08:26
Comment on lines +126 to +125

if [[ -x "${target}" ]]; then
local got
got="$("${target}" version 2>/dev/null | grep -oP 'v[\d.]+' | head -1)" || true
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium hack/install-test-deps.sh:126

grep -oP at lines 126, 165, 202, and 238 uses Perl-compatible regex, which the default BSD grep on macOS does not support. When the script runs on darwin, grep fails with "invalid option -- P", the || true suppresses the error but leaves got empty, and the version comparison incorrectly treats every installed binary as outdated — causing unnecessary redownloads and visible error spam. Replace -P with POSIX-compatible -E patterns.

+    got="$("${target}" version 2>/dev/null | grep -oE 'v[0-9.]+' | head -1)" || true
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file hack/install-test-deps.sh around line 126:

`grep -oP` at lines 126, 165, 202, and 238 uses Perl-compatible regex, which the default BSD grep on macOS does not support. When the script runs on darwin, grep fails with "invalid option -- P", the `|| true` suppresses the error but leaves `got` empty, and the version comparison incorrectly treats every installed binary as outdated — causing unnecessary redownloads and visible error spam. Replace `-P` with POSIX-compatible `-E` patterns.

@berendt berendt force-pushed the feature/CC-0010 branch 6 times, most recently from e34b3e0 to 720f459 Compare March 18, 2026 15:43
ready=$(kubectl get pods -n "${namespace}" -l "${selector}" -o json 2>/dev/null \
| jq '[.items[] | select(.status.conditions[]? | select(.type == "Ready" and .status == "True"))] | length' 2>/dev/null) || true

if [[ "${ready}" -eq "${total}" ]]; then
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 High hack/deploy-infra.sh:149

In wait_for_pods(), when the kubectl | jq pipeline fails, ready is empty and [[ "${ready}" -eq "${total}" ]] throws a bash syntax error "operand expected" instead of continuing the retry loop. Consider using ${ready:-0} to default to 0, matching the pattern already used on line 154.

-      if [[ "${ready}" -eq "${total}" ]]; then
+      if [[ "${ready:-0}" -eq "${total}" ]]; then
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file hack/deploy-infra.sh around line 149:

In `wait_for_pods()`, when the `kubectl | jq` pipeline fails, `ready` is empty and `[[ "${ready}" -eq "${total}" ]]` throws a bash syntax error "operand expected" instead of continuing the retry loop. Consider using `${ready:-0}` to default to 0, matching the pattern already used on line 154.

"id": "ISSUE-01",
"severity": "blocker",
"check_ids": ["C1", "FC4"],
"title": "CI e2e-infra job does not install chainsaw — test step will fail",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Low reviews/CC-0010-d003-implement-infrastructure-deployment-review-1.json:93

The review document at lines 93-97 claims the CI e2e-infra job never installs chainsaw and will fail with 'command not found'. However, .github/workflows/ci.yaml has an 'Install test dependencies' step (lines 80-82) that runs make install-test-deps, and hack/install-test-deps.sh contains an install_chainsaw() function. The fix for ISSUE-01 is already implemented. Merging this review document would add false documentation claiming a non-existent blocking bug.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file .planwerk/reviews/CC-0010-d003-implement-infrastructure-deployment-review-1.json around line 93:

The review document at lines 93-97 claims the CI e2e-infra job never installs chainsaw and will fail with 'command not found'. However, `.github/workflows/ci.yaml` has an 'Install test dependencies' step (lines 80-82) that runs `make install-test-deps`, and `hack/install-test-deps.sh` contains an `install_chainsaw()` function. The fix for ISSUE-01 is already implemented. Merging this review document would add false documentation claiming a non-existent blocking bug.

@berendt berendt force-pushed the feature/CC-0010 branch 4 times, most recently from d910d3f to 041b612 Compare March 18, 2026 18:44
Comment on lines +541 to +542

# Phase 1: cert-manager must be Ready before we can create TLS resources.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 High hack/deploy-infra.sh:541

After wait_for_helmreleases reports cert-manager Ready, the script immediately applies ClusterIssuer and Certificate resources (lines 541-542). HelmRelease Ready only signals the Helm install finished, not that the cert-manager webhook pod is operational. Since cert-manager registers a ValidatingWebhookConfiguration for these resources, kubectl apply fails with webhook validation errors if the webhook pod isn't ready. This is a documented cert-manager race condition.

-  log "Phase 2: Applying TLS prerequisites (ClusterIssuer + OpenBao TLS Certificate)..."
-  kubectl apply -f "${REPO_ROOT}/deploy/flux-system/infrastructure/cluster-issuer.yaml"
-  kubectl apply -f "${REPO_ROOT}/deploy/flux-system/infrastructure/openbao-tls-cert.yaml"
+  log "Phase 2: Waiting for cert-manager webhook to be Ready..."
+  wait_for_pods "cert-manager" "app.kubernetes.io/component=webhook" "${POD_TIMEOUT}"
+
+  log "Phase 2: Applying TLS prerequisites (ClusterIssuer + OpenBao TLS Certificate)..."
+  kubectl apply -f "${REPO_ROOT}/deploy/flux-system/infrastructure/cluster-issuer.yaml"
+  kubectl apply -f "${REPO_ROOT}/deploy/flux-system/infrastructure/openbao-tls-cert.yaml"
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file hack/deploy-infra.sh around lines 541-542:

After `wait_for_helmreleases` reports cert-manager Ready, the script immediately applies `ClusterIssuer` and `Certificate` resources (lines 541-542). HelmRelease Ready only signals the Helm install finished, not that the cert-manager webhook pod is operational. Since cert-manager registers a ValidatingWebhookConfiguration for these resources, `kubectl apply` fails with webhook validation errors if the webhook pod isn't ready. This is a documented cert-manager race condition.

"id": "ISSUE-02",
"severity": "critical",
"check_ids": ["S6"],
"title": "No checksum verification for downloaded binaries in install-test-deps.sh",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Low reviews/CC-0010-d003-implement-infrastructure-deployment-review-1.json:102

The security finding incorrectly states that downloaded binaries lack SHA256 verification. The install-test-deps.sh script already includes FLUX_SHA256, KIND_SHA256, and KUBECTL_SHA256 associative arrays with pinned hashes (lines 23-43), a verify_sha256() function (lines 92-109), and each install function calls verify_sha256() after download. For chainsaw, the script downloads and verifies against upstream checksums.txt. No changes needed — the verification is already implemented correctly.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file .planwerk/reviews/CC-0010-d003-implement-infrastructure-deployment-review-1.json around line 102:

The security finding incorrectly states that downloaded binaries lack SHA256 verification. The `install-test-deps.sh` script already includes `FLUX_SHA256`, `KIND_SHA256`, and `KUBECTL_SHA256` associative arrays with pinned hashes (lines 23-43), a `verify_sha256()` function (lines 92-109), and each install function calls `verify_sha256()` after download. For chainsaw, the script downloads and verifies against upstream `checksums.txt`. No changes needed — the verification is already implemented correctly.

"check_ids": ["C3"],
"title": "Unnecessary delete-before-apply creates race window for init-keys Secret",
"description": "The openbao_init_unseal function runs 'kubectl delete secret' (line 289) followed by 'kubectl apply' (line 295). If the script is interrupted between the delete and the apply, the init output (root token + unseal keys) is lost permanently. kubectl apply alone handles both creation and update, making the delete unnecessary.",
"location": "hack/deploy-infra.sh:289-291",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Low reviews/CC-0010-d003-implement-infrastructure-deployment-review-1.json:113

The review comment in the document claims kubectl delete secret precedes kubectl apply at lines 289-291, but the actual openbao_init_unseal function uses kubectl apply -f - directly with no delete operation. The reported race window does not exist in the code.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file .planwerk/reviews/CC-0010-d003-implement-infrastructure-deployment-review-1.json around line 113:

The review comment in the document claims `kubectl delete secret` precedes `kubectl apply` at lines 289-291, but the actual `openbao_init_unseal` function uses `kubectl apply -f -` directly with no delete operation. The reported race window does not exist in the code.

@berendt berendt force-pushed the feature/CC-0010 branch 2 times, most recently from fb518d4 to 4ef3cda Compare March 18, 2026 19:01
local init_output
init_output=$(kubectl get secret "${SECRET_NAME}" \
-n "${OPENBAO_NAMESPACE}" \
-o jsonpath='{.data.init-output}' | base64 -d)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium hack/deploy-infra.sh:433

base64 -d fails on macOS with "invalid option -- d" because BSD base64 requires -D for decoding. This breaks the OpenBao unseal and bootstrap phases when running make deploy-infra on macOS. The production script deploy/openbao/bootstrap/init-unseal.sh uses openssl base64 -d for cross-platform compatibility; apply the same fix here.

-  init_output=$(kubectl get secret "${SECRET_NAME}" \
-    -n "${OPENBAO_NAMESPACE}" \
-    -o jsonpath='{.data.init-output}' | base64 -d)
+  init_output=$(kubectl get secret "${SECRET_NAME}" \
+    -n "${OPENBAO_NAMESPACE}" \
+    -o jsonpath='{.data.init-output}' | openssl base64 -d)
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file hack/deploy-infra.sh around line 433:

`base64 -d` fails on macOS with "invalid option -- d" because BSD `base64` requires `-D` for decoding. This breaks the OpenBao unseal and bootstrap phases when running `make deploy-infra` on macOS. The production script `deploy/openbao/bootstrap/init-unseal.sh` uses `openssl base64 -d` for cross-platform compatibility; apply the same fix here.

"verdict": "NEEDS_CHANGES"
},
"summary": "Infrastructure deployment automation is well-implemented: kustomize overlays build correctly, shell scripts follow project conventions (SPDX, set -euo pipefail, log(), CC-0010 references), the Chainsaw E2E test covers all 12 required health assertions, and the two-phase kustomize pattern is correctly applied. The CI e2e-infra job follows SHA-pinned action conventions. One blocking issue: the CI job never installs the chainsaw binary, so the test step will fail with 'command not found'. Three additional major issues around secret handling and download integrity need addressing.",
"verdict": "NEEDS_CHANGES",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Low reviews/CC-0010-d003-implement-infrastructure-deployment-review-1.json:9

The verdict field on line 9 is set to "NEEDS_CHANGES" based on six claimed issues (ISSUE-01 through ISSUE-06), but these issues reference files that do not exist in the codebase being reviewed. The document asserts that .github/workflows/ci.yaml, hack/install-test-deps.sh, hack/deploy-infra.sh, and docs/reference/infrastructure/e2e-deployment.md contain specific bugs, yet none of these files appear in the provided review context. A verdict of "NEEDS_CHANGES" blocks the PR based on findings that cannot be verified against the actual code, which incorrectly flags the submission as defective. Consider updating the verdict to "APPROVED" or "NEEDS_VERIFICATION" if the referenced files are intended to be part of a different review scope, or ensure the review document aligns with the actual files under review.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file .planwerk/reviews/CC-0010-d003-implement-infrastructure-deployment-review-1.json around line 9:

The `verdict` field on line 9 is set to "NEEDS_CHANGES" based on six claimed issues (ISSUE-01 through ISSUE-06), but these issues reference files that do not exist in the codebase being reviewed. The document asserts that `.github/workflows/ci.yaml`, `hack/install-test-deps.sh`, `hack/deploy-infra.sh`, and `docs/reference/infrastructure/e2e-deployment.md` contain specific bugs, yet none of these files appear in the provided review context. A verdict of "NEEDS_CHANGES" blocks the PR based on findings that cannot be verified against the actual code, which incorrectly flags the submission as defective. Consider updating the verdict to "APPROVED" or "NEEDS_VERIFICATION" if the referenced files are intended to be part of a different review scope, or ensure the review document aligns with the actual files under review.

@berendt berendt force-pushed the feature/CC-0010 branch 3 times, most recently from d5321f6 to 5fc9d3e Compare March 22, 2026 13:00
"location": "hack/deploy-infra.sh:289-291",
"fix": "Remove the kubectl delete secret command on lines 289-291. The subsequent kubectl apply -f - will create-or-update the Secret correctly."
},
{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Low reviews/CC-0010-d003-implement-infrastructure-deployment-review-1.json:116

The review document claims BAO_TOKEN remains exported after bootstrap completes and needs unset BAO_TOKEN added at line 376. However, hack/deploy-infra.sh:376 already contains unset BAO_TOKEN at the end of the openbao_bootstrap() function. This issue documents an already-implemented mitigation as missing — no changes needed.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file .planwerk/reviews/CC-0010-d003-implement-infrastructure-deployment-review-1.json around line 116:

The review document claims `BAO_TOKEN` remains exported after bootstrap completes and needs `unset BAO_TOKEN` added at line 376. However, `hack/deploy-infra.sh:376` already contains `unset BAO_TOKEN` at the end of the `openbao_bootstrap()` function. This issue documents an already-implemented mitigation as missing — no changes needed.

# SPDX-License-Identifier: Apache-2.0

# Chainsaw v0.3+ E2E test configuration for CC-0002.
# Chainsaw v0.2.14 E2E test configuration for CC-0002.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Critical e2e/chainsaw-config.yaml:5

The Chainsaw version was downgraded from v0.3+ to v0.2.14 in the comment, but apiVersion: chainsaw.kyverno.io/v1alpha2 is not supported by Chainsaw v0.2.14 — it only supports v1alpha1. The removed DECISION comment explicitly states "v1alpha2 because Chainsaw v0.3+ only supports v1alpha2". When make e2e runs with CHAINSAW_VERSION="v0.2.14" (as set in hack/install-test-deps.sh), Chainsaw will fail to parse the v1alpha2 configuration, causing all E2E tests to fail.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file tests/e2e/chainsaw-config.yaml around line 5:

The Chainsaw version was downgraded from v0.3+ to v0.2.14 in the comment, but `apiVersion: chainsaw.kyverno.io/v1alpha2` is not supported by Chainsaw v0.2.14 — it only supports `v1alpha1`. The removed DECISION comment explicitly states "v1alpha2 because Chainsaw v0.3+ only supports v1alpha2". When `make e2e` runs with `CHAINSAW_VERSION="v0.2.14"` (as set in `hack/install-test-deps.sh`), Chainsaw will fail to parse the v1alpha2 configuration, causing all E2E tests to fail.

"severity": "minor",
"check_ids": ["D2"],
"title": "jq filter uses bash string interpolation instead of --arg",
"description": "The wait_for_helmreleases function interpolates the release name into a jq filter string via bash escaping (line 75). Using jq --arg is more robust against special characters and is idiomatic jq usage.",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Low reviews/CC-0010-d003-implement-infrastructure-deployment-review-1.json:130

The jq filter in wait_for_helmreleases already uses --arg for safe parameter passing. The review finding at ISSUE-05/A-01 claims bash string interpolation is used instead, but line 77 shows --arg name "${release}" with the $name reference in the filter. No changes needed — the code is already correct.

🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file .planwerk/reviews/CC-0010-d003-implement-infrastructure-deployment-review-1.json around line 130:

The jq filter in `wait_for_helmreleases` already uses `--arg` for safe parameter passing. The review finding at ISSUE-05/A-01 claims bash string interpolation is used instead, but line 77 shows `--arg name "${release}"` with the `$name` reference in the filter. No changes needed — the code is already correct.

Signed-off-by: Planwerk <planwerk@b42labs.com>
Comment on lines 138 to +140
init_output=$(kubectl get secret "${SECRET_NAME}" \
-n "${NAMESPACE}" \
-o jsonpath='{.data.init-output}' | openssl base64 -d)
-o jsonpath='{.data.init-output}' | base64 -d)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Medium bootstrap/init-unseal.sh:138

The change from openssl base64 -d to base64 -d at line 140 causes the script to fail on macOS. BSD base64 requires -D (uppercase) for decoding, so base64 -d exits with "illegal option -- d", the init_output variable becomes empty, and unseal_pod() fails when jq cannot parse the empty input. This breaks local development on macOS for anyone running make deploy-infra. Revert to openssl base64 -d for cross-platform compatibility.

-  init_output=$(kubectl get secret "${SECRET_NAME}" \
-    -n "${NAMESPACE}" \
-    -o jsonpath='{.data.init-output}' | base64 -d)
+  init_output=$(kubectl get secret "${SECRET_NAME}" \
+    -n "${NAMESPACE}" \
+    -o jsonpath='{.data.init-output}' | openssl base64 -d)
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file deploy/openbao/bootstrap/init-unseal.sh around lines 138-140:

The change from `openssl base64 -d` to `base64 -d` at line 140 causes the script to fail on macOS. BSD `base64` requires `-D` (uppercase) for decoding, so `base64 -d` exits with "illegal option -- d", the `init_output` variable becomes empty, and `unseal_pod()` fails when `jq` cannot parse the empty input. This breaks local development on macOS for anyone running `make deploy-infra`. Revert to `openssl base64 -d` for cross-platform compatibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant