feat(github-workspace): add GitHub Actions workspace kind with git-backed state persistence#11457
feat(github-workspace): add GitHub Actions workspace kind with git-backed state persistence#11457
Conversation
…optionally delete the k3d cluster. - Created `gitstate` package for managing state commits to an orphan branch. - Added `pgbackup` package for handling PostgreSQL backups and restores. - Introduced `k3d` package for managing k3d clusters. - Enhanced workspace connection handling to support GitHub workspaces. - Updated validation logic to accommodate new workspace types. - Added tests for new functionality in shutdown, gitstate, k3d, and pgbackup packages. Signed-off-by: Sylvain Niles <sylvainniles@microsoft.com>
…est files Signed-off-by: Sylvain Niles <sylvainniles@microsoft.com>
pkg/cli/cmd/radinit/github.go
Outdated
| Namespace: "default", | ||
| }, | ||
| Recipes: recipePackOptions{ | ||
| DevRecipes: true, |
There was a problem hiding this comment.
the functional test workflow skips dev recipes but this sets them to true always. might have missed it but does this get overwritten?
There was a problem hiding this comment.
This would be used in the new workflow that is the end to end test for repo radius. I imagine we will extend that workflow with other end to end tests rather than shoehorn them into the existing functional tests.
There was a problem hiding this comment.
Correct — the functional-test-github-workspace.yaml workflow is a separate end-to-end test specific to the GitHub workspace lifecycle, not the existing functional test suite. Dev recipes are intentional here because the workflow tests a real Radius install with actual recipe execution. The flag difference is by design.
- Add LockInfo, ErrDeployLockHeld, NewLockInfoFromEnv to gitstate - Add TryAcquireDeployLock / ReleaseDeployLock to StateWorktree - Wire acquireDeployLock into rad deploy Runner (github kind only) - Retry of the same workflow run (higher GITHUB_RUN_ATTEMPT) takes over a stale lock - Unrelated workspace kinds and nil-field Runner are no-ops - Full test coverage for all lock paths in gitstate and deploy Signed-off-by: Sylvain Niles <sylvainniles@microsoft.com>
The helm unit tests were asserting against wrong data keys: - ucp/configmaps.yaml: 'ucp.yaml' -> 'ucp-config.yaml' - rp/configmaps.yaml: 'applications-rp.yaml' -> 'radius-self-host.yaml' - dynamic-rp/configmaps.yaml: 'dynamic-rp.yaml' -> 'radius-self-host.yaml' Signed-off-by: Sylvain Niles <sylvainniles@microsoft.com>
…map assertions helm-unittest's 'contains' is for arrays; configmap data values are multi-line strings. Switch to 'matchRegex' which works on strings. Signed-off-by: Sylvain Niles <sylvainniles@microsoft.com>
- Remove duplicate copyright/package declarations in k3d_test.go and pgbackup_test.go - Fix workflow permissions: contents: read -> contents: write (needed to push radius-state branch) - Add dynamic_rp database to pgbackup, init-db ConfigMap, and dynamic-rp configmap URL (was incorrectly using 'ucp' user/database for dynamic-rp) - Update pgbackup_test.go to expect 3 databases - Document all three backed-up databases in github-workspace.md Signed-off-by: Sylvain Niles <sylvainniles@microsoft.com>
- actions/checkout: 11bd719 (v4.2.2) -> de0fac2 (v6.0.2) - actions/setup-go: 0aaccfd (v5.4.0, invalid SHA) -> 4b73464 (v6.3.0) - actions/upload-artifact: ea165f8 (v4.6.2) -> bbbca2d (v7.0.0) The invalid setup-go SHA caused 'Set up job' to fail immediately because the runner could not download the action at that commit hash. Signed-off-by: Sylvain Niles <sylvainniles@microsoft.com>
…setup-kubectl SHA Signed-off-by: Sylvain Niles <sylvainniles@microsoft.com>
…nit call Signed-off-by: Sylvain Niles <sylvainniles@microsoft.com>
…ctions) Signed-off-by: Sylvain Niles <sylvainniles@microsoft.com>
Radius functional test overviewClick here to see the test run details
Test Status⌛ Building Radius and pushing container images for functional tests... |
Use tea.WithInput(strings.NewReader("")) instead of short-circuiting the
channel so that RunProgram is always called (test mocks and the goroutine
that drains the progress channel continue to function correctly).
Also tidy gitstate_test.go: drop the unused dir variable from
Test_TryAcquireDeployLock_ReadOnlyDir and use t.Cleanup for the
permission-restore.
Signed-off-by: Sylvain Niles <sylvainniles@microsoft.com>
Signed-off-by: Sylvain Niles <sylvainniles@microsoft.com> # Conflicts: # pkg/cli/cmd/radinit/init.go
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #11457 +/- ##
==========================================
- Coverage 51.07% 51.06% -0.01%
==========================================
Files 699 706 +7
Lines 44316 44821 +505
==========================================
+ Hits 22634 22890 +256
- Misses 19517 19743 +226
- Partials 2165 2188 +23 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
- Remove DefaultRecipePack from enterGitHubInitOptions: Radius.Core/recipePacks is a new API type added in this PR; the published Helm chart images used by the E2E test predate this addition and return 400 BadRequest on that endpoint. The GitHub workspace E2E only verifies cluster setup and PG backup/restore, so recipe packs are not needed. - Override database.image to docker.io/library/postgres: the chart default 'mirror/postgres' resolves to ghcr.io/radius-project/mirror/postgres (non- existent) when no global.imageRegistry is set in public CI. - Fix import ordering in github_test.go (gofmt) Signed-off-by: Sylvain Niles <sylvainniles@microsoft.com>
Description
Adds a new
"github"workspace kind to Radius for use in GitHub Actions workflows. When this kind is active, Radius manages a k3d cluster for the duration of the workflow, persists database state between runs via a git orphan branch, and guards against concurrent or interrupted deploys with a deploy lock.How it works
rad init --kind github— creates a k3d cluster, installs Radius with PostgreSQL as the state backend (database.enabled=true), checks the state orphan branch for prior backup state and restores it if the previous run shut down cleanly.rad deploy— acquires a.deploy-lockon theradius-stateorphan branch before deploying. A later retry of the same workflow run (higherGITHUB_RUN_ATTEMPT) automatically takes over a stale lock so the job can continue. A lock held by a completely different run returns an error immediately.rad shutdown [--cleanup]— backs up PostgreSQL to the state worktree (viakubectl exec pg_dump), commits and pushes toradius-state, writes.backup-ok, and optionally deletes the k3d cluster.State isolation — git worktree
State files (SQL dumps,
.lock,.backup-ok,.deploy-lock) live only in the orphan branch, checked out into an OS temp directory viagit worktree add. They are never written to the application working tree and never appear ingit statusonmain.Semaphore / spot-instance safety
.lock.backup-okrad initbehaviourDeploy lock (idempotent retry)
.deploy-lockpresentRunID, lowerRunAttemptRunID, sameRunAttemptRunIDErrDeployLockHeldErrDeployLockHeldHelm PostgreSQL fixes (pre-existing gaps)
provider: apiserver— now conditional ondatabase.enabled./docker-entrypoint-initdb.d/.POSTGRES_DBsecret value was the literal string"POSTGRES_DB"— fixed to"radius".New packages
pkg/cli/k3dos/execpkg/cli/pgbackupkubectl exec pg_dump/psqlpkg/cli/gitstategit worktree; semaphore + deploy lockNew commands
rad init --kind githubrad deploy(extended: acquires deploy lock for GitHub workspaces)rad shutdown [--cleanup]New workflow
.github/workflows/functional-test-github-workspace.yaml— end-to-end integration test exercising the full init → deploy → shutdown → restore lifecycle on a GitHub Actions runner.Outstanding TODOs (tracked in code)
rad resource type syncafter restore (command does not yet exist)Type of change
Contributor checklist
Please verify that the PR meets the following requirements, where applicable: