Fix LRT Azure failures: scope skip-list update to install/upgrade only#11456
Fix LRT Azure failures: scope skip-list update to install/upgrade only#11456
Conversation
…pshot to install/upgrade only Co-authored-by: sk593 <42750942+sk593@users.noreply.github.com>
| run: | | ||
| kubectl get resources.ucp.dev -n radius-system --no-headers -o custom-columns=":metadata.name" > skip-delete-resources-list.txt | ||
|
|
||
| - name: Save list of resources not to be deleted |
There was a problem hiding this comment.
is the save needed each time workflow runs with a new key? perhaps the script can return if install/upgrade was done and we use that?
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #11456 +/- ##
==========================================
+ Coverage 51.21% 51.23% +0.02%
==========================================
Files 699 699
Lines 44050 44050
==========================================
+ Hits 22560 22571 +11
+ Misses 19330 19326 -4
+ Partials 2160 2153 -7 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
Fixes Azure long-running test (LRT) failures by ensuring skip-delete-resources-list.txt is only refreshed when Radius is actually installed/upgraded, instead of being overwritten on every run.
Changes:
- Move UCP resource snapshot generation into
save_skip_resources_list()insidemanage-radius-installation.sh. - Call the snapshot helper only on install/upgrade paths (not on the version-match no-op path).
- Remove the unconditional workflow step that regenerated the skip list every run.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
.github/workflows/long-running-azure.yaml |
Removes the always-run kubectl snapshot step; relies on the script to generate the skip list only when needed. |
.github/scripts/manage-radius-installation.sh |
Adds save_skip_resources_list() helper and invokes it from install/upgrade branches. |
Comments suppressed due to low confidence (1)
.github/workflows/long-running-azure.yaml:282
- After removing the step that always generated
skip-delete-resources-list.txt, this cache save step can run when the file was neither restored (cache miss) nor created (version-match no-op).actions/cache/savetypically fails when the specified path doesn't exist, which would break the workflow even though skipping install/upgrade is a valid outcome. Guard this step with anifthat checks the file exists (or create a minimal file in the no-op path without overwriting an existing one).
- name: Save list of resources not to be deleted
uses: actions/cache/save@668228422ae6a00e4ad889ee87cd7109ec5666a7 # v5.0.4
with:
path: skip-delete-resources-list.txt
key: skip-delete-resources-list-${{ steps.gen-id.outputs.UNIQUE_ID || github.run_id }}
| if ! kubectl get resources.ucp.dev -n radius-system --no-headers -o custom-columns=":metadata.name" > skip-delete-resources-list.txt; then | ||
| echo "Error: Failed to retrieve UCP resources from cluster." >&2 | ||
| exit 1 | ||
| fi | ||
| echo "Skip resources list saved." |
There was a problem hiding this comment.
save_skip_resources_list redirects output directly to skip-delete-resources-list.txt, which truncates the existing file before kubectl succeeds. If kubectl get fails (transient API error, RBAC, etc.), the workflow can lose the previously-restored skip list and later cleanup may delete resources that should have been preserved. Write to a temporary file and atomically replace the destination only on success (and leave the previous file intact on failure).
| if ! kubectl get resources.ucp.dev -n radius-system --no-headers -o custom-columns=":metadata.name" > skip-delete-resources-list.txt; then | |
| echo "Error: Failed to retrieve UCP resources from cluster." >&2 | |
| exit 1 | |
| fi | |
| echo "Skip resources list saved." | |
| local tmp_file | |
| tmp_file="$(mktemp skip-delete-resources-list.txt.XXXXXX)" | |
| if kubectl get resources.ucp.dev -n radius-system --no-headers \ | |
| -o custom-columns=":metadata.name" > "${tmp_file}"; then | |
| mv "${tmp_file}" skip-delete-resources-list.txt | |
| echo "Skip resources list saved." | |
| else | |
| echo "Error: Failed to retrieve UCP resources from cluster." >&2 | |
| rm -f "${tmp_file}" | |
| exit 1 | |
| fi |
LRT on Azure was failing due to
skip-delete-resources-list.txtbeing unconditionally overwritten every run regardless of whether install/upgrade occurred.Changes
.github/scripts/manage-radius-installation.shsave_skip_resources_list()helper (with error handling); call it only in the install and upgrade branches — not the version-match no-op path.github/workflows/long-running-azure.yamlmanage-radius-installation.shand is scoped to install/upgrade onlyType of change
Contributor checklist