Skip to content

fix(release): retry gh pr merge to cover GraphQL propagation lag#54

Merged
JohnnyVicious merged 1 commit intomainfrom
fix/release-graphql-race
Apr 12, 2026
Merged

fix(release): retry gh pr merge to cover GraphQL propagation lag#54
JohnnyVicious merged 1 commit intomainfrom
fix/release-graphql-race

Conversation

@JohnnyVicious
Copy link
Copy Markdown
Owner

Summary

Wraps gh pr merge in the release workflow with a 5-attempt retry loop (2s backoff) to ride out GitHub's GraphQL propagation lag between gh pr create and the subsequent merge.

Why

The v1.0.4 release hit this in anger. Workflow run 24312750234 failed at step 11 with:

Created release PR: https://github.com/.../pull/53
GraphQL: Could not resolve to a PullRequest with the number of 53.
##[error]Process completed with exit code 1.

gh pr create returned synchronously with the new PR URL, but gh pr merge fired ~500ms later and the PR node wasn't yet resolvable from the merge endpoint — classic eventual consistency. I confirmed it was a pure race by checking PR #53 manually a few seconds later: it was already MERGEABLE, and gh pr merge 53 --squash --delete-branch worked first-try.

v1.0.2 and v1.0.3 releases happened to land on the fast side of the race; this fix is about hardening the automation, not reacting to a new failure mode.

Test plan

  • YAML syntax parses (.github/workflows/release.yml is a single step edit)
  • Retry loop logic is standard bash — attempts 1..5 with sleep 2 between, exits non-zero with a ::error:: annotation on final failure
  • Real validation: the next release cut will exercise the retry branch if GitHub lags, or the happy path if it doesn't. Either way it ships v1.0.5+ cleanly.

The v1.0.4 release (run 24312750234) failed at step 11 with:

    Created release PR: https://github.com/.../pull/53
    GraphQL: Could not resolve to a PullRequest with the number of 53.
    ##[error]Process completed with exit code 1.

`gh pr create` had returned the PR URL synchronously, but
`gh pr merge` fired ~500ms later and hit GraphQL's eventual-
consistency lag — the PR node wasn't yet resolvable from the merge
endpoint even though it had just been created. By the time I
checked manually the PR was fully mergeable, confirming it was a
pure propagation race and not a real auth/permission failure.

Wrap `gh pr merge` in a 5-attempt retry loop with a 2s backoff.
Worst case the workflow spends an extra 8s; best case the first
attempt still wins. The v1.0.2 and v1.0.3 releases didn't hit this
because they happened to land on the fast side of the race, so the
fix is hardening against a flaky GitHub API, not a new failure
mode.
@JohnnyVicious JohnnyVicious merged commit a58b5a6 into main Apr 12, 2026
1 check passed
@JohnnyVicious JohnnyVicious deleted the fix/release-graphql-race branch April 12, 2026 18:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant