Skip to content

[ML] Automate version bump in CI pipeline#3018

Open
edsavage wants to merge 3 commits intoelastic:mainfrom
edsavage:feature/version-bump-automation
Open

[ML] Automate version bump in CI pipeline#3018
edsavage wants to merge 3 commits intoelastic:mainfrom
edsavage:feature/version-bump-automation

Conversation

@edsavage
Copy link
Copy Markdown
Contributor

@edsavage edsavage commented Apr 7, 2026

Summary

Replaces the manual block step in the version-bump pipeline with automated version bump logic, supporting both patch and minor release workflows.

Patch workflow (WORKFLOW=patch, default)

  1. Bump version — checks out $BRANCH, updates elasticsearchVersion in gradle.properties to $NEW_VERSION, commits as elasticsearchmachine, pushes
  2. Fetch DRA Artifacts — polls staging + snapshot artifact URLs until the new version is available

Minor workflow (WORKFLOW=minor, feature freeze day)

  1. Create minor branch — derives the branch name from the current version on $BRANCH (e.g., main at 9.4.0 → branch 9.4), pushes it (inherits the current version)
  2. Bump upstream — bumps $BRANCH (e.g., main) to $NEW_VERSION (e.g., 9.5.0), commits, pushes
  3. Fetch DRA Artifacts — two parallel steps:
    • Upstream branch: polls for $NEW_VERSION staging + snapshot
    • Minor branch: polls for the inherited version staging + snapshot

Pattern

Follows the established Elasticsearch repo pattern for automated commits from CI:

  • elasticsearchmachine / infra-root+elasticsearchmachine@elastic.co as committer
  • Local git config (not --global)
  • git diff-index --quiet HEAD for idempotency
  • git pull --ff-only before push to handle concurrent commits

Script features

  • DRY_RUN=true — performs all steps except git push
  • Portable sed -i (macOS/Linux)
  • Idempotent: skips if version already matches, skips branch creation if it already exists
  • Reusable helpers: git_push, sed_inplace, configure_git, bump_version_on_branch
  • Unknown WORKFLOW values rejected with clear error
  • Exports MINOR_BRANCH/MINOR_VERSION via Buildkite meta-data for downstream steps

Pipeline features

  • WORKFLOW env var selects pipeline shape at generation time
  • Minor branch name/version derived from NEW_VERSION (e.g., 9.5.09.4 branch, 9.4.0 version)
  • DRA step generation refactored into helper functions

Files

  • dev-tools/bump_version.sh — standalone script with patch + minor support
  • .buildkite/job-version-bump.json.py — pipeline generator with workflow-aware step generation

Test plan

  • CI passes
  • Dry-run patch: NEW_VERSION=99.99.99 BRANCH=test/... DRY_RUN=true — commit created with correct author/message, no push
  • Real push patch: commit pushed to throwaway branch successfully
  • Idempotency patch: re-running with same version produces "nothing to do"
  • Dry-run minor: NEW_VERSION=99.99.0 BRANCH=test/... WORKFLOW=minor DRY_RUN=true — creates 9.4 branch, bumps upstream to 99.99.0, no push
  • Unknown workflow: rejected with clear error
  • Edge cases: missing NEW_VERSION/BRANCH fail with clear errors; non-existent branch fails at checkout
  • Pipeline JSON: patch generates 2 steps, minor generates 3 steps with correct artifact URLs
  • Repo-level branch protection: elasticsearchmachine added to bypass list on main, 9.3, 9.2, 9.1, 8.18
  • Blocker: Elastic org-level ruleset [org] Require a PR has no bypass actors. Coordinate with Release Engineering.
  • End-to-end Buildkite pipeline test (requires PR merged to main and org-level bypass resolved)

Replace the manual block step in the version-bump pipeline with an
automated step that:
1. Checks out the target branch
2. Updates elasticsearchVersion in gradle.properties to $NEW_VERSION
3. Commits as elasticsearchmachine
4. Pushes directly to the branch (no PR needed)

Follows the same pattern as Elasticsearch's automated Lucene snapshot
updates (.buildkite/scripts/lucene-snapshot/update-es-snapshot.sh).

The Fetch DRA Artifacts step now depends on the bump step, ensuring
the version is updated before polling for artifacts at the new version.

Made-with: Cursor
@prodsecmachine
Copy link
Copy Markdown

prodsecmachine commented Apr 7, 2026

Snyk checks have passed. No issues have been found so far.

Status Scan Engine Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

elasticsearchmachine and others added 2 commits April 8, 2026 15:48
Adds a DRY_RUN=true option that performs all steps (checkout, sed,
commit) but skips the final git push. Useful for testing the pipeline
and for local verification.

Also makes sed portable across macOS/Linux and uses local git config
instead of --global.

Made-with: Cursor
Extends the version bump script and pipeline to support the minor
release workflow (feature freeze day) in addition to patch bumps.

Script changes (bump_version.sh):
- Accepts WORKFLOW env var: 'patch' (default) or 'minor'
- Minor workflow: creates a new minor branch from BRANCH (e.g.,
  9.4 from main inheriting current version), then bumps BRANCH
  to NEW_VERSION (the next minor)
- Exports MINOR_BRANCH and MINOR_VERSION via Buildkite meta-data
  for downstream steps
- Idempotent: skips branch creation if it already exists on remote
- Refactored into reusable helpers (git_push, sed_inplace,
  configure_git, bump_version_on_branch)

Pipeline changes (job-version-bump.json.py):
- Reads WORKFLOW env var at pipeline generation time
- Patch: 1 bump step + 1 DRA step (2 artifact checks) — unchanged
- Minor: 1 bump step + 2 parallel DRA steps (upstream + minor
  branch, 4 artifact checks total covering staging + snapshot
  for both branches)
- Minor branch name/version derived from NEW_VERSION at generation
  time (e.g., NEW_VERSION=9.5.0 → minor branch 9.4, version 9.4.0)
- Refactored DRA step generation into helper functions

Tested locally with DRY_RUN=true on a throwaway branch:
- Minor workflow correctly creates branch, bumps upstream
- Patch workflow backward-compatible
- Unknown workflow rejected with clear error

Made-with: Cursor
@edsavage
Copy link
Copy Markdown
Contributor Author

edsavage commented Apr 10, 2026

Status checklist

Consolidated tracker for this PR against the Version Bump Automation PSI spec.

Core functionality

  • Patch workflow — bump gradle.properties on target branch, commit, push
  • Minor workflow — create minor branch from upstream (e.g. 9.4 from main), bump upstream to next minor, push both
  • Idempotency — skip if version already matches; skip branch creation if branch already exists
  • DRY_RUN mode — performs all steps except git push
  • WORKFLOW parameterpatch (default) or minor, unknown values rejected
  • Portablesed -i handles macOS/Linux; local git config (not --global)

Pipeline / DRA artifact verification

  • Patch pipeline — 1 bump step + 1 DRA step (staging + snapshot for BRANCH)
  • Minor pipeline — 1 bump step + 2 parallel DRA steps (upstream + minor branch, 4 artifact checks)
  • json-watcher plugin — polls artifact URLs with 30s interval, 240min timeout
  • Slack notifications#machine-learn-build on pass/fail
  • DRA build triggering — triggered automatically by the version bump commit — no explicit API trigger needed
  • Team failure notifications — current Slack notification doesn't ping @ml-team directly; spec requires immediate team notification

Branch protection

  • Repo-level bypasselasticsearchmachine added to bypass list on main, 9.3, 9.2, 9.1, 8.18
  • Org-level ruleset[org] Require a PR has no bypass actors; only org admins can update. Other teams will hit the same blocker. Coordinate with Release Engineering.
  • New branch protection — when the minor workflow creates a new branch (e.g. 9.5), it won't have elasticsearchmachine in the bypass list until someone adds it. Consider automating this.

Testing

  • Patch dry-run: correct author, message, diff, no push
  • Patch real push: to throwaway branch
  • Patch idempotency: "nothing to do" on re-run
  • Minor dry-run: creates branch, bumps upstream, no push
  • Edge cases: missing env vars, non-existent branch, unknown workflow
  • Pipeline JSON: correct step structure for both workflows
  • End-to-end Buildkite pipeline test (requires merge + org bypass)

Spec items not yet addressed

  • Push retry logic — if a concurrent commit lands between checkout and push, the push fails. The spec says team pipelines should handle retries. Consider a pull-rebase-retry loop.
  • SLSA 0.1 compliance — need to confirm whether ml-cpp requires human approvals for automated commits. If so, a PR-based approach would be needed instead of direct push.
  • Skip ITs/E2E for bump commits — spec recommends skipping heavy tests for version-bump-only commits. No mechanism to flag these currently.

Not applicable to ml-cpp

  • Auto-approval / auto-merge — we push directly (no PR), assuming branch protection is resolved
  • GitHub workflow triggering — we use Buildkite natively

@valeriy42
Copy link
Copy Markdown
Contributor

@edsavage Just to get context: is this needed for the new automated version-bump orchestration strategy?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants