Skip to content

[backport 8.19] add tbs storage limit metrics#20682

Open
ericywl wants to merge 5 commits into8.19from
backport-tbs-monitoring
Open

[backport 8.19] add tbs storage limit metrics#20682
ericywl wants to merge 5 commits into8.19from
backport-tbs-monitoring

Conversation

@ericywl
Copy link
Contributor

@ericywl ericywl commented Mar 12, 2026

Closes #20531.

This is a manual backport of #20464. Due to differences in TBS implementation between 8.19 and 9.x, the disk-related metrics are not backported.

Related PRs:

@ericywl ericywl self-assigned this Mar 12, 2026
@ericywl ericywl requested a review from a team as a code owner March 12, 2026 12:12
@github-actions
Copy link
Contributor

🤖 GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@ericywl ericywl requested a review from carsonip March 12, 2026 12:13
@carsonip carsonip requested review from Copilot and removed request for carsonip March 12, 2026 12:13
@ericywl ericywl requested review from carsonip and removed request for Copilot March 12, 2026 12:13
@ericywl ericywl changed the title [backport] add tbs storage limit metrics [backport 8.19] add tbs storage limit metrics Mar 12, 2026
Copy link
Member

@carsonip carsonip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, 1 nit

Comment on lines +121 to +122
storageLimit := getGauge(t, reader, "apm-server.sampling.tail.storage.storage_limit")
assert.Zero(t, storageLimit)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the test might give better confidence if storage limit is non-zero. Also non-zero tests the *0.9 quirk

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added 98ef2ad.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Backports tail-based sampling (TBS) storage-limit observability to the 8.19 APM Server by adding a new storage_limit metric and consolidating TBS storage metric registration within the Badger storage manager.

Changes:

  • Add an opt-in OpenTelemetry MeterProvider to eventstorage.StorageManager and register TBS storage metrics there (including new apm-server.sampling.tail.storage.storage_limit).
  • Move metric callback registration/unregistration responsibility from main.go into StorageManager lifecycle (NewStorageManager/Close), and record the effective storage limit in Run.
  • Update monitoring tests to assert the new metric and loosen metric assertions to “contains” to avoid brittleness.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
x-pack/apm-server/sampling/eventstorage/storage_manager.go Adds MeterProvider-based metric instruments/callback registration, records effective storage limit, and unregisters callback on close.
x-pack/apm-server/main.go Passes MeterProvider into NewStorageManager via an option and removes the previous global metric callback registration/unregistration.
x-pack/apm-server/main_test.go Updates monitoring assertions to “contain” and adds coverage for the new storage_limit metric.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +38 to +44
type StorageManagerOptions func(*StorageManager)

func WithMeterProvider(mp metric.MeterProvider) StorageManagerOptions {
return func(sm *StorageManager) {
sm.meterProvider = mp
}
}
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StorageManagerOptions is a functional-option type but is named in plural form. The prevailing pattern in this repo uses singular ...Option for these (e.g. LoadConfigOption in internal/beatcmd/config.go and ConfigOption in internal/telemetry/metric_exporter_config.go). Consider renaming this to StorageManagerOption and updating NewStorageManager to accept ...StorageManagerOption for consistency.

Copilot uses AI. Check for mistakes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep it as StorageManagerOptions to be consistent with main. It was a typo in retrospect. Alternatively, fix it in both 9.x and 8.19

@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

History

cc @ericywl

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants