Skip to content

caching: export vlog mmap telemetry via expvar#830

Open
snissn wants to merge 6 commits intopr/vlog-mmap-sealed-byte-budgetfrom
pr/vlog-mmap-expvar-telemetry
Open

caching: export vlog mmap telemetry via expvar#830
snissn wants to merge 6 commits intopr/vlog-mmap-sealed-byte-budgetfrom
pr/vlog-mmap-expvar-telemetry

Conversation

@snissn
Copy link
Copy Markdown
Owner

@snissn snissn commented Mar 15, 2026

Summary

  • publish selected TreeDB runtime stats to expvar under treedb so forensics can read mmap telemetry live from /debug/vars
  • expose only targeted keys to keep payload focused: treedb.cache.vlog_mmap.* and treedb.process.memory.*
  • track the active open caching DB for export and clear it on close

Why

PR #829 added sealed mmap byte budgeting and deny-reason counters, but celestia forensics could not directly collect those counters during a run. This adds a low-risk telemetry bridge so we can correlate:

  • treedb.cache.vlog_mmap.active_bytes/dead_bytes/...
  • smaps_rollup Pss_File
    without changing storage semantics.

Validation

  • go test ./TreeDB/caching -run 'Expvar|SelectTreeDBExpvarStats' -count=1
  • go test ./TreeDB/internal/valuelog -run 'SealedLazyMmap|MmapResidencyStatsAggregatesCounters' -count=1
  • go test ./TreeDB/db -run TestDoesNotExist -count=1
  • go vet ./TreeDB/caching ./TreeDB/internal/valuelog ./TreeDB/db
  • smoke run (fast, interrupted after startup):
    • curl http://localhost:6062/debug/vars now includes top-level treedb
    • jq '.treedb|keys[]' includes:
      • treedb.cache.vlog_mmap.active_bytes
      • treedb.cache.vlog_mmap.dead_bytes
      • treedb.cache.vlog_mmap.sealed_map_denied.count_cap
      • treedb.cache.vlog_mmap.sealed_map_denied.bytes_cap

Copilot AI review requested due to automatic review settings March 15, 2026 09:26
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6339068dba

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an expvar telemetry bridge for TreeDB’s caching DB so live mmap residency/budget counters (and selected process memory stats) can be inspected via /debug/vars for runtime forensics.

Changes:

  • Publish a top-level expvar variable named treedb that exports a filtered subset of db.Stats() keys (treedb.cache.vlog_mmap.* and treedb.process.memory.*).
  • Track the “current” open caching DB for expvar export and clear it on DB.Close().
  • Add unit tests for key filtering and string→typed-value coercion.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
TreeDB/caching/expvar_stats.go Implements expvar publishing, current-DB tracking, filtering, and type coercion.
TreeDB/caching/expvar_stats_test.go Adds tests for filtering and coercion behavior.
TreeDB/caching/db.go Hooks registration into Open() and unregistration into Close().

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

@snissn
Copy link
Copy Markdown
Owner Author

snissn commented Mar 15, 2026

Follow-up full validation on pr/vlog-mmap-expvar-telemetry (6339068d) with live /debug/vars sampling.

Runs

fast

  • run home: /home/mikers/.celestia-app-mainnet-treedb-20260314232649
  • duration_seconds: 324
  • max_rss_kb: 10412212
  • end_app_bytes: 4770955995
  • smaps at max-rss capture (pprof-heap-max-rss-10412212k-20260314233226.smaps_rollup.txt):
    • Pss_File=432843 kB
    • Anonymous=10183924 kB
  • expvar sample trace: /tmp/celestia_fast_pr830_samples_20260314232648.tsv (rows=32)
    • max treedb.cache.vlog_mmap.active_bytes=0
    • max treedb.cache.vlog_mmap.dead_bytes=0
    • max treedb.cache.vlog_mmap.sealed_map_denied=0
    • max treedb.process.memory.heap_inuse_bytes=7876313088

wal_on_fast

  • run home: /home/mikers/.celestia-app-mainnet-treedb-20260314233243
  • duration_seconds: 357
  • max_rss_kb: 12318516
  • end_app_bytes: 4788246272
  • smaps at max-rss capture (pprof-heap-max-rss-10636268k-20260314233810.smaps_rollup.txt):
    • Pss_File=288019 kB
    • Anonymous=10479764 kB
  • expvar sample trace: /tmp/celestia_wal_on_fast_pr830_samples_20260314233243.tsv (rows=35)
    • max treedb.cache.vlog_mmap.active_bytes=0
    • max treedb.cache.vlog_mmap.dead_bytes=0
    • max treedb.cache.vlog_mmap.sealed_map_denied=0
    • max treedb.process.memory.heap_inuse_bytes=11139866624

Implication

The expvar bridge works and is now directly observable during runs. In these two runs, file-backed RSS is not being driven by live valuelog mmaps (vlog_mmap counters stayed at zero); peak RSS remains dominated by anonymous memory.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an expvar bridge in TreeDB/caching to publish a selected subset of caching.DB runtime stats (focused on value-log mmap and process memory) under the top-level treedb variable, enabling live scraping via /debug/vars.

Changes:

  • Introduces TreeDB/caching/expvar_stats.go to publish a filtered/coerced stats payload via expvar.
  • Registers/unregisters the “current DB” for expvar export from caching.Open / (*DB).Close.
  • Adds unit tests for the stat filtering/coercion helper.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
TreeDB/caching/expvar_stats.go Publishes treedb expvar var and filters/coerces selected db.Stats() keys.
TreeDB/caching/expvar_stats_test.go Adds tests for expvar stat selection/coercion behavior.
TreeDB/caching/db.go Hooks expvar registration into Open and cleanup into Close.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copilot AI review requested due to automatic review settings March 18, 2026 00:00
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an expvar telemetry bridge for TreeDB caching so selected runtime stats (vlog mmap + process memory) are available live under /debug/vars as treedb, tracking the currently-open caching DB and clearing on close.

Changes:

  • Introduce expvar publisher that exports a filtered/coerced subset of DB.Stats() under treedb.
  • Wire DB lifecycle hooks to register/unregister the currently active DB for export.
  • Add unit tests for filtering and type coercion behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
TreeDB/caching/expvar_stats.go Implements expvar publishing, DB registration, filtering, and value coercion.
TreeDB/caching/db.go Registers the opened DB for expvar export and unregisters on close.
TreeDB/caching/expvar_stats_test.go Adds tests validating filtering and coercion for selected stats.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copilot AI review requested due to automatic review settings March 18, 2026 20:33
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an expvar bridge in the caching layer to expose a focused subset of TreeDB runtime stats (notably value-log mmap telemetry and process memory stats) under the top-level treedb key, enabling live forensics via /debug/vars.

Changes:

  • Add TreeDB/caching/expvar_stats.go to publish expvar var treedb, select only treedb.cache.vlog_mmap.* and treedb.process.memory.*, and coerce stat values into numeric/bool types where possible.
  • Register the active open caching *DB for expvar export on Open, and clear it on Close.
  • Add unit tests for stat selection/filtering and basic type coercion.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
TreeDB/caching/expvar_stats.go Implements expvar publishing, current-DB tracking, stat key filtering, and string→typed coercion.
TreeDB/caching/expvar_stats_test.go Tests filtering and coercion behavior for the expvar stat selection helper.
TreeDB/caching/db.go Hooks expvar DB registration into Open() and unregister into Close().

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants