Skip to content

maintenance: merge fanout clamp + vlog remap accounting fixes#840

Open
snissn wants to merge 11 commits intomainfrom
exp/merge-clamp-origin-main
Open

maintenance: merge fanout clamp + vlog remap accounting fixes#840
snissn wants to merge 11 commits intomainfrom
exp/merge-clamp-origin-main

Conversation

@snissn
Copy link
Copy Markdown
Owner

@snissn snissn commented Mar 17, 2026

Summary

  • cap zipper.mergeInternal worker fan-out to active children with pending ops
  • fix pooled child-slice defer capture so putChildWorkSlice receives the fully populated slice
  • remove extra active-children pass by counting during op assignment
  • fix valuelog remap accounting so dead-mapping bytes/count are charged only after a successful new mmap
  • route vlog generation rewrite/GC/close summary logs through existing TREEDB_DEBUG_VLOG_MAINT debug gate

Why

  • address correctness/perf findings from Copilot/Codex review on this branch head
  • keep hot-path behavior tighter (no extra pass, no unconditional maintenance logging)
  • avoid dead-map budget drift when remap attempts fail

Validation

  • go test ./TreeDB/zipper ./TreeDB/internal/valuelog ./TreeDB/caching
  • go vet ./...

Notes

  • This PR now intentionally includes both zipper fan-out and closely related value-log/maintenance fixes that were reviewed in this branch head.

Copilot AI review requested due to automatic review settings March 17, 2026 04:11
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts Zipper.mergeInternal parallel merge scheduling to avoid spawning more worker goroutines than there are active child subtrees with pending ops, reducing oversubscription while keeping existing parallelism thresholds and scheduling behavior.

Changes:

  • Count “active” children (those with len(child.ops) > 0) during internal merge setup.
  • Cap maxParallel worker fan-out to min(GOMAXPROCS, activeChildren) in the parallel merge path.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1f933080e4

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copilot AI review requested due to automatic review settings March 17, 2026 11:07
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts internal concurrency and telemetry around two areas: (1) limiting Zipper.mergeInternal’s parallel worker fan-out based on how many children actually have pending ops, and (2) extending value-log mmap “dead mapping” budgeting to include a byte-based cap, with corresponding tests and Stats keys.

Changes:

  • Clamp mergeInternal’s worker count to min(GOMAXPROCS, activeChildrenWithOps) in the parallel path.
  • Add MaxDeadMappedBytes (env-configurable) as an additional dead-mmap budget, track dead-mapped bytes, and expose the new metrics via Stats.
  • Update/extend valuelog unit tests to cover the new byte-budget behavior and new aggregation API.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
TreeDB/zipper/zipper.go Caps parallel worker fan-out based on number of active children with ops.
TreeDB/internal/valuelog/reader_mmap.go Adds byte-budget cap for dead mmaps + env var + cap check updates.
TreeDB/internal/valuelog/manager.go Tracks dead-mapped bytes per file; adds RemapStatsDetailed.
TreeDB/internal/valuelog/reader_mmap_cap_test.go Updates exhaustion checks for new signature; adds byte-budget test.
TreeDB/internal/valuelog/manager_test.go Adds aggregation test for RemapStatsDetailed (and consistency with RemapStats).
TreeDB/db/api.go Exposes dead-mapped bytes + byte-cap in DB Stats output.
TreeDB/caching/db.go Exposes dead-mapped bytes + byte-cap in caching DB Stats output.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copilot AI review requested due to automatic review settings March 17, 2026 13:23
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces unnecessary parallel worker fan-out in zipper.mergeInternal and adds additional observability/tuning around value-log mmap remapping and generational value-log maintenance.

Changes:

  • Clamp mergeInternal’s parallel worker count to the number of children that actually have pending ops.
  • Add a byte-budget cap for retained “dead” mmaps (MaxDeadMappedBytes) with env configuration, and track dead-mapped bytes per file.
  • Expand stats/telemetry and tests for remap byte accounting and vlog generation rewrite/GC execution.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
TreeDB/zipper/zipper.go Caps parallel worker fan-out by counting active children with pending ops.
TreeDB/internal/valuelog/reader_mmap.go Adds MaxDeadMappedBytes budget + env var; updates cap logic and dead-mapping byte accounting.
TreeDB/internal/valuelog/manager.go Tracks deadMappingsBytes, resets it on close, and exposes detailed remap stats.
TreeDB/internal/valuelog/reader_mmap_cap_test.go Updates cap tests for new deadMappingsCapExhausted signature and adds byte-budget coverage.
TreeDB/internal/valuelog/manager_test.go Adds test coverage for RemapStatsDetailed aggregation.
TreeDB/db/api.go Exposes new mmap dead-mapping byte/cap stats in DB.Stats().
TreeDB/caching/db.go Adds many generational rewrite/GC counters; emits additional stats keys; adds new log.Printf events.
TreeDB/caching/vlog_generation_scheduler_test.go Asserts presence/values of newly added generational rewrite/GC stats keys.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copilot AI review requested due to automatic review settings March 17, 2026 14:01
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces oversubscription in zipper internal-node merges by bounding worker fan-out to the number of children that actually have pending ops, and also extends value-log mmap remap limiting/telemetry plus cached-mode vlog-generation maintenance instrumentation.

Changes:

  • Cap mergeInternal parallelism (maxParallel) by the number of active children with non-empty op slices.
  • Add a byte-based cap for retained dead mmaps (MaxDeadMappedBytes), track dead-mapping bytes, and expose detailed remap stats.
  • Add cached-mode vlog-generation rewrite/GC counters, adjust post-rewrite GC protection set, and expand related tests/stats output.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
TreeDB/zipper/zipper.go Clamp mergeInternal parallel worker count to active children to avoid unnecessary goroutine fan-out.
TreeDB/internal/valuelog/reader_mmap.go Introduce dead-mmap byte budget, env override, and cap checks using both count+bytes.
TreeDB/internal/valuelog/manager.go Track dead-mapping bytes, reset on close, and add RemapStatsDetailed().
TreeDB/db/api.go Export new mmap dead-mapping byte/cap stats via DB.Stats().
TreeDB/internal/valuelog/reader_mmap_cap_test.go Update cap-exhaustion tests for new signature and add byte-budget coverage.
TreeDB/internal/valuelog/manager_test.go Add test validating RemapStatsDetailed() aggregation.
TreeDB/caching/db.go Add vlog-generation rewrite/GC counters + stats keys, change post-rewrite GC protection to in-use paths, and add logging/close summary.
TreeDB/caching/vlog_generation_scheduler_test.go Extend backend recorder to capture GC calls and assert new stats/GC behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

@snissn
Copy link
Copy Markdown
Owner Author

snissn commented Mar 17, 2026

Added commit 4dbddcb on this branch: one-shot post-rewrite GC follow-up scheduling + tests.

What changed:

  • Arm follow-up when post-rewrite GC deletes nothing.
  • Allow one follow-up GC opportunity to bypass queue/min-interval gating once armed delay elapses.
  • Keep checkpoint-kick path enabled when follow-up is pending.
  • Expose follow-up state in stats.

Tests added:

  • TestVlogGenerationGC_PostRewriteFollowupBypassesMinInterval
  • TestCheckpoint_KicksVlogGenerationGCFollowupWhenRewriteIneligible

Validation run notes are in issue #762 comments:

Caveat: this is a scheduling/correctness step; runtime debt and RSS impact are still noisy and not yet a conclusive win.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4dbddcb9e8

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copilot AI review requested due to automatic review settings March 17, 2026 15:20
@snissn
Copy link
Copy Markdown
Owner Author

snissn commented Mar 17, 2026

Added follow-on commit 979eb9e:

  • exposes follow-up GC telemetry counters (, )
  • extends scheduler tests to assert follow-up run accounting

This is to reduce ambiguity in run forensics and separate 'path never armed' vs 'armed but not executed'.

@snissn
Copy link
Copy Markdown
Owner Author

snissn commented Mar 17, 2026

Correction/clarification: follow-on commit 979eb9e adds follow-up GC telemetry counters (treedb.cache.vlog_generation.post_rewrite_gc.followup_armed_total, treedb.cache.vlog_generation.post_rewrite_gc.followup_runs) and test assertions for follow-up run accounting.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to reduce unnecessary parallelism in zipper.mergeInternal by clamping worker fan-out to the number of children that actually have pending ops, and it also expands value-log mmap budgeting/telemetry plus caching DB vlog-generation maintenance observability (stats + post-rewrite GC behavior + logging) with accompanying tests.

Changes:

  • Clamp mergeInternal parallel worker count to the number of “active” children (children with non-empty op slices).
  • Add a byte-budget cap for retained dead mmaps (MaxDeadMappedBytes), track dead-mmap bytes, and expose detailed remap stats.
  • Extend caching DB vlog-generation maintenance with more counters/stats, post-rewrite GC protected-path tightening + follow-up GC, and additional tests.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
TreeDB/zipper/zipper.go Caps parallel worker fan-out based on children that have pending ops.
TreeDB/internal/valuelog/reader_mmap.go Adds dead-mmap byte budget + env knob and enforces it in remap eligibility.
TreeDB/internal/valuelog/manager.go Tracks dead-mmap bytes and exposes RemapStatsDetailed().
TreeDB/db/api.go Exposes dead-mmap byte stats and cap in DB stats output.
TreeDB/internal/valuelog/reader_mmap_cap_test.go Updates tests for new dead-mmap budget logic and adds byte-budget test.
TreeDB/internal/valuelog/manager_test.go Adds coverage for RemapStatsDetailed() aggregation.
TreeDB/caching/db.go Adds vlog-generation maintenance counters, post-rewrite GC changes, follow-up GC, and new logging/summary-on-close.
TreeDB/caching/vlog_generation_scheduler_test.go Extends test backend to record GC calls and adds tests for post-rewrite GC behavior and follow-up GC.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

@snissn snissn changed the title zipper: cap mergeInternal worker fanout by active children maintenance: merge fanout clamp + vlog remap accounting fixes Mar 17, 2026
Copilot AI review requested due to automatic review settings March 17, 2026 17:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR tightens TreeDB’s internal maintenance behavior across the zipper merge path and value-log mmap/remap + generational rewrite/GC maintenance, aiming to improve correctness and reduce unnecessary work/logging in hot paths.

Changes:

  • Clamp zipper.mergeInternal parallel worker fan-out to the number of children that actually have pending ops, and fix pooled child-slice defer capture so pooled entries are properly cleared before reuse.
  • Fix value-log remap accounting so dead-mapping count/bytes are only charged after a successful new mmap; add dead-mapping byte budget support + stats plumbing.
  • Expand vlog-generation maintenance instrumentation (rewrite plan/exec counters, post-rewrite GC behavior and follow-up), and route related logs through the existing debug-gated maintenance logger.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
TreeDB/zipper/zipper.go Fix pooled child slice defer capture; count active children during op assignment and clamp parallelism accordingly.
TreeDB/internal/valuelog/reader_mmap.go Add dead-mapping byte budget and env config; update exhaustion logic; fix remap accounting to charge dead mappings only on successful remap.
TreeDB/internal/valuelog/reader_mmap_cap_test.go Update cap-exhaustion tests for new byte-budget parameter; add byte-budget coverage; ensure globals restored.
TreeDB/internal/valuelog/manager.go Track dead-mapping bytes per file; reset on Close; add RemapStatsDetailed() and update remap-cap checks to include bytes.
TreeDB/internal/valuelog/manager_test.go Add coverage for RemapStatsDetailed() aggregation; ensure test files are closed via File.Close().
TreeDB/db/api.go Export additional value-log mmap stats (dead-mapping bytes + cap_bytes) via backend DB stats.
TreeDB/caching/db.go Add vlog-generation rewrite/GC counters, post-rewrite GC follow-up behavior, debug-gated summary on Close, and additional stats keys.
TreeDB/caching/vlog_generation_scheduler_test.go Extend scheduler tests to validate new stats keys and post-rewrite GC protection behavior + follow-up GC behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR tightens merge-time parallelism in zipper.mergeInternal and improves value-log mmap remap budgeting/observability, aiming to reduce hot-path overhead and avoid remap-budget drift while keeping maintenance logs opt-in.

Changes:

  • Clamp mergeInternal worker fan-out to the number of children that actually have pending ops, and fix pooled child-slice defer capture.
  • Add dead-mapping byte-budgeting for value-log mmaps (env-configurable) and correct remap accounting so dead bytes/count are only charged after a successful new mmap.
  • Expand vlog-generation rewrite/GC instrumentation and route maintenance logs through the existing TREEDB_DEBUG_VLOG_MAINT debug gate.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
TreeDB/zipper/zipper.go Fix pooled-slice defer capture and cap parallel worker fan-out to active children.
TreeDB/internal/valuelog/reader_mmap.go Add MaxDeadMappedBytes + env parsing; include byte budget in dead-mapping exhaustion checks; only charge dead mappings after successful remap.
TreeDB/internal/valuelog/reader_mmap_cap_test.go Update helpers/tests for new byte-budget parameterization and add byte-budget coverage.
TreeDB/internal/valuelog/manager.go Track dead-mapping bytes per file; expose RemapStatsDetailed() and reset byte counters on close.
TreeDB/internal/valuelog/manager_test.go Add coverage for RemapStatsDetailed() aggregation and switch tests to close via File.Close().
TreeDB/db/api.go Export new value-log mmap dead-bytes and cap-bytes stats in backend DB stats.
TreeDB/caching/db.go Add vlog-generation rewrite/GC counters, post-rewrite GC in-use protection + follow-up GC arming, and debug-gated close summary logging; export new stats keys.
TreeDB/caching/vlog_generation_scheduler_test.go Extend test backend to record GC calls and add tests for post-rewrite GC protection and follow-up behavior/stats.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants