Skip to content

caching: staged debt-aware runtime vlog rewrite maintenance#842

Merged
snissn merged 23 commits intomainfrom
pr/vlog-maint-v2-runtime-debt
Mar 24, 2026
Merged

caching: staged debt-aware runtime vlog rewrite maintenance#842
snissn merged 23 commits intomainfrom
pr/vlog-maint-v2-runtime-debt

Conversation

@snissn
Copy link
Copy Markdown
Owner

@snissn snissn commented Mar 19, 2026

Summary

This PR takes a staged Maintenance V2 step for runtime value-log debt:

  1. Drain persisted rewrite debt more effectively in checkpoint-kick passes.
  2. Keep generic rewrite passes efficacy-aware so we avoid low-signal rewrites.

Changes

  • Added bounded multi-segment debt drain path for checkpoint-kick rewrite resume queues.
  • Added persistent queue/ledger debt-drain scheduler coverage.
  • Added generic rewrite efficacy floor for planner/executor:
    • MinSegmentStaleRatio for generic passes (with size-aware fallback).
    • MinSegmentStaleBytes floor.
  • Ensured fresh same-pass plan materialization does not over-drain in one shot; debt-drain is applied to explicit resume debt.

Tests

  • go test ./TreeDB/caching -count=1
  • go test ./TreeDB/db ./TreeDB/internal/valuelog ./TreeDB/tree -count=1
  • go vet ./...

run_celestia evidence (branch head)

Code: ab72ba68

fast

  • Home: /home/mikers/.celestia-app-mainnet-treedb-20260318154110
  • duration_seconds=268
  • max_rss_kb=7764772
  • end_app_bytes=4749147116
  • du -sb application.db=4749147116
  • post vlog-gc -rw: 4749150764
  • post vlog-rewrite -rw: 2494985619

wal_on_fast

  • Home: /home/mikers/.celestia-app-mainnet-treedb-20260318154549
  • duration_seconds=316
  • max_rss_kb=8770156
  • end_app_bytes=5021754982
  • du -sb application.db=5021754982
  • post vlog-gc -rw: 5021754574
  • post vlog-rewrite -rw: 2698181953

Notes on rejected follow-up (not in this PR)

A subsequent local experiment that used non-foreground-cancel contexts for full debt-drain planning+execution caused major wall/end-size regressions in fast (duration_seconds up to 440-781s, end_app_bytes ~6.0-6.2GB) and was reverted locally.

Copilot AI review requested due to automatic review settings March 19, 2026 02:46
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR advances “Maintenance V2” for caching DB runtime value-log (vlog) rewrite maintenance by (a) draining persisted rewrite debt more effectively during checkpoint-kick passes and (b) preventing low-signal generic rewrites via an efficacy floor.

Changes:

  • Add generic rewrite efficacy floors via MinSegmentStaleRatio (with size-aware fallback) and MinSegmentStaleBytes for planning/execution.
  • Add bounded multi-segment debt-drain behavior for checkpoint-kick rewrite resume queues (capped by vlogGenerationRewriteDebtDrainMaxSegments and budget).
  • Extend scheduler tests to cover multi-segment debt drain and new planning option propagation.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
TreeDB/caching/db.go Adds rewrite efficacy floors and a debt-drain multi-segment scheduler path, including checkpoint-kick wiring.
TreeDB/caching/vlog_generation_scheduler_test.go Adds assertions for new plan options and a new test covering multi-segment debt drain behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +11346 to +11351
if queueLen < maxSegments {
maxSegments = queueLen
}
if queueLen > maxSegments {
maxSegments = queueLen
}
Comment on lines +811 to +813
{FileID: 11, BytesLive: 64, BytesTotal: 128, BytesStale: 64, StaleRatio: 0.2},
{FileID: 22, BytesLive: 64, BytesTotal: 128, BytesStale: 64, StaleRatio: 0.7},
{FileID: 33, BytesLive: 64, BytesTotal: 128, BytesStale: 64, StaleRatio: 0.9},
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ab72ba687e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +11687 to +11691
minStaleRatio := db.vlogGenerationRewriteMinStaleRatioForGenericPass(totalBytes)
plan, err := planner.ValueLogRewritePlan(ctx, backenddb.ValueLogRewriteOnlineOptions{
MaxSourceSegments: 0,
MaxSourceBytes: maxSourceBytes,
MaxSourceSegments: 0,
MaxSourceBytes: maxSourceBytes,
MinSegmentStaleRatio: minStaleRatio,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid gating total-bytes/churn rewrites on per-segment stale ratio

This new MinSegmentStaleRatio is applied to the generic total-bytes/churn planner, not just the stale-ratio trigger. Once totalBytes >= 512 MiB, the helper returns valueLogRewriteTriggerRatioPPM (20% by default), and ValueLogRewritePlan then drops every segment whose individual stale ratio is below that floor (TreeDB/db/vlog_rewrite.go:847-848). On workloads where stale bytes are spread across many segments (for example, lots of updates leaving each segment ~5-15% stale), shouldRunVlogGenerationRewrite still fires for total-bytes/churn, but the plan comes back empty and lines 11728-11729 disable rewrite entirely, so the total-bytes trigger no longer reclaims space and the vlog can keep growing.

Useful? React with 👍 / 👎.

Comment on lines 11811 to +11815
ledger, _ := db.currentVlogGenerationRewriteLedger()
if len(ledger) > 0 {
processedRewriteIDs = vlogGenerationRewriteLedgerChunk(ledger, vlogGenerationRewriteResumeMaxSegments, budgetTokens)
processedRewriteIDs = vlogGenerationRewriteLedgerChunk(ledger, rewriteMaxSegments, budgetTokens)
} else {
processedRewriteIDs = vlogGenerationRewriteQueueChunk(rewriteQueue, vlogGenerationRewriteResumeMaxSegments)
processedRewriteIDs = vlogGenerationRewriteQueueChunk(rewriteQueue, rewriteMaxSegments)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep queue-only rewrite debt budgeted during checkpoint drain

Checkpoint-kick debt drain can now raise rewriteMaxSegments above 1, but this fallback still slices queue-only debt by count only. We still persist queue-only state at lines 11822-11823 when a planner returns SourceFileIDs without SelectedSegments, and vlogGenerationRewriteQueueChunk (TreeDB/caching/vlog_generation_state.go:257-264) ignores budgetTokens entirely. In that supported resume path, a single checkpoint kick can rewrite up to 8 full segments even when the token bucket only has budget for one segment, which defeats the runtime budget cap and can reintroduce the large maintenance bursts this change is trying to avoid.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d84a1656e1

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +11884 to +11887
if stats.BytesBefore > 0 && stats.BytesAfter >= stats.BytesBefore {
db.vlogGenerationRewriteIneffectiveRuns.Add(1)
db.vlogGenerationRewriteIneffectiveBytesIn.Add(uint64(stats.BytesBefore))
db.vlogGenerationRewriteIneffectiveBytesOut.Add(uint64(stats.BytesAfter))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Defer ineffective-run accounting until after zombie GC

In normal cached maintenance runs that actually rewrite data, this branch will still classify the pass as “ineffective”. ValueLogRewriteOnline only marks old source segments zombie (TreeDB/internal/valuelog/manager.go:1191) and then sets BytesAfter from valueLogSegmentStats, which counts every on-disk value-log file before the follow-up GC deletes those zombies (TreeDB/db/vlog_rewrite.go:1200-1205, 2301-2314). As a result, any successful rewrite that creates replacement segments will usually satisfy BytesAfter >= BytesBefore, so these new counters will report ineffective runs and whole-DB byte totals even when the subsequent ValueLogGC in this function reclaims the space immediately.

Useful? React with 👍 / 👎.

Copilot AI review requested due to automatic review settings March 19, 2026 07:08
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR advances the caching DB’s “Maintenance V2” runtime value-log rewrite strategy by improving how persisted rewrite debt is drained (especially on checkpoint-kick passes) and by adding efficacy floors to avoid low-signal rewrites that can amplify bytes and hurt wall time.

Changes:

  • Add debt-drain support for checkpoint-kick rewrite resumes, including bounded multi-segment selection.
  • Add “generic pass” rewrite efficacy floors (MinSegmentStaleRatio + MinSegmentStaleBytes) and track ineffective rewrites via new Stats counters.
  • Expand scheduler test coverage for debt-drain behavior, efficacy floors, and bypass-quiet planner behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
TreeDB/caching/db.go Adds debt-drain scheduling, new planner/exec knobs, new ineffective-run metrics, and introduces a bypass-quiet maintenance context.
TreeDB/caching/vlog_generation_scheduler_test.go Adds tests validating multi-segment debt drain, ineffective-run telemetry, and quiet-window bypass behavior.
Comments suppressed due to low confidence (1)

TreeDB/caching/db.go:11631

  • In the stale-ratio trigger planning pass, MinSegmentStaleBytes is still hard-coded as 1 even though vlogGenerationRewriteMinSegmentStaleBytes was introduced alongside the new generic-pass knobs. Using the constant here would keep the planner behavior consistent and avoid future drift if the floor changes.
		plan, err := planner.ValueLogRewritePlan(ctx, backenddb.ValueLogRewriteOnlineOptions{
			MaxSourceSegments:    0,
			MaxSourceBytes:       maxSourceBytes,
			MinSegmentStaleRatio: minStaleRatio,
			MinSegmentStaleBytes: 1,
		})

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +7151 to +7154
if timeout > 0 {
return context.WithTimeout(context.Background(), timeout)
}
return context.WithCancel(context.Background())
Comment on lines +11362 to +11367
if queueLen < maxSegments {
maxSegments = queueLen
}
if queueLen > maxSegments {
maxSegments = queueLen
}
if err != nil {
t.Fatalf("open backend: %v", err)
}
t.Cleanup(func() { _ = backend.Close() })
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 801a18f660

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +7146 to +7150
func (db *DB) vlogGenerationMaintenanceContext(timeout time.Duration, opts vlogGenerationMaintenanceOptions) (context.Context, context.CancelFunc) {
// Checkpoint-kick maintenance is an explicit caller opt-in to run outside the
// quiet-window gate. Keep this context timeout-bounded, but do not
// self-cancel on immediate foreground activity resumes.
if opts.bypassQuiet {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve close-triggered cancellation in bypassQuiet planners

When bypassQuiet is set this helper switches to context.WithTimeout(context.Background(), ...), so the planner no longer observes db.closeCh. The checkpoint-kick path runs maybeRunVlogGenerationMaintenanceWithOptions(... bypassQuiet: true ...) in a goroutine that is counted in db.wg, and Close() waits on that waitgroup before teardown (TreeDB/caching/db.go:12097-12120, 12916-12919). On a large tree, closing during a checkpoint-kick rewrite plan will now block until the full 30s planner timeout instead of canceling promptly.

Useful? React with 👍 / 👎.

Comment on lines +11380 to +11384
perSegmentBudget := db.valueLogGenerationWarmTarget
if perSegmentBudget <= 0 {
perSegmentBudget = defaultVlogGenerationWarmTargetBytes
}
if perSegmentBudget <= 0 {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Budget queue-only debt drains by actual segment size

Fresh evidence relative to the earlier queue-only budget comment: this helper now budgets legacy/persisted queue-only debt using valueLogGenerationWarmTarget, not the queued segments' real live bytes. That still underestimates cold or oversized source segments: with the default 256 MiB warm / 512 MiB cold targets, a 512 MiB token bucket authorizes two queued cold segments because byBudget = budgetTokens / warmTarget, and the queue fallback at TreeDB/caching/db.go:11837-11842 then slices only by count. Because tokens are only debited after the rewrite finishes (TreeDB/caching/db.go:11959-11976), one checkpoint-kick pass can still overspend the runtime budget by 2x or more on supported queue-only resume state loaded from vlog_generation_state.json.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8463c62153

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +11611 to +11613
if !quiet && !opts.bypassQuiet {
shouldRewrite = false
reason = vlogGenerationReasonNone
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve rewrite progress on hot periodic passes without checkpoints

In TreeDB/caching/db.go I checked that checkpoint-kicks only happen from Checkpoint() (maybeKickVlogGenerationMaintenanceAfterCheckpoint) and that StartAutoCheckpoint() is opt-in, so many deployments can run long stretches with no checkpoints at all. This new guard clears shouldRewrite on every hot maybeRunVlogGenerationMaintenance(true) pass, which means sustained-write WAL-on workloads now lose their only rewrite path under load: total-bytes/churn triggers and any persisted rewrite-resume queue will stop making progress until the foreground finally goes quiet. That can let rewrite debt and retained bytes grow indefinitely in otherwise supported configurations.

Useful? React with 👍 / 👎.

Copilot AI review requested due to automatic review settings March 19, 2026 07:48
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR advances the caching DB’s generational value-log maintenance (“Maintenance V2”) to better drain persisted rewrite debt during checkpoint-kick passes and to avoid low-signal/low-efficacy rewrites during generic maintenance runs.

Changes:

  • Add rewrite-plan outcome tracking + cancel backoff to reduce immediate re-planning after cancellations.
  • Add bounded multi-segment “debt drain” execution for persisted rewrite resume queues (checkpoint-kick).
  • Add generic rewrite efficacy floors via MinSegmentStaleRatio / MinSegmentStaleBytes, plus tests covering the new scheduling behavior and stats.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
TreeDB/caching/db.go Adds planning outcome stats/backoff, debt-drain segment selection, and generic rewrite efficacy floors; threads new options through checkpoint-kick maintenance.
TreeDB/caching/vlog_generation_scheduler_test.go Extends scheduler tests to cover multi-segment debt drain, plan outcome stats, cancel backoff, and ineffective rewrite tracking.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +7153 to +7164
func (db *DB) vlogGenerationMaintenanceContext(timeout time.Duration, opts vlogGenerationMaintenanceOptions) (context.Context, context.CancelFunc) {
// Checkpoint-kick maintenance is an explicit caller opt-in to run outside the
// quiet-window gate. Keep this context timeout-bounded, but do not
// self-cancel on immediate foreground activity resumes.
if opts.bypassQuiet {
if timeout > 0 {
return context.WithTimeout(context.Background(), timeout)
}
return context.WithCancel(context.Background())
}
return db.foregroundMaintenanceContext(timeout)
}
Comment on lines +11369 to +11380
if queueLen < maxSegments {
maxSegments = queueLen
}
if queueLen > maxSegments {
maxSegments = queueLen
}
if maxSegments > vlogGenerationRewriteDebtDrainMaxSegments {
maxSegments = vlogGenerationRewriteDebtDrainMaxSegments
}
if maxSegments < 1 {
maxSegments = 1
}
Comment on lines +114 to +119
func (b *blockingRewritePlannerBackend) recordedRewriteCalls() int {
b.mu.Lock()
defer b.mu.Unlock()
return b.rewriteCalls
}

Comment on lines +837 to +839
{FileID: 11, BytesLive: 64, BytesTotal: 128, BytesStale: 64, StaleRatio: 0.2},
{FileID: 22, BytesLive: 64, BytesTotal: 128, BytesStale: 64, StaleRatio: 0.7},
{FileID: 33, BytesLive: 64, BytesTotal: 128, BytesStale: 64, StaleRatio: 0.9},
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f398e0967d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +11622 to +11625
planBackoff := len(rewriteQueue) == 0 && db.vlogGenerationRewritePlanBackoffActive(now)
if planBackoff {
shouldRewrite = false
reason = vlogGenerationReasonNone
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Exempt checkpoint-kick rewrites from plan-cancel backoff

On write-heavy workloads where a periodic planner run is canceled by renewed foreground activity, observeVlogGenerationRewritePlanOutcome records the cancel time and this branch suppresses all new rewrite planning for 5 seconds even when maybeKickVlogGenerationMaintenanceAfterCheckpoint later calls maintenance with bypassQuiet: true. Because checkpoint kicks are also rate-limited, one canceled background plan can consume the next checkpoint-triggered maintenance window and leave rewrite debt untouched until a later checkpoint, which undermines the new debt-drain path under sustained ingest.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a29ba223e3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +11644 to +11648
rewriteCancelBackoff := len(rewriteQueue) > 0 && db.vlogGenerationRewriteCancelBackoffActive(now)
if rewriteCancelBackoff {
shouldRewrite = false
reason = vlogGenerationReasonNone
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Skip rewrite-cancel backoff for checkpoint debt drains

On write-heavy WAL-on workloads, ValueLogRewriteOnline can still be canceled by resumed foreground activity because the executor later in maybeRunVlogGenerationMaintenanceWithOptions still uses foregroundMaintenanceContext. Once that happens, this new branch suppresses every queued rewrite for 20 seconds, including the checkpoint-kick maybeKickVlogGenerationMaintenanceAfterCheckpoint(... rewriteDebtDrain: true ...) path. A single foreground-canceled chunk can therefore burn several checkpoint windows and leave persisted rewrite debt idle, which undermines the new debt-drain behavior this change is adding.

Useful? React with 👍 / 👎.

Copilot AI review requested due to automatic review settings March 19, 2026 09:18
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR advances the caching DB’s “Maintenance V2” generational value-log rewrite scheduler to better drain persisted rewrite debt (especially on checkpoint-kick passes) while avoiding low-efficacy rewrites during generic maintenance passes.

Changes:

  • Added pruning of persisted rewrite ledger/queue entries with non-positive live bytes, and ensured resume selection skips zero-live segments.
  • Added bounded multi-segment “debt drain” execution mode for checkpoint-kick rewrite resumes.
  • Added additional scheduler observability (plan outcomes, cancel backoffs, ineffective rewrites, queue prune stats) and expanded test coverage around these behaviors.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
TreeDB/caching/vlog_generation_state.go Adds persisted-ledger pruning for non-positive live bytes and filters zero-live segments from ledger chunk selection.
TreeDB/caching/vlog_generation_scheduler_test.go Adds tests covering pruning, debt-drain multi-segment resumes, cancel backoff behavior, and new stats tracking.
TreeDB/caching/db.go Implements debt-drain segment selection limits, efficacy floors for generic plans/exec, prune integration, cancel backoff tracking, and new Stats() counters.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +11374 to +11379
if queueLen < maxSegments {
maxSegments = queueLen
}
if queueLen > maxSegments {
maxSegments = queueLen
}
Comment on lines 307 to +312
if seg.FileID == 0 {
continue
}
if seg.BytesLive <= 0 {
continue
}
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: eaae4f61e9

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

if db == nil || db.valueLogGenerationPolicy != uint8(backenddb.ValueLogGenerationHotWarmCold) {
return
}
quiet := db.foregroundActivityQuietFor(time.Now(), vlogGenerationMaintenanceQuietWindow, vlogForegroundReadQuietWindow)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Recompute quiet-window state after retained-prune waits

quiet is sampled here before waitForRetainedValueLogPrune()/checkpoint serialization, but the later guards at TreeDB/caching/db.go:11616-11617 and 11682-11694 reuse that stale value. On a periodic runGC pass that entered while quiet and then blocks behind an active retained prune, foreground reads/writes that resume during the wait are invisible, so the pass can still reach rewrite planning/execution. Because the rewrite contexts are created only after the wait, they also will not cancel for activity that already happened, which reintroduces hot-path rewrites on the very passes this change is trying to keep quiet-window-bound.

Useful? React with 👍 / 👎.

Comment on lines +12395 to +12398
func (db *DB) vlogGenerationRewriteMinStaleRatioForGenericPass(totalBytes int64) float64 {
if totalBytes < vlogGenerationRewriteEfficacyMinTotalBytes {
return 0
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep a nonzero efficacy floor for small churn-triggered rewrites

This helper returns 0 until the value log reaches 512 MiB, and maybeRunVlogGenerationMaintenance then feeds that into the generic rewrite planner/executor. On deployments that use ValueLogRewriteTriggerChurnPerSec with a sub-512 MiB DB, a churn-triggered pass therefore runs with no stale-ratio floor and only MinSegmentStaleBytes=1, so a segment with a single overwritten value can be selected for rewrite. That defeats the new “generic passes should be efficacy-aware” goal and can still amplify bytes/wall time on small but write-heavy datasets.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

snissn commented Mar 19, 2026

Progress update on pr/vlog-maint-v2-runtime-debt (new head 37f1422c):

What changed

  • Added a maintenance pass gate to serialize maybeRunVlogGenerationMaintenanceWithOptions(...) calls (vlogGenerationMaintenanceActive CAS guard).
  • Added harmful rewrite debt handling:
    • mark materially growth-ineffective rewrites (growth >= 4MiB) and clear remaining queued debt after consuming processed chunk.
    • also clear remaining queued debt on no-progress resume rewrites (BytesAfter==BytesBefore && RecordsCopied==0).
  • Added scheduler tests:
    • TestVlogGenerationMaintenance_SerializesConcurrentRuns
    • TestVlogGenerationRewrite_DropsRemainingDebtAfterMaterialIneffectiveGrowth
    • TestVlogGenerationRewrite_KeepsRemainingDebtForSmallIneffectiveGrowth
    • TestVlogGenerationRewrite_DropsRemainingDebtAfterNoProgressResume

Validation

  • go test ./TreeDB/caching -count=1
  • go test ./TreeDB/db ./TreeDB/internal/valuelog ./TreeDB/tree -count=1
  • go vet ./...
    (all passing)

run_celestia (debug maint)

  1. wal_on_fast (home /home/mikers/.celestia-app-mainnet-treedb-20260319013246)
  • duration_seconds=801
  • max_rss_kb=10197360
  • end_app_bytes=7132616105
  • debug logs show rewrite_ineffective_drop_queue firing with both drop_reason=material_growth and drop_reason=no_progress.
  • also revealed overlapping rewrite starts (reason=total_bytes + reason=rewrite_resume) before first completion, which motivated the new maintenance-active gate.
  1. wal_on_fast after gate (home /home/mikers/.celestia-app-mainnet-treedb-20260319015301)
  • duration_seconds=280
  • max_rss_kb=9268340
  • end_app_bytes=4731374108
  • this shorter snapshot window had no rewrite-maintenance events (rewrite_start count 0), so it confirms no immediate regression but is not a stress proof for queue behavior by itself.

Next step

  • Run another longer stress pass where rewrite actually triggers (or fixed-height comparable replay) to quantify whether the overlap gate + debt-drop reduces rewrite_resume churn and runtime debt under heavy maintenance pressure.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 37f1422ce3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +12064 to +12067
if len(processedRewriteIDs) > 0 && stats.BytesAfter >= stats.BytesBefore {
growth := int64(stats.BytesAfter) - int64(stats.BytesBefore)
if growth >= vlogGenerationRewriteIneffectiveGrowthMinBytes {
dropRewriteDebt = true
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid dropping queued rewrite debt on pre-GC byte growth

ValueLogRewriteStats.BytesAfter is computed before the follow-up ValueLogGC removes the zombie source segments (TreeDB/db/vlog_rewrite.go:1189-1205), so a normal successful queued rewrite often reports BytesAfter > BytesBefore by roughly the copied live bytes. With the new material_growth check here, any resumed chunk larger than 4 MiB will clear the rest of vlog_generation_state.json at lines 12082-12089 even though the later GC in this same function reclaims the old segments. On multi-segment debt-drain/rewrite-resume runs, that means one successful chunk can abandon all remaining planned segments until a full replan happens.

Useful? React with 👍 / 👎.

Comment on lines +12069 to +12073
} else if growth == 0 && stats.RecordsCopied == 0 {
// A queued rewrite that copies nothing and does not reduce bytes is
// effectively a no-op resume loop; drop remaining debt so the
// scheduler can re-plan from current state instead of retrying.
dropRewriteDebt = true
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve later queued segments when a resumed source ID vanished

In the supported queue-only resume path (no SelectedSegments ledger), vlogGenerationRewriteQueueChunk can hand the rewriter a source ID that has already disappeared. selectRewriteSourceSegments silently ignores missing IDs, and ValueLogRewriteOnline returns a no-op when none remain (TreeDB/db/vlog_rewrite.go:793-801, 945-963). This new no_progress branch then clears the entire remaining queue at lines 12082-12089, so one stale head entry can discard later queued segments that still exist and should be resumed.

Useful? React with 👍 / 👎.

Copilot AI review requested due to automatic review settings March 19, 2026 12:44
Copy link
Copy Markdown
Owner Author

snissn commented Mar 20, 2026

Update on head 905e6874:

Implemented two scheduler hardening fixes from review:

  1. Checkpoint-kick bypass now ignores ineffective-backoff for fresh replans.
  • ineffectiveBackoff now includes && !allowCheckpointKickBypass.
  • This prevents a prior ineffective queued rewrite from suppressing checkpoint-kick replans for the full cooldown window.
  1. Planner deadlines are treated as canceled planner passes.
  • Added helper isVlogGenerationPlannerCanceled(err) (context.Canceled OR context.DeadlineExceeded).
  • Used in planner outcome accounting and both planner call sites’ error branches.

Tests added:

  • TestVlogGenerationRewrite_IneffectiveBackoffCheckpointKickBypasses
  • TestVlogGenerationRewritePlan_DeadlineBackoffSkipsImmediateRetry

Validation run:

  • go test ./TreeDB/caching -run 'Test(VlogGenerationRewritePlan_DeadlineBackoffSkipsImmediateRetry|VlogGenerationRewritePlan_CancelBackoffCheckpointKickBypasses|VlogGenerationRewrite_IneffectiveBackoffCheckpointKickBypasses|VlogGenerationRewrite_IneffectiveBackoffSkipsImmediateGenericRetry|VlogGenerationRewrite_IneffectiveBackoffExpires)' -count=1
  • go test ./TreeDB/caching ./TreeDB/internal/memtable ./TreeDB/tree -count=1
  • go vet ./...

Runtime note:

  • I attempted a run_celestia wal validation on this head with maint debug enabled, but the run aborted very early due node-error log detection from transient p2p reconnect/auth failures (no useful runtime maintenance signal captured), so I did not treat it as perf evidence.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +7180 to +7203
func (db *DB) vlogGenerationMaintenanceContext(timeout time.Duration, opts vlogGenerationMaintenanceOptions) (context.Context, context.CancelFunc) {
// Checkpoint-kick maintenance is an explicit caller opt-in to run outside the
// quiet-window gate. Keep this context timeout-bounded, but do not
// self-cancel on immediate foreground activity resumes.
if opts.bypassQuiet {
if timeout > 0 {
return context.WithTimeout(context.Background(), timeout)
}
return context.WithCancel(context.Background())
}
return db.foregroundMaintenanceContext(timeout)
}

func (db *DB) vlogGenerationRewritePlanContext(timeout time.Duration, opts vlogGenerationMaintenanceOptions) (context.Context, context.CancelFunc) {
// Keep planner calls quiet-window-gated, but tolerate short foreground
// activity resumes so sub-second plan scans can complete.
if opts.bypassQuiet {
if timeout > 0 {
return context.WithTimeout(context.Background(), timeout)
}
return context.WithCancel(context.Background())
}
return db.foregroundMaintenanceContextWithResumeGrace(timeout, vlogGenerationRewritePlanResumeGrace)
}
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In bypass-quiet mode (checkpoint-kick), the returned contexts are based on context.Background() and don’t get canceled when the DB is closing (closeCh). Since Close() waits on db.wg after closing closeCh, a long-running plan/rewrite/GC kicked in bypass mode can delay shutdown until the timeout elapses. Consider deriving these contexts from a cancelable base that is canceled on db.closeCh (while still bypassing foreground-activity cancel semantics).

Copilot uses AI. Check for mistakes.
var ctx context.Context
var cancel context.CancelFunc
if hadRewriteQueue && len(processedRewriteIDs) > 0 {
ctx, cancel = context.WithTimeout(context.Background(), vlogGenerationRewriteBoundedExecTimeout)
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Queued/debt-drain rewrite execution uses context.WithTimeout(context.Background(), vlogGenerationRewriteBoundedExecTimeout), which won’t be canceled on db.closeCh. Because Close() waits on db.wg, a bounded queued rewrite can still hold shutdown for up to the full timeout. Suggest using a context that ignores foreground-activity resumes but still cancels promptly on DB close (e.g., tie cancellation to closeCh).

Suggested change
ctx, cancel = context.WithTimeout(context.Background(), vlogGenerationRewriteBoundedExecTimeout)
ctx, cancel = db.foregroundMaintenanceContext(vlogGenerationRewriteBoundedExecTimeout)

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 905e687414

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +11635 to +11639
func (db *DB) schedulePendingVlogGenerationCheckpointKick() {
if db == nil || db.closing.Load() {
return
}
if !db.vlogGenerationCheckpointKickPending.CompareAndSwap(true, false) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Respect disabled checkpoint-kick policy in deferred retries

This deferred retry helper bypasses the same admission checks that maybeKickVlogGenerationMaintenanceAfterCheckpoint() uses for checkpoint kicks. In particular, it never checks TREEDB_DISABLE_VLOG_GENERATION_CHECKPOINT_KICK (or testSkipVlogCheckpointKick), so a queued rewrite that returns context.Canceled in maybeRunVlogGenerationMaintenanceWithOptions will still set vlogGenerationCheckpointKickPending and eventually re-enter maintenance with skipCheckpoint:false. In deployments that disable checkpoint kicks to avoid extra checkpoint stalls, a canceled resume chunk will now force the very checkpoint-kick maintenance path the operator asked to suppress.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 95f4d278fa

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +12135 to +12137
ctx, cancel = context.WithTimeout(context.Background(), vlogGenerationRewriteBoundedExecTimeout)
} else {
ctx, cancel = db.foregroundMaintenanceContext(2 * time.Minute)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Wire bypass-quiet context into checkpoint-kick execution

The new vlogGenerationMaintenanceContext added above is never used, so checkpoint-kick maintenance still runs the first freshly planned rewrite chunk here—and GC still does the same at db.go:12350—under foregroundMaintenanceContext. Any read/write that resumes after the checkpoint cancels those operations immediately even when opts.bypassQuiet was set specifically to let checkpoint-triggered maintenance run outside the quiet window. On sustained-ingest workloads this leaves fresh checkpoint kicks unable to execute their first rewrite chunk, and GC-only kicks have no retry path at all.

Useful? React with 👍 / 👎.

Copilot AI review requested due to automatic review settings March 20, 2026 01:21
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +11413 to +11418
if queueLen < maxSegments {
maxSegments = queueLen
}
if queueLen > maxSegments {
maxSegments = queueLen
}
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In vlogGenerationRewriteMaxSegmentsForRun, the queueLen comparison logic is redundant and effectively forces maxSegments to become queueLen for any queueLen != vlogGenerationRewriteResumeMaxSegments. That makes the initial resume cap (vlogGenerationRewriteResumeMaxSegments) irrelevant in debt-drain mode and is confusing to read/maintain. Consider replacing the two comparisons with a single, clearer assignment (e.g., cap directly to min(queueLen, vlogGenerationRewriteDebtDrainMaxSegments)) or otherwise expressing the intended policy explicitly.

Suggested change
if queueLen < maxSegments {
maxSegments = queueLen
}
if queueLen > maxSegments {
maxSegments = queueLen
}
// In debt-drain mode, cap by the current queue length and the per-run maximum.
maxSegments = queueLen

Copilot uses AI. Check for mistakes.
Comment on lines +4908 to +4909
// During checkpoint-kick debt drain, allow a bounded multi-segment rewrite
// selection so debt can converge faster than one-segment-per-pass.
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on vlogGenerationRewriteDebtDrainMaxSegments says checkpoint-kick debt drain allows a bounded multi-segment rewrite selection, but vlogGenerationRewriteMaxSegmentsForRun currently returns 1 segment for checkpoint-kick runs (opts.bypassQuiet && !opts.skipCheckpoint). Either the comment should be adjusted to match the single-segment checkpoint-kick behavior, or the selection policy should be updated if multi-segment checkpoint-kick drain is actually intended.

Suggested change
// During checkpoint-kick debt drain, allow a bounded multi-segment rewrite
// selection so debt can converge faster than one-segment-per-pass.
// Upper bound on the number of segments that may be selected in a single
// checkpoint-kick debt-drain rewrite pass. The current selection policy
// still runs with a single segment per pass; this limit exists to cap any
// future multi-segment checkpoint-kick behavior.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 41976c452a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +11653 to +11657
db.maybeRunVlogGenerationMaintenanceWithOptions(true, vlogGenerationMaintenanceOptions{
bypassQuiet: true,
skipRetainedPruneWait: true,
// This path retries a checkpoint-triggered kick that collided with an
// already-active maintenance pass.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Wait for active checkpoints before retrying a pending kick

Unlike maybeKickVlogGenerationMaintenanceAfterCheckpoint() in the same file, this deferred retry path does not wait on checkpointCond before re-entering maintenance. If an auto/user checkpoint is already running, maybeRunVlogGenerationMaintenanceWithOptions immediately bails out at db.go:11736-11738, and because line 11644 has already cleared vlogGenerationCheckpointKickPending, the queued retry is lost instead of waiting for the checkpoint to finish. In that interleaving, a foreground-canceled rewrite resume can miss its only checkpoint-kick follow-up until some later checkpoint or quiet periodic pass happens.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

snissn commented Mar 20, 2026

Update on pr/vlog-maint-v2-runtime-debt (new commits):

  • 95f4d278 — prioritize queued checkpoint-kick retries over generic periodic passes.
  • 41976c45 — cap checkpoint-kick debt-drain to 1 segment/pass (generic debt-drain unchanged).

Validation:

  • go test ./TreeDB/caching ./TreeDB/internal/memtable ./TreeDB/tree -count=1
  • go vet ./...
  • targeted scheduler tests for pending-kick prioritization + debt-drain caps all pass.

run_celestia (serial, frozen-remote-height, local gomap head):

  1. fast (/home/mikers/.celestia-app-mainnet-treedb-20260319152147)
  • duration_seconds=280
  • max_rss_kb=8224092
  • end_app_bytes=5073563545
  • post-maint:
    • before: 5073563545
    • after vlog-gc: 4805131497
    • after vlog-rewrite: 2424115113
  1. wal_on_fast (/home/mikers/.celestia-app-mainnet-treedb-20260319152815)
  • duration_seconds=307
  • max_rss_kb=8788684
  • end_app_bytes=4846345035
  • post-maint:
    • before: 4846345035
    • after vlog-gc: 4846348274
    • after vlog-rewrite: 2493470802

Maint behavior note:

  • wal_on_fast had repeated checkpoint-kicks (pending=true) but no online rewrite execution in this run.
  • fast executed one rewrite, but it was still ineffective (bytes_after > bytes_before) and dropped queue (material_growth).

I’ll next target the pending=true churn path in wal mode and planner selection under high write pressure (why checkpoint kicks repeatedly don’t transition into rewrite execution).

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 975af29196

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +11673 to +11677
// A new collision re-queued pending=true. Consume that token and retry.
if !db.vlogGenerationCheckpointKickPending.CompareAndSwap(true, false) {
return
}
time.Sleep(time.Duration(attempt+1) * vlogGenerationPendingKickRetryStep)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep pending checkpoint kicks armed after retry budget expires

Fresh evidence relative to the earlier active-collision handling: this retry loop still drops a checkpoint-triggered pass whenever the conflicting maintenance run lasts longer than the 4 immediate retries here (about 50ms total). maybeRunVlogGenerationMaintenanceWithOptions re-sets vlogGenerationCheckpointKickPending on each collision, but these lines immediately consume that bit again before sleeping; once the loop exits, no state remains to retry after the active planner/rewrite/GC pass finishes. Since those passes can run for seconds (30s planner/GC timeouts and 2m rewrite timeout in the same file), a normal long maintenance run can still silently discard the queued checkpoint kick until some later checkpoint or quiet periodic pass happens.

Useful? React with 👍 / 👎.

Copilot AI review requested due to automatic review settings March 20, 2026 01:53
Copy link
Copy Markdown
Owner Author

snissn commented Mar 20, 2026

Follow-up update:

I tested and then reverted 975af291 (Revert "caching: retry pending checkpoint-kick after active collision") because it regressed wal profile outcomes.

What happened with 975af291:

  • It did force wal_on_fast to execute one online rewrite (instead of zero).
  • But that rewrite was ineffective (bytes_after > bytes_before) and still dropped queue on material_growth.
  • Across two serial wal runs it regressed:
    • run1: duration=320s, max_rss_kb=10958896, end_app_bytes=5305938152
    • run2: duration=317s, max_rss_kb=11872692, end_app_bytes=5445418832
    • post vlog-rewrite floor in run2: 2775105850 (worse than prior baseline run on this branch).

Branch head is now 1715486d (revert applied), preserving only:

  • 95f4d278 pending-kick prioritization over periodic pass
  • 41976c45 checkpoint-kick debt-drain cap to 1 segment/pass

Local validation after revert is clean (go test caching/memtable/tree + go vet).

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 12056 to 12060
// If a queued rewrite is pending, do not run it while the bucket is
// empty; that defeats the whole point of a bounded executor.
if budgetTokens <= 0 && len(rewriteQueue) > 0 {
db.vlogGenerationSchedulerState.Store(vlogGenerationSchedulerIdle)
return nil
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the queued-rewrite path, this early return when budgetTokens <= 0 skips executing a rewrite, but vlogGenerationLastRewriteUnixNano has already been updated earlier in the same closure. That can throttle the next eligible queued rewrite attempt and makes treedb.cache.vlog_generation.rewrite.last_unix_nano look like a rewrite ran when it didn’t. Consider moving the vlogGenerationLastRewriteUnixNano.Store(...) update to just before the actual ValueLogRewriteOnline call (or only when a rewrite is about to run).

Copilot uses AI. Check for mistakes.
Comment on lines 12159 to +12167
db.debugVlogMaintf("rewrite_err reason=%s err=%v dur_ms=%.3f", vlogGenerationReasonString(reason), err, float64(time.Since(rewriteStart).Microseconds())/1000)
if errors.Is(err, context.Canceled) {
db.observeVlogGenerationRewriteCanceled()
if len(processedRewriteIDs) > 0 {
// A canceled rewrite that already selected a queued chunk should
// immediately queue a checkpoint-kick retry. The retry executes
// as resumable debt with bounded non-cancel semantics.
db.vlogGenerationCheckpointKickPending.Store(true)
}
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rewrite cancellation/backoff handling only checks errors.Is(err, context.Canceled). Since some rewrite executions use context.WithTimeout (e.g., the bounded queued exec timeout), timeouts will surface as context.DeadlineExceeded and currently won’t be counted as canceled/backed-off (and will instead be treated as a hard error). If the intent is to treat timeouts similarly to foreground-cancel, consider checking context.DeadlineExceeded here as well (or reusing isVlogGenerationPlannerCanceled).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1715486dbc

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +12427 to +12429
if trigger := db.valueLogRewriteTriggerBytes; trigger > 0 {
retained, bytes := db.valueLogRetainedStats()
if bytes < trigger && retained < 2 {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Let checkpoint kicks start stale-ratio/churn rewrites below byte floor

Fresh evidence from this change: maybeRunVlogGenerationMaintenanceWithOptions now clears shouldRewrite on non-quiet periodic passes (TreeDB/caching/db.go:11789-11791), so checkpoint kicks are the only rewrite path under sustained foreground activity. This fast-path still skips every kick whenever bytes < valueLogRewriteTriggerBytes && retained < 2, even though shouldRunVlogGenerationRewrite in the same file can request rewrite from stale-ratio or churn alone (db.go:12612-12619). On write-heavy WAL-on workloads whose vlog is still below the total-bytes threshold, checkpoint-triggered maintenance will therefore never start rewrite debt until the DB goes quiet or grows past the byte floor.

Useful? React with 👍 / 👎.

Comment on lines +12200 to +12202
if gcStats.BytesDeleted > 0 {
gcBytesDeleted = int64(gcStats.BytesDeleted)
effectiveBytesAfter -= gcBytesDeleted
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Don't offset harmful rewrites with unrelated GC deletions

This subtracts the entire post-rewrite ValueLogGC deletion total from stats.BytesAfter, but ValueLogGC is a whole-DB sweep (TreeDB/db/vlog_gc.go:40-46) and BytesDeleted includes any unrelated unreachable segments it happens to reclaim. If a queued rewrite grows its own source chunk while the same GC pass also deletes older garbage from previous work, effectiveBytesAfter can look smaller than effectiveBytesBefore, so the new material_growth/ineffective-run logic never fires and the remaining debt queue is kept even though this rewrite cycle was harmful.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f78133892c

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +12509 to +12512
if len(rewriteQueue) == 0 {
if trigger := db.valueLogRewriteTriggerBytes; trigger > 0 {
retained, bytes := db.valueLogRetainedStats()
if bytes < trigger && retained < 2 {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Skip checkpoint-kick retries when rewrite budget is empty

When rewrite budgeting is enabled and the token bucket is depleted, maybeRunVlogGenerationMaintenanceWithOptions exits at TreeDB/caching/db.go:12134-12140 before rewriting any queued or freshly planned work. This checkpoint-kick fast-path never checks that budget first, so a DB with pending rewrite debt (or one merely above the byte floor) will still launch another checkpoint-triggered maintenance pass and re-enter Checkpoint() with zero chance of making progress. On deployments that checkpoint frequently with a tight rewrite budget, that turns every checkpoint into an extra checkpoint barrier for no benefit until tokens refill.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

snissn commented Mar 20, 2026

Update on f7813389 (caching: keep rewrite debt after ineffective online rewrite): local correctness gates passed (go test ./TreeDB/caching ./TreeDB/internal/memtable ./TreeDB/tree -count=1, go vet ./...), but the serial run_celestia integration gate is a regression and not a keep.

Results on this head:

  • fast: home /home/mikers/.celestia-app-mainnet-treedb-20260319163617, duration_seconds=305, max_rss_kb=11297948, end_app_bytes=5479728777, post vlog-gc 5479732844, post vlog-rewrite 2586303159.
  • wal_on_fast: home /home/mikers/.celestia-app-mainnet-treedb-20260319164207, duration_seconds=360, max_rss_kb=12902204, end_app_bytes=5857267247, post vlog-gc 5857270939, post vlog-rewrite 2648143685.

Compared with the kept baseline on this branch (41976c45), both profiles regressed on wall / RSS / end size, and the debt gap remains large.

Important forensic note: on both runs, application-db maintenance did not execute an online rewrite at all (rewrite_runs_after=0 on checkpoint-kick completions, no application-db rewrite_start / rewrite_done entries). The new per-segment penalty persistence and queue-retention semantics are locally correct, but they did not create a runtime win in this form.

Copilot AI review requested due to automatic review settings March 20, 2026 10:59
Copy link
Copy Markdown
Owner Author

snissn commented Mar 20, 2026

Pushed 67237888 (caching: prune queued rewrite debt by trigger quality).

Summary:

  • Generic pre_rewrite already used min_ratio=0.85.
  • This follow-on applies the stronger floor when queued debt is resumed under total_bytes / churn, instead of pruning queued debt at 0.50.

Validation on this head:

wal_on_fast

  • duration_seconds=514
  • max_rss_kb=10396928
  • end_app_bytes=4952082742
  • post vlog-gc: 3342556835
  • post vlog-rewrite: 2492662155
  • runtime proof of the fix:
    • rewrite_plan pre_rewrite ... min_ratio=0.850000 selected=0/...
    • rewrite_queue_quality_prune dropped=2 remaining=3 min_ratio=0.850000
    • late reason=total_bytes executed only a high-quality queued segment (planned_live=3411529 planned_stale=265025660)

fast

  • duration_seconds=302
  • max_rss_kb=8663524
  • end_app_bytes=4859998272
  • post vlog-gc: 4591564637
  • post vlog-rewrite: 2459761667

This is a keep from my side: it fixes the specific late queued-debt mismatch that remained after the earlier generic planner floor change.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 6 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

}
}

func TestVlogGenerationRewriteMinStaleRatioForGenericPass_UsesHigherConfiguredRatio(t *testing.T) {
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test name is misleading: it says it "UsesHigherConfiguredRatio", but the configured ratio here (0.80) is lower than the expected floor (vlogGenerationRewriteGenericMinSegmentStaleRatio == 0.85). Consider renaming the test to reflect that the generic quality floor overrides a lower configured ratio.

Suggested change
func TestVlogGenerationRewriteMinStaleRatioForGenericPass_UsesHigherConfiguredRatio(t *testing.T) {
func TestVlogGenerationRewriteMinStaleRatioForGenericPass_UsesQualityFloorOverLowerConfiguredRatio(t *testing.T) {

Copilot uses AI. Check for mistakes.
Comment on lines +11426 to +11431
if queueLen < maxSegments {
maxSegments = queueLen
}
if queueLen > maxSegments {
maxSegments = queueLen
}
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In vlogGenerationRewriteMaxSegmentsForRun(), the if queueLen > maxSegments { maxSegments = queueLen } block is redundant because maxSegments is either unchanged (vlogGenerationRewriteResumeMaxSegments) or already set to queueLen by the preceding if queueLen < maxSegments branch. Consider simplifying this to a single clamp (e.g., maxSegments = min(queueLen, vlogGenerationRewriteResumeMaxSegments) before the debt-drain cap) to make the intent clearer and avoid confusing future edits.

Suggested change
if queueLen < maxSegments {
maxSegments = queueLen
}
if queueLen > maxSegments {
maxSegments = queueLen
}
maxSegments = queueLen

Copilot uses AI. Check for mistakes.
Comment on lines +7193 to +7205
func (db *DB) vlogGenerationMaintenanceContext(timeout time.Duration, opts vlogGenerationMaintenanceOptions) (context.Context, context.CancelFunc) {
// Checkpoint-kick maintenance is an explicit caller opt-in to run outside the
// quiet-window gate. Keep this context timeout-bounded, but do not
// self-cancel on immediate foreground activity resumes.
if opts.bypassQuiet {
if timeout > 0 {
return context.WithTimeout(context.Background(), timeout)
}
return context.WithCancel(context.Background())
}
return db.foregroundMaintenanceContext(timeout)
}

Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vlogGenerationMaintenanceContext() is introduced here but appears unused (no call sites in TreeDB/caching). If it’s not intended for near-term use, consider removing it to keep the maintenance context behavior centralized in the actually-used helpers (foregroundMaintenanceContext / vlogGenerationRewritePlanContext).

Suggested change
func (db *DB) vlogGenerationMaintenanceContext(timeout time.Duration, opts vlogGenerationMaintenanceOptions) (context.Context, context.CancelFunc) {
// Checkpoint-kick maintenance is an explicit caller opt-in to run outside the
// quiet-window gate. Keep this context timeout-bounded, but do not
// self-cancel on immediate foreground activity resumes.
if opts.bypassQuiet {
if timeout > 0 {
return context.WithTimeout(context.Background(), timeout)
}
return context.WithCancel(context.Background())
}
return db.foregroundMaintenanceContext(timeout)
}

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 672378881e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +11421 to +11424
// Checkpoint-kick retries should keep each debt-drain run small to reduce
// write amplification when foreground ingest is still active.
if opts.bypassQuiet && !opts.skipCheckpoint {
return 1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Let checkpoint debt-drain runs select multiple segments

This guard forces vlogGenerationRewriteMaxSegmentsForRun to return 1 for every rewriteDebtDrain caller in this commit: the only call sites that set rewriteDebtDrain=true (TreeDB/caching/db.go:11823, 11889, 11975, 12017, 12033, 13147) also pass bypassQuiet=true and skipCheckpoint=false. That means the new checkpoint-kick “multi-segment debt drain” path never actually rewrites more than one queued segment per pass, so large persisted queues still drain at the old one-segment rate.

Useful? React with 👍 / 👎.

Comment on lines +12311 to 12313
if hasExecutableRewriteQueue {
shouldRewrite = true
reason = vlogGenerationReasonRewriteResume
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve generic efficacy floor on queued rewrite debt

Overwriting reason with vlogGenerationReasonRewriteResume here loses whether the persisted queue originally came from the total-bytes or churn triggers. The later quality filter at TreeDB/caching/db.go:12693 calls vlogGenerationRewriteMinStaleRatioForQueuedDebt, and that helper only applies the stricter generic floor for vlogGenerationReasonTotalBytes/vlogGenerationReasonChurn (TreeDB/caching/db.go:13320-13326). As a result, only the first chunk of a generic rewrite is gated by the new 0.85 efficacy floor; resumed chunks fall back to the looser 0.50 floor and can keep rewriting much less stale segments.

Useful? React with 👍 / 👎.

Comment on lines +12466 to +12467
} else if shouldDeferVlogGenerationRewritePlanForAge(plan, planOpts.MinSegmentAge) {
db.setVlogGenerationRewriteAgeBlockedUntil(now.Add(plan.AgeBlockedMinRemainingAge))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Schedule age-blocked retries from plan completion time

This retry deadline is anchored to the pass's earlier now value instead of to when ValueLogRewritePlan finishes; the same pattern appears again at TreeDB/caching/db.go:12578. On large trees or slower disks, planning can take seconds, so a result like AgeBlockedMinRemainingAge=10s can be converted into an immediate retry even though the youngest candidate still has about 10 seconds left before it reaches MinSegmentAge, causing repeated premature full-tree replans.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Owner Author

snissn commented Mar 20, 2026

Follow-on decomposition result is now tracked in stacked PR #843.

Summary:

  • A (retained-source cleanup + explicit SourceFileIDs live-byte estimation) = keep now
  • B (planner queued-ledger refresh) = redesign before merge
  • C (checkpoint-tail restage) = keep now

Most important new evidence is deterministic, not blind runtime sampling:

  • real-backend test proves checkpoint-kick queued rewrite consumes one source, restages the tail, and drops the consumed retained/live source
  • benchmark BenchmarkVlogGenerationRewriteQueueDrainCheckpointKick shows:
    • A + C: rewrite_calls=1, max_kick_ms=68.14, remaining_queue=3, stage_pending=1, total_window_ms=68.22
    • A only: rewrite_calls=4, max_kick_ms=84.29, remaining_queue=0, stage_pending=0, total_window_ms=302.1

So #843 is the promoted candidate path. Further iteration should happen there rather than on the old mixed head.

@snissn snissn merged commit 6723788 into main Mar 24, 2026
6 of 21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants