treedb: improve live vlog maintenance observability by snissn · Pull Request #856 · snissn/gomap

snissn · 2026-03-28T03:22:07Z

Summary

export treedb.cache.vlog_generation.* metrics through expvar selection
add rewrite-plan byte counters (total/live/stale) to maintenance stats
add cumulative rewrite reclaimed-bytes counter to make progress visible under live rewrite+gc
cover new metric exposure with expvar and scheduler unit tests

Testing

go test ./TreeDB/caching -count=1
go test ./TreeDB/caching -run 'TestSelectTreeDBExpvarStatsFiltersAndCoerces|TestObserveVlogGenerationRewritePlanOutcome_SelectedTracksBytes|TestObserveVlogGenerationRewritePlanOutcome_SelectedTracksSegmentFallbackBytes' -count=1

Copilot

Pull request overview

This PR improves TreeDB live value-log (vlog) maintenance observability by exposing vlog-generation metrics via expvar and adding additional byte-level counters to make rewrite planning and rewrite+GC progress visible.

Changes:

Export treedb.cache.vlog_generation.* metrics through the expvar stat-selection filter.
Add rewrite-plan selected-byte counters (total/live/stale) and a cumulative rewrite “reclaimed bytes” counter.
Add unit tests covering new expvar exposure and rewrite-plan byte accounting paths.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File	Description
TreeDB/caching/db.go	Adds new vlog-generation stats counters (plan-selected bytes + reclaimed bytes) and emits them via `Stats()`.
TreeDB/caching/expvar_stats.go	Extends expvar filtering to include `treedb.cache.vlog_generation.*`.
TreeDB/caching/expvar_stats_test.go	Verifies expvar selection/coercion includes a representative `vlog_generation` metric.
TreeDB/caching/vlog_generation_scheduler_test.go	Adds unit tests for rewrite-plan outcome observation and byte counter tracking (including fallback path).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-28T03:25:34Z

TreeDB/caching/expvar_stats.go

 			strings.HasPrefix(k, "treedb.cache.vlog_dict.") ||
+			strings.HasPrefix(k, "treedb.cache.vlog_generation.") ||
 			strings.HasPrefix(k, "treedb.cache.vlog_payload_kind.") ||


The selection comment says expvar exports only process-wide metric families plus a few cache/backend families (mmap/decode/batch-arena), but this function now also exports the full treedb.cache.vlog_generation.* family. Please update the comment to reflect the expanded scope so future readers don’t miss that maintenance stats are intentionally exposed via expvar.

snissn · 2026-03-28T03:38:52Z

Follow-up slice pushed in 9ce5339a.

What changed

Added live maintenance gate/rejection counters:
- treedb.cache.vlog_generation.maintenance.{attempts,acquired,collisions}
- treedb.cache.vlog_generation.maintenance.skip.*
- treedb.cache.vlog_generation.maintenance.passes.{noop,with_rewrite,with_gc}
Added vacuum skip counters:
- treedb.cache.vlog_generation.vacuum.{skipped_disabled,skipped_rewrite_bytes,skipped_cooldown}
Added scheduler tests for the new counters.
Added run notes and reproduction in worklog/2026-03-27.md.

run_celestia validation (fast profile)

Command:
LOCAL_GOMAP_DIR=/home/mikers/dev/snissn/gomap-phasehook-active TREEDB_OPEN_PROFILE=fast STOP_AT_LOCAL_HEIGHT=500 FREEZE_REMOTE_HEIGHT_AT_START=1 NO_PROGRESS_WARN_SECONDS=120 NO_PROGRESS_FAIL_SECONDS=1800 HEAP_CAPTURE_RSS_DELTA_KB=1 CAPTURE_HEAP_ON_MAX_RSS=1 ~/run_celestia.sh

Run home:
/home/mikers/.celestia-app-mainnet-treedb-20260327173138

Latest diagnostics snapshot sampled:
sync/diagnostics/pprof-heap-max-rss-7070580k-20260327173541.treedb_vars.json

Key counters:

maintenance.attempts=637
maintenance.acquired=56
maintenance.collisions=581
maintenance.passes.noop=53
maintenance.passes.with_gc=2
maintenance.passes.with_rewrite=0
maintenance.skip.quiet_window=26
gc.runs=2
rewrite.runs=0
vacuum.runs=0

Takeaway: in this early state-sync window, rewrite/vacuum did not trigger, and the dominant behavior was high active-pass contention. We now have explicit counters to drive the next tuning step.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9ce5339a4e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-28T03:41:45Z

TreeDB/caching/db.go

 		db.scheduleDueVlogGenerationDeferredMaintenance()
 		db.schedulePendingVlogGenerationCheckpointKick()
+		rewriteRan := db.vlogGenerationRewriteRuns.Load() > rewriteRunsBefore
+		gcRan := db.vlogGenerationGCRuns.Load() > gcRunsBefore


Count pass outcomes before queuing follow-up maintenance

The new pass-outcome counters are computed after scheduleDueVlogGenerationDeferredMaintenance()/schedulePendingVlogGenerationCheckpointKick() run in the defer path, so a just-scheduled retry goroutine can increment vlogGenerationRewriteRuns/vlogGenerationGCRuns first and make this pass look like it performed rewrite/GC when it did not. This appears under contention (fast retry wakeups on another core) and skews the new observability metrics (maintenance.passes.*) by overcounting rewrite/gc and undercounting noop for the current pass.

Useful? React with 👍 / 👎.

snissn · 2026-03-28T03:49:48Z

Pushed another follow-up in bf11ec9d to reduce live maintenance collision churn.

Code change

runVlogGenerationMaintenanceRetries now coalesces retries while maintenanceActive=true if retry intent is already pending:
- checkpoint-kick path: when checkpoint_kick.pending=true, wait/backoff instead of re-entering maybeRun... and colliding.
- deferred path: same behavior when deferred_pending=true.
maybeRunPeriodicVlogGenerationMaintenance now exits early while maintenanceActive=true.
Added test: TestRunVlogGenerationMaintenanceRetries_CoalescesPendingCollisionRetries.

Validation

go test ./TreeDB/caching -count=1

A/B run_celestia comparison

Both runs used:
LOCAL_GOMAP_DIR=/home/mikers/dev/snissn/gomap-phasehook-active TREEDB_OPEN_PROFILE=fast STOP_AT_LOCAL_HEIGHT=500 FREEZE_REMOTE_HEIGHT_AT_START=1 NO_PROGRESS_WARN_SECONDS=120 NO_PROGRESS_FAIL_SECONDS=1800 HEAP_CAPTURE_RSS_DELTA_KB=1 CAPTURE_HEAP_ON_MAX_RSS=1 ~/run_celestia.sh

Baseline run home:
/home/mikers/.celestia-app-mainnet-treedb-20260327173138
(snapshot pprof-heap-max-rss-7070580k-20260327173541.treedb_vars.json)

New run home:
/home/mikers/.celestia-app-mainnet-treedb-20260327174335
(snapshot pprof-heap-max-rss-6605092k-20260327174744.treedb_vars.json)

Key counter deltas:

maintenance.attempts: 637 -> 124
maintenance.acquired: 56 -> 119
maintenance.collisions: 581 -> 5
maintenance.passes.with_rewrite: 0 -> 1
maintenance.passes.with_gc: 2 -> 1
rewrite.runs: 0 -> 1
vacuum.runs: 0 -> 1

This looks like a material reduction in scheduler contention, and maintenance made actual rewrite/vacuum progress in the same early-state-sync lab window.

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-28T03:54:34Z

TreeDB/caching/db.go

 		db.vlogGenerationMaintenanceActive.Store(false)
 		// If a deferred confirmation/age wake became due while this pass held the
 		// scheduler active, requeue it immediately on exit instead of relying on
 		// the original retry goroutine to still be alive.
 		db.scheduleDueVlogGenerationDeferredMaintenance()
 		db.schedulePendingVlogGenerationCheckpointKick()
+		rewriteRan := db.vlogGenerationRewriteRuns.Load() > rewriteRunsBefore
+		gcRan := db.vlogGenerationGCRuns.Load() > gcRunsBefore
+		if rewriteRan {
+			db.vlogGenerationMaintenancePassWithRewrite.Add(1)
+		}
+		if gcRan {
+			db.vlogGenerationMaintenancePassWithGC.Add(1)
+		}
+		if !rewriteRan && !gcRan {
+			db.vlogGenerationMaintenancePassNoop.Add(1)
+		}


The maintenance pass outcome counters (passes.noop/with_rewrite/with_gc) can be mis-attributed due to the defer order: maintenanceActive is cleared and new maintenance may be scheduled before computing rewriteRan/gcRan from the global counters. If a deferred/checkpoint-kick pass starts quickly after maintenanceActive.Store(false), rewriteRan/gcRan may reflect the next pass instead of the one being released. Consider computing/incrementing the pass outcome counters before clearing vlogGenerationMaintenanceActive and before scheduling any follow-up maintenance (or track rewrite/gc execution via local booleans set during this pass).

snissn · 2026-03-28T03:59:47Z

Pushed c83e2d61 with one more low-risk scheduler optimization.

Code change

Added periodic preflight in maybeRunPeriodicVlogGenerationMaintenance:
- when runGC=false and foreground is hot, periodic loop now skips entering maintenance unless a deferred/checkpoint wake is pending.
- intent: reduce high-frequency no-op maintenance acquisitions.
Added test: TestVlogGenerationMaintenance_PeriodicPreflightSkipsHotNoPending.

Validation

go test ./TreeDB/caching -count=1

Third comparison run (same lab command)

Run home:
/home/mikers/.celestia-app-mainnet-treedb-20260327175340

Snapshot:
pprof-heap-max-rss-6560568k-20260327175747.treedb_vars.json

Counters:

maintenance.attempts=46
maintenance.acquired=41
maintenance.collisions=5
maintenance.passes.noop=38
maintenance.passes.with_rewrite=1
maintenance.passes.with_gc=1
maintenance.skip.quiet_window=0
rewrite.runs=1
vacuum.runs=1

Trend across runs:

baseline: attempts=637, collisions=581, rewrite=0, vacuum=0
retry-coalesced: attempts=124, collisions=5, rewrite=1, vacuum=1
preflight: attempts=46, collisions=5, rewrite=1, vacuum=1

So the latest change primarily removes maintenance churn/noops while preserving rewrite/vacuum progress.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c83e2d6164

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-28T04:04:12Z

TreeDB/caching/db.go

+			!db.vlogGenerationCheckpointKickPending.Load() &&
+			!db.vlogGenerationDeferredMaintenancePending.Load() &&
+			!db.vlogGenerationDeferredMaintenanceDue(now) {
+			return false


Surface deferred-due check errors before preflight skip

This new hot-foreground preflight can return early without entering maybeRunVlogGenerationMaintenance, but the !db.vlogGenerationDeferredMaintenanceDue(now) term internally treats rewrite-stage read errors as a plain false result. In sustained non-quiet workloads with no pending flags, backend errors from the deferred-due probe are now silently ignored, so scheduler error state and notifyError are never triggered until the node eventually becomes quiet. That regresses operational visibility compared with the previous path, which always entered maintenance and reported queue/stage load failures immediately.

Useful? React with 👍 / 👎.

snissn · 2026-03-28T04:18:07Z

Instrumentation-first update pushed to pr/live-vacuum-observability (31fbb0a5).

What this adds

Maintenance/rewrite/GC/vacuum timing counters exposed in Stats():
- treedb.cache.vlog_generation.maintenance.pass.{total,max,avg}_ms
- treedb.cache.vlog_generation.rewrite.plan.{total,max,avg}_ms
- treedb.cache.vlog_generation.rewrite.exec.{total,max,avg}_ms
- treedb.cache.vlog_generation.gc.exec.{total,max,avg}_ms
- treedb.cache.vlog_generation.vacuum.exec.{total,max,avg}_ms
Rewrite debt/backlog observability:
- treedb.cache.vlog_generation.rewrite.ledger_segments
- treedb.cache.vlog_generation.rewrite.ledger_bytes_{total,live,stale}
- treedb.cache.vlog_generation.rewrite.ledger_stale_ratio_ppm
- treedb.cache.vlog_generation.rewrite.stage_{pending,observed_unix_nano}
- treedb.cache.vlog_generation.rewrite.penalties_active
- treedb.cache.vlog_generation.rewrite.age_blocked_{until_unix_nano,remaining_ms}
Rewrite budget runtime visibility:
- treedb.cache.vlog_generation.rewrite_budget.tokens_{bytes,cap_bytes}
- treedb.cache.vlog_generation.rewrite_budget.tokens_utilization_pct
- treedb.cache.vlog_generation.rewrite_budget.consumed_bytes_total
Token-consume path now tracks consumed bytes in vlogGenerationConsumeRewriteBudgetBytes.

Test coverage

Added TestVlogGenerationStats_ReportRewriteBacklogAndDurations in TreeDB/caching/vlog_generation_scheduler_test.go.
Validated with:
- go test ./TreeDB/caching -run TestVlogGenerationStats_ReportRewriteBacklogAndDurations -count=1
- go test ./TreeDB/caching -count=1

Next step: run run_celestia again and pull these new stats from .treedb_vars.json to quantify where we are bottlenecked (budget-limited, age-blocked, or collision/phase-gated) before changing rewrite behavior.

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-28T04:22:17Z

TreeDB/caching/expvar_stats.go

 			strings.HasPrefix(k, "treedb.cache.vlog_payload_split.") ||
 			strings.HasPrefix(k, "treedb.cache.vlog_auto.") ||
 			strings.HasPrefix(k, "treedb.cache.vlog_dict.") ||
+			strings.HasPrefix(k, "treedb.cache.vlog_generation.") ||
 			strings.HasPrefix(k, "treedb.cache.vlog_payload_kind.") ||


The inline comment above this prefix list says only certain cache/backend families are exported via expvar, but treedb.cache.vlog_generation.* is now included. Please update the comment to mention vlog_generation (or reword it so it stays accurate as this allowlist evolves).

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 31fbb0a53e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-28T04:23:19Z

TreeDB/caching/db.go

 			stats, err := rewriter.ValueLogRewriteOnline(ctx, rewriteOpts)
 			cancel()
+			rewriteDur := time.Since(rewriteStart)
+			db.observeVlogGenerationRewriteExecDuration(rewriteDur)


Count rewrite exec duration only for successful runs

observeVlogGenerationRewriteExecDuration is called before the rewrite error check, so canceled/failed rewrites contribute to treedb.cache.vlog_generation.rewrite.exec.{total,max}_ms, but rewrite.runs (the denominator for rewrite.exec.avg_ms in Stats()) is incremented only on success. Under repeated failures this yields misleading observability (for example non-zero total/max with avg_ms=0.000, or inflated averages), which can skew maintenance tuning and experiment interpretation.

Useful? React with 👍 / 👎.

snissn · 2026-03-28T04:47:32Z

Follow-up pushed: 7cc50d6d (treedb: stop maintenance retry collision amplification).

Why

In a longer run_celestia window (STOP_AT_LOCAL_HEIGHT=2000), application-db metrics showed retry amplification under load:

maintenance.attempts=333, acquired=29, collisions=304
rewrite.plan_runs=3 but plan_selected=0, rewrite.runs=0
checkpoint_kick.pending=true persisted while attempts/collisions climbed

This indicated retry loops were still attempting while maintenanceActive=true in some pending-state transitions.

Change

runVlogGenerationMaintenanceRetries now always backs off while maintenanceActive=true (until release/deadline), rather than conditionally retrying based on pending flags.
Added regression coverage in TestRunVlogGenerationMaintenanceRetries_CoalescesPendingCollisionRetries for checkpoint path with checkpoint_pending=false + maintenanceActive=true.

Validation

go test ./TreeDB/caching -run TestRunVlogGenerationMaintenanceRetries_CoalescesPendingCollisionRetries -count=1
go test ./TreeDB/caching -count=1

A/B `run_celestia` outcome (application.db instance snapshots)

Pre-fix bad snapshot (20260327183236) vs post-fix (20260327184427):

maintenance.attempts: 333 -> 38
maintenance.acquired: 29 -> 38
maintenance.collisions: 304 -> 0
rewrite.plan_selected: 0 -> 2
rewrite.runs: 0 -> 1
vacuum.runs: 0 -> 1
rewrite_budget.consumed_bytes_total: 0 -> 33073906

Also logged in worklog/2026-03-27.md with run homes and command context.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7cc50d6de3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-28T04:51:28Z

TreeDB/caching/db.go

+		if db.vlogGenerationMaintenanceActive.Load() {
+			if time.Now().After(deadline) {
+				return


Preserve checkpoint retry intent while pass is active

When runVlogGenerationMaintenanceRetries is used for checkpoint kicks (stopWhenAcquired=false) and another maintenance pass remains active longer than vlogGenerationCheckpointKickRetryWindow (5s), this new branch only sleeps and then returns without ever invoking maybeRunVlogGenerationMaintenanceWithOptions. That means no collision path runs to re-set vlogGenerationCheckpointKickPending, so the active pass’s deferred schedulePendingVlogGenerationCheckpointKick() sees no pending work and the checkpoint-triggered follow-up maintenance can be dropped entirely.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-28T04:51:29Z

TreeDB/caching/db.go

 		gcStats, err := gcer.ValueLogGC(ctx, gcOpts)
 		cancel()
+		db.observeVlogGenerationGCExecDuration(time.Since(gcStart))
 		if err != nil {


Record GC exec duration only for successful runs

observeVlogGenerationGCExecDuration is called before checking err, but vlogGenerationGCRuns (used as the denominator for gc.exec.avg_ms in Stats()) is incremented only on success. Failed or canceled GC calls therefore increase gc.exec.total_ms/max_ms without increasing gc.runs, producing inconsistent and inflated observability during error periods.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-28T04:51:29Z

TreeDB/caching/db.go

+	db.observeVlogGenerationVacuumExecDuration(time.Since(vacuumStart))
 	if err != nil {
 		db.vlogGenerationVacuumFailures.Add(1)


Record vacuum exec duration only for successful runs

The vacuum execution duration is observed before the error check, while vlogGenerationVacuumRuns (the denominator for vacuum.exec.avg_ms) is incremented only after success. If vacuum fails, vacuum.exec.total_ms/max_ms still grow but vacuum.runs does not, so the new average metrics can report misleading values (including non-zero totals with zero average).

Useful? React with 👍 / 👎.

snissn · 2026-03-28T05:30:35Z

Follow-up update pushed: 00355572 (treedb: add stage-gate and rewrite segment counters).

What was added:

New maintenance split counters:
- treedb.cache.vlog_generation.maintenance.skip.stage_gate_not_due
- treedb.cache.vlog_generation.maintenance.skip.stage_gate_due_reserved
New rewrite selection/execution counters:
- treedb.cache.vlog_generation.rewrite.plan_selected_segments_total
- treedb.cache.vlog_generation.rewrite.exec.source_segments_total
Test coverage extended in TestVlogGenerationStats_ReportRewriteBacklogAndDurations.

Validation:

go test ./TreeDB/caching -count=1

Live run (baseline fast profile, no trigger override):

Command:
- LOCAL_GOMAP_DIR=/home/mikers/dev/snissn/gomap-phasehook-active TREEDB_OPEN_PROFILE=fast FREEZE_REMOTE_HEIGHT_AT_START=1 STOP_AT_LOCAL_HEIGHT=2000 ~/run_celestia.sh
Home:
- /home/mikers/.celestia-app-mainnet-treedb-20260327192126
Final snapshot:
- pprof-heap-max-rss-final-7983364k-20260327192636.treedb_vars.json
Key counters:
- maintenance.attempts=38 acquired=38 collisions=0
- rewrite.plan_runs=2 plan_selected=2
- rewrite.plan_selected_segments_total=3
- rewrite.runs=1 rewrite.exec.source_segments_total=1
- rewrite.bytes_in=33073442 rewrite.reclaimed_bytes=0
- maintenance.skip.stage_gate=7
- maintenance.skip.stage_gate_not_due=7
- maintenance.skip.stage_gate_due_reserved=0

Interpretation:

Planner selected more rewrite debt than was executed in-run (3 selected vs 1 executed).
Stage gating was entirely the "not due yet" confirmation wait path in this run.

Offline headroom on same home:

pre: du -sb application.db = 4707839386
treemap vlog-rewrite -rw:
- segments_before=20 segments_after=16 bytes_before=4637168004 bytes_after=2039183405 records=964467
post: du -sb application.db = 2077350273

So live maintenance remains significantly below post-rewrite floor, and these counters now make that bottleneck explicit.

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 15 out of 16 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

TreeDB/db/vlog_gc.go:270

ValueLogGC() returns early on opts.DryRun before computing SegmentsPending/BytesPending (and the observed-source pending fields). If these fields are intended to represent "eligible - deleted" backlog, then in DryRun (where deleted=0) they likely should be populated as pending=eligible, or the struct/docs should explicitly state that *Pending is only set for non-dry-run GC runs.

		if opts.DryRun {
			continue
		}
		if err := vm.MarkZombie(id); err != nil {
			return stats, err
		}
		candidates[id] = candidate{path: f.Path, size: size, observed: observed}
	}

	if opts.DryRun {
		if set != nil {
			_ = vm.Release(set)
		}
		db.persistValueLogRefTrackerBestEffort()
		return stats, nil

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-28T15:24:01Z

TreeDB/public.go

+	if v, ok := envInt(envVlogMaxRetainedBytes); ok {
+		opts.ValueLog.MaxRetainedBytes = int64(v)
+	}
+	if v, ok := envInt(envVlogMaxRetainedBytesHard); ok {
+		opts.ValueLog.MaxRetainedBytesHard = int64(v)
+	}
+	if v, ok := envInt(envVlogRewriteBudgetBytesPerSec); ok {
+		opts.ValueLog.Generational.RewriteBudgetBytesPerSec = int64(v)
+	}
+	if v, ok := envInt(envVlogRewriteBudgetRecordsPerSec); ok {
+		opts.ValueLog.Generational.RewriteBudgetRecordsPerSec = v
+	}
+	if v, ok := envInt(envVlogRewriteTriggerTotalBytes); ok {
+		opts.ValueLog.Generational.RewriteTriggerTotalBytes = int64(v)
+	}
+	if v, ok := envInt(envVlogRewriteTriggerStaleRatioPPM); ok {
+		if v < 0 {
+			v = 0
+		}
+		opts.ValueLog.Generational.RewriteTriggerStaleRatioPPM = uint32(v)
+	}
+	if v, ok := envInt(envVlogRewriteTriggerChurnPerSec); ok {
+		opts.ValueLog.Generational.RewriteTriggerChurnPerSec = int64(v)
+	}


applyEnvMaintenanceOverrides() parses several env vars documented as int64 (e.g. TREEDB_VLOG_MAX_RETAINED_BYTES*, TREEDB_VLOG_REWRITE_*_BYTES_PER_SEC) via envInt(), which uses strconv.Atoi into an int. This can overflow / reject valid large int64 values (and is inconsistent with the int64 doc comments). Consider adding an envInt64 helper (strconv.ParseInt) and using it for int64-backed options (and keeping envInt for true int fields like *_RECORDS_PER_SEC).

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 431d323f84

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-28T15:43:11Z

TreeDB/caching/db.go

 				return nil
 			}
 		}
 		now := time.Now()
 		db.vlogGenerationLastGCUnixNano.Store(now.UnixNano())
-		ctx, cancel := db.foregroundMaintenanceContext(30 * time.Second)
-		gcOpts := backenddb.ValueLogGCOptions{ProtectedPaths: db.valueLogProtectedPaths()}
+		ctx, cancel := db.vlogGenerationMaintenanceContext(30*time.Second, opts)


Preserve close cancellation in bypass-quiet GC maintenance

This GC call now uses vlogGenerationMaintenanceContext(30*time.Second, opts), and in checkpoint-kick/deferred paths (opts.bypassQuiet == true) that helper returns a timeout-only context that is not canceled by db.closeCh. If Close() happens while ValueLogGC is blocked, the maintenance goroutine can keep running until timeout/backend return, which can delay shutdown by up to 30s; this is a regression from the previous close-aware foregroundMaintenanceContext behavior on this path.

Useful? React with 👍 / 👎.

Copilot

Pull request overview

Copilot reviewed 15 out of 16 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-28T15:59:07Z

scripts/analyze_vlog_maintenance_capacity.py

+    if value is None or math.isnan(value):
+        return "n/a"
+    n = float(value)


human_bytes calls math.isnan(value) before coercing to float. Many call sites pass ints (e.g., from counters), and math.isnan(int) raises TypeError, causing the script to crash. Coerce to float first (or guard with isinstance(value, float) / try-except) before calling math.isnan.

Suggested change

if value is None or math.isnan(value):

return "n/a"

n = float(value)

if value is None:

return "n/a"

try:

n = float(value)

except (TypeError, ValueError):

return "n/a"

if math.isnan(n):

return "n/a"

Copilot · 2026-03-28T15:59:07Z

TreeDB/caching/db.go

 					removed = true
+					if size > 0 {
+						out.RemovedSegments++
+						out.RemovedBytes += size
+					}
+					if candidate.observed {
+						out.ObservedSourceRemovedSegments++
+						if size > 0 {
+							out.ObservedSourceRemovedBytes += size
+						}
+					}
 					continue
 				}
 				if db.cleanupMissingRetainedValueLog(path) {
+					if size > 0 {
+						out.RemovedSegments++
+						out.RemovedBytes += size
+					}


Same issue as earlier: when cleaning up orphaned/missing retained logs, RemovedSegments/RemovedBytes are only updated when size > 0, which can undercount removals if the size wasn't tracked. Increment the segment counter regardless of size knowledge.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e7dd2a3448

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-28T16:26:44Z

TreeDB/caching/db.go

+		gcOpts := db.valueLogGCOptions(false)
+		if forceObservedSourceGC {
+			gcOpts.ObservedSourceFileIDs = append([]uint32(nil), observedSourceGCIDs...)
+			db.vlogGenerationObservedGCRuns.Add(1)


Increment observed GC run counter only after successful GC

The new observed-source GC run metric is incremented before ValueLogGC returns, so failed or canceled GC attempts are counted as completed runs. This happens when forceObservedSourceGC is true and gcer.ValueLogGC returns an error; the code re-queues IDs on error, but treedb.cache.vlog_generation.observed_gc.runs has already been increased, which makes run-count telemetry inaccurate during failure periods.

Useful? React with 👍 / 👎.

Copilot

Pull request overview

Copilot reviewed 17 out of 18 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-28T17:10:34Z

TreeDB/internal/valuelog/manager.go

+	if f.Path != "" {
+		if info, err := os.Stat(f.Path); err == nil && info.Size() > 0 {
+			return uint64(info.Size())
+		}
+	}


ZombieStats is called from Stats/expvar paths and currently may call os.Stat(f.Path) via valueLogFileSizeBestEffort while holding m.mu.RLock(). Doing filesystem I/O under the manager lock can introduce latency and block writers/maintenance. Consider avoiding os.Stat here (e.g., rely on cached fileSize/mmap length only) or snapshot zombie file paths/sizes under the lock and perform any best-effort stat calls after unlocking (optionally caching results back into fileSize).

Suggested change

if f.Path != "" {

if info, err := os.Stat(f.Path); err == nil && info.Size() > 0 {

return uint64(info.Size())

}

}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e7ef33865a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-28T17:12:16Z

TreeDB/caching/db.go

+			if size > 0 {
+				out.RemovedSegments++
+				out.RemovedBytes += size


Count removed prune segments even when size is unknown

This branch only increments RemovedSegments when size > 0, but valueLogClosedSegmentSize returns 0 when a retained path has no tracked closed-size entry (for example after reopen or metadata loss). In that case the segment is still pruned by cleanupMissingRetainedValueLog, yet treedb.cache.vlog_retained_prune.removed_segments is not incremented, and observed-source removed counts can exceed global removed counts. This introduces inconsistent retained-prune telemetry for operators and automation.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a9e6fc3a9a

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-28T17:51:07Z

scripts/analyze_vlog_maintenance_capacity.py

+    passes_total = m["maintenance_noop"] + m["maintenance_with_rewrite"] + m["maintenance_with_gc"]
+    m["maintenance_passes_total"] = passes_total


Derive maintenance pass totals from mutually exclusive counters

maintenance_with_rewrite and maintenance_with_gc are not mutually exclusive (a single pass can increment both in maybeRunVlogGenerationMaintenanceWithOptions), so summing noop + with_rewrite + with_gc overcounts total passes whenever rewrite and GC run together. This inflates maintenance_passes_total and skews the rewrite/gc share percentages used by this analyzer, which can mislead capacity interpretation during mixed rewrite+gc workloads.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-28T17:51:07Z

scripts/analyze_vlog_maintenance_capacity.py

+    m["maintenance_skip"] = skip_map
+    m["maintenance_skip_total"] = sum(skip_map.values())
+


Exclude umbrella stage_gate from skip-total aggregation

maintenance_skip_total is computed as a raw sum of all skip counters, but stage_gate is an umbrella counter that is incremented together with stage_gate_not_due or stage_gate_due_reserved for the same skip event. This double-counts stage-gate skips and overstates total skip pressure, which can lead operators to overestimate scheduler contention in diagnostics.

Useful? React with 👍 / 👎.

Copilot

Pull request overview

Copilot reviewed 17 out of 18 changed files in this pull request and generated 3 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-28T18:57:34Z

TreeDB/internal/valuelog/manager.go

+	m.mu.RLock()
+	for _, f := range m.files {
+		if f == nil || !f.IsZombie.Load() {
+			continue
+		}
+		segments++
+		size := valueLogFileSizeBestEffort(f)
+		bytes += size
+		if f.RefCount.Load() > 0 {
+			pinnedSegments++
+			pinnedBytes += size
+			continue
+		}
+		unpinnedSegments++
+		unpinnedBytes += size
+	}
+	m.mu.RUnlock()


ZombieStats() holds m.mu.RLock() while calling valueLogFileSizeBestEffort(), which may perform os.Stat(f.Path). Doing filesystem I/O under the manager lock can block writers/maintenance and increase tail latency. Consider snapshotting the zombie *File pointers (or their Path + cached size fields) under the lock, then compute sizes (including any os.Stat) after releasing the lock.

Copilot · 2026-03-28T18:57:34Z

TreeDB/caching/expvar_stats.go

 			strings.HasPrefix(k, "treedb.cache.vlog_payload_split.") ||
 			strings.HasPrefix(k, "treedb.cache.vlog_auto.") ||
 			strings.HasPrefix(k, "treedb.cache.vlog_dict.") ||
+			strings.HasPrefix(k, "treedb.cache.vlog_generation.") ||
+			strings.HasPrefix(k, "treedb.cache.vlog_retained_prune.") ||
 			strings.HasPrefix(k, "treedb.cache.vlog_payload_kind.") ||
 			strings.HasPrefix(k, "treedb.cache.vlog_outer_leaf_codec.") ||
 			strings.HasPrefix(k, "treedb.cache.batch_arena.") {


The expvar selector now exports treedb.cache.vlog_generation.* and treedb.cache.vlog_retained_prune.*, but the capacity analyzer/runbook also relies on treedb.cache.vlog_zombie.* keys. Since that prefix is not allowlisted here, *.debug_vars.json snapshots won't include zombie inventory metrics and the report will silently show zeros. Add strings.HasPrefix(k, "treedb.cache.vlog_zombie.") to the allowlist (and consider a small selector test for one zombie key).

Copilot · 2026-03-28T18:57:34Z

TreeDB/public.go

 	// Value-log dictionary compression knobs (cached mode).
 	//
 	// Enabling dict compression requires:
 	//   - ValueLog compression mode that allows dicts (auto/dict), and
 	//   - Dict training enabled (TrainBytes > 0), and
 	//   - Side stores enabled (dictdb), and
 	//   - Split value log enabled (value pointers used).
-	envVlogDictEnable            = "TREEDB_VLOG_DICT_ENABLE"                    // bool
-	envVlogDictTrainBytes        = "TREEDB_VLOG_DICT_TRAIN_BYTES"               // int
-	envVlogDictBytes             = "TREEDB_VLOG_DICT_BYTES"                     // int
-	envVlogDictMinRecords        = "TREEDB_VLOG_DICT_MIN_RECORDS"               // int
-	envVlogDictMaxRecordBytes    = "TREEDB_VLOG_DICT_MAX_RECORD_BYTES"          // int
-	envVlogDictSampleStride      = "TREEDB_VLOG_DICT_SAMPLE_STRIDE"             // int
-	envVlogDictDedupWindow       = "TREEDB_VLOG_DICT_DEDUP_WINDOW"              // int
-	envVlogDictTrainLevel        = "TREEDB_VLOG_DICT_TRAIN_LEVEL"               // int
-	envVlogDictMaxK              = "TREEDB_VLOG_DICT_MAX_K"                     // int
-	envVlogDictClassMode         = "TREEDB_VLOG_DICT_CLASS_MODE"                // single|split_outer_leaf
-	envVlogDictZstdLevel         = "TREEDB_VLOG_DICT_ZSTD_LEVEL"                // fastest|default|better|best|int
-	envVlogDictEntropy           = "TREEDB_VLOG_DICT_ENTROPY"                   // bool
-	envVlogDictAdaptiveRatio     = "TREEDB_VLOG_DICT_ADAPTIVE_RATIO"            // float64
-	envVlogDictMinPayloadSavings = "TREEDB_VLOG_DICT_MIN_PAYLOAD_SAVINGS_RATIO" // float64
+	envVlogDictEnable                  = "TREEDB_VLOG_DICT_ENABLE"                     // bool
+	envVlogDictTrainBytes              = "TREEDB_VLOG_DICT_TRAIN_BYTES"                // int
+	envVlogDictBytes                   = "TREEDB_VLOG_DICT_BYTES"                      // int
+	envVlogDictMinRecords              = "TREEDB_VLOG_DICT_MIN_RECORDS"                // int
+	envVlogDictMaxRecordBytes          = "TREEDB_VLOG_DICT_MAX_RECORD_BYTES"           // int
+	envVlogDictSampleStride            = "TREEDB_VLOG_DICT_SAMPLE_STRIDE"              // int
+	envVlogDictDedupWindow             = "TREEDB_VLOG_DICT_DEDUP_WINDOW"               // int
+	envVlogDictTrainLevel              = "TREEDB_VLOG_DICT_TRAIN_LEVEL"                // int
+	envVlogDictMaxK                    = "TREEDB_VLOG_DICT_MAX_K"                      // int
+	envVlogDictClassMode               = "TREEDB_VLOG_DICT_CLASS_MODE"                 // single|split_outer_leaf
+	envVlogDictZstdLevel               = "TREEDB_VLOG_DICT_ZSTD_LEVEL"                 // fastest|default|better|best|int
+	envVlogDictEntropy                 = "TREEDB_VLOG_DICT_ENTROPY"                    // bool
+	envVlogDictAdaptiveRatio           = "TREEDB_VLOG_DICT_ADAPTIVE_RATIO"             // float64
+	envVlogDictMinPayloadSavings       = "TREEDB_VLOG_DICT_MIN_PAYLOAD_SAVINGS_RATIO"  // float64
+	envVlogMaxRetainedBytes            = "TREEDB_VLOG_MAX_RETAINED_BYTES"              // int64
+	envVlogMaxRetainedBytesHard        = "TREEDB_VLOG_MAX_RETAINED_BYTES_HARD"         // int64
+	envVlogRewriteBudgetBytesPerSec    = "TREEDB_VLOG_REWRITE_BUDGET_BYTES_PER_SEC"    // int64
+	envVlogRewriteBudgetRecordsPerSec  = "TREEDB_VLOG_REWRITE_BUDGET_RECORDS_PER_SEC"  // int
+	envVlogRewriteTriggerTotalBytes    = "TREEDB_VLOG_REWRITE_TRIGGER_TOTAL_BYTES"     // int64
+	envVlogRewriteTriggerStaleRatioPPM = "TREEDB_VLOG_REWRITE_TRIGGER_STALE_RATIO_PPM" // uint32
+	envVlogRewriteTriggerChurnPerSec   = "TREEDB_VLOG_REWRITE_TRIGGER_CHURN_PER_SEC"   // int64


The new TREEDB_VLOG_* retained/rewrite env vars are grouped under the “Value-log dictionary compression knobs” header, but they’re unrelated to dict compression. This makes the comment above the block misleading and makes the env list harder to scan. Consider moving these constants to a separate section (e.g., “Value-log maintenance knobs”) or updating the header comment to reflect the expanded scope.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8e9a018a23

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-28T19:04:22Z

TreeDB/caching/expvar_stats.go

+			strings.HasPrefix(k, "treedb.cache.vlog_generation.") ||
+			strings.HasPrefix(k, "treedb.cache.vlog_retained_prune.") ||


Include vlog_zombie keys in expvar whitelist

The commit adds treedb.cache.vlog_zombie.* counters in DB.Stats() (used by the new maintenance analyzer), but selectTreeDBExpvarStats still whitelists only vlog_generation and vlog_retained_prune families here. In environments that consume diagnostics via this expvar filter (the default run-celestia path), zombie metrics are dropped and the analyzer’s zombie inventory fields read as zero, masking pinned-zombie buildup and skewing reclaim diagnostics.

Useful? React with 👍 / 👎.

Copilot

Pull request overview

Copilot reviewed 17 out of 18 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-28T21:05:31Z

TreeDB/internal/valuelog/manager.go

+	m.mu.RLock()
+	for _, f := range m.files {
+		if f == nil || !f.IsZombie.Load() {
+			continue
+		}
+		segments++
+		size := valueLogFileSizeBestEffort(f)
+		bytes += size
+		if f.RefCount.Load() > 0 {
+			pinnedSegments++
+			pinnedBytes += size
+			continue
+		}
+		unpinnedSegments++
+		unpinnedBytes += size
+	}
+	m.mu.RUnlock()
+	return segments, bytes, pinnedSegments, pinnedBytes, unpinnedSegments, unpinnedBytes


ZombieStats holds Manager.mu.RLock while calling valueLogFileSizeBestEffort, which can fall back to os.Stat on f.Path. Doing filesystem I/O under the manager lock can introduce latency/lock contention if fileSize/mmapData are unset (e.g., newly discovered or partially initialized files). Consider collecting the zombie file pointers/paths under the lock, then computing best-effort sizes outside the lock (or restricting to cached size only) to keep stats reporting non-blocking.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: db46ff7081

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-28T21:12:12Z

TreeDB/db/vlog_gc.go

+	protectedAll := mergeUniqueNonEmptyPaths(opts.ProtectedPaths, opts.ProtectedInUsePaths, opts.ProtectedRetainedPaths)
+	if len(protectedAll) > 0 {
+		if recent := recentValueLogIDsForProtectedPaths(set, valueLogKeepRecentSegmentsPerLane, protectedAll); len(recent) > 0 {
 			keptIDs = recent


Preserve active IDs when narrowing GC lanes

keptIDs starts as currentValueLogIDs(set) but is then replaced with recent when any protected-path set is provided. With the new ProtectedInUsePaths/ProtectedRetainedPaths API, callers can validly pass only one class of paths; in that case recentValueLogIDsForProtectedPaths keeps only those lanes, so current writable segments in other lanes can be treated as eligible and marked zombie/deleted if currently unreferenced. This can prune active segments outside the protected lane subset; merge recent into keptIDs instead of overwriting it.

Useful? React with 👍 / 👎.

treedb: expose vlog generation maintenance bytes in stats

bcac5d1

Copilot AI review requested due to automatic review settings March 28, 2026 03:22

Copilot started reviewing on behalf of snissn March 28, 2026 03:22 View session

Copilot AI reviewed Mar 28, 2026

View reviewed changes

treedb: add maintenance and vacuum skip counters

9ce5339

chatgpt-codex-connector bot reviewed Mar 28, 2026

View reviewed changes

treedb: coalesce maintenance retries under load

bf11ec9

Copilot AI review requested due to automatic review settings March 28, 2026 03:49

Copilot started reviewing on behalf of snissn March 28, 2026 03:50 View session

Copilot AI reviewed Mar 28, 2026

View reviewed changes

treedb: skip hot periodic maintenance preflight

c83e2d6

chatgpt-codex-connector bot reviewed Mar 28, 2026

View reviewed changes

treedb: add live vacuum/rewrite economics instrumentation

31fbb0a

Copilot AI review requested due to automatic review settings March 28, 2026 04:17

Copilot started reviewing on behalf of snissn March 28, 2026 04:18 View session

Copilot AI reviewed Mar 28, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Mar 28, 2026

View reviewed changes

treedb: stop maintenance retry collision amplification

7cc50d6

chatgpt-codex-connector bot reviewed Mar 28, 2026

View reviewed changes

treedb: add stage-gate and rewrite segment counters

0035557

Copilot AI review requested due to automatic review settings March 28, 2026 05:30

Copilot started reviewing on behalf of snissn March 28, 2026 05:30 View session

Copilot AI reviewed Mar 28, 2026

View reviewed changes

treedb: speed staged rewrite debt progression

0dd0b18

Copilot started reviewing on behalf of snissn March 28, 2026 15:20 View session

Copilot AI reviewed Mar 28, 2026

View reviewed changes

worklog: record high-budget live rewrite run

431d323

chatgpt-codex-connector bot reviewed Mar 28, 2026

View reviewed changes

caching: expose cumulative observed-source gc totals

f587805

Copilot AI review requested due to automatic review settings March 28, 2026 15:55

Copilot started reviewing on behalf of snissn March 28, 2026 15:55 View session

tools: include retained-prune outcomes in capacity report

4a959bb

Copilot AI reviewed Mar 28, 2026

View reviewed changes

caching: accelerate observed-source retained prune pacing

e7dd2a3

chatgpt-codex-connector bot reviewed Mar 28, 2026

View reviewed changes

caching: add observed-prune and zombie lifecycle diagnostics

e7ef338

Copilot AI review requested due to automatic review settings March 28, 2026 17:04

Copilot started reviewing on behalf of snissn March 28, 2026 17:05 View session

Copilot AI reviewed Mar 28, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Mar 28, 2026

View reviewed changes

caching: honor configured stale-ratio threshold in generic rewrite

a9e6fc3

chatgpt-codex-connector bot reviewed Mar 28, 2026

View reviewed changes

caching: add pre-checkpoint rewrite override for WAL-off runs

8e9a018

Copilot AI review requested due to automatic review settings March 28, 2026 18:52

Copilot started reviewing on behalf of snissn March 28, 2026 18:53 View session

Copilot AI reviewed Mar 28, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Mar 28, 2026

View reviewed changes

snissn added 2 commits March 28, 2026 10:55

analyzer: surface rewrite plan-empty reasons

8806b4e

vlog: add observed-source protection mix counters

db46ff7

Copilot AI review requested due to automatic review settings March 28, 2026 21:01

Copilot started reviewing on behalf of snissn March 28, 2026 21:02 View session

Copilot AI reviewed Mar 28, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Mar 28, 2026

View reviewed changes

worklog: capture protection-mix validation run

6cc124c

		passes_total = m["maintenance_noop"] + m["maintenance_with_rewrite"] + m["maintenance_with_gc"]
		m["maintenance_passes_total"] = passes_total

		m["maintenance_skip"] = skip_map
		m["maintenance_skip_total"] = sum(skip_map.values())

		strings.HasPrefix(k, "treedb.cache.vlog_generation.") \|\|
		strings.HasPrefix(k, "treedb.cache.vlog_retained_prune.") \|\|

Conversation

snissn commented Mar 28, 2026

Summary

Testing

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

snissn commented Mar 28, 2026

What changed

run_celestia validation (fast profile)

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

snissn commented Mar 28, 2026

Code change

Validation

A/B run_celestia comparison

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

snissn commented Mar 28, 2026

Code change

Validation

Third comparison run (same lab command)

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

snissn commented Mar 28, 2026

What this adds

Test coverage

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

snissn commented Mar 28, 2026

Why

Change

Validation

A/B run_celestia outcome (application.db instance snapshots)

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 28, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 28, 2026

A/B `run_celestia` outcome (application.db instance snapshots)