Skip to content

batch: retain hottest arena chunk and grow chunk caps#819

Open
snissn wants to merge 2 commits intopr/appendonly-entry-reuse-depthfrom
pr/batch-arena-reserve-tuning
Open

batch: retain hottest arena chunk and grow chunk caps#819
snissn wants to merge 2 commits intopr/appendonly-entry-reuse-depthfrom
pr/batch-arena-reserve-tuning

Conversation

@snissn
Copy link
Copy Markdown
Owner

@snissn snissn commented Mar 13, 2026

Summary

  • retain the largest reusable batch arena chunk on Reset instead of always retaining the first chunk
  • add bounded geometric growth in ensureArenaChunk (up to batchArenaMaxRetainCap) to reduce repeated chunk allocations in large batches
  • add unit tests for both invariants:
    • reset keeps largest reusable chunk and drops oversized-only sets
    • chunk capacities grow geometrically as chunks are exhausted

Why

batch.(*Batch).ensureArenaChunk remained one of the top allocators in write-heavy profiles. The old reset behavior repeatedly reverted pooled batches to a small first chunk, and chunk growth remained too incremental.

Validation

  • go test ./TreeDB/batch ./TreeDB/caching
  • time ./bin/unified-bench -dbs treedb -profile fast -keys 500000 -progress=false -treedb-index-outer-leaves-in-vlog=true -valsize=100 -treedb-force-value-pointers=false -checkpoint-between-tests -profile-dir=/home/mikers/tmp/perf-batch-geogrow-b-1773435448

Comparison baseline (PR #818 head): /home/mikers/tmp/perf-entry-lease32b-1773434944

alloc_space (batch_write_steady)

  • total: 138.13MB -> 122.66MB
  • batch.(*Batch).ensureArenaChunk: 48.93MB -> 31.27MB

ops/sec highlights

  • batch_write_steady: 2,107,601 -> 2,231,000 (+5.9%)
  • batch_write: 12,362,290 -> 12,610,767 (+2.0%)
  • random_write: 5,039,465 -> 5,218,219 (+3.5%)
  • batch_random: 8,377,934 -> 8,278,773 (-1.2%)
  • dataset_write_sorted: 3,656,106 -> 3,634,457 (-0.6%)

Copilot AI review requested due to automatic review settings March 13, 2026 20:58
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes batch.Batch’s internal copy arena reuse/growth to reduce allocations on large or write-heavy workloads, and adds tests to lock in the new reuse/growth invariants.

Changes:

  • Update resetArenaLocked to retain the largest reusable arena chunk (bounded by batchArenaMaxRetainCap) instead of always keeping the first.
  • Add bounded geometric growth logic in ensureArenaChunk to reduce repeated chunk allocations for large batches.
  • Add unit tests covering reset retention/drop behavior and geometric chunk growth.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
TreeDB/batch/batch.go Adjust arena reset retention policy and implement geometric growth for subsequent arena chunks.
TreeDB/batch/batch_arena_test.go Add tests validating retained chunk selection, oversized-chunk dropping, and geometric growth behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 688f9a2bbc

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces allocations in TreeDB/batch by improving batch arena chunk reuse across Batch.Reset() and by introducing bounded geometric growth when allocating additional arena chunks during large batch writes.

Changes:

  • Update Batch.resetArenaLocked() to retain the largest reusable arena chunk (bounded by batchArenaMaxRetainCap) instead of always retaining the first chunk.
  • Update Batch.ensureArenaChunk() to grow subsequent chunk capacities geometrically (doubling, capped at batchArenaMaxRetainCap) to reduce repeated allocations in large batches.
  • Add unit tests covering the new reset-retention invariant and geometric growth behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
TreeDB/batch/batch.go Adjusts arena reset retention policy and implements bounded geometric growth for chunk allocation.
TreeDB/batch/batch_arena_test.go Adds tests ensuring the largest reusable chunk is retained (or oversized-only sets dropped) and that chunk caps grow geometrically.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants