batch: retain hottest arena chunk and grow chunk caps#819
batch: retain hottest arena chunk and grow chunk caps#819snissn wants to merge 2 commits intopr/appendonly-entry-reuse-depthfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR optimizes batch.Batch’s internal copy arena reuse/growth to reduce allocations on large or write-heavy workloads, and adds tests to lock in the new reuse/growth invariants.
Changes:
- Update
resetArenaLockedto retain the largest reusable arena chunk (bounded bybatchArenaMaxRetainCap) instead of always keeping the first. - Add bounded geometric growth logic in
ensureArenaChunkto reduce repeated chunk allocations for large batches. - Add unit tests covering reset retention/drop behavior and geometric chunk growth.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| TreeDB/batch/batch.go | Adjust arena reset retention policy and implement geometric growth for subsequent arena chunks. |
| TreeDB/batch/batch_arena_test.go | Add tests validating retained chunk selection, oversized-chunk dropping, and geometric growth behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 688f9a2bbc
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
Pull request overview
This PR reduces allocations in TreeDB/batch by improving batch arena chunk reuse across Batch.Reset() and by introducing bounded geometric growth when allocating additional arena chunks during large batch writes.
Changes:
- Update
Batch.resetArenaLocked()to retain the largest reusable arena chunk (bounded bybatchArenaMaxRetainCap) instead of always retaining the first chunk. - Update
Batch.ensureArenaChunk()to grow subsequent chunk capacities geometrically (doubling, capped atbatchArenaMaxRetainCap) to reduce repeated allocations in large batches. - Add unit tests covering the new reset-retention invariant and geometric growth behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| TreeDB/batch/batch.go | Adjusts arena reset retention policy and implements bounded geometric growth for chunk allocation. |
| TreeDB/batch/batch_arena_test.go | Adds tests ensuring the largest reusable chunk is retained (or oversized-only sets dropped) and that chunk caps grow geometrically. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Summary
Resetinstead of always retaining the first chunkensureArenaChunk(up tobatchArenaMaxRetainCap) to reduce repeated chunk allocations in large batchesWhy
batch.(*Batch).ensureArenaChunkremained one of the top allocators in write-heavy profiles. The old reset behavior repeatedly reverted pooled batches to a small first chunk, and chunk growth remained too incremental.Validation
go test ./TreeDB/batch ./TreeDB/cachingtime ./bin/unified-bench -dbs treedb -profile fast -keys 500000 -progress=false -treedb-index-outer-leaves-in-vlog=true -valsize=100 -treedb-force-value-pointers=false -checkpoint-between-tests -profile-dir=/home/mikers/tmp/perf-batch-geogrow-b-1773435448Comparison baseline (PR #818 head):
/home/mikers/tmp/perf-entry-lease32b-1773434944alloc_space (
batch_write_steady)138.13MB -> 122.66MBbatch.(*Batch).ensureArenaChunk:48.93MB -> 31.27MBops/sec highlights
batch_write_steady:2,107,601 -> 2,231,000(+5.9%)batch_write:12,362,290 -> 12,610,767(+2.0%)random_write:5,039,465 -> 5,218,219(+3.5%)batch_random:8,377,934 -> 8,278,773(-1.2%)dataset_write_sorted:3,656,106 -> 3,634,457(-0.6%)