perf: pre-cached indent arrays for bulk newline+spaces#676
Open
He-Pin wants to merge 2 commits intodatabricks:masterfrom
Open
perf: pre-cached indent arrays for bulk newline+spaces#676He-Pin wants to merge 2 commits intodatabricks:masterfrom
He-Pin wants to merge 2 commits intodatabricks:masterfrom
Conversation
For integer values (the common case in Jsonnet), write digits directly to CharBuilder using a scratch digit buffer instead of allocating a String via RenderUtils.renderDouble/Long.toString and copying chars. Upstream: jit branch commit d63ce90 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Pre-compute indent arrays (newline + spaces) for depths 0-15 at Renderer construction time. On flushBuffer, use bulk appendAll for the cached array instead of character-by-character space appending. Upstream: jit branch commit 4f19bde Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
The JSON renderer writes newline+indent sequences character by character. For deeply nested output with many array/object elements, this is a significant bottleneck. Pre-caching indent byte arrays allows bulk writes.
Key Design Decision
Pre-compute indent arrays up to a maximum depth and use
System.arraycopy-style bulk writes instead of character-by-character output. This trades a small amount of memory for significant rendering throughput.Modification
Added pre-cached indent arrays in BaseCharRenderer.scala. The renderer now uses bulk array copies for indent sequences, reducing per-character overhead.
Benchmark Results
JMH Regression Suite (1 fork, 3 warmup, 1 measurement)
All other benchmarks within noise margin.
Scala Native Hyperfine (
-N -w4 -m20)Analysis
This is particularly effective for rendering-heavy benchmarks like
reverse(which generates deeply nested JSON output) andbase64DecodeBytes(large output). On native, this narrows the gap with jrsonnet from 1.49x to just 1.08x on reverse, and from 1.61x to 1.14x on base64DecodeBytes.References
Upstream jit branch exploration at he-pin/sjsonnet@jit
Result
16-29% improvement on rendering-heavy benchmarks. Dramatically narrows the gap with jrsonnet on reverse and base64DecodeBytes.