Skip to content

perf: foldl string concat O(n) StringBuilder optimization#665

Open
He-Pin wants to merge 1 commit intodatabricks:masterfrom
He-Pin:perf/foldl-stringbuilder
Open

perf: foldl string concat O(n) StringBuilder optimization#665
He-Pin wants to merge 1 commit intodatabricks:masterfrom
He-Pin:perf/foldl-stringbuilder

Conversation

@He-Pin
Copy link
Copy Markdown
Contributor

@He-Pin He-Pin commented Apr 4, 2026

Motivation

std.foldl with string concatenation has O(n²) complexity because each + creates a new string. This detects the foldl+string-concat pattern and uses a StringBuilder for O(n) complexity.

Key Design Decision

Detection at runtime in the foldl hot loop rather than static analysis. When foldl accumulator is a string and the function body performs string concatenation, we switch to a StringBuilder-based fast path that avoids quadratic string copying.

Modification

Added tryStringBuilderFoldl in ArrayModule.scala that detects the string concat pattern and uses StringBuilder. Falls back to standard foldl for non-string cases.

Benchmark Results

JMH Regression Suite (1 fork, 3 warmup, 1 measurement)

Benchmark Master This PR Change
foldl 9.365 0.271 -97.1%
bench.04 32.887 0.493 -98.5%

All other benchmarks within noise margin.

Scala Native Hyperfine (-N -w4 -m20)

Benchmark sjsonnet master This PR jrsonnet vs master vs jrsonnet
bench.04 (foldl string concat) 507.2 ± 34.6 ms 6.9 ± 1.1 ms 548.1 ± 10.9 ms 73.9x faster 79.9x faster
foldl 143.2 ± 7.9 ms 6.2 ± 0.7 ms 147.0 ± 3.9 ms 23.1x faster 23.7x faster

Analysis

This is the single largest performance improvement in the entire optimization suite. The O(n²) → O(n) complexity change produces dramatic speedups that scale with input size. On native, this makes sjsonnet nearly 80x faster than jrsonnet (Rust) for this workload.

References

Upstream jit branch exploration at he-pin/sjsonnet@jit

Result

Massive reduction in foldl string concatenation time. bench.04 goes from 507ms to 6.9ms on native (73.9x speedup). sjsonnet is now 79.9x faster than jrsonnet for this benchmark.

@He-Pin He-Pin force-pushed the perf/foldl-stringbuilder branch from 6644fcb to b32a959 Compare April 4, 2026 12:05
@He-Pin He-Pin marked this pull request as ready for review April 4, 2026 12:59
Detect the pattern std.foldl(function(acc, elem) acc + elem, arr, stringInit)
at runtime by inspecting the function's body AST. When the body is a
BinaryOp(OP_+, ValidId(param0), ValidId(param1)) and the initial value is
a string, use a StringBuilder for O(n) instead of O(n²) concatenation.

Changes:
- Val.Func: add bodyExpr hook (default null) for AST inspection
- Evaluator.visitMethod: override bodyExpr to expose function body
- ArrayModule.Foldl: add tryStringBuilderFoldl fast path that detects
  the concatenation pattern and builds the result in a single pass

This addresses the 88x gap with jrsonnet on the foldl string concat
benchmark by converting from quadratic string copying to linear
StringBuilder appending.

Upstream: jit branch commit 2d3e56d
@He-Pin He-Pin force-pushed the perf/foldl-stringbuilder branch from b32a959 to 626d1ea Compare April 4, 2026 17:24
@He-Pin He-Pin changed the title perf: foldl StringBuilder fast path + stdlib allocation optimizations perf: foldl string concat O(n) StringBuilder optimization Apr 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant