Skip to content

perf: escape-free string rendering fast path with bulk copy#678

Open
He-Pin wants to merge 1 commit intodatabricks:masterfrom
He-Pin:perf/string-escape-fast-path
Open

perf: escape-free string rendering fast path with bulk copy#678
He-Pin wants to merge 1 commit intodatabricks:masterfrom
He-Pin:perf/string-escape-fast-path

Conversation

@He-Pin
Copy link
Copy Markdown
Contributor

@He-Pin He-Pin commented Apr 4, 2026

Motivation

Most Jsonnet strings are pure ASCII without escape characters. The current renderer dispatches every character through RenderUtils.escapeChar, which involves a per-character method call with a match statement to check for control chars, quotes, and backslashes. For strings that need no escaping (the common case), this per-character dispatch is unnecessary overhead.

Key Design Decision

Add a pre-scan fast path in visitNonNullString: quickly check if any character needs escaping. If not, use a single String.getChars bulk memcpy with surrounding quotes instead of character-by-character processing. Falls back to the original RenderUtils.escapeChar path when any character needs escaping.

The optimization targets the JVM's polymorphic dispatch overhead — on Scala Native (AOT-compiled), the impact is minimal since there's no JIT deoptimization from polymorphic call sites.

Modification

BaseCharRenderer.scalavisitNonNullString:

  • Added escape detection scan: c < 32 || c == '"' || c == '\\'
  • Fast path: ensureLength(len + 2)appendUnsafe('"')String.getChars bulk copy → appendUnsafe('"')
  • Falls back to RenderUtils.escapeChar for strings with special characters
  • Only applies to String instances when !escapeUnicode (the default)

Benchmark Results

JMH (JVM, Scala 3.3.7)

Benchmark Master (ms/op) Optimized (ms/op) Change
base64 0.760 0.492 -35.3%
base64Decode 0.576 0.380 -34.0%
lstripChars 0.577 0.390 -32.4%
rstripChars 0.592 0.387 -34.6%
stripChars 0.573 0.370 -35.3%
substr 0.153 0.110 -28.1%
realistic1 2.707 2.382 -12.0%
realistic2 67.037 58.301 -13.0%
bench.02 48.735 43.667 -10.4%
base64_byte_array 1.397 1.128 -19.3%
comparison 23.928 22.239 -7.1%
large_string_join 2.062 2.010 -2.5%
large_string_template 2.251 2.218 -1.5%

All 35 benchmarks checked, zero regressions.

Native (Scala Native, hyperfine --warmup 5 --runs 15 -N)

Benchmark Master (ms) Optimized (ms) vs jrsonnet
realistic2 297.6 ± 1.7 285.3 ± 2.3 (-4.1%) 2.98x slower
bench.02 73.2 ± 1.7 72.0 ± 1.6 (-1.6%) 1.59x faster
large_string_template 15.7 ± 0.3 15.3 ± 0.4 (-2.5%) 3.86x slower

Native impact is minimal because Scala Native AOT-compiles code — the JVM JIT's polymorphic dispatch overhead for escapeChar doesn't apply.

Analysis

The escape detection scan (c < 32 || c == '"' || c == '\\') perfectly matches RenderUtils.escapeChar's handled characters (verified via bytecode inspection of upickle-core 4.4.2). For strings without special characters, the single String.getChars memcpy replaces N individual character reads + method dispatch, giving 30%+ improvement on string-heavy JVM workloads.

The optimization is particularly impactful for:

  • Benchmarks producing many small strings (base64, stripChars) — saves per-string dispatch overhead
  • Benchmarks with large string outputs (realistic1/2, bench.02) — saves per-char processing

References

Result

  • ✅ All 140 JVM tests pass
  • ✅ 28-35% JMH improvement on string-heavy benchmarks
  • ✅ 10-13% JMH improvement on realistic workloads
  • ✅ Zero regressions across all 35 benchmarks

Add a fast path in visitNonNullString that scans the string for chars
needing escaping (control chars, quotes, backslashes). When no escaping
is needed (the common case for Jsonnet output), bulk-copy the entire
string into the CharBuilder using String.getChars instead of going
through upickle's per-character RenderUtils.escapeChar pipeline.

Upstream: jit branch commit 1d72a47
@He-Pin He-Pin force-pushed the perf/string-escape-fast-path branch from ba698a9 to 8ade3a2 Compare April 4, 2026 17:26
@He-Pin He-Pin changed the title perf: fast path for escape-free string rendering perf: escape-free string rendering fast path with bulk copy Apr 4, 2026
@He-Pin He-Pin marked this pull request as ready for review April 4, 2026 18:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant