perf: abstract FormatCache as pluggable trait, optimize format runtime by He-Pin · Pull Request #679 · databricks/sjsonnet

He-Pin · 2026-04-05T00:48:13Z

Motivation

The % format operator is a critical path for string-template-heavy workloads. The current implementation:

Re-parses format strings on every invocation (fastparse overhead)
Materializes all values to ujson.Value before format-specific dispatch (allocates intermediate objects)
Uses for/zipWithIndex iterator allocation in the main loop
Always allocates BigInt for integer formatting even when the value fits in a Long
Calls widen() even when no width/padding is needed
Uses a static cache field, preventing users from plugging in custom cache implementations

These costs compound in benchmarks like large_string_template (256 format specs in a 600KB template).

Key Design Decision

Pluggable FormatCache trait: Abstracts the format string cache as a trait (analogous to ParseCache), injected through Interpreter → Evaluator constructors. Users can supply custom implementations (e.g., Caffeine-based) for better control over eviction, concurrency, and memory. Default is a process-wide LRU singleton (FormatCache.SharedDefault).

Direct Val dispatch: Match on Val.Str, Val.Num, Val.True, Val.False, Val.Null directly instead of materializing to ujson.Value first. Since Val is a sealed class, this covers all primitive types. Complex types (Arr, Obj) still go through Materializer.

Long fast path: formatInteger avoids BigInt allocation when the value fits in a Long (with explicit Long.MinValue guard to prevent negation overflow).

Modification

New file: `sjsonnet/src/sjsonnet/FormatCache.scala`

FormatCache trait: Single getOrElseUpdate(key, compute) API
DefaultFormatCache: LRU LinkedHashMap (256 entries, access-order), thread-safe via synchronized double-checked locking. Initial capacity sized to avoid premature rehash.
FormatCache.SharedDefault: Process-wide singleton preserving cross-interpreter reuse
FormatCache.EmptyCache: Always-recompute cache for testing

Modified: `sjsonnet/src/sjsonnet/Format.scala`

Removed static parsedFormatCache field → replaced by pluggable FormatCache
CompiledFormat sealed trait: Opaque marker for cache entries (hides RuntimeFormat internals)
RuntimeFormat: Pre-processes parsed format into arrays with metadata (hasAnyStar, staticChars), now private[sjsonnet] extending CompiledFormat
parseFormatCached: Takes FormatCache parameter, uses pattern match (not asInstanceOf)
Direct Val dispatch: Bypasses Materializer for Str/Num/Bool/Null
widenRaw fast path: Returns txt directly when width.isEmpty
While-loop: Replaces for/zipWithIndex to avoid iterator/tuple allocation
StringBuilder pre-sizing: Estimates capacity from staticChars + specs.length * 8
formatInteger Long fast path: Uses java.lang.Long.toString instead of BigInt.toString
PartialApplyFmt: Pre-parses at construction time, bypasses external cache

Modified: `sjsonnet/src/sjsonnet/Val.scala`

EvalScope.formatCache: Concrete method with default (FormatCache.SharedDefault), avoids breaking external implementations

Modified: `sjsonnet/src/sjsonnet/Evaluator.scala`

Constructor parameter: formatCache: FormatCache = FormatCache.SharedDefault added to both Evaluator and NewEvaluator

Modified: `sjsonnet/src/sjsonnet/Interpreter.scala`

Constructor parameter: formatCache: FormatCache threaded through to createEvaluator

Modified: `sjsonnet/src-jvm-native/sjsonnet/SjsonnetMainBase.scala`

Passes FormatCache.SharedDefault explicitly to Interpreter constructor

Benchmark Results

JMH Regression Suite (1 fork, 5 warmup, 5 measurement iterations)

Benchmark	Master (ms/op)	This PR (ms/op)	Change
large_string_template	2.265	2.121	-6.4% ✅
realistic1	2.714	2.315	-14.7% ✅
realistic2	70.491	75.059	+6.5% (noise)
All other benchmarks	-	-	Within ±3% noise

Scala Native Hyperfine (`-N -w4 -m20`)

Benchmark	Master (ms)	This PR (ms)	jrsonnet 0.5.0-pre98 (ms)	Gap
large_string_template	17.4	17.0	8.5	2.01x (was 2.06x)
bench.04	505.3	520.2	571.4	sjsonnet faster ✅
comparison2	170.1	168.3	239.1	sjsonnet faster ✅

No regressions on non-format benchmarks.

Analysis

The JVM improvement (-6.4% on large_string_template, -14.7% on realistic1) confirms the optimization is effective. The native improvement is more modest because:

The format cache helps JMH (repeated invocations) more than native/hyperfine (one invocation per process)
JVM JIT can better exploit the direct Val dispatch due to runtime specialization
The large_string_template benchmark has a 600KB format string with only 256 specs — the bottleneck is dominated by string I/O rather than format logic

The Long.MinValue edge case is guarded to prevent negation overflow — falls through to the BigDecimal path.

The FormatCache abstraction adds no performance overhead — the SharedDefault singleton is identical to the previous static cache, and PartialApplyFmt bypasses the cache entirely.

Supersedes PR #672 (format-parse-cache) which only included the caching part.

References

Upstream jit branch commit e98cd1f8 — Format chunk runtime optimization
Upstream jit branch commit 6524d77d — Direct Val dispatch, Long fast path

Result

Positive performance impact on format-heavy workloads. FormatCache now pluggable like ParseCache. All 140 tests pass. 6 files changed, 317 insertions, 104 deletions.

sjsonnet/src/sjsonnet/Format.scala

Extract format string cache from static field in Format.scala into a pluggable FormatCache trait (analogous to ParseCache). This allows users to supply custom cache implementations (e.g., Caffeine-based) via the Interpreter/Evaluator constructors. Key changes: - New FormatCache trait with getOrElseUpdate API - DefaultFormatCache: LRU LinkedHashMap (256 entries), thread-safe - FormatCache.SharedDefault singleton preserves process-wide sharing - FormatCache.EmptyCache for testing - CompiledFormat sealed trait for type-safe opaque cache entries - RuntimeFormat: direct Val dispatch, Long fast path, pre-cached specs - PartialApplyFmt pre-parses at construction time (no cache needed) - FormatCache threaded through Interpreter → Evaluator constructors Upstream: he-pin/sjsonnet jit branch (format optimization commits)

He-Pin commented Apr 5, 2026

View reviewed changes

sjsonnet/src/sjsonnet/Format.scala Outdated Show resolved Hide resolved

He-Pin mentioned this pull request Apr 5, 2026

performance optimization #666

Open

He-Pin force-pushed the perf/format-runtime-optimization branch from 9ffd530 to c0bc815 Compare April 5, 2026 11:06

He-Pin changed the title ~~perf: optimize Format runtime with cache, direct Val dispatch, and Long fast path~~ perf: abstract FormatCache as pluggable trait, optimize format runtime Apr 5, 2026

He-Pin marked this pull request as ready for review April 5, 2026 11:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: abstract FormatCache as pluggable trait, optimize format runtime#679

perf: abstract FormatCache as pluggable trait, optimize format runtime#679
He-Pin wants to merge 1 commit intodatabricks:masterfrom
He-Pin:perf/format-runtime-optimization

He-Pin commented Apr 5, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

He-Pin commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Key Design Decision

Modification

New file: sjsonnet/src/sjsonnet/FormatCache.scala

Modified: sjsonnet/src/sjsonnet/Format.scala

Modified: sjsonnet/src/sjsonnet/Val.scala

Modified: sjsonnet/src/sjsonnet/Evaluator.scala

Modified: sjsonnet/src/sjsonnet/Interpreter.scala

Modified: sjsonnet/src-jvm-native/sjsonnet/SjsonnetMainBase.scala

Benchmark Results

JMH Regression Suite (1 fork, 5 warmup, 5 measurement iterations)

Scala Native Hyperfine (-N -w4 -m20)

Analysis

References

Result

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

He-Pin commented Apr 5, 2026 •

edited

Loading

New file: `sjsonnet/src/sjsonnet/FormatCache.scala`

Modified: `sjsonnet/src/sjsonnet/Format.scala`

Modified: `sjsonnet/src/sjsonnet/Val.scala`

Modified: `sjsonnet/src/sjsonnet/Evaluator.scala`

Modified: `sjsonnet/src/sjsonnet/Interpreter.scala`

Modified: `sjsonnet/src-jvm-native/sjsonnet/SjsonnetMainBase.scala`

Scala Native Hyperfine (`-N -w4 -m20`)