perf: abstract FormatCache as pluggable trait, optimize format runtime#679
Open
He-Pin wants to merge 1 commit intodatabricks:masterfrom
Open
perf: abstract FormatCache as pluggable trait, optimize format runtime#679He-Pin wants to merge 1 commit intodatabricks:masterfrom
He-Pin wants to merge 1 commit intodatabricks:masterfrom
Conversation
He-Pin
commented
Apr 5, 2026
Extract format string cache from static field in Format.scala into a pluggable FormatCache trait (analogous to ParseCache). This allows users to supply custom cache implementations (e.g., Caffeine-based) via the Interpreter/Evaluator constructors. Key changes: - New FormatCache trait with getOrElseUpdate API - DefaultFormatCache: LRU LinkedHashMap (256 entries), thread-safe - FormatCache.SharedDefault singleton preserves process-wide sharing - FormatCache.EmptyCache for testing - CompiledFormat sealed trait for type-safe opaque cache entries - RuntimeFormat: direct Val dispatch, Long fast path, pre-cached specs - PartialApplyFmt pre-parses at construction time (no cache needed) - FormatCache threaded through Interpreter → Evaluator constructors Upstream: he-pin/sjsonnet jit branch (format optimization commits)
9ffd530 to
c0bc815
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
The
%format operator is a critical path for string-template-heavy workloads. The current implementation:ujson.Valuebefore format-specific dispatch (allocates intermediate objects)for/zipWithIndexiterator allocation in the main loopBigIntfor integer formatting even when the value fits in aLongwiden()even when no width/padding is neededThese costs compound in benchmarks like
large_string_template(256 format specs in a 600KB template).Key Design Decision
Pluggable FormatCache trait: Abstracts the format string cache as a trait (analogous to
ParseCache), injected throughInterpreter→Evaluatorconstructors. Users can supply custom implementations (e.g., Caffeine-based) for better control over eviction, concurrency, and memory. Default is a process-wide LRU singleton (FormatCache.SharedDefault).Direct Val dispatch: Match on
Val.Str,Val.Num,Val.True,Val.False,Val.Nulldirectly instead of materializing toujson.Valuefirst. SinceValis a sealed class, this covers all primitive types. Complex types (Arr,Obj) still go throughMaterializer.Long fast path:
formatIntegeravoidsBigIntallocation when the value fits in aLong(with explicitLong.MinValueguard to prevent negation overflow).Modification
New file:
sjsonnet/src/sjsonnet/FormatCache.scalaFormatCachetrait: SinglegetOrElseUpdate(key, compute)APIDefaultFormatCache: LRULinkedHashMap(256 entries, access-order), thread-safe via synchronized double-checked locking. Initial capacity sized to avoid premature rehash.FormatCache.SharedDefault: Process-wide singleton preserving cross-interpreter reuseFormatCache.EmptyCache: Always-recompute cache for testingModified:
sjsonnet/src/sjsonnet/Format.scalaparsedFormatCachefield → replaced by pluggableFormatCacheCompiledFormatsealed trait: Opaque marker for cache entries (hidesRuntimeFormatinternals)RuntimeFormat: Pre-processes parsed format into arrays with metadata (hasAnyStar,staticChars), nowprivate[sjsonnet]extendingCompiledFormatparseFormatCached: TakesFormatCacheparameter, uses pattern match (notasInstanceOf)MaterializerforStr/Num/Bool/NullwidenRawfast path: Returnstxtdirectly whenwidth.isEmptyfor/zipWithIndexto avoid iterator/tuple allocationstaticChars + specs.length * 8formatIntegerLong fast path: Usesjava.lang.Long.toStringinstead ofBigInt.toStringPartialApplyFmt: Pre-parses at construction time, bypasses external cacheModified:
sjsonnet/src/sjsonnet/Val.scalaEvalScope.formatCache: Concrete method with default (FormatCache.SharedDefault), avoids breaking external implementationsModified:
sjsonnet/src/sjsonnet/Evaluator.scalaformatCache: FormatCache = FormatCache.SharedDefaultadded to bothEvaluatorandNewEvaluatorModified:
sjsonnet/src/sjsonnet/Interpreter.scalaformatCache: FormatCachethreaded through tocreateEvaluatorModified:
sjsonnet/src-jvm-native/sjsonnet/SjsonnetMainBase.scalaFormatCache.SharedDefaultexplicitly toInterpreterconstructorBenchmark Results
JMH Regression Suite (1 fork, 5 warmup, 5 measurement iterations)
Scala Native Hyperfine (
-N -w4 -m20)No regressions on non-format benchmarks.
Analysis
The JVM improvement (-6.4% on
large_string_template, -14.7% onrealistic1) confirms the optimization is effective. The native improvement is more modest because:large_string_templatebenchmark has a 600KB format string with only 256 specs — the bottleneck is dominated by string I/O rather than format logicThe
Long.MinValueedge case is guarded to prevent negation overflow — falls through to theBigDecimalpath.The FormatCache abstraction adds no performance overhead — the
SharedDefaultsingleton is identical to the previous static cache, andPartialApplyFmtbypasses the cache entirely.Supersedes PR #672 (format-parse-cache) which only included the caching part.
References
e98cd1f8— Format chunk runtime optimization6524d77d— Direct Val dispatch, Long fast pathResult
Positive performance impact on format-heavy workloads. FormatCache now pluggable like ParseCache. All 140 tests pass. 6 files changed, 317 insertions, 104 deletions.