perf: inline numeric fast path in array comparison#693
Open
He-Pin wants to merge 1 commit intodatabricks:masterfrom
Open
perf: inline numeric fast path in array comparison#693He-Pin wants to merge 1 commit intodatabricks:masterfrom
He-Pin wants to merge 1 commit intodatabricks:masterfrom
Conversation
Inline Val.Num type check in the array comparison while loop to avoid polymorphic recursive compare() dispatch per element. For numeric array comparisons (e.g. 1M elements), this eliminates 1M recursive method calls with 5-branch pattern matching overhead. Uses asDouble (not raw extraction) to preserve NaN error behavior — std.log(-1) etc. can produce NaN in Val.Num, and asDouble correctly throws 'not a number', matching the official C++ jsonnet behavior. Uses java.lang.Double.compare() instead of compareTo() to avoid autoboxing overhead. Upstream: jit branch commit 62437d8
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Array comparison in Jsonnet (e.g.,
long_array + [1] < long_array + [2]) iterates element-by-element through a while loop, callingcompare()recursively for each element. For numeric arrays with 1M+ elements, this means 1M recursive method calls, each going through a 5-branch pattern match (Null,Num,Str,Bool,Arr). This polymorphic dispatch overhead is unnecessary when both elements are the commonVal.Numtype.Key Design Decision
Val.Numtype match inside the array comparison loop before falling back to recursivecompare(), eliminating dispatch overhead for numeric elementsasDouble(not raw extraction): Preserves NaN error behavior —std.log(-1)etc. can produce NaN inVal.Num, andasDoublecorrectly throws "not a number", matching the official C++ jsonnet behavior. This preserves correct behavior for edge cases likestd.log(-1)which createsVal.Num(NaN)via MathModulejava.lang.Double.compare(): Used instead ofcompareTo()to avoid autoboxing overheadModification
sjsonnet/src/sjsonnet/Evaluator.scala—compare()method,Val.Arrbranch:Val.Numtype check before recursivecompare()callcompare()for non-numeric elements (strings, booleans, nested arrays)Benchmark Results
JMH (same-session A/B, single-fork)
Hyperfine (Scala Native vs jrsonnet, comparison.jsonnet)
Native shows neutral results — the optimization primarily benefits JVM where JIT dispatch overhead is more significant than in AOT-compiled code.
Analysis
The -11.2% JMH improvement on the
comparisonbenchmark (1M-element numeric array comparison) is consistent across multiple runs (-8.6% to -11.2% range). The optimization works by:java.lang.Double.compare()is a single JVM intrinsicThe
comparison2benchmark (scalari < jin comprehensions) is unaffected because it uses the top-levelVal.Numcase incompare(), not the array branch.References
62437d8fResult
Targeted optimization for numeric array comparison with no regressions. Reduces the gap with jrsonnet on the comparison benchmark by avoiding unnecessary polymorphic dispatch in the hot loop.