Skip to content

SparseStorage, concurrent_flat_map, and SparseMatrixAtomic#43

Open
bendavid wants to merge 25 commits intomainfrom
calibrationdev
Open

SparseStorage, concurrent_flat_map, and SparseMatrixAtomic#43
bendavid wants to merge 25 commits intomainfrom
calibrationdev

Conversation

@bendavid
Copy link
Copy Markdown
Owner

@bendavid bendavid commented Apr 9, 2026

Adds a sparse storage backend for HistoBoost and the supporting
data structures (lock-free concurrent map and SparseMatrixAtomic),
plus a generic MapWrapper helper that is then used to extend the
existing HistShiftHelper / QuantileHelper helpers with automatic
broadcasting over container arguments and a new continuous
quantile-transform mode.

Bottom-up commit list:

  • add python script for tests
  • Add SymMatrixAtomic
  • minor improvement for SymMatrixAtomic and add initial version of SparseMatrixAtomic
  • fix deprecated storage_type access
  • fix constness
  • make wrapper more flexible/robust
  • flexible column types for quantile helpers
  • add missing include
  • make range_to more flexible
  • add lock-free insert-only concurrent_flat_map
  • add SparseMatrixAtomic test driver
  • SparseMatrixAtomic: switch to narf::concurrent_flat_map
  • concurrent_flat_map: add move constructor and assignment
  • HistoBoost: add SparseStorage option backed by concurrent_flat_map
  • HistoBoost SparseStorage: convert result to wums.SparseHist
  • concurrent_flat_map: serialize segment growth via sentinel
  • SparseStorage: fix ND linearization mismatch with SparseHist
  • SparseMatrixAtomic: configurable fill_fraction
  • HistShiftHelper: guard against non-finite bin geometry
  • Add MapWrapper helper for element-wise application over container args
  • HistShiftHelper: delegate container broadcasting to MapWrapper
  • QuantileHelper[Static]: delegate container broadcasting to MapWrapper
  • QuantileHelper[Static]: add continuous CDF-style lookup mode
  • define_quantile_ints: support continuous quantile mode
  • build_quantile_hists: return bin centers and volumes

This series is the narf side of the larger sparse-input rework that
the rabbit and wums PRs make use of.

WMass/rabbit#129
WMass/wums#25

bendavid and others added 25 commits April 3, 2026 21:15
A segmented open-addressing hash map for integer keys supporting
concurrent lock-free find / insert / emplace / expansion. State bits
are encoded in the two MSBs of each slot's key. Includes tests
covering single-threaded correctness, pointer stability across
expansion, and multi-threaded concurrent insert/find, plus a test
for SparseMatrixAtomic that exercises its public API under
concurrent fetch_add.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaces tbb::concurrent_unordered_map with the new lock-free
insert-only flat map, removing the FIXME about lock contention on
inserts. reserve() becomes a no-op since the new map grows on
demand.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Required so the map can live as a member of other movable types
(e.g. a boost::histogram storage class). The moved-from object is
left in a destroy-only state.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds narf::concurrent_sparse_storage, a boost::histogram Storage type
backed by narf::concurrent_flat_map with has_threading_support = true,
plus a make_histogram_sparse factory and python-friendly snapshot
helpers (boost::histogram does not expose its storage_ member to
cppyy directly).

HistoBoost gains a SparseStorage marker class taking an estimated
fill_fraction (default 0.1) used to pre-size the underlying map and
avoid most on-the-fly expansions. Tensor weights are not supported in
this mode and conversion to a python hist.Hist is skipped; the raw
RResultPtr is returned. Includes an end-to-end RDataFrame test.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The SparseStorage path now lazily converts the underlying C++
sparse histogram to a wums.sparse_hist.SparseHist on first
dereference, snapshotting the concurrent_flat_map into flat
indices/values that match the with-flow row-major layout.
Pass convert_to_hist=False to get the raw RResultPtr instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previously every thread that observed a saturated tail segment
speculatively allocated a doubled-size successor and then either
won the CAS or freed it. Under high thread contention this caused
a transient memory spike of M_threads * segment_size per growth
event, easily inflating peak RSS by an order of magnitude for
multi-GB segments and potentially fragmenting the address space.

ensure_next now CAS-publishes a "growing" sentinel into the
segment's next pointer before allocating; only the winning thread
performs the allocation while losers yield-spin until the real
successor is published. All segment walks use a new observed_next
helper that treats the sentinel as "no successor yet".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
boost::histogram linearizes axes column-major (leftmost axis has
stride 1), but wums.SparseHist expects numpy row-major flat
indices. For ND histograms this caused entries to land in the
wrong bins (often flow bins) and silently disappear from
toarray(flow=False); 1D was unaffected and so the existing test
did not catch it.

The conversion now un-ravels each boost-linear key under F order
and re-ravels under C order before constructing the SparseHist.
Adds a 3D test that cross-checks against a dense HistoBoost.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the hard-coded size0*size1/40 initial capacity with a
fill_fraction constructor argument (default 0.025 to match the
previous behaviour) that sizes the underlying concurrent_flat_map
to fill_fraction * size0 * size1 entries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Treat continuous-axis bins with infinite width or center as flow bins
and return zero correction, preventing NaN propagation when an axis
uses np.inf as a bin edge.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
MapWrapper wraps an arbitrary callable so that, when invoked, any
argument satisfying narf::is_container is zipped element-wise (with
scalar arguments broadcast via make_view) and the callable is applied
to each resulting tuple via std::apply. If none of the arguments are
containers, the callable is invoked directly with the arguments as-is.

Also provide a forwarding constructor so the wrapped callable can be
constructed in place from MapWrapper's own constructor arguments, and
add a unit test exercising both the container and scalar-passthrough
code paths.
Rename the core class to HistShiftHelperImpl and drop its is_container_any
branch, collapsing compute and compute_impl into a single scalar-only
implementation. HistShiftHelper is now a template alias for
MapWrapper<HistShiftHelperImpl<Axes...>>, which restores the previous
element-wise behavior for container arguments while keeping the per-event
code path untouched.
Rename the core classes to QuantileHelperImpl and QuantileHelperStaticImpl
and expose QuantileHelper / QuantileHelperStatic as MapWrapper template
aliases over them. This gives both helpers automatic element-wise
broadcasting over container arguments while leaving their existing
scalar call paths and factory/Python entry points source-compatible.

Also add a unit test exercising the scalar and RVec call paths of
QuantileHelperStatic.
Thread a bool Continuous template parameter through QuantileHelperImpl
and QuantileHelperStaticImpl via a shared quantile_lookup helper. In
continuous mode the helpers return a double in [0, 1] obtained by
linearly interpolating between adjacent stored edges (edges[i] maps to
i/(N-1)), with values outside [edges[0], edges[N-1]] clamped to 0 / 1.

Expose QuantileHelperContinuous / QuantileHelperStaticContinuous aliases
and a make_quantile_helper_continuous factory. Extend the unit test to
cover the scalar and RVec continuous code paths.
Add a continuous=False option to build_quantile_hists which preserves
the original (Regular / Variable) quantile axes in the returned helper
histograms instead of replacing them with Integer axes.
define_quantile_ints auto-detects the mode from the axis type and
dispatches to the continuous quantile helpers, feeding the resulting
CDF-style columns (named _quant instead of _iquant) to subsequent
helpers in the chain.
Also compute per-bin minima (via ak.min) alongside the existing maxima
so that per-dimension widths and centers of the final transformed
quantile bins can be derived. Return two additional histograms:

- centers_hist: multidimensional bin centers stored along an extra
  StrCategory "coord" axis labelled with the input quantile axis names
  (with a quant_i placeholder for any unnamed or duplicated name).
- volume_hist: product of the per-dimension widths of the same bin.

Both are indexed by the full set of conditional and quantile axes
matching the last helper histogram. Update the existing call site in
test/testquantiles.py to unpack the new return tuple.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant