Add concurrent chunk fetching for external array links#123
Open
bendichter wants to merge 3 commits intomainfrom
Open
Add concurrent chunk fetching for external array links#123bendichter wants to merge 3 commits intomainfrom
bendichter wants to merge 3 commits intomainfrom
Conversation
Route remote external array links through zarr + LindiH5ZarrStore instead of h5py + LindiRemfile. This enables concurrent HTTP range requests via LindiH5ZarrStore.getitems(), which zarr calls when reading multiple chunks. The getitems() method separates serial metadata lookup (fast, uses h5py's B-tree cache) from parallel data fetches (N concurrent HTTP requests via ThreadPoolExecutor instead of N serial ones). Local external array links still use h5py directly since there's no concurrency benefit for local I/O. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Merge adjacent/nearby chunk byte ranges into single HTTP requests before concurrent fetching. Two configurable parameters control the behavior: - coalesce_merge_gap (default 256KB): max gap between ranges to merge - coalesce_max_size (default 20MB): max size of a coalesced request Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #123 +/- ##
==========================================
- Coverage 81.63% 81.42% -0.21%
==========================================
Files 30 30
Lines 2793 2918 +125
==========================================
+ Hits 2280 2376 +96
- Misses 513 542 +29 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Standalone script in devel/ that compares serial (h5py+LindiRemfile) vs concurrent (zarr+LindiH5ZarrStore) reads from DANDI, asserts data equivalence, and produces a bar chart showing timings and speedup. Usage: python devel/benchmark_concurrent_fetch.py [--dandiset 000473] [-o results.png] Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
getitems()toLindiH5ZarrStoreenabling concurrent HTTP range requests when zarr reads multiple chunkszarr + LindiH5ZarrStoreinstead ofh5py + LindiRemfile, so chunk fetches go throughgetitems()and run in parallel viaThreadPoolExecutorMotivation
When reading large remote NWB datasets via LINDI (e.g., from DANDI), reads are slow because h5py fetches chunks serially through LindiRemfile — one HTTP request per chunk. h5py has no batch-read API, so parallelism can't be added at the LindiRemfile level. Zarr does have this via
getitems(), which receives all needed chunk keys in a single call.How it works
Before:
h5py.File(LindiRemfile(url))[dataset][selection]→ serial HTTP requestsAfter:
zarr.open_array(LindiH5ZarrStore(url))[selection]→store.getitems(chunk_keys)→ serial metadata lookup (fast, h5py B-tree cache) → coalesce nearby byte ranges → concurrent data fetches (ThreadPoolExecutor, up to 8 workers)Byte range coalescing
Before issuing HTTP requests, nearby byte ranges are merged into larger contiguous fetches. This reduces the number of round-trips when chunks are stored close together in the HDF5 file. Two parameters control the behavior:
_coalesce_merge_gap(default 256KB): maximum gap between ranges to merge into one request_coalesce_max_size(default 20MB): maximum size of a single coalesced requestTest plan
test_getitems_local_chunks— multi-chunk getitems with local filetest_getitems_inline_data— inline (small) dataset pathtest_getitems_single_chunk_shortcut— single chunk skips thread pooltest_external_array_link_via_zarr_store— local external array links still worktest_zarr_store_for_external_array— zarr store serves all chunks correctly with slicingtest_getitems_empty_keys— empty key list returns empty dicttest_coalesce_byte_ranges— unit tests for range merging logic (gap, max_size, sorting)test_coalesce_integration— coalesced fetching returns correct data through zarr🤖 Generated with Claude Code