Skip to content

chore: consolidate SourceSet variants into PopulationRange#826

Open
cdc-as81 wants to merge 2 commits intomainfrom
cdc-as81-add-population-range-source-set
Open

chore: consolidate SourceSet variants into PopulationRange#826
cdc-as81 wants to merge 2 commits intomainfrom
cdc-as81-add-population-range-source-set

Conversation

@cdc-as81
Copy link
Copy Markdown
Collaborator

Description

This PR refactors our entity source sets by consolidating the Empty, Entity, and Population variants into a single, unified PopulationRange(std::ops::Range<usize>).

By mapping these sources to a standard contiguous range, we offload all the low-level iterator math, bounds checking, and size calculations to the standard library (std::ops::Range), making the code significantly cleaner and less error-prone.

Key Changes

  • Unified SourceSet Variants: Replaced SourceSet::Empty, SourceSet::Entity, and SourceSet::Population with the unified SourceSet::PopulationRange.
  • Simplified PopulationIterator: Delegated size_hint(), count(), and nth() implementations directly to the underlying standard Range iterator, abstracting away custom saturating math and manual bound tracking.
  • Streamlined EntitySet Logic:
    • Removed obsolete is_universal() algebraic reduction checks.
    • Simplified as_singleton() to use a clean range.len() == 1 check rather than manual checked arithmetic.
  • Updated Tests: Adapted all builder mock implementations and unit tests across entity_set.rs, entity_set_iterator.rs, and source_set.rs to validate against the new PopulationRange structures.

@cdc-as81 cdc-as81 linked an issue Mar 26, 2026 that may be closed by this pull request
github-actions bot added a commit that referenced this pull request Mar 26, 2026
@github-actions
Copy link
Copy Markdown

Benchmark Results

Hyperfine

Command Mean [ms] Min [ms] Max [ms] Relative
large_sir::baseline 2.9 ± 0.1 2.8 3.1 1.00
large_sir::entities 12.2 ± 0.3 11.8 13.2 4.24 ± 0.13

Criterion

Regressions (slower)
Group Bench Param Change CI Lower CI Upper
sampling sampling_single_known_length_entities 10.652% 9.732% 11.601%
large_dataset bench_query_population_derived_property_entities 6.763% 5.464% 8.046%
algorithm_benches algorithm_sampling_multiple_known_length 2.959% 2.149% 3.773%
sample_entity sample_entity_single_property_unindexed 1000 2.386% 1.824% 3.169%
indexing with_query_results_multiple_individually_indexed_properties_enti 1.980% 1.483% 2.340%
Improvements (faster)
Group Bench Param Change CI Lower CI Upper
sample_entity sample_entity_whole_population 100000 -44.748% -45.347% -44.115%
sample_entity sample_entity_whole_population 1000 -43.516% -44.106% -42.955%
sample_entity sample_entity_whole_population 10000 -40.330% -41.150% -39.597%
counts multi_property_indexed_entities -21.941% -22.696% -21.111%
sample_entity sample_entity_single_property_unindexed 100000 -19.072% -19.675% -18.414%
indexing query_people_indexed_multi-property_entities -18.250% -18.501% -17.967%
counts multi_property_unindexed_entities -16.366% -17.516% -15.126%
indexing query_people_count_single_indexed_property_entities -12.639% -12.943% -12.240%
indexing with_query_results_single_indexed_property_entities -12.174% -12.743% -11.679%
sampling sampling_single_l_reservoir_entities -11.128% -12.303% -10.119%
sampling sampling_multiple_l_reservoir_entities -10.799% -11.691% -9.976%
indexing query_people_multiple_individually_indexed_properties_entities -8.460% -8.623% -8.275%
indexing query_people_count_multiple_individually_indexed_properties_enti -7.648% -7.775% -7.539%
sampling count_and_sampling_single_unindexed_concrete_plus_derived_entiti -5.750% -5.901% -5.595%
indexing query_people_count_indexed_multi-property_entities -4.700% -5.182% -4.203%
sample_entity sample_entity_multi_property_indexed 100000 -4.329% -4.580% -3.989%
sample_entity sample_entity_multi_property_indexed 1000 -4.186% -4.694% -3.682%
large_dataset bench_match_entity -4.093% -5.196% -2.601%
sample_entity sample_entity_multi_property_indexed 10000 -3.911% -4.282% -3.438%
sampling sampling_single_unindexed_concrete_plus_derived_entities -3.546% -3.914% -3.224%
sample_entity sample_entity_single_property_indexed 1000 -3.537% -3.945% -3.055%
indexing with_query_results_indexed_multi-property_entities -3.399% -4.429% -2.606%
counts single_property_indexed_entities -3.065% -3.897% -1.829%
sampling sampling_single_unindexed_entities -2.295% -2.687% -1.906%
sample_entity sample_entity_single_property_indexed 10000 -2.212% -2.674% -1.670%
sampling sampling_multiple_known_length_entities -2.168% -3.014% -1.422%
sample_entity sample_entity_single_property_indexed 100000 -2.018% -2.445% -1.506%
algorithm_benches algorithm_sampling_multiple_l_reservoir -1.817% -2.331% -1.411%
examples example-basic-infection -1.789% -2.225% -1.412%
counts concrete_plus_derived_unindexed_entities -1.605% -2.349% -1.059%
Unchanged / inconclusive (CI crosses 0%)
Group Bench Param Change CI Lower CI Upper
counts single_property_unindexed_entities -1.982% -3.251% -0.964%
large_dataset bench_query_population_multi_unindexed_entities -1.447% -2.866% -0.258%
large_dataset bench_filter_indexed_entity -1.194% -13.841% 12.609%
counts index_after_adding_entities -1.190% -1.374% -0.996%
large_dataset bench_query_population_indexed_property_entities -0.824% -1.384% -0.361%
examples example-births-deaths -0.684% -0.973% -0.417%
algorithm_benches algorithm_sampling_single_l_reservoir 0.660% 0.402% 1.003%
large_dataset bench_filter_unindexed_entity 0.611% -3.929% 5.534%
large_dataset bench_query_population_property_entities 0.451% -0.204% 1.255%
large_dataset bench_query_population_multi_indexed_entities 0.375% -0.032% 0.887%
algorithm_benches algorithm_sampling_single_known_length 0.359% -0.112% 1.046%
counts reindex_after_adding_more_entities 0.357% 0.127% 0.563%
sampling sampling_multiple_unindexed_entities 0.228% 0.123% 0.376%
algorithm_benches algorithm_sampling_single_rand_reservoir 0.198% -0.055% 0.525%
indexing query_people_single_indexed_property_entities 0.106% 0.034% 0.184%
sampling count_and_sampling_single_known_length_entities 0.038% -1.172% 1.175%
sample_entity sample_entity_single_property_unindexed 10000 0.010% -0.735% 0.882%

)
}
/// Returns the contained entity id if this set is a singleton leaf.
fn as_singleton(&self) -> Option<EntityId<E>> {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect once you start thinking about the simplification cases for intervals, this private helper method will disappear or be replaced with an interval analog.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 1, 2026

Benchmark Results

Hyperfine

Command Mean [ms] Min [ms] Max [ms] Relative
large_sir::baseline 2.8 ± 0.0 2.8 2.9 1.00
large_sir::entities 12.8 ± 0.1 12.6 13.2 4.54 ± 0.07

Criterion

Regressions (slower)
Group Bench Param Change CI Lower CI Upper
sample_entity sample_entity_single_property_unindexed 10000 10.556% 9.544% 11.479%
sampling sampling_single_l_reservoir_entities 8.454% 7.965% 8.871%
examples example-births-deaths 5.802% 4.818% 7.137%
sample_entity sample_entity_multi_property_indexed 10000 5.232% 4.623% 5.813%
counts reindex_after_adding_more_entities 4.752% 4.511% 4.990%
sample_entity sample_entity_multi_property_indexed 1000 3.906% 3.317% 4.459%
sampling sampling_multiple_unindexed_entities 2.807% 2.646% 3.102%
sample_entity sample_entity_multi_property_indexed 100000 2.655% 2.149% 3.135%
indexing with_query_results_indexed_multi-property_entities 2.454% 2.042% 2.843%
algorithm_benches algorithm_sampling_multiple_l_reservoir 2.219% 1.714% 2.800%
indexing with_query_results_multiple_individually_indexed_properties_enti 1.834% 1.529% 2.111%
Improvements (faster)
Group Bench Param Change CI Lower CI Upper
large_dataset bench_filter_indexed_entity -10.372% -20.098% -1.495%
indexing query_people_count_multiple_individually_indexed_properties_enti -8.501% -9.045% -7.997%
sample_entity sample_entity_single_property_unindexed 1000 -7.080% -7.773% -6.359%
large_dataset bench_match_entity -7.037% -7.244% -6.832%
counts single_property_indexed_entities -6.557% -7.145% -5.693%
indexing query_people_single_indexed_property_entities -6.418% -8.303% -4.515%
algorithm_benches algorithm_sampling_single_known_length -5.007% -5.562% -4.241%
sample_entity sample_entity_single_property_indexed 10000 -3.508% -3.922% -3.224%
sampling count_and_sampling_single_unindexed_concrete_plus_derived_entiti -3.426% -3.752% -3.057%
sample_entity sample_entity_single_property_unindexed 100000 -3.036% -3.277% -2.807%
sampling sampling_multiple_l_reservoir_entities -2.872% -3.099% -2.642%
indexing query_people_indexed_multi-property_entities -2.863% -3.114% -2.585%
examples example-basic-infection -2.231% -2.650% -1.846%
sampling sampling_single_unindexed_concrete_plus_derived_entities -2.205% -2.560% -1.857%
sample_entity sample_entity_single_property_indexed 1000 -1.999% -2.461% -1.645%
algorithm_benches algorithm_sampling_multiple_known_length -1.992% -2.580% -1.623%
sampling sampling_multiple_known_length_entities -1.968% -2.663% -1.285%
counts multi_property_unindexed_entities -1.915% -2.207% -1.416%
sample_entity sample_entity_single_property_indexed 100000 -1.834% -2.239% -1.296%
sampling sampling_single_known_length_entities -1.819% -2.363% -1.282%
Unchanged / inconclusive (CI crosses 0%)
Group Bench Param Change CI Lower CI Upper
large_dataset bench_filter_unindexed_entity -4.547% -8.200% -0.658%
indexing query_people_count_indexed_multi-property_entities 1.450% 0.652% 2.244%
indexing query_people_count_single_indexed_property_entities 1.431% 0.855% 1.999%
large_dataset bench_query_population_multi_indexed_entities -1.094% -1.389% -0.839%
counts multi_property_indexed_entities -1.084% -1.742% -0.595%
sample_entity sample_entity_whole_population 1000 -1.058% -2.248% -0.127%
indexing query_people_multiple_individually_indexed_properties_entities 1.047% 0.646% 1.382%
large_dataset bench_query_population_indexed_property_entities 0.894% 0.571% 1.330%
indexing with_query_results_single_indexed_property_entities 0.842% 0.384% 1.485%
large_dataset bench_query_population_multi_unindexed_entities 0.581% 0.224% 1.071%
sample_entity sample_entity_whole_population 100000 -0.571% -0.969% -0.274%
sample_entity sample_entity_whole_population 10000 -0.448% -0.830% -0.126%
algorithm_benches algorithm_sampling_single_rand_reservoir 0.444% 0.278% 0.592%
counts index_after_adding_entities -0.416% -0.544% -0.299%
large_dataset bench_query_population_derived_property_entities 0.389% -0.066% 0.898%
algorithm_benches algorithm_sampling_single_l_reservoir -0.360% -0.528% -0.163%
counts single_property_unindexed_entities 0.271% -0.273% 0.850%
sampling sampling_single_unindexed_entities -0.197% -0.372% -0.043%
sampling count_and_sampling_single_known_length_entities -0.141% -0.487% 0.337%
large_dataset bench_query_population_property_entities -0.090% -0.530% 0.232%
counts concrete_plus_derived_unindexed_entities -0.086% -0.345% 0.147%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add a SourceSet variant for a range of EntityIds

3 participants