Skip to content

fix: resolve pruning pipeline bugs#265

Draft
nh13 wants to merge 1 commit intobroadinstitute:mainfrom
nh13:nh/fix-pruning-pipeline
Draft

fix: resolve pruning pipeline bugs#265
nh13 wants to merge 1 commit intobroadinstitute:mainfrom
nh13:nh/fix-pruning-pipeline

Conversation

@nh13
Copy link
Copy Markdown
Collaborator

@nh13 nh13 commented Feb 28, 2026

Summary

  • num_epochs passed where num_workers expected in make_data_loader, potentially launching hundreds of worker processes
  • calculate_pruning_thresholds had a vestigial for fold in range(NUM_FOLDS) loop that always returned on the first iteration; removed the loop
  • Added division-by-zero guards with informative ValueError messages in error rate calculations
  • Fixed operator precedence bug: a + b % c evaluated as a + (b % c) instead of (a + b) % c for fold index
  • Fixed return type annotation: List[int] -> Generator[Datum, None, None]
  • Fixed wrong argparse descriptions in prune_dataset.py and edit_dataset.py (both incorrectly said "train the Mutect3 artifact model")

Test plan

  • test_prune_dataset integration test passes
  • Verify pruning runs to completion on example data

@nh13 nh13 force-pushed the nh/fix-pruning-pipeline branch from 3fc2b19 to 62e3fb4 Compare March 4, 2026 23:44
@nh13 nh13 changed the base branch from main to nh/fix-broken-tests March 4, 2026 23:45
@nh13 nh13 force-pushed the nh/fix-broken-tests branch from 04da066 to 21a66ec Compare March 4, 2026 23:51
@nh13 nh13 force-pushed the nh/fix-pruning-pipeline branch from 62e3fb4 to 14b83a6 Compare March 4, 2026 23:51
@nh13 nh13 force-pushed the nh/fix-broken-tests branch from 21a66ec to 6708c39 Compare March 4, 2026 23:56
@nh13 nh13 force-pushed the nh/fix-pruning-pipeline branch from 14b83a6 to 00de1c3 Compare March 4, 2026 23:56
@nh13 nh13 force-pushed the nh/fix-broken-tests branch from 6708c39 to 4da02e3 Compare March 5, 2026 01:22
@nh13 nh13 force-pushed the nh/fix-pruning-pipeline branch from 00de1c3 to abc7026 Compare March 5, 2026 01:22
@nh13 nh13 force-pushed the nh/fix-broken-tests branch from 4da02e3 to e76c0e1 Compare March 5, 2026 16:10
@nh13 nh13 force-pushed the nh/fix-pruning-pipeline branch from abc7026 to 7258bee Compare March 5, 2026 16:11
@nh13 nh13 force-pushed the nh/fix-broken-tests branch from e76c0e1 to 4047e6c Compare March 5, 2026 16:45
@nh13 nh13 force-pushed the nh/fix-pruning-pipeline branch from 7258bee to 40eb370 Compare March 5, 2026 16:45
@nh13 nh13 force-pushed the nh/fix-broken-tests branch from 4047e6c to a71a6b7 Compare March 10, 2026 17:17
@nh13 nh13 force-pushed the nh/fix-pruning-pipeline branch from 40eb370 to 1b28cde Compare March 10, 2026 17:17
@nh13 nh13 force-pushed the nh/fix-broken-tests branch from a71a6b7 to 3c4f160 Compare March 10, 2026 17:20
@nh13 nh13 force-pushed the nh/fix-pruning-pipeline branch from 1b28cde to f50802c Compare March 10, 2026 17:20
@nh13 nh13 force-pushed the nh/fix-broken-tests branch from 3c4f160 to 1f4bffd Compare March 12, 2026 04:38
@nh13 nh13 force-pushed the nh/fix-pruning-pipeline branch from f50802c to ac1c6dc Compare March 12, 2026 04:38
@nh13 nh13 force-pushed the nh/fix-broken-tests branch from 1f4bffd to 78a2b91 Compare March 12, 2026 04:39
@nh13 nh13 force-pushed the nh/fix-pruning-pipeline branch from ac1c6dc to e9a3cc1 Compare March 12, 2026 04:39
@nh13 nh13 force-pushed the nh/fix-broken-tests branch from 78a2b91 to f35061b Compare March 14, 2026 03:16
@nh13 nh13 force-pushed the nh/fix-pruning-pipeline branch from e9a3cc1 to 9af8e28 Compare March 14, 2026 03:16
@nh13 nh13 force-pushed the nh/fix-broken-tests branch from f35061b to 04c6545 Compare March 14, 2026 03:19
@nh13 nh13 force-pushed the nh/fix-pruning-pipeline branch from 9af8e28 to cf00273 Compare March 14, 2026 03:19
@nh13 nh13 force-pushed the nh/fix-broken-tests branch from 04c6545 to 0ee302e Compare March 16, 2026 18:37
@nh13 nh13 force-pushed the nh/fix-pruning-pipeline branch from cf00273 to 6d78ec5 Compare March 16, 2026 18:37
@nh13 nh13 force-pushed the nh/fix-broken-tests branch from 0ee302e to 617749c Compare March 16, 2026 20:20
@nh13 nh13 force-pushed the nh/fix-pruning-pipeline branch from 6d78ec5 to 706cd36 Compare March 16, 2026 20:20
@nh13 nh13 force-pushed the nh/fix-broken-tests branch from 617749c to a7c5dfa Compare March 16, 2026 20:34
@nh13 nh13 force-pushed the nh/fix-pruning-pipeline branch from 706cd36 to c4dd9c4 Compare March 16, 2026 20:34
@nh13 nh13 force-pushed the nh/fix-broken-tests branch from a7c5dfa to bf0ece0 Compare March 16, 2026 20:36
@nh13 nh13 force-pushed the nh/fix-pruning-pipeline branch from c4dd9c4 to a35feda Compare March 16, 2026 20:36
@nh13 nh13 force-pushed the nh/fix-broken-tests branch from bf0ece0 to 1321324 Compare March 17, 2026 17:59
@nh13 nh13 force-pushed the nh/fix-pruning-pipeline branch from a35feda to 2b5ad28 Compare March 17, 2026 17:59
- num_epochs was passed where num_workers was expected in
  make_data_loader call, potentially launching hundreds of workers
- calculate_pruning_thresholds had a vestigial for-loop that always
  returned on the first iteration; removed the loop
- Added division-by-zero guards with informative error messages in
  error rate and inverse error rate calculations
- Fixed operator precedence bug: a + b % c evaluated as a + (b % c)
  instead of (a + b) % c for fold index
- Fixed return type annotation: List[int] -> Generator[Datum, None, None]
- Fixed wrong argparse descriptions in prune_dataset.py and
  edit_dataset.py (both said "train the Mutect3 artifact model")
@nh13 nh13 changed the base branch from nh/fix-broken-tests to main March 18, 2026 04:59
@nh13 nh13 force-pushed the nh/fix-pruning-pipeline branch from 2b5ad28 to 0137d52 Compare March 18, 2026 05:00
@nh13 nh13 marked this pull request as draft March 18, 2026 05:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant