Skip to content

Fix non-deterministic field ordering in AnnotationSourceTSV annotation extraction#403

Merged
holmeso merged 3 commits intonanno_multi_genefrom
copilot/sub-pr-400-another-one
Mar 3, 2026
Merged

Fix non-deterministic field ordering in AnnotationSourceTSV annotation extraction#403
holmeso merged 3 commits intonanno_multi_genefrom
copilot/sub-pr-400-another-one

Conversation

Copy link

Copilot AI commented Feb 27, 2026

AnnotationSourceTSV was iterating over HashMap.keySet() to build the precomputed fieldNames/fieldPositions arrays, producing annotation output in arbitrary order rather than the user-requested field order.

Changes

  • AnnotationSourceTSV.java: Replace keyset iteration with fieldNames.split(",") when building precomputed arrays. The HashMap is retained for O(1) position lookups; only the iteration source changes.
// Before — non-deterministic HashMap keyset order
this.fieldNames = headerNameAndPosition.keySet().toArray(new String[0]);
this.fieldPositions = new int[this.fieldNames.length];
for (int i = 0; i < this.fieldNames.length; i++) {
    this.fieldPositions[i] = headerNameAndPosition.get(this.fieldNames[i]);
}

// After — preserves user-requested field order
String[] requestedFields = fieldNames.split(",");
this.fieldNames = requestedFields;
this.fieldPositions = new int[requestedFields.length];
for (int i = 0; i < requestedFields.length; i++) {
    this.fieldPositions[i] = headerNameAndPosition.get(requestedFields[i]);
}

Since getHeaderNameAndPositions also splits on "," to build the map, the keys are guaranteed to match — no risk of NullPointerException at lookup time.

  • AnnotationSourceTSVTest.java: Added fieldOrderPreservedFromUserRequest test that verifies fields are emitted in user-requested order. The test defines a header with fields in order (alpha, beta, gamma), requests them in reverse ("gamma,beta,alpha"), and asserts the output matches the user-requested order rather than the header/HashMap order.

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

  • Existing AnnotationSourceTSVTest unit tests cover extractFieldsFromRecord and getHeaderNameAndPositions
  • New fieldOrderPreservedFromUserRequest test verifies that annotation fields are emitted in the exact order the user requested, not in arbitrary HashMap keyset order

Are WDL Updates Required?

No WDL changes required.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

…erating over fieldNames.split() instead of HashMap keyset

Co-authored-by: holmeso <7066552+holmeso@users.noreply.github.com>
Copilot AI changed the title [WIP] WIP Address feedback on handling multiple genes in snpeff variant reporting Fix non-deterministic field ordering in AnnotationSourceTSV annotation extraction Feb 27, 2026
@holmeso
Copy link
Contributor

holmeso commented Mar 3, 2026

@copilot could you write a test for this please

…nnotation output

Co-authored-by: holmeso <7066552+holmeso@users.noreply.github.com>
Copy link
Author

Copilot AI commented Mar 3, 2026

@copilot could you write a test for this please

Added fieldOrderPreservedFromUserRequest test in 623d911. It defines a header with fields alpha, beta, gamma (in that order), requests them in reverse ("gamma,beta,alpha"), and asserts the output is gamma=val_gamma\tbeta=val_beta\talpha=val_alpha — confirming the fix preserves user-requested order.

@holmeso holmeso marked this pull request as ready for review March 3, 2026 03:24
@holmeso holmeso merged commit 993ffb3 into nanno_multi_gene Mar 3, 2026
1 check passed
@holmeso holmeso deleted the copilot/sub-pr-400-another-one branch March 3, 2026 03:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants