Skip to content

Improve memory usage in uvfits reader#1651

Open
bhazelton wants to merge 4 commits intomainfrom
mem_fixes
Open

Improve memory usage in uvfits reader#1651
bhazelton wants to merge 4 commits intomainfrom
mem_fixes

Conversation

@bhazelton
Copy link
Member

@bhazelton bhazelton commented Feb 24, 2026

Description

This makes a few memory handling improvements to the uvfits reader (and maybe the MWA correlator FITS reader).

It turns out that when using astropy.io.fits.open with memmap=True, the python garbage collector does not release the memory promptly upon exiting the context manager. We were always using memmap=True, but it's really only needed when doing partial reads. This PR changes the calls to the astropy.io.fits.open to only use memmap=True when doing partial reads. This leads to significant memory improvements for the uvfits reader when reading the whole file. It's not clear if there are any improvements to the MWA correlator FITS reader in terms of memory usage, but it seemed better to be consistent.

I also refactored the uvfits reader to remove a temporary array that was being used to store and manipulate the data stored in the primary uvfits HDU before assigning parts of it to the data_array, nsample_array and flag_array. I now have it just assign the full data from the HDU to the data_array and do all the manipulations in place in that array. It feels a bit yucky but leads to much better memory usage.

Below are plots of the memory usage as measured by memray for a small script that just reads in an MWA uvfits file (the blue resident size is most important). There is some run-to-run variability to these plots that I don't fully understand, so I'm including 2 plots for each codebase, from different days but all three codebase runs were run back to back on the 2 different days (all on my laptop):

main branch:
uvfits_read_main
uvfits_read_main_2

After using memmap=False:
uvfits_read_fix1
uvfits_read_fix1_2

After refactoring out the temporary array:
uvfits_read_fix2
uvfits_read_fix2_2

Note that the peak memory use doesn't change much (and is seems quite variable in the last set of plots), but there's a consistent big change in the final memory usage at the end of the script. If you are doing anything with the object after reading it, that becomes your baseline memory usage for the next step. I actually stumbled on this when trying to profile something else (frequency averaging) and discovered that the memory usage was much higher when I used a script that started by reading in a uvfits file vs when I started by reading in the same data saved as a uvh5 file.

Similar plots for the MWA Correlator FITS reader seem to be dominated by run-to-run variability -- I don't see a consistent improvement with these changes, but I don't see it getting materially worse either.

We could certainly make similar changes in the calfits and beamfits readers, but I wasn't sure I should cram them into this PR.

Motivation and Context

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation change (documentation changes only)
  • Version change
  • Build or continuous integration change
  • Other

Checklist:

Other:

  • I have updated any docstrings associated with my change using the numpy docstring format.
  • I have updated the readme and/or tutorial to reflect my changes (if appropriate).
  • I have added/updated tests to cover my changes (if appropriate).
  • I have updated the CHANGELOG.

@codecov
Copy link

codecov bot commented Feb 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.93%. Comparing base (ad3e728) to head (3de7ed6).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1651   +/-   ##
=======================================
  Coverage   99.93%   99.93%           
=======================================
  Files          67       67           
  Lines       22688    22705   +17     
=======================================
+ Hits        22674    22691   +17     
  Misses         14       14           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant