Skip to content

Atrocious memory footprint and performance when reading many small files #486

@pronebird

Description

@pronebird

Describe the bug
Iterating over 1.5GB ZIP archive (with 947319 small files) causes the app to consume almost 512MB of memory on my machine. Listing files in ZIP with unzip -l on the other side takes 1.6M.

To Reproduce
Steps to reproduce the behavior:

  1. Go to https://www.sec.gov/search-filings/edgar-application-programming-interfaces
  2. Download submissions.zip under "Bulk Data"
  3. Run program:
let reader = std::fs::File::open("submissions.zip").unwrap();
let mut zip = ZipArchive::new(reader).unwrap()

let num_files = zip.len();
for i in 0..num_files {
   let file = zip.by_index(i).unwrap();
   println!("Read file: {}", file.name());
}

Expected behavior
Expected that memory footprint is somewhat small.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: macOS
  • Version 15.7.2

Additional context
Add any other context about the problem here.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions