-
Notifications
You must be signed in to change notification settings - Fork 2
Blog post topic: Encodings #18
Copy link
Copy link
Open
Description
Something that has recently intrigued me is the choice of data format for files. Whether it be csv, json, pickle, zarr, netCDF, parquet, arrow, COG, icechunk, there isn't a right or wrong answer, just trade offs for what can and can't be done.
I'd like to dive very deep into the encodings/backends of why these data formats shine in one way or another. Examples include how parquets are column-oriented and save a schema, zarr/icechunk is compressed and chunked, etc.
I think this would be useful to understand why to use something to avoid "using something because everyone else does
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels