A suite of Dask-powered tools for processing and analyzing terabyte-scale 3D segmentation datasets. Supports both isotropic and anisotropic voxel sizes.
| Tool | CLI Command | Description |
|---|---|---|
| Connected Components | connected-components |
Threshold predictions, apply masks, and extract connected components. Volume thresholds are in physical units (nm³). |
| Clean Components | clean-connected-components |
Refine existing segmentations by removing small/large components. |
| Contact Sites | contact-sites |
Identify regions where two segmentations are within a configurable physical distance. Handles mismatched voxel sizes by resampling to a common resolution. |
| Fill Holes | fill-holes |
Fill interior gaps in segmented volumes. |
| Filter IDs | filter-ids |
Exclude unwanted segmentation IDs. |
| Mutex Watershed | mws |
Mutex watershed agglomeration from affinities. |
| Label With Mask | label-with-mask |
Label one dataset with IDs from another. |
| Morphological Operations | morphological-operations |
Erosion and dilation of segmented datasets. Processing order across blocks is not guaranteed. |
| Skeletonize | skeletonize |
Generate skeletons from segmented objects with optional pruning and simplification. Automatically resamples to isotropic resolution before skeletonization. |
| Tool | CLI Command | Description |
|---|---|---|
| Measurement | measure |
Compute metrics (volume, surface area, radius of gyration, bounding box) for objects and contact sites. Supports raw intensity statistics when a raw dataset is provided. |
| Fit Lines | fit_lines_to_segmentations |
Fit geometric lines to elongated/cylindrical structures. |
| Assign to Cells | assign_to_cells |
Map segmented objects to cells based on centers of mass. |
All operations handle anisotropic voxel sizes (e.g. (8, 8, 32) nm in ZYX). Physical-unit parameters like minimum_volume_nm_3, contact_distance_nm, and gaussian_smoothing_sigma_nm are automatically converted to the appropriate per-axis voxel units. When two datasets have different voxel sizes, they are resampled to a common resolution using nearest-neighbor interpolation.
pip install cellmap-analyzeAll commands share the same basic interface:
<command> [options] <config_path>-
<command>: One of the processing or analysis tools listed above. -
<config_path>: Directory containing:run-config.yaml(parameters for your chosen command)dask-config.yaml(Dask cluster settings)
Options:
-n, --num-workers N: Number of Dask workers to launch.
Output: A new directory named
config_path-<YYYYMMDDHHMMSS>will be created, containing copies of your configs and anoutput.logfor monitoring.
The following run-config.yaml could be used to run connected-components.
input_path: /path/to/predictions.zarr/mito/s0
output_path: /path/to/segmentations.zarr/mito
intensity_threshold_minimum: 0.71
minimum_volume_nm_3: 1E7
delete_tmp: true
connectivity: 1
mask_config:
cell:
path: /path/to/masks.zarr/cell/s0
mask_type: inclusive
fill_holes: trueThe following dask-config.yaml files can be used for a variety of tasks.
jobqueue:
local:
ncpus: 1
processes: 1
cores: 1
log-directory: job-logs
name: dask-worker
distributed:
scheduler:
work-stealing: truejobqueue:
lsf:
ncpus: 8 # cores per job chunk
processes: 12 # worker processes per chunk
cores: 12 # threads per process (1 thread each)
memory: 120GB # 15 GB per slot
walltime: 08:00
mem: 12000000000
use-stdin: true
log-directory: job-logs
name: cellmap-analyze
project: charge_group
distributed:
scheduler:
work-stealing: true
admin:
log-format: '[%(asctime)s] %(levelname)s %(message)s'
tick:
interval: 20ms
limit: 3hTo run on 12 dask workers:
Local run example:
connected-components -n 12 config_pathCluster submit example (LSF):
bsub -n 4 -P chargegroup connected-components -n 12 config_pathThe center-finding implementation is taken from funlib.evaluate.