Skip to content

RichardScottOZ/ASTER-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ASTER Time Series Analysis

A Python package for global time series analysis of ASTER public data with support for spatial subsetting (bounding box or polygon), scalable processing, and multiple data access patterns.

Features

  • 🌍 Global scale – tile-based processing supports whole-of-Earth analyses
  • 📦 Spatial subsetting – bounding box or polygon (GeoJSON/Shapefile/GeoDataFrame), with predefined regions (South Australia, Australia, CONUS, etc.)
  • 🕐 Time series extraction – stack scenes along a time dimension using xarray + dask
  • Scalable – chunked parallel processing via dask and thread pools
  • 🛰️ Multiple data sources – NASA Earthdata (via earthaccess) and Microsoft Planetary Computer (via pystac-client)
  • 📊 Spectral indices – NDVI, brightness temperature (bands 10–14), SWIR band ratios (mineral mapping)
  • 💾 Export – NetCDF, GeoTIFF, CSV
  • 🖥️ CLIaster-timeseries command for scripting and automation

Supported ASTER Products

Product Description What you get
AST_L1T Level 1T Precision Terrain Corrected Registered Radiance All three instrument subsystems from the same scene: VNIR bands 1, 2, 3N; SWIR bands 4-9; TIR bands 10-14
AST_07 Surface Emissivity TIR emissivity product
AST_08 Surface Kinetic Temperature TIR temperature product
AST_09 Surface Spectral Radiance – SWIR SWIR-only surface radiance product
AST_09T Surface Spectral Radiance – TIR TIR-only surface radiance product
AST_09XT Surface Spectral Radiance – VNIR + TIR VNIR and TIR together, but no SWIR
AST14DEM Digital Elevation Model Elevation raster

Choosing the right product, subsystem, and band

ASTER data in this package is organised in two layers:

  1. Product: the NASA data product you search for, such as AST_L1T or AST_09T
  2. Subsystem: the part of the ASTER instrument you open from a file:
    • VNIR = bands 1, 2, 3N
    • SWIR = bands 4, 5, 6, 7, 8, 9
    • TIR = bands 10, 11, 12, 13, 14

When you call ASTERDataAccess.open_dataset(..., subsystem="..."), the code loads one subsystem at a time from the HDF file.

Common questions:

  • I only want band 2 → search AST_L1T, open subsystem="VNIR", and use ImageData2
  • I only want band 5 → search AST_L1T, open subsystem="SWIR", and use ImageData5
  • I want both band 2 and band 5 from the same scene → use AST_L1T, because that is the product in this package that contains both VNIR and SWIR measurements for the same acquisition
  • I only need VNIR + TIR, not SWIRAST_09XT may be a better fit than AST_L1T
  • I only need thermal temperatureAST_08 is the dedicated temperature product

Example: extracting band 2 and band 5 from the same AST_L1T file:

from aster_timeseries import ASTERCatalog, ASTERDataAccess, BoundingBox

catalog = ASTERCatalog()
scenes = catalog.search(
    product="AST_L1T",
    bbox=BoundingBox.from_named_region("south_australia"),
    start_date="2020-01-01",
    end_date="2020-12-31",
)

access = ASTERDataAccess()
local_files = access.download(scenes[:1])

vnir = access.open_dataset(local_files[0], subsystem="VNIR")
swir = access.open_dataset(local_files[0], subsystem="SWIR")

band2 = vnir["ImageData2"]
band5 = swir["ImageData5"]

Installation

pip install aster-timeseries

For distributed/parallel processing:

pip install "aster-timeseries[distributed]"

For visualisation:

pip install "aster-timeseries[viz]"

Requirements

  • Python ≥ 3.9
  • NASA Earthdata account (free): https://urs.earthdata.nasa.gov
  • Earthdata credentials configured for earthaccess (see detailed setup below)

Earthdata authentication and local paths

The default catalog backend is NASA Earthdata. In this repository, both ASTERCatalog() and ASTERDataAccess.download() delegate authentication to earthaccess, which is called with:

earthaccess.login(strategy="netrc", persist=False)

That means the package is not reading a random path by itself: it is asking earthaccess to authenticate using the credentials available in your user environment. The most reliable setup is a netrc file in your home directory.

What ~ means

In the README and examples, ~ means your home directory:

  • Linux: usually /home/your-user
  • macOS: usually /Users/your-user
  • Windows: usually C:\Users\your-user

So:

  • ~/.netrc on Linux might be /home/alex/.netrc
  • ~/.netrc on macOS might be /Users/alex/.netrc
  • the rough Windows equivalent is usually C:\Users\alex\_netrc

Recommended Earthdata setup

Create an Earthdata account at https://urs.earthdata.nasa.gov, then create a netrc file containing your credentials:

machine urs.earthdata.nasa.gov
  login YOUR_EARTHDATA_USERNAME
  password YOUR_EARTHDATA_PASSWORD

Typical locations:

  • Linux: /home/your-user/.netrc
  • macOS: /Users/your-user/.netrc
  • Windows: C:\Users\your-user\_netrc

On Linux/macOS, restrict the file permissions so other users cannot read it:

chmod 600 ~/.netrc

Environment variable alternative

If you prefer environment variables, earthaccess can also use these:

Linux/macOS

export EARTHDATA_USERNAME="your_username"
export EARTHDATA_PASSWORD="your_password"

Windows PowerShell

setx EARTHDATA_USERNAME "your_username"
setx EARTHDATA_PASSWORD "your_password"

If you use setx, open a new terminal before running aster-timeseries.

What the package looks for, and when

  • ASTERCatalog(backend="earthaccess") needs Earthdata credentials so it can search NASA CMR
  • ASTERDataAccess.download(...) needs the same credentials so it can download the matching granules
  • ASTERDataAccess.open_dataset(...) reads a local .hdf file and does not search for credentials
  • ASTERCatalog(backend="planetary_computer") uses Microsoft Planetary Computer instead, so Earthdata credentials are not needed for that backend

Cache and output paths

If you construct ASTERDataAccess() without a cache_dir, downloads are stored under:

  • Linux: /home/your-user/.cache/aster_timeseries
  • macOS: /Users/your-user/.cache/aster_timeseries
  • Windows: C:\Users\your-user\.cache\aster_timeseries

By default, downloaded files are organised into product subdirectories such as:

  • /home/your-user/.cache/aster_timeseries/AST_L1T/
  • C:\Users\your-user\.cache\aster_timeseries\AST_09T\

Quick Start

Python API

from aster_timeseries import ASTERCatalog, ASTERDataAccess, ASTERTimeSeries, BoundingBox

# 1. Define your region of interest
bbox = BoundingBox.from_named_region("south_australia")
# Or use a custom bounding box:
# bbox = BoundingBox(west=129.0, south=-38.1, east=141.0, north=-26.0)
# Or load from a GeoJSON/Shapefile:
# bbox, gdf = load_region("path/to/south_australia.geojson")

# 2. Search for scenes
catalog = ASTERCatalog()   # uses NASA Earthdata by default
scenes = catalog.search(
    product="AST_L1T",
    bbox=bbox,
    start_date="2015-01-01",
    end_date="2020-12-31",
    cloud_cover_max=20,
)
print(f"Found {len(scenes)} scenes")

# 3. Download and open scenes (lazy, backed by dask)
#    Default cache path is ~/.cache/aster_timeseries on Linux/macOS
#    and C:\Users\<you>\.cache\aster_timeseries on Windows.
access = ASTERDataAccess()
datasets = access.open_scenes(scenes[:50], subsystem="VNIR")

# 4. Build time series
ts = ASTERTimeSeries(scenes[:50], datasets)
print(ts)
# ASTERTimeSeries(n_scenes=50, products=['AST_L1T'], start=2015-01-03, end=2019-12-28)

# 5. Compute NDVI time series
ndvi_ts = ts.compute_index("ndvi")  # shape: (time, y, x)

# 6. Area-averaged time series
ndvi_mean = ts.area_mean("ndvi")    # shape: (time,)

# 7. Resample to monthly means
monthly = ts.resample(freq="ME", variable="ImageData3N")

# 8. Export
ts.to_netcdf("south_australia_vnir.nc")
ts.to_csv("south_australia_ndvi.csv", variable="ndvi")

Global Processing

For global or continental analyses, use GlobalASTERProcessor which tiles the region automatically:

from aster_timeseries import ASTERCatalog, ASTERDataAccess
from aster_timeseries.processing import GlobalASTERProcessor

proc = GlobalASTERProcessor(
    catalog=ASTERCatalog(),
    access=ASTERDataAccess(),
    tile_size_deg=10.0,   # process in 10° x 10° tiles
    n_workers=8,
)

output_paths = proc.run(
    product="AST_L1T",
    start_date="2015-01-01",
    end_date="2015-03-31",
    output_dir="./output/global",
    subsystem="TIR",
    cloud_cover_max=30,
)
print(f"Wrote {len(output_paths)} tile files")

Physical Units

Convert raw digital numbers (DN) to at-sensor radiance:

from aster_timeseries.processing import apply_radiance_conversion

raw_ds = access.open_dataset("AST_L1T_scene.hdf", subsystem="VNIR")
radiance_ds = apply_radiance_conversion(raw_ds)

Compute TIR brightness temperature:

from aster_timeseries.timeseries import brightness_temperature

bt = brightness_temperature(radiance_ds["ImageData10"], band=10)
# bt in Kelvin

Mineral Mapping Band Ratios

# Kaolinite/Muscovite indicator
ratio = ts.compute_index("ratio_4_6")

# Alunite indicator
ratio = ts.compute_index("ratio_5_7")

# Carbonate indicator
ratio = ts.compute_index("ratio_6_8")

Planetary Computer Backend

from aster_timeseries import ASTERCatalog

# Use Microsoft Planetary Computer (no authentication needed for ASTER L1T)
catalog = ASTERCatalog(backend="planetary_computer")
scenes = catalog.search(
    product="AST_L1T",
    bbox=BoundingBox.from_named_region("south_australia"),
    start_date="2018-01-01",
    end_date="2018-12-31",
)

Scaling without Planetary Computer

Planetary Computer is optional. For larger runs on your own infrastructure, the main alternatives are:

  • NASA Earthdata + tiled processing: use the default earthaccess backend and run GlobalASTERProcessor or aster-timeseries global-run over a large region.
  • Local or on-prem Dask/distributed: install pip install "aster-timeseries[distributed]" and connect to an existing Dask scheduler before opening scenes. The datasets produced by ASTERDataAccess.open_dataset(..., chunks="auto") are already chunked for xarray/dask workflows.
  • Cloud or HPC Dask clusters: if you already provision clusters with tools such as Dask CloudProvider, Coiled, Kubernetes, or an HPC scheduler, create the cluster externally and then use the Python API against that scheduler. The package does not provision cloud infrastructure by itself.

Example: Earthdata search + a Dask scheduler you manage yourself:

from dask.distributed import Client

from aster_timeseries import ASTERCatalog, ASTERDataAccess
from aster_timeseries.processing import GlobalASTERProcessor

client = Client("tcp://scheduler-host:8786")

proc = GlobalASTERProcessor(
    catalog=ASTERCatalog(backend="earthaccess"),
    access=ASTERDataAccess(n_workers=16),
    tile_size_deg=5.0,
    n_workers=16,
)

CLI example for a large Earthdata-backed tiled run:

aster-timeseries global-run \
    --product AST_L1T \
    --start 2015-01-01 \
    --end 2015-03-31 \
    --tile-size 10 \
    --backend earthaccess \
    --n-workers 16 \
    --output-dir ./output/global

Slurm backend on Azure CycleCloud

If you want to fan out a large tiled run across a Slurm cluster, global-run can generate one Slurm job per tile instead of processing the tiles inline. This works well on an Azure CycleCloud Slurm cluster because CycleCloud already provides the scheduler, login node, and compute nodes; this package only needs to generate and submit standard sbatch jobs.

1. Provision a CycleCloud Slurm cluster

On Azure CycleCloud, create or reuse a Slurm cluster with:

  • a scheduler/login node where you can run aster-timeseries
  • a shared filesystem visible from the login node and compute nodes for:
    • your Python environment or repo checkout
    • the output directory
    • optional cache directories and job scripts
  • a Slurm partition sized for the ASTER jobs you plan to run

The examples below assume a shared path such as /shared, but any shared mount works.

2. Install the package in a shared Python environment

SSH to the CycleCloud scheduler/login node and create a shared environment:

python -m venv /shared/venvs/aster-timeseries
source /shared/venvs/aster-timeseries/bin/activate
pip install --upgrade pip
pip install "aster-timeseries[distributed]"

If you are developing from this repository instead of installing from PyPI:

git clone https://github.com/RichardScottOZ/ASTER-analysis /shared/src/ASTER-analysis
source /shared/venvs/aster-timeseries/bin/activate
pip install -e "/shared/src/ASTER-analysis[dev,distributed]"

Because the Slurm jobs call python -m aster_timeseries.cli, the compute nodes need access to the same Python environment and installed package.

3. Make credentials available to compute nodes

For the default earthaccess data backend, the Slurm job needs the same Earthdata credentials described earlier in this README.

The most reliable CycleCloud setup is:

  • keep ~/.netrc in a home directory that is mounted on compute nodes, or
  • export EARTHDATA_USERNAME / EARTHDATA_PASSWORD from the environment that launches sbatch

If you use a custom cache directory, point it at shared storage so multiple jobs can reuse downloaded HDF files:

mkdir -p /shared/aster-cache /shared/aster-output /shared/aster-jobs

4. Generate Slurm job scripts for each tile

Use the normal data backend selection (--backend) together with the new execution backend (--execution-backend slurm):

source /shared/venvs/aster-timeseries/bin/activate

aster-timeseries global-run \
    --product AST_L1T \
    --start 2019-01-01 \
    --end 2019-03-31 \
    --bbox 129.0 -38.1 141.0 -26.0 \
    --tile-size 2 \
    --subsystem VNIR \
    --backend earthaccess \
    --execution-backend slurm \
    --n-workers 8 \
    --cache-dir /shared/aster-cache \
    --output-dir /shared/aster-output \
    --slurm-script-dir /shared/aster-jobs \
    --slurm-partition hpc \
    --slurm-account my-project \
    --slurm-time 02:00:00 \
    --slurm-memory 16G \
    --slurm-cpus-per-task 8

This writes one .sbatch file per tile under /shared/aster-jobs/ and matching log files under /shared/aster-jobs/logs/.

Each generated job script runs the per-tile equivalent of:

python -m aster_timeseries.cli run \
    --bbox WEST SOUTH EAST NORTH \
    --product AST_L1T \
    --start 2019-01-01 \
    --end 2019-03-31 \
    --subsystem VNIR \
    --backend earthaccess \
    --n-workers 8 \
    --cache-dir /shared/aster-cache \
    --output /shared/aster-output/AST_L1T_<tile>.nc

5. Submit the jobs

You can submit the generated scripts manually:

sbatch /shared/aster-jobs/AST_L1T_129.0_-38.1_131.0_-36.1.sbatch

Or have global-run submit them immediately:

aster-timeseries global-run \
    --product AST_L1T \
    --start 2019-01-01 \
    --end 2019-03-31 \
    --bbox 129.0 -38.1 141.0 -26.0 \
    --tile-size 2 \
    --backend earthaccess \
    --execution-backend slurm \
    --output-dir /shared/aster-output \
    --slurm-script-dir /shared/aster-jobs \
    --slurm-submit

Use squeue, sacct, and the per-tile log files in /shared/aster-jobs/logs/ to monitor progress. Re-running the command safely rewrites the .sbatch files for the same tiles, which is helpful when tuning time, memory, or partition settings on Azure CycleCloud.

CLI

# List available products
aster-timeseries products

# List predefined regions
aster-timeseries regions

# Search for scenes (outputs JSON)
aster-timeseries search \
    --region south_australia \
    --product AST_L1T \
    --start 2015-01-01 \
    --end 2020-12-31 \
    --output scenes.json

# Download and build a NetCDF time series
aster-timeseries run \
    --region south_australia \
    --product AST_L1T \
    --start 2015-01-01 \
    --end 2020-12-31 \
    --subsystem VNIR \
    --n-workers 8 \
    --output ./output/sa_vnir.nc

# Global tiled processing
aster-timeseries global-run \
    --product AST_L1T \
    --start 2015-01-01 \
    --end 2015-03-31 \
    --tile-size 10 \
    --backend earthaccess \
    --subsystem TIR \
    --output-dir ./output/global \
    --n-workers 8

# Generate one Slurm job per tile instead of running locally
aster-timeseries global-run \
    --product AST_L1T \
    --start 2015-01-01 \
    --end 2015-03-31 \
    --tile-size 5 \
    --backend earthaccess \
    --execution-backend slurm \
    --output-dir /shared/aster-output \
    --slurm-script-dir /shared/aster-jobs \
    --slurm-partition hpc \
    --slurm-time 02:00:00

# Use a custom bounding box
aster-timeseries run \
    --bbox 129.0 -38.1 141.0 -26.0 \
    --product AST_08 \
    --start 2010-01-01 \
    --end 2022-12-31 \
    --output ./output/sa_temperature.nc

Package Structure

aster_timeseries/
├── __init__.py        – Public API
├── catalog.py         – Scene search (NASA CMR / STAC)
├── spatial.py         – BoundingBox, regions, spatial utilities
├── access.py          – Download, open, stream HDF-EOS files
├── timeseries.py      – Stack, index computation, resampling, export
├── processing.py      – DN to radiance, masking, tiling, mosaicking
└── cli.py             – Command-line interface

Access Patterns

Pattern When to use
ASTERCatalog.search() Small region or short time range
ASTERCatalog.search_tiles() Large region (continental/global)
ASTERCatalog.iter_scenes() Memory-efficient iteration over very large result sets
ASTERDataAccess.download() + open_dataset() Processing locally cached files
ASTERDataAccess.open_remote() Streaming without download (requires fast internet)
GlobalASTERProcessor.run() Fully automated tiled global pipeline

Development

git clone https://github.com/RichardScottOZ/ASTER-analysis
cd ASTER-analysis
pip install -e ".[dev]"
pytest tests/ -v

Predefined Regions

Name Extent (W, S, E, N)
south_australia 129, -38.1, 141, -26
australia 112, -44, 154, -10
global -180, -90, 180, 90
conus -125, 24, -66, 50
europe -25, 34, 45, 72
africa -20, -35, 55, 38
south_america -82, -56, -34, 13
north_america -170, 14, -50, 72
asia 25, -10, 145, 55

Custom regions can be loaded from any GeoJSON, Shapefile, or GeoPackage:

from aster_timeseries.spatial import load_region

bbox, gdf = load_region(
    "path/to/states.shp",
    name_column="STATE_NAME",
    region_name="South Australia",
)

License

MIT

About

Analysis of aster

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages