Skip to content

Add Python coda detector module (port of MATLAB Coda-detector)#3

Open
CivicDash wants to merge 1 commit intoProject-CETI:mainfrom
CivicDash:feature/python-coda-detector
Open

Add Python coda detector module (port of MATLAB Coda-detector)#3
CivicDash wants to merge 1 commit intoProject-CETI:mainfrom
CivicDash:feature/python-coda-detector

Conversation

@CivicDash
Copy link

Summary

  • Adds a wham/detection/ module that ports the MATLAB Coda-detector to pure Python
  • Enables end-to-end coda detection from raw audio recordings: raw audio → detected codas
  • No new heavy dependencies (uses numpy and scipy, already in setup.py)

What it does

The detector implements the full pipeline from the original MATLAB code:

  1. TKEO (Teager-Kaiser Energy Operator) — enhances impulsive transients (whale clicks) while suppressing background noise
  2. SNR-based transient selection — filters candidates by signal-to-noise ratio
  3. IPI estimation — extracts Inter-Pulse Interval from the multipulse structure of sperm whale clicks (characteristic of individual whales)
  4. Waveform cross-correlation — builds a similarity matrix between detected clicks
  5. Graph-based clustering — enumerates valid coda candidates satisfying ICI constraints, scores by similarity, greedy selection
  6. Deduplication — removes overlapping codas with similar ICI patterns

Usage

from wham.detection import detect_codas, codas_to_dict

codas = detect_codas("recording.wav")
for coda in codas:
    print(f"{coda.n_clicks} clicks, ICIs: {coda.icis}")

# Export to JSON-serializable format
results = codas_to_dict(codas)

Or from the command line:

python -m wham.detection.coda_detector recording.wav

Why this is useful

WhAM currently works with pre-segmented codas from datasets like DSWP. This module closes the gap between raw field recordings and WhAM's embedding/generation pipeline, allowing users to go from raw audio → detected codas → WhAM embeddings in a single Python workflow.

Tests

20 unit tests covering all pipeline stages (TKEO, bandpass filter, resampling, click detection, cross-correlation, clustering, deduplication, end-to-end). All passing.

Related

Made with Cursor

Adds wham/detection/ module providing end-to-end coda detection from
raw audio, porting the original MATLAB Coda-detector to pure Python.

Pipeline: TKEO click detection → SNR filtering → IPI estimation →
waveform cross-correlation → graph-based clustering → deduplication.

Dependencies: numpy, scipy (already in setup.py).
Includes 20 unit tests covering all pipeline stages.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant