Conformer-Based Long-Tail Automatic Chord Recognition

Overview & Architecture

This library implements a multi-headed Conformer network designed to solve the "long-tail" problem in Automatic Chord Recognition (ACR).

Figure 1: The multi-headed Conformer architecture branching into Root, Bass, and Quality predictions.

To counteract this, conformer-acr makes use of:

The Conformer Backbone: Combines Convolutional Neural Networks (CNNs) to capture local acoustic texture/timbre with Transformers (self-attention) to maintain global harmonic context.
Structured Multi-Task Heads: Instead of predicting a single monolithic chord string, the network branches into three distinct classification heads: Root, Bass, and Quality. This explicitly forces the model to understand inversions without causing a combinatorial explosion in the target vocabulary.
Synthetic Pre-Training (Harmonic Prior): Because Conformers are memory and data-hungry, the model is pre-trained on perfectly annotated synthetic multitracks (the AAM dataset) using the Bede NVLink GPU cluster. This establishes a mathematically pure "harmonic prior" before the model is fine-tuned on noisy, real-world acoustic audio.

Install

# editable install (for development)
pip install -e .

# with dev tools (pytest, etc)
pip install -e ".[dev]"


To counteract this, `conformer-acr` makes use of:
* **The Conformer Backbone:** Combines Convolutional Neural Networks (CNNs) to capture local acoustic texture/timbre with Transformers (self-attention) to maintain global harmonic context.
* **Structured Multi-Task Heads:** Instead of predicting a single monolithic chord string, the network branches into three distinct classification heads: **Root**, **Bass**, and **Quality**. This explicitly forces the model to understand inversions without causing a combinatorial explosion in the target vocabulary.
* **Synthetic Pre-Training (Harmonic Prior):** Because Conformers are memory and data-hungry, the model is pre-trained on perfectly annotated synthetic multitracks (the AAM dataset) using the Bede NVLink GPU cluster. This establishes a mathematically pure "harmonic prior" before the model is fine-tuned on noisy, real-world acoustic audio.

## Install

```bash
# editable install (for development)
pip install -e .

# with dev tools (pytest, etc)
pip install -e ".[dev]"
<img width="621" height="724" alt="Screenshot 2026-03-13 at 16 27 13" src="https://github.com/user-attachments/assets/2f7dbddb-84fc-4edd-8163-e88447432c65" />



## Install

```bash
#editable install (for development)
pip install -e .

#with dev tools (pytest, etc)
pip install -e ".[dev]"

Quick Start

import conformer_acr as acr

#feature extraction
cqt = acr.preprocess_audio("song.mp3")

#inference (requires a trained checkpoint)
chords = acr.predict("song.mp3", checkpoint_path="model.pt")

#model
model = acr.ConformerACR(d_model=256, n_heads=4, n_layers=4)

#chord vocabulary
idx   = acr.chord_to_index("C:maj")   # → 0
label = acr.index_to_chord(0)          # → 'C:maj'

Library Structure

conformer_acr/
├── __init__.py          #flat public API
├── config.py            #constants (SR, CQT bins, hop length)
├── core.py              #high-level inference pipeline
├── models/
│   └── conformer.py     #ConformerACR (encoder + 3 heads)
├── data/
│   ├── dataset.py       #AAM & Isophonics Dataset classes
│   └── preprocess.py    #audio loading & CQT extraction
├── theory/
│   └── vocabulary.py    #chord ↔ integer mappings
├── training/
│   ├── trainer.py       #training loop
│   └── losses.py        #focal Loss
└── utils/
    └── distributed.py   #Bede/DDP helpers

`lit_review/`

The lit_review/ directory contains standalone research scripts and datasets used during the literature review phase. It is not part of the library.

Acknowledgements

This work is part of the N8 Centre for Computationally Intensive Research project "Deep Learning Models for Automatic Chord Recognition in Polyphonic Audio” for the EPSRC-funded Bede Supercomputer studentship. Supervised by Dr. Karolina Prawda.

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
conformer_acr		conformer_acr
lit_review		lit_review
tests		tests
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
diagnose.py		diagnose.py
evaluate.py		evaluate.py
mutate.py		mutate.py
prep_dataset.py		prep_dataset.py
pyproject.toml		pyproject.toml
render_hybrid.py		render_hybrid.py
submit_data_prep.sh		submit_data_prep.sh
submit_eval.sh		submit_eval.sh
submit_train.sh		submit_train.sh
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Conformer-Based Long-Tail Automatic Chord Recognition

Overview & Architecture

Install

Quick Start

Library Structure

`lit_review/`

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Conformer-Based Long-Tail Automatic Chord Recognition

Overview & Architecture

Install

Quick Start

Library Structure

lit_review/

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`lit_review/`

Packages