ImmunoMatch is a machine learning framework for deciphering the molecular rules governing the pairing of antibody chains. Fine-tuned on an antibody-specific language model (AntiBERTA2), ImmunoMatch learns from paired H and L sequences from single human B cells to distinguish cognate H-L pairs and randomly paired sequences.
A total of three variants of ImmunoMatch, trained on different subsets of the data, are made available on huggingface:
| Checkpoint name | Trained on |
|---|---|
| ImmunoMatch | A mixture of antibodies with both κ and λ light chains |
| ImmunoMatch-κ | Antibodies with κ light chains |
| ImmunoMatch-λ | Antibodies with λ light chains |
Please note that the ImmunoMatch models are provided under a CC-BY-NC-4.0 license.
Run_ImmunoMatch.ipynb contains example code on how to apply any ImmunoMatch model to obtain H-L pairing scores for a given VH-VL sequence pair, or to annotate sequences in batch upon supplying a CSV. You can also try it out on Google Collaboratory:
ImmunoMatch is also available as a stand-alone Python package on PyPI.
There are no specific prerequisites to use ImmunoMatch beyond standard installation of Huggingface libraries on Python. On a clean virtual environment on Google Colab, the installation of these libraries took around 1 minute.
Folder figure_code contains all Python and R code used to generate figure panels in the manuscript.
If you have used any of the ImmunoMatch models in your research please cite:
Guo, D., Dunn-Walters, D.K., Fraternali, F. et al. ImmunoMatch learns and predicts cognate pairing of heavy and light immunoglobulin chains. Nat Methods 23, 106–117 (2026). https://doi.org/10.1038/s41592-025-02913-x
