Rustling is a blazingly fast library for computational linguistics. It aims to provide flexible and efficient tools to facilitate further research.
Currently implemented features:
-
Sequence modeling:
- N-grams and related language models
- Hidden Markov model
- Word segmentation
- Averaged perceptron part-of-speech tagging
-
Handling richly formatted data, supporting cross-format conversion as well as both local and remote sources for data ingestion:
- CHAT for TalkBank and CHILDES
- ELAN for annotated multimedia data
- TextGrid for Praat annotations
- CoNLL-U for University Dependencies
- SRT for SubRip subtitles
Rustling is highly performant because it is implemented in Rust under the hood.
For benchmarks comparing Rustling against other Python packages with similar functionalities,
please see benchmarks.
Using pip:
pip install rustlingUsing conda:
conda install -c conda-forge rustlingFor Pyodide, pre-built WASM wheels (with multithreading disabled, as Pyodide does not support it)
are available from each GitHub release
— look for the .whl file with emscripten in the filename.
cargo add rustlingMIT License