Skip to content

jacksonllee/rustling

Repository files navigation


PyPI Conda Version crates.io DOI

Rustling is a blazingly fast library for computational linguistics. It aims to provide flexible and efficient tools to facilitate further research.

Documentation: Python | Rust

Currently implemented features:

  • Sequence modeling:

    • N-grams and related language models
    • Hidden Markov model
    • Word segmentation
    • Averaged perceptron part-of-speech tagging
  • Handling richly formatted data, supporting cross-format conversion as well as both local and remote sources for data ingestion:

    • CHAT for TalkBank and CHILDES
    • ELAN for annotated multimedia data
    • TextGrid for Praat annotations
    • CoNLL-U for University Dependencies
    • SRT for SubRip subtitles

Performance

Rustling is highly performant because it is implemented in Rust under the hood. For benchmarks comparing Rustling against other Python packages with similar functionalities, please see benchmarks.

Installation

Python

Using pip:

pip install rustling

Using conda:

conda install -c conda-forge rustling

For Pyodide, pre-built WASM wheels (with multithreading disabled, as Pyodide does not support it) are available from each GitHub release — look for the .whl file with emscripten in the filename.

Rust

cargo add rustling

License

MIT License