I wanted to learn rust and machine learning..so I thought why not do ML in rust. The notebooks, while in python, have rust bindings to the core ML/DL algorithms.
I have also written my own cuda matrix library, which I have painstakingly optimized to use batched operations, chunked memory management, asynchronous memory transfers, mempools/async mem management and more.
Why not use cublas you ask?
- It was a good learning exercise, I don't pretend it is faster than existing frameworks
- CUBLAS is there, just commented out (I used it for performance comparison and I am usually within a factor of 2-3x)
Python is just used for preprocessing and loading of data, visualization and verification of results.
That means all the ml algorithms are written using good old if/else, for loops etc. No significant help from libraries.
The following are the only libraries used for actual ML logic:
- Rust's itertools for iterating over data more easily
- Rust's statsrs and Python's numpy for random number generation
- Rust's Image lib to decode thousands of images of varying formats into a raw float array efficiently
PYO3 is used to create bindings from python to the RUST binaries. The python code is located under notebooks. The rust core ML functions are located under src. Any data is located in the data folder.
| Algorithm | Status |
|---|---|
| K-Means | β |
| K-Nearest Neighbors | β |
| Naive Bayes | β |
| Decision Trees/Random Forest | β |
| Regression Tree | β |
| Gradient Descent | β |
| ADA Boost | β |
| Gradient Boost | β |
| XGBoost | β |
| Neural Network w/t backpropogation | β |
| Convolutional Neural Networks | β |
| Recurrent Neural Networks | β |
| Generative Adversarial Networks | β |
| Large Language Models | β |
| CUDA Acceleration w/t Rust FFI | β |
- Install python3
- Install Rust
- Install nvidia cuda toolkit
- Create .venv folder
python -m venv .venv - Activate python virtual environment (platform dependent)
Windows:.\.venv\Scripts\activate.bat
Mac/Linux:source ./.venv/bin/activate - Install dependencies from requirements.txt
pip install -r requirements_{platform}.txt - Compile Rust Code using
maturin developormaturin develop --release - Open notebooks in jupyter notebook/jupyter lab/vscode etc..
Make sure to run tests using cargo test -- --test-threads=1.
Running the tests in parallel may fail because the cuda matrix library is not thread safe (yet).
MNIST Handwritten digit database available from https://yann.lecun.com/exdb/mnist/
All Other datasets are publically available from University of California Irvine here: https://archive.ics.uci.edu/ml/index.php