Skip to content

mgorshkov/np

Repository files navigation

Build status

About

⚡ NumPy-style arrays in C++ | CUDA GPU + AVX512 CPU | Tikhonov Regularized EVD, LSQR, MRRR, SVD, eigenvalue solvers

Description

High-performance N-dimensional arrays with CPU/GPU acceleration and built-in ML algorithms (Tikhonov Regularized EVD, LSQR, MRRR, SVD, eigenvalue solvers)

Requirements

Any C++20-compatible compiler:

  • gcc 10 or higher
  • clang 6 or higher
  • Visual Studio 2019 or higher
  • CUDA development environment (NVIDIA CUDA Toolkit, and compatible NVIDIA drivers installed)

Repo

git clone https://github.com/mgorshkov/np.git

Build unit tests and sample

Linux/MacOS

mkdir build && cd build
cmake ..
cmake --build .

Windows

mkdir build && cd build
cmake ..
cmake --build . --config Release

Build docs

cmake --build . --target doc

Open np/build/doc/html/index.html in your browser.

Install

cmake .. -DCMAKE_INSTALL_PREFIX:PATH=~/np_install
cmake --build . --target install

Usage example (samples/monte-carlo)

#include <iostream>
#include <np/Creators.hpp>

int main(int, char **) {
    // PI number calculation with Monte-Carlo method
    using namespace np;
    Size size = 10000000;
    auto rx = random::rand(size);
    auto ry = random::rand(size);
    auto dist = rx * rx + ry * ry;
    auto inside = (dist["dist<1"]).size();
    std::cout << "PI=" << 4 * static_cast<double>(inside) / size;
    return 0;
}

How to build the sample

  1. Clone the repo
git clone https://github.com/mgorshkov/np.git
  1. cd samples/monte-carlo
cd samples/monte-carlo
  1. Make build dir
mkdir -p build-release && cd build-release
  1. Configure cmake
cmake -DCMAKE_BUILD_TYPE=Release ..
  1. Build

Linux/MacOS

cmake --build .

Windows

cmake --build . --config Release
  1. Run the app
$./monte_carlo
PI=3.14158

Usage example (samples/least-squares)

#include <iostream>
#include <np/Array.hpp>
#include <np/linalg/LstSq.hpp>

int main(int, char **) {
    // LSTSQ calculation with MRRR method
    using namespace np;
    using namespace np::linalg;

    static const constexpr Size rows = 10000;
    static const constexpr Size cols = 1000;

    // Generate random matrix A and true solution x_true
    Shape shapeA({rows, cols});
    auto A = random::rand(shapeA);

    Shape shapeX({cols});
    auto x_true = random::rand(shapeX);

    // Add noise
    auto noise = random::rand(Shape{rows}, -0.01, 0.01); // 1 % noise
    // Compute b = A * x_true + noise
    auto b = A * x_true + noise;

    // Solve using MRRR method
    auto start = std::chrono::high_resolution_clock::now();
    auto x = lstsq_mrrr(A, b);
    auto end = std::chrono::high_resolution_clock::now();

    double error = 0.0;
    for (size_t i = 0; i < cols; ++i) {
        error += (x.get(i) - x_true.get(i)) * (x.get(i) - x_true.get(i));
    }

    auto time = std::chrono::duration_cast<std::chrono::milliseconds>(end - start);
    std::cout << "Time:  " << time.count() << " ms\n";
    std::cout << "||x - x_true||:  " << sqrtf(error) << "\n";

    return 0;
}

How to build the sample

  1. Clone the repo
git clone https://github.com/mgorshkov/np.git
  1. cd samples/least-squares
cd samples/least-squares
  1. Make build dir
mkdir -p build-release && cd build-release
  1. Configure cmake
cmake -DCMAKE_BUILD_TYPE=Release ..
  1. Build

Linux/MacOS

cmake --build .

Windows

cmake --build . --config Release
  1. Run the app
$./least-squares
Time:  189 ms
||x - x_true||:  10.1145

Links

Plans

  • Other LSQR algorithm implementations

About

⚡ NumPy-style arrays in C++ | CUDA GPU + AVX512 CPU | Tikhonov Regularized EVD, LSQR, MRRR, SVD, eigenvalue solvers

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages