⚡ NumPy-style arrays in C++ | CUDA GPU + AVX512 CPU | Tikhonov Regularized EVD, LSQR, MRRR, SVD, eigenvalue solvers
High-performance N-dimensional arrays with CPU/GPU acceleration and built-in ML algorithms (Tikhonov Regularized EVD, LSQR, MRRR, SVD, eigenvalue solvers)
Any C++20-compatible compiler:
- gcc 10 or higher
- clang 6 or higher
- Visual Studio 2019 or higher
- CUDA development environment (NVIDIA CUDA Toolkit, and compatible NVIDIA drivers installed)
git clone https://github.com/mgorshkov/np.git
mkdir build && cd build
cmake ..
cmake --build .
mkdir build && cd build
cmake ..
cmake --build . --config Release
cmake --build . --target doc
Open np/build/doc/html/index.html in your browser.
cmake .. -DCMAKE_INSTALL_PREFIX:PATH=~/np_install
cmake --build . --target install
#include <iostream>
#include <np/Creators.hpp>
int main(int, char **) {
// PI number calculation with Monte-Carlo method
using namespace np;
Size size = 10000000;
auto rx = random::rand(size);
auto ry = random::rand(size);
auto dist = rx * rx + ry * ry;
auto inside = (dist["dist<1"]).size();
std::cout << "PI=" << 4 * static_cast<double>(inside) / size;
return 0;
}
- Clone the repo
git clone https://github.com/mgorshkov/np.git
- cd samples/monte-carlo
cd samples/monte-carlo
- Make build dir
mkdir -p build-release && cd build-release
- Configure cmake
cmake -DCMAKE_BUILD_TYPE=Release ..
- Build
cmake --build .
cmake --build . --config Release
- Run the app
$./monte_carlo
PI=3.14158
#include <iostream>
#include <np/Array.hpp>
#include <np/linalg/LstSq.hpp>
int main(int, char **) {
// LSTSQ calculation with MRRR method
using namespace np;
using namespace np::linalg;
static const constexpr Size rows = 10000;
static const constexpr Size cols = 1000;
// Generate random matrix A and true solution x_true
Shape shapeA({rows, cols});
auto A = random::rand(shapeA);
Shape shapeX({cols});
auto x_true = random::rand(shapeX);
// Add noise
auto noise = random::rand(Shape{rows}, -0.01, 0.01); // 1 % noise
// Compute b = A * x_true + noise
auto b = A * x_true + noise;
// Solve using MRRR method
auto start = std::chrono::high_resolution_clock::now();
auto x = lstsq_mrrr(A, b);
auto end = std::chrono::high_resolution_clock::now();
double error = 0.0;
for (size_t i = 0; i < cols; ++i) {
error += (x.get(i) - x_true.get(i)) * (x.get(i) - x_true.get(i));
}
auto time = std::chrono::duration_cast<std::chrono::milliseconds>(end - start);
std::cout << "Time: " << time.count() << " ms\n";
std::cout << "||x - x_true||: " << sqrtf(error) << "\n";
return 0;
}
- Clone the repo
git clone https://github.com/mgorshkov/np.git
- cd samples/least-squares
cd samples/least-squares
- Make build dir
mkdir -p build-release && cd build-release
- Configure cmake
cmake -DCMAKE_BUILD_TYPE=Release ..
- Build
cmake --build .
cmake --build . --config Release
- Run the app
$./least-squares
Time: 189 ms
||x - x_true||: 10.1145
- Methods from pandas library on top of NP library: https://github.com/mgorshkov/pd
- Scientific methods on top of NP library: https://github.com/mgorshkov/scipy
- ML Methods from scikit-learn library: https://github.com/mgorshkov/sklearn
- Other LSQR algorithm implementations