Python + CUDA Algebraic Multigrid

High-performance GPU-accelerated multigrid / multigrid + bicgstab solver for structured Cartesian grids.

Overview

Implementation of algebraic multigrid (AMG) as a standalone iterative solver, or preconditioner for bicgstab(1), aimed at solving 2D Poisson equations:

∇²u(x,y) = f(x,y) ,

or other diagonally dominant and structured sparse linear systems of equations, Ax=b.

Multigrid Construction

Bi-linear or cubic restriction/prolongation operators
Row-weighted "Sparse Approximate Inverse" Jacobi smoother
Red-Black colouring fine grid post-smoothing
'V' and 'F' cycles

Features

Flexibility of cuPy
CUDA-accelerated fused stencil operations
Warp-level and thread-level spMV CUDA kernel optimisations
Supports arbitrary (even/odd) grid sizes
Single precision
CPU and GPU versions

Structure

├── BICGSTAB_L/ # Pre-conditioned bicgstab(1) solver and kernels

├── Multigrid/ # Base algbebraic multigrid implementation, cycles and kernels

├── Laplacians/ # Example 2nd order Laplacian matrix

├── SparseApproximateInverse/ # Multigrid smoother and kernels

Example_AMG_PoissonProblem.py # Solve the poisson problem on the CPU and GPU, and compare performance.

Example : Poisson Problem

📊 Performance (AMD Ryzen 9 9950x3D vs Nvidia RTX 4090)

Takeaways

The CPU is expected to be competitive at very small problem sizes.
Below a grid size of $512^2$, kernel execution overhead is a dominant contribution to the solve time.
At larger problem sizes, the GPU becomes saturated and kernel overhead is suppressed.
At $4096^2$ cells, the GPU is ~45x faster than the CPU (80 ms per solve).

Considerations

Strictly for a Laplacian matrix on a regular grid, the stencil may be known ahead of time. This would replace spMV operations by e.g. the smoother with much faster stencil CUDA kernels.

This whilst improving performance, however, would hard-code the problem. Keeping 'A' as a matrix makes it more flexible, for example, applied to non-uniform Cartesian grids, modified stencils and boundary conditions.

Installation

Requirements

CUDA >= 11.0
Python >= 3.10
NumPy,
Numba,
CuPy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Python + CUDA Algebraic Multigrid

Overview

Multigrid Construction

Features

Structure

Example : Poisson Problem

📊 Performance (AMD Ryzen 9 9950x3D vs Nvidia RTX 4090)

Takeaways

Considerations

Installation

Requirements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
BICGSTAB_L		BICGSTAB_L
Laplacians		Laplacians
Multigrid		Multigrid
SparseApproximateInverse		SparseApproximateInverse
Example_AMG_PoissonProblem.py		Example_AMG_PoissonProblem.py
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Python + CUDA Algebraic Multigrid

Overview

Multigrid Construction

Features

Structure

Example : Poisson Problem

📊 Performance (AMD Ryzen 9 9950x3D vs Nvidia RTX 4090)

Takeaways

Considerations

Installation

Requirements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages