Iterative CUDA

Iterative CUDA is a CUDA-based C++ package containing iterative solvers for sparse linear systems. To use it, you would:

assemble a matrix in memory in Compressed Sparse Row (CSR) format
feed it to Iterative CUDA, which computes a decomposition and copies it onto the GPU
call iterative CUDA to solve Ax=b on that matrix. (uses the Conjugate Gradient method, but can be easily extended to other methods)

Iterative CUDA is based on the following excellent pieces of software:

CUDA Sparse Matrix-Vector Multiplication by Nathan Bell and Michael Garland
CUDA Parallel reduction by Mark Harris

The goal is to turn Iterative CUDA into “yet another solver library”, except that the solution is actually performed on the GPU (and hence faster than the CPU by a factor between five and ten).

Note: If you are a PyCUDA user, you need not worry—a more flexible version of this functionality is also available in recent development versions of PyCUDA.

Getting the Code

Iterative CUDA is licensed under the MIT/X11 Consortium license. Other software components contained in Iterative CUDA, as indicated above, have slightly different licenses.

Documentation

See the Wiki. This has build instructions. Usage examples are available in the source distribution under example.