[petsc-dev] Improving and stabilizing GPU support

Paul Mullowney paulm at txcorp.com
Fri Jul 19 14:35:44 CDT 2013


>> * Reduce CUSP dependency: The current elementary operations are mainly realized via CUSP. With better support via CUSPARSE and CUBLAS, I'd add a separate 'native' CUDA backend so that we can provide a full set of vector and sparse matrix operations out of the default NVIDIA toolchain. We will still keep CUSP for its preconditioners, yet we no longer depend on it.
Agreed. In the past, I've suggested a -vec_type cuda (not cusp). All the 
CUSP operations can be done with Thrust algorithms. Since Thrust comes 
default with CUDA, one can have only a CUDA dependency.
>> * Integrate last bits of txpetscgpu package. I assume Paul will provide a helping hand here.
Of course. This will go much faster as much of the hard work is done. Do 
people want support for different matrix formats in the CUSP classes : 
i.e. diagonal, ellpack, hybrid? I think the CUSP preconditioners can be 
derived from matrices stored in non-csr format (although they're likely 
just doing a convert under the hood).
>> * Better ViennaCL bindings: The OpenCL version of VecMDot() will experience a boost with the ViennaCL 1.5.0 release, the CUDA version was fixed a couple of month back. Also, VecCopySome() will get improved in order to provide better MPI performance (similar to what Paul applied for CUSPARSE)
>>
>> * Documentation: Add a chapter on GPUs to the manual, particularly on what to expect and what not to expect. Update documentation on webpage regarding installation.
I will help with the manual.
>> * Integration of FEM quadrature from SNES ex52. The CUDA part requiring code generation is not very elegant, while the OpenCL approach is better suited for a library integration thanks to JIT. However, this requires user code to be provided as a string (again not very elegant) or loaded from file (more reasonable). How much FEM functionality do we want to provide via PETSc?
Multi-GPU is a highly pressing need, IMO. Need to figure out how to make 
Block Jacobi and ASM run efficiently.

-Paul
>> Please don't hesitate to post other GPU wishes. Now it's the best time for doing so :-)
>>
>> Best regards,
>> Karli




More information about the petsc-dev mailing list