[petsc-users] GPU local direct solve of penta-diagonal
Karl Rupp
rupp at mcs.anl.gov
Thu Dec 12 15:59:41 CST 2013
Hi,
> Yes, each MPI process is responsible for solving a system of
> nonlinear equations on a number of grid cells.
>
>
> Just to elaborate, and Ed can correct me, each MPI process has a few 100
> to a few 1000 (spacial) cells. We solve a (Folker-Plank) system in
> velocity space at each grid cell.
Thanks, Mark, this helps. Is there any chance you can collect a couple
of spatial cells together and solve a bigger system consisting of
decoupled subsystems?
Ideally you have more than 100k dofs for GPUs to perform well. Have a
look at this figure here (cross-over at about 10k dofs for CUDA):
http://viennacl.sourceforge.net/uploads/pics/cg-timings.png
to get an idea about the saturation of GPU solves at smaller system
sizes. PCI-Express latency is the limiting factor here.
Best regards,
Karli
More information about the petsc-users
mailing list