[petsc-dev] Supporting OpenCL matrix assembly

Jed Brown jedbrown at mcs.anl.gov
Tue Sep 24 04:45:08 CDT 2013


Karl Rupp <rupp at mcs.anl.gov> writes:

>>> This can obviously be done incrementally, so storing a batch of
>>> element matrices to global memory is not a problem.
>>
>> If you store element matrices to global memory, you're using a ton of
>> bandwidth (about 20x the size of the matrix if using P1 tets).
>>
>> What if you do the sort/reduce thing within thread blocks, and only
>> write the reduced version to global storage?
>
> My primary metric for GPU kernels is memory transfers from global memory 
> ('flops are free'), hence what I suggest for the assembly stage is to go 
> with something CSR-like rather than COO. Pure CSR may be too expensive 
> in terms of element lookup if there are several fields involved 
> (particularly 3d), so one could push (column-index, value) pairs for 
> each row and making the merge-by-key much cheaper than for arbitrary COO 
> matrices.

I think CSR vs. COO is a second-order optimization to be considered
after the 20x redundancy has been eliminated and a synchronization
strategy has been chosen (e.g., coloring vs redundant storage and later
compression).

> This, of course, requires the knowledge of the nonzero pattern and 
> couplings among elements, yet this is reasonably cheap to extract for a 
> large number of problems (for example, (non)linear PDEs without 
> adaptivity). Also, the nonzero pattern is rather cheap to obtain if one 
> uses coloring for avoiding expensive atomic writes to global memory.

At this point, I don't mind having the nonzero pattern set ahead of time
using CPU code.  It's reassembly in time-dependent problems with no
adaptivity or occasional adaptivity that I'm more concerned with.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20130924/d6937e08/attachment.sig>


More information about the petsc-dev mailing list