[petsc-dev] Supporting OpenCL matrix assembly

Matthew Knepley knepley at gmail.com
Tue Sep 24 09:37:06 CDT 2013


On Tue, Sep 24, 2013 at 7:07 AM, Karl Rupp <rupp at mcs.anl.gov> wrote:

> Hey,
>
>
> On 09/24/2013 03:53 PM, Jed Brown wrote:
>
>> Karl Rupp <rupp at mcs.anl.gov> writes:
>>
>>> I'm not talking about CSR vs. COO from the SpMV point of view, but
>>> rather on how to store the actual data in global memory without
>>> expensive subsequent sorts.
>>>
>>
>> Sure, but this seems like such a minor detail.  With PetscScalar=double
>> and PetscInt=int, we have 16 bytes/entry for COO and (nominally) 12
>> bytes/entry for CSR, and it only needs to go to GPU global memory and
>> back, not across to the CPU.  I doubt the difference between 12 and 16
>> bytes/entry during assembly is a bottleneck.
>>
>
> I'm not worried about 12 bytes vs. 16 bytes, but rather about the ordering
> of entries as a whole. If one assembles into something CSR-like, then one
> can either run the SpMV right away, or merge entries in each row of the
> matrix which have the same column indices. Merging such entries can usually
> be done in shared memory, so the memory costs is one read and write of the
> matrix nonzero entries in worst case.
>
> On the contrary, if everything is assembled into a general COO format,
> then one needs to sort the triplets by row first in order to be even able
> to run SpMVs. The memory transactions required for this are
> O(N log(N)) with N being the number of nonzeros. N is in almost all cases
> larger than 10^6, so the log(N) hurts...
>

Here I believe strongly that we need tests. Nathan assured me that nothing
is faster on the GPU than sort+reduce-by-key since
they are highly optimized. I think they will be hard to beat, and the
initial timings I had say that this is the case. I am willing to be
wrong, but I am not willing to overengineer based on supposition.

  Thanks,

       Matt


> Best regards,
> Karli
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20130924/544b498e/attachment.html>


More information about the petsc-dev mailing list