[petsc-dev] Proper matrix size to choose when evaluating MatMult?

Fri Feb 21 18:40:42 CST 2020

I think Karl goes into these issues here:
https://arxiv.org/pdf/1410.4054.pdf

  Thanks,

    Matt

On Fri, Feb 21, 2020 at 5:58 PM Junchao Zhang via petsc-dev <
petsc-dev at mcs.anl.gov> wrote:

>
>
> On Fri, Feb 21, 2020 at 4:38 PM Mark Adams <mfadams at lbl.gov> wrote:
>
>>
>>
>> On Fri, Feb 21, 2020 at 4:51 PM Junchao Zhang via petsc-dev <
>> petsc-dev at mcs.anl.gov> wrote:
>>
>>> Hello,
>>>
>>> I want to evaluate MatMult on GPU.  I took a 2M x 2M matrix and ran with
>>> 6 mpi ranks and 6 GPUs.  It took about 0.9 seconds.  A kernel launch or a
>>> stream synchronization took about 10us.
>>>
>>
>> Your call, but you should run the code once and then run it in a new
>> timer. I've seen some big "warm up costs" GPUs today.
>>
>
> Yes, I usually run hundreds iterations and skip the first few. Thanks.
>
>>
>>
>>> Compared with MatMult, they are tiny. Does it mean we can ignore them?
>>> What is a proper size to evaluate MatMult?
>>>
>>
>> It depends on the purpose/audience for the study. There is no right size
>> other than being much larger than the launch cost, perhaps.
>>
>>
>>> I heard it is a few thousand rows per MPI rank.  Why?
>>> Thanks.
>>> --Junchao Zhang
>>>
>>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20200221/a4954faf/attachment.html>