<meta http-equiv="Content-Type" content="text/html; charset=utf-8"><div dir="ltr"><div dir="ltr"><br><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Feb 22, 2020 at 11:05 PM Karl Rupp <<a href="mailto:rupp@iue.tuwien.ac.at">rupp@iue.tuwien.ac.at</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Junchao,<br>
<br>
> I want to evaluate MatMult on GPU. I took a 2M x 2M matrix and ran with <br>
> 6 mpi ranks and 6 GPUs. It took about 0.9 seconds. <br>
<br>
How many nonzeros per row? With 0.9 seconds you should either have many <br>
runs of MatMult, or a fairly dense matrix; or a really slow MatMult <br>
kernel ;-)<br></blockquote><div>I had a typo. It should be 0.9e-3 seconds. I ran with 6 GPUs and 6 MPI ranks. The matrix has about 100 nonzeros per row. 2M x 2M is the whole matrix size. Thanks for the explanation.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
A 2M-by-2M matrix for a 5-point stencil is probably still on the small <br>
side (I'm assuming that you run 2M-by-2M for *each* GPU), but should <br>
suffice. Expect that communication cost are significant (i.e. the <br>
bookkeeping and data exchange between GPUs is on the order of the costs <br>
for running the MatMult kernel for the respective diagonal block).<br>
<br>
<br>
> A kernel launch or <br>
> a stream synchronization took about 10us. Compared with MatMult, they <br>
> are tiny. Does it mean we can ignore them? What is a proper size to <br>
> evaluate MatMult? I heard it is a few thousand rows per MPI rank. Why?<br>
<br>
That would be a typical strong scaling limit for a CPU-based run a <br>
well-tuned BlueGene-type system. With GPUs you will probably need at <br>
least 100k unknowns (or ~1M nonzeros) per rank in the strong scaling <br>
limit. Add a factor of ~10 to make latency costs small in comparison.<br>
<br>
Best regards,<br>
Karli<br>
</blockquote></div></div>