[petsc-users] Poor speed up for KSP example 45

Wed Mar 25 13:08:56 CDT 2020

Thank you Matt and Mark for the explanation. That makes sense. Please
correct me if I'm wrong, I think instead of asking for the whole node with
32 cores, if I ask for more nodes, say 4 or 8, but each with 8 cores, then
I should see much better speedups. Is that correct?

On Wed, Mar 25, 2020 at 2:04 PM Mark Adams <mfadams at lbl.gov> wrote:

> I would guess that you are saturating the memory bandwidth. After you make
> PETSc (make all) it will suggest that you test it (make test) and suggest
> that you run streams (make streams).
>
> I see Matt answered but let me add that when you make streams you will
> seed the memory rate for 1,2,3, ... NP processes. If your machine is decent
> you should see very good speed up at the beginning and then it will start
> to saturate. You are seeing about 50% of perfect speedup at 16 process. I
> would expect that you will see something similar with streams. Without
> knowing your machine, your results look typical.
>
> On Wed, Mar 25, 2020 at 1:05 PM Amin Sadeghi <aminthefresh at gmail.com>
> wrote:
>
>> Hi,
>>
>> I ran KSP example 45 on a single node with 32 cores and 125GB memory
>> using 1, 16 and 32 MPI processes. Here's a comparison of the time spent
>> during KSP.solve:
>>
>> - 1 MPI process: ~98 sec, speedup: 1X
>> - 16 MPI processes: ~12 sec, speedup: ~8X
>> - 32 MPI processes: ~11 sec, speedup: ~9X
>>
>> Since the problem size is large enough (8M unknowns), I expected a
>> speedup much closer to 32X, rather than 9X. Is this expected? If yes, how
>> can it be improved?
>>
>> I've attached three log files for more details.
>>
>> Sincerely,
>> Amin
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20200325/3de8a117/attachment.html>