[petsc-users] Strange strong scaling result
Ce Qin
qince168 at gmail.com
Tue Jul 12 10:32:11 CDT 2022
For your reference, I also calculated the speedups for other procedures:
VecAXPY MatMult SetupAMS
PCApply Assembly Solving
NProcessors NNodes CoresPerNode
1 1 1 1.0 1.0 1.0
1.0 1.0 1.0
2 1 2 1.640502 1.945753 1.418709
1.898884 1.995246 1.898756
2 1 2.297125 2.614508 1.600718
2.419798 2.121401 2.436149
4 1 4 4.456256 6.821532 3.614451
5.991256 4.658187 6.004539
2 2 4.539748 6.779151 3.619661
5.926112 4.666667 5.942085
4 1 4.480902 7.210629 3.471541
6.082946 4.65272 6.101214
8 2 4 10.584189 17.519901 8.59046
16.615395 9.380985 16.581135
4 2 10.980687 18.674113 8.612347
17.273229 9.308575 17.258891
8 1 11.096298 18.210245 8.456557
17.430586 9.314449 17.380612
16 2 8 21.929795 37.04392 18.135278
34.5448 18.575953 34.483058
4 4 22.00331 39.581504 18.011148
34.793732 18.745129 34.854409
8 2 22.692779 41.38289 18.354949
36.388144 18.828393 36.45509
32 4 8 43.935774 80.003087 34.963997
70.085728 37.140626 70.175879
8 4 44.387091 80.807608 35.62153
71.471289 37.166421 71.533865
and the streams result on the computation node:
1 8291.4887 Rate (MB/s)
2 8739.3219 Rate (MB/s) 1.05401
3 24769.5868 Rate (MB/s) 2.98735
4 31962.0242 Rate (MB/s) 3.8548
5 39603.8828 Rate (MB/s) 4.77645
6 47777.7385 Rate (MB/s) 5.76226
7 54557.5363 Rate (MB/s) 6.57994
8 62769.3910 Rate (MB/s) 7.57034
9 38649.9160 Rate (MB/s) 4.6614
10 58976.9536 Rate (MB/s) 7.11295
11 48108.7801 Rate (MB/s) 5.80219
12 49506.8213 Rate (MB/s) 5.9708
13 54810.5266 Rate (MB/s) 6.61046
14 62471.5234 Rate (MB/s) 7.53441
15 63968.0218 Rate (MB/s) 7.7149
16 69644.8615 Rate (MB/s) 8.39956
17 60791.9544 Rate (MB/s) 7.33185
18 65476.5162 Rate (MB/s) 7.89683
19 60127.0683 Rate (MB/s) 7.25166
20 72052.5175 Rate (MB/s) 8.68994
21 62045.7745 Rate (MB/s) 7.48307
22 64517.7771 Rate (MB/s) 7.7812
23 69570.2935 Rate (MB/s) 8.39057
24 69673.8328 Rate (MB/s) 8.40305
25 75196.7514 Rate (MB/s) 9.06915
26 72304.2685 Rate (MB/s) 8.7203
27 73234.1616 Rate (MB/s) 8.83245
28 74041.3842 Rate (MB/s) 8.9298
29 77117.3751 Rate (MB/s) 9.30079
30 78293.8496 Rate (MB/s) 9.44268
31 81377.0870 Rate (MB/s) 9.81453
32 84097.0813 Rate (MB/s) 10.1426
Best,
Ce
Mark Adams <mfadams at lbl.gov> 于2022年7月12日周二 22:11写道:
> You may get more memory bandwidth with 32 processors vs 1, as Ce mentioned.
> Depends on the architecture.
> Do you get the whole memory bandwidth on one processor on this machine?
>
> On Tue, Jul 12, 2022 at 8:53 AM Matthew Knepley <knepley at gmail.com> wrote:
>
>> On Tue, Jul 12, 2022 at 7:32 AM Ce Qin <qince168 at gmail.com> wrote:
>>
>>>
>>>
>>>>>> The linear system is complex-valued. We rewrite it into its real form
>>>>>> and solve it using FGMRES and an optimal block-diagonal
>>>>>> preconditioner.
>>>>>> We use CG and the AMS preconditioner implemented in HYPRE to solve the
>>>>>> smaller real linear system arised from applying the block
>>>>>> preconditioner.
>>>>>> The iteration number of FGMRES and CG keep almost constant in all the
>>>>>> runs.
>>>>>>
>>>>>
>>>>> So those blocks decrease in size as you add more processes?
>>>>>
>>>>>
>>>>
>>> I am sorry for the unclear description of the block-diagonal
>>> preconditioner.
>>> Let K be the original complex system matrix, A = [Kr, -Ki; -Ki, -Kr] is
>>> the equivalent
>>> real form of K. Let P = [Kr+Ki, 0; 0, Kr+Ki], it can beproved that P is
>>> an optimal
>>> preconditioner for A. In our implementation, only Kr, Ki and Kr+Ki
>>> are explicitly stored as MATMPIAIJ. We use MATSHELL to represent A and P.
>>> We use FGMRES + P to solve Ax=b, and CG + AMS to
>>> solve (Kr+Ki)y=c. So the block size is never changed.
>>>
>>
>> Then we have to break down the timings further. I suspect AMS is not
>> taking as long, since
>> all other operations scale like N.
>>
>> Thanks,
>>
>> Matt
>>
>>
>>
>>> Best,
>>> Ce
>>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20220712/2f2b1b3b/attachment.html>
More information about the petsc-users
mailing list