[petsc-users] Parallel processes run significantly slower

Junchao Zhang junchao.zhang at gmail.com
Fri Jan 12 09:41:39 CST 2024


Hi,  Steffen,
  Would it be an MPI process binding issue?  Could you try running with

mpiexec --bind-to core -n N python parallel_example.py


--Junchao Zhang


On Fri, Jan 12, 2024 at 8:52 AM Steffen Wilksen | Universitaet Bremen <
swilksen at itp.uni-bremen.de> wrote:

> Thank you for your feedback.
> @Stefano: the use of my communicator was intentional, since I later intend
> to distribute M independent calculations to N processes, each process then
> only needing to do M/N calculations. Of course I don't expect speed up in
> my example since the number of calculations is constant and not dependent
> on N, but I would hope that the time each process takes does not increase
> too drastically with N.
> @Barry: I tried to do the STREAMS benchmark, these are my results:
> 1  23467.9961   Rate (MB/s) 1
> 2  26852.0536   Rate (MB/s) 1.1442
> 3  29715.4762   Rate (MB/s) 1.26621
> 4  34132.2490   Rate (MB/s) 1.45442
> 5  34924.3020   Rate (MB/s) 1.48817
> 6  34315.5290   Rate (MB/s) 1.46223
> 7  33134.9545   Rate (MB/s) 1.41192
> 8  33234.9141   Rate (MB/s) 1.41618
> 9  32584.3349   Rate (MB/s) 1.38846
> 10  32582.3962   Rate (MB/s) 1.38838
> 11  32098.2903   Rate (MB/s) 1.36775
> 12  32064.8779   Rate (MB/s) 1.36632
> 13  31692.0541   Rate (MB/s) 1.35044
> 14  31274.2421   Rate (MB/s) 1.33263
> 15  31574.0196   Rate (MB/s) 1.34541
> 16  30906.7773   Rate (MB/s) 1.31698
>
> I also attached the resulting plot. As it seems, I get very bad MPI
> speedup (red curve, right?), even decreasing if I use too many threads. I
> don't fully understand the reasons given in the discussion you linked since
> this is all very new to me, but I take that this is a problem with my
> computer which I can't easily fix, right?
>
>
> ----- Message from Barry Smith <bsmith at petsc.dev> ---------
>    Date: Thu, 11 Jan 2024 11:56:24 -0500
>    From: Barry Smith <bsmith at petsc.dev>
> Subject: Re: [petsc-users] Parallel processes run significantly slower
>      To: Steffen Wilksen | Universitaet Bremen <swilksen at itp.uni-bremen.de
> >
>      Cc: PETSc users list <petsc-users at mcs.anl.gov>
>
>
>    Take a look at the discussion in
> https://petsc.gitlab.io/-/petsc/-/jobs/5814862879/artifacts/public/html/manual/streams.html and
> I suggest you run the streams benchmark from the branch barry/2023-09-15/fix-log-pcmpi
> on your machine to get a baseline for what kind of speedup you can expect.
>
>     Then let us know your thoughts.
>
>    Barry
>
>
>
> On Jan 11, 2024, at 11:37 AM, Stefano Zampini <stefano.zampini at gmail.com>
> wrote:
>
> You are creating the matrix on the wrong communicator if you want it
> parallel. You are using PETSc.COMM_SELF
>
> On Thu, Jan 11, 2024, 19:28 Steffen Wilksen | Universitaet Bremen <
> swilksen at itp.uni-bremen.de> wrote:
>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *Hi all, I'm trying to do repeated matrix-vector-multiplication of large
>> sparse matrices in python using petsc4py. Even the most simple method of
>> parallelization, dividing up the calculation to run on multiple processes
>> indenpendtly, does not seem to give a singnificant speed up for large
>> matrices. I constructed a minimal working example, which I run using
>> mpiexec -n N python parallel_example.py, where N is the number of
>> processes. Instead of taking approximately the same time irrespective of
>> the number of processes used, the calculation is much slower when starting
>> more MPI processes. This translates to little to no speed up when splitting
>> up a fixed number of calculations over N processes. As an example, running
>> with N=1 takes 9s, while running with N=4 takes 34s. When running with
>> smaller matrices, the problem is not as severe (only slower by a factor of
>> 1.5 when setting MATSIZE=1e+5 instead of MATSIZE=1e+6). I get the same
>> problems when just starting the script four times manually without using
>> MPI. I attached both the script and the log file for running the script
>> with N=4. Any help would be greatly appreciated. Calculations are done on
>> my laptop, arch linux version 6.6.8 and PETSc version 3.20.2. Kind Regards
>> Steffen*
>>
>
>
>
> *----- End message from Barry Smith <bsmith at petsc.dev <bsmith at petsc.dev>>
> -----*
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20240112/ae60e21a/attachment-0001.html>


More information about the petsc-users mailing list