[petsc-users] GAMG Parallel Performance

Fri Nov 16 05:51:29 CST 2018

On Fri, Nov 16, 2018 at 5:35 AM Karin&NiKo via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Dear PETSc team,
>
> I have run the same test on the same number of processes as before (1000,
> 1500 and 2000) but by increasing the number of nodes. The results are much
> better!
> If I focus on the KSPSolve event, I have the following timings:
> 1000 => 1.2681e+02
> 1500 => 8.7030e+01
> 2000 => 7.8904e+01
> The parallel efficiency between 1000 and 1500 is around to 96% but it
> decreases drastically when using 2000 processes. I think my problem is too
> small and the communications begin to be important.
>
> I have an extra question : in the profiling section, what is exactly
> measured in  "Time (sec): " ? I wonder if it is the time between
> PetscInitialize and PetscFinalize?
>

Yep. Also, your communication could get more expensive at the 2000 level by
including another cabinet or something.

  Thanks,

     Matt

> Thanks again for your help,
> Nicolas
>
>
> Le ven. 16 nov. 2018 à 00:24, Karin&NiKo <niko.karin at gmail.com> a écrit :
>
>> Ok. I will do that soon and I will let you know.
>> Thanks again,
>> Nicolas
>>
>> Le jeu. 15 nov. 2018 20:50, Smith, Barry F. <bsmith at mcs.anl.gov> a
>> écrit :
>>
>>>
>>>
>>> > On Nov 15, 2018, at 1:02 PM, Mark Adams <mfadams at lbl.gov> wrote:
>>> >
>>> > There is a lot of load imbalance in VecMAXPY also. The partitioning
>>> could be bad and if not its the machine.
>>>
>>>
>>> >
>>> > On Thu, Nov 15, 2018 at 1:56 PM Smith, Barry F. via petsc-users <
>>> petsc-users at mcs.anl.gov> wrote:
>>> >
>>> >     Something is odd about your configuration. Just consider the time
>>> for VecMAXPY which is an embarrassingly parallel operation. On 1000 MPI
>>> processes it produces
>>> >
>>> >                                                 Time
>>>
>>>                               flop rate
>>> >  VecMAXPY             575 1.0 8.4132e-01 1.5 1.36e+09 1.0 0.0e+00
>>> 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0 1,600,021
>>> >
>>> > on 1500 processes it produces
>>> >
>>> >  VecMAXPY             583 1.0 1.0786e+00 3.4 9.38e+08 1.0 0.0e+00
>>> 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0 1,289,187
>>> >
>>> > that is it actually takes longer (the time goes from .84 seconds to
>>> 1.08 seconds and the flop rate from 1,600,021 down to 1,289,187) You would
>>> never expect this kind of behavior
>>> >
>>> > and on 2000 processes it produces
>>> >
>>> > VecMAXPY             583 1.0 7.1103e-01 2.7 7.03e+08 1.0 0.0e+00
>>> 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0 1,955,563
>>> >
>>> > so it speeds up again but not by very much. This is very mysterious
>>> and not what you would expect.
>>> >
>>> >    I'm inclined to believe something is out of whack on your computer,
>>> are you sure all nodes on the computer are equivalent? Same processors,
>>> same clock speeds? What happens if you run the 1000 process case several
>>> times, do you get very similar numbers for VecMAXPY()? You should but I am
>>> guessing you may not.
>>> >
>>> >     Barry
>>> >
>>> >   Note that this performance issue doesn't really have anything to do
>>> with the preconditioner you are using.
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > > On Nov 15, 2018, at 10:50 AM, Karin&NiKo via petsc-users <
>>> petsc-users at mcs.anl.gov> wrote:
>>> > >
>>> > > Dear PETSc team,
>>> > >
>>> > > I am solving a linear transient dynamic problem, based on a
>>> discretization with finite elements. To do that, I am using FGMRES with
>>> GAMG as a preconditioner. I consider here 10 time steps.
>>> > > The problem has round to 118e6 dof and I am running on 1000, 1500
>>> and 2000 procs. So I have something like 100e3, 78e3 and 50e3 dof/proc.
>>> > > I notice that the performance deteriorates when I increase the
>>> number of processes.
>>> > > You can find as attached file the log_view of the execution and the
>>> detailled definition of the KSP.
>>> > >
>>> > > Is the problem too small to run on that number of processes or is
>>> there something wrong with my use of GAMG?
>>> > >
>>> > > I thank you in advance for your help,
>>> > > Nicolas
>>> > >
>>> <FGMRES_GAMG_1000procs.txt><FGMRES_GAMG_2000procs.txt><FGMRES_GAMG_1500procs.txt>
>>> >
>>>
>>>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20181116/b2f80456/attachment.html>