[petsc-users] Scaling problem when cores > 600

Sat Apr 21 14:19:36 CDT 2018

looks fine to me.

On Sat, Apr 21, 2018 at 11:34 AM, TAY wee-beng <zonexo at gmail.com> wrote:

> Hi,
>
> I have found some time to work on this scaling problem again. I am now
> using:
>
> mpirun ./a.out -log_view -poisson_pc_type gamg
> -poisson_pc_gamg_agg_nsmooths 1
>
> I have attached the log_view output for 288, 600, 960, 1440 procs for
> comparison.
>
> Please give some comments.
>
>
> Thank you very much
>
> Yours sincerely,
>
> ================================================
> TAY Wee-Beng 郑伟明 (Zheng Weiming)
> Personal research webpage: http://tayweebeng.wixsite.com/website
> Youtube research showcase: https://www.youtube.com/channe
> l/UC72ZHtvQNMpNs2uRTSToiLA
> linkedin: www.linkedin.com/in/tay-weebeng
> ================================================
>
> On 7/3/2018 11:58 PM, Smith, Barry F. wrote:
>
>>     What are you using for Poisson log.
>>
>>     If it is a Poisson problem then almost for sure you should be using
>> Hypre BoomerAMG?.
>>
>>     It sounds like your matrix does not change. You will need to discuss
>> the scaling with the hypre people.
>>
>>     Barry
>>
>>
>> On Mar 7, 2018, at 5:38 AM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>
>>>
>>> On 7/3/2018 6:22 AM, Smith, Barry F. wrote:
>>>
>>>>     The speed up for "Poisson log" is 1.6425364214878704 =
>>>> 5.0848e+02/3.0957e+02
>>>>
>>>>      This is lower than I would expect for Hypre BoomerAMG?
>>>>
>>>>      Are you doing multiple solves with the same matrix with hypre or
>>>> is each solve a new matrix? If each solve is a new matrix then you may be
>>>> getting expected behavior since the multigrid AMG construction process does
>>>> not scale as well as the application of AMG once it is constructed.
>>>>
>>>>      I am forwarding to the hypre team since this is their expertise
>>>> not ours
>>>>
>>>>     Barry
>>>>
>>>> Hi,
>>>
>>> My LHS of the eqn does not change. Only the RHS changes at each time
>>> step. So should this be expected?
>>>
>>> So maybe I should change to BoomerAMG and compare?
>>>
>>> Will PETSc GAMG give better performance?
>>>
>>> Also, I must add that I only partition in the x and y direction. Will
>>> this be a factor?
>>>
>>> Thanks.
>>>
>>> On Mar 5, 2018, at 11:19 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>
>>>>>
>>>>> On 5/3/2018 11:43 AM, Smith, Barry F. wrote:
>>>>>
>>>>>> 360 process
>>>>>>
>>>>>> KSPSolve              99 1.0 2.6403e+02 1.0 6.67e+10 1.1 2.7e+05
>>>>>> 9.9e+05 5.1e+02 15100 17 42 19  15100 17 42 19 87401
>>>>>>
>>>>>> 1920 processes
>>>>>>
>>>>>> KSPSolve              99 1.0 2.3184e+01 1.0 1.32e+10 1.2 1.5e+06
>>>>>> 4.3e+05 5.1e+02  4100 17 42 19   4100 17 42 19 967717
>>>>>>
>>>>>>
>>>>>> Ratio of number of processes 5.33 ratio of time for KSPSolve  11.388
>>>>>> so the time for the solve is scaling very well (extremely well actually).
>>>>>> The problem is
>>>>>> due to "other time" that is not in KSP solve. Note that the
>>>>>> percentage of the total time in KSPSolve went from 15 percent of the
>>>>>> runtime to 4 percent. This means something outside of KSPSolve is scaling
>>>>>> very poorly. You will need to profile the rest of the code to determine
>>>>>> where the time is being spent. PetscLogEventRegister()  and
>>>>>> PetscLogEventBegin/End() will be needed in your code. Already with 360
>>>>>> processes the linear solver is only taking 15 percent of the time.
>>>>>>
>>>>>>    Barry
>>>>>>
>>>>>> Hi,
>>>>>
>>>>> I have attached the new logging results with the HYPRE Poisson eqn
>>>>> solver. However, due to some problems, I am now using Intel 2018. Should be
>>>>> quite similar to 2016 in terms of runtime. Using 360 processes can't work
>>>>> this time, and I'm not sure why though.
>>>>>
>>>>>> On Mar 4, 2018, at 9:23 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 1/3/2018 12:14 PM, Smith, Barry F. wrote:
>>>>>>>
>>>>>>>> On Feb 28, 2018, at 8:01 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 1/3/2018 12:10 AM, Matthew Knepley wrote:
>>>>>>>>>
>>>>>>>>>> On Wed, Feb 28, 2018 at 10:45 AM, TAY wee-beng <zonexo at gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I have a CFD code which uses PETSc and HYPRE. I found that for a
>>>>>>>>>> certain case with grid size of 192,570,048, I encounter scaling problem
>>>>>>>>>> when my cores > 600. At 600 cores, the code took 10min for 100 time steps.
>>>>>>>>>> At 960, 1440 and 2880 cores, it still takes around 10min. At 360 cores, it
>>>>>>>>>> took 15min.
>>>>>>>>>>
>>>>>>>>>> So how can I find the bottleneck? Any recommended steps?
>>>>>>>>>>
>>>>>>>>>> For any performance question, we need to see the output of
>>>>>>>>>> -log_view for all test cases.
>>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> To be more specific, I use PETSc KSPBCGS and HYPRE geometric
>>>>>>>>> multigrid (entirely based on HYPRE, no PETSc) for the momentum and Poisson
>>>>>>>>> eqns in my code.
>>>>>>>>>
>>>>>>>>> So can log_view be used in this case to give a meaningful? Since
>>>>>>>>> part of the code uses HYPRE?
>>>>>>>>>
>>>>>>>>    Yes, just send the logs.
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>
>>>>>>> I have attached the logs, with the number indicating the no. of
>>>>>>> cores used. Some of the new results are different from the previous runs,
>>>>>>> although I'm using the same cluster.
>>>>>>>
>>>>>>> Thanks for the help.
>>>>>>>
>>>>>>>> I also program another subroutine in the past which uses PETSc to
>>>>>>>>> solve the Poisson eqn. It uses either HYPRE's boomeramg, KSPBCGS or
>>>>>>>>> KSPGMRES.
>>>>>>>>>
>>>>>>>>> If I use boomeramg, can log_view be used in this case?
>>>>>>>>>
>>>>>>>>> Or do I have to use KSPBCGS or KSPGMRES, which is directly from
>>>>>>>>> PETSc? However, I ran KSPGMRES yesterday with the Poisson eqn and my ans
>>>>>>>>> didn't converge.
>>>>>>>>>
>>>>>>>>> Thanks.
>>>>>>>>>
>>>>>>>>>>   I must also mention that I partition my grid only in the x and
>>>>>>>>>> y direction. There is no partitioning in the z direction due to limited
>>>>>>>>>> code development. I wonder if there is a strong effect in this case.
>>>>>>>>>>
>>>>>>>>>> Maybe. Usually what happens is you fill up memory with a z-column
>>>>>>>>>> and cannot scale further.
>>>>>>>>>>
>>>>>>>>>>    Thanks,
>>>>>>>>>>
>>>>>>>>>>       Matt
>>>>>>>>>>   --
>>>>>>>>>> Thank you very much
>>>>>>>>>>
>>>>>>>>>> Yours sincerely,
>>>>>>>>>>
>>>>>>>>>> ================================================
>>>>>>>>>> TAY Wee-Beng 郑伟明 (Zheng Weiming)
>>>>>>>>>> Personal research webpage: http://tayweebeng.wixsite.com/website
>>>>>>>>>> Youtube research showcase: https://www.youtube.com/channe
>>>>>>>>>> l/UC72ZHtvQNMpNs2uRTSToiLA
>>>>>>>>>> linkedin: www.linkedin.com/in/tay-weebeng
>>>>>>>>>> ================================================
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> What most experimenters take for granted before they begin their
>>>>>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>>>>>> experiments lead.
>>>>>>>>>> -- Norbert Wiener
>>>>>>>>>>
>>>>>>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>>>>>>>
>>>>>>>>> <log960.txt><log600.txt><log360.txt><log1920.txt>
>>>>>>>
>>>>>> <log1920_2.txt><log600_2.txt><log960_2.txt><log1440_2.txt>
>>>>>
>>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20180421/3d089671/attachment.html>