[petsc-users] [petsc-maint] Correct use of PCFactorSetMatOrderingType

Edoardo alinovi edoardo.alinovi at gmail.com
Thu Nov 8 11:03:11 CST 2018


Yes, it is like you are saying. This is mostly due to the time employed by
ksp to solve the pressure equation. However, I have worked a lot on the
problem and I have found out that the default configuration is far to be
the optimal one, at least in this case.

Actually my cpu time is decreased by more than twice with respect to the
default configuration. I put here the changes, maybe they can be usefull to
other users in the future:

-pc_hypre_boomeramg_no_CF
-pc_hypre_boomeramg_agg_nl 1
pc_hypre_boomeramg_coarsen_type HMIS
-pc_hypre_boomeramg_interp_type FF1

The last two seem to be essential in enhancing performances.

Do you know other configurations that worth to test?  Also, I would like to
know if a list of command options for hypre is available somewhere. Looking
at hypre's doc there are a lot of options, but I do not exactly know which
one is available in petsc and their name.

Thank you very much for the kind support and sorry for the long emal.!

Il giorno gio 8 nov 2018, 17:32 Mark Adams <mfadams at lbl.gov> ha scritto:

> To repeat:
>
> You seem to be saying that OpenFOAM solves the problem in 10 seconds and
> PETSc solves it in 14 seconds. Is that correct?
>
>
>
> On Thu, Nov 8, 2018 at 3:42 AM Edoardo alinovi via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
>> Hello Mark,
>>
>> Yes, there are 5 KSP calls within a time-step (3 for the solution of
>> momentum equation + 2 for the solution of pressure), this is the classical
>> non iterative PISO by Issa ( the exact sequence of operations is : solve
>> momentum implicitly, solve pressure-correction, momentum explicitly,
>> pressure correction). The pressure correction equation ,which is something
>> similar to a Poisson equation for incompressible flows, is the one that
>> determines the overall performance in my code such as in the others.
>> Usually, when the pressure is being solved for the second time, the
>> solution is faster since there is a better input guess and, as in my case,
>> the preconditioner is not recomputed again.
>>
>> Have you got some advices for the multigrid configuration in this
>> scenario, which are not the default one, in order to increase performances?
>>
>> I do not know if this may impact  drastically the performance, but I am
>> running on a E4 workstation with 16 Intel's Xeon processors (2.3GH/12MB
>> cache)  and 128GB of RAM .
>>
>> Thank you very much for your helpful comments,
>>
>>
>> Edoardo
>> ------
>>
>> Edoardo Alinovi, Ph.D.
>>
>> DICCA, Scuola Politecnica
>> Universita' di Genova
>> 1, via Montallegro
>> 16145 Genova, Italy
>>
>> email: edoardo.alinovi at dicca.unige.it
>> Tel: +39 010 353 2540
>>
>>
>>
>>
>> Il giorno mer 7 nov 2018 alle ore 17:59 Mark Adams <mfadams at lbl.gov> ha
>> scritto:
>>
>>> please respond to petsc-users.
>>>
>>> You are doing 5 solves here in 14 seconds. You seem to be saying that
>>> the two pressure solves are taking all of this time. I don't know why the
>>> two solves are different.
>>>
>>> You seem to be saying that OpenFOAM solves the problem in 10 seconds and
>>> PETSc solves it in 14 seconds. Is that correct? Hypre seems to be running
>>> fine.
>>>
>>>
>>>
>>> On Wed, Nov 7, 2018 at 11:24 AM Edoardo alinovi <
>>> edoardo.alinovi at gmail.com> wrote:
>>>
>>>> Thanks a lot Mark for your kind replay. The solver is mine and I use
>>>> PETSc  for the solution of momentum and pressure. The first is solved very
>>>> fast by a standard bcgs + bjacobi, but the pressure is the source of all
>>>> evils and, unfortunately, I am pretty sure that almost all the time within
>>>> the time-step is needed by KSP to solve the pressure (see log attached). I
>>>> have verified this also putting a couple of mpi_wtime around the kspsolve
>>>> call. The pressure is solved 2 times (1 prediction + 1 correction), the
>>>> prediction takes around 11s , the correction around 4s (here I am avoiding
>>>> to recompute the preconditioner), all the rest of the code (flux assembling
>>>> + mometum solution + others) around 1s. Openfoam does the same procedure
>>>> with the same tolerance in 10s using its gamg version (50 it to converge).
>>>> The number of iteration required to solve the pressure with hypre are 12.
>>>> Gamg performs similarly to hypre in terms of speed, but with 50 iterations
>>>> to converge. Am I missing something in the setup in your opinion?
>>>>
>>>> thanks a lot,
>>>>
>>>> Edo
>>>>
>>>> ------
>>>>
>>>> Edoardo Alinovi, Ph.D.
>>>>
>>>> DICCA, Scuola Politecnica
>>>> Universita' di Genova
>>>> 1, via Montallegro
>>>> 16145 Genova, Italy
>>>>
>>>> email: edoardo.alinovi at dicca.unige.it
>>>> Tel: +39 010 353 2540
>>>>
>>>>
>>>>
>>>>
>>>> Il giorno mer 7 nov 2018 alle ore 16:50 Mark Adams <mfadams at lbl.gov>
>>>> ha scritto:
>>>>
>>>>> You can try -pc_type gamg, but hypre is a pretty good solver for the
>>>>> Laplacian. If hypre is just a little faster than LU on a 3D problem (that
>>>>> takes 10 seconds to solve) then AMG is not doing well. I would expect that
>>>>> AMG is taking a lot of iterations (eg, >> 10). You can check that with
>>>>> -ksp_monitor.
>>>>>
>>>>> The PISO algorithm is a multistage algorithm with a pressure
>>>>> correction in it. It also has a solve for the velocity, from what I can
>>>>> tell. Are you building PISO yourself and using PETSc just for the pressure
>>>>> correction? Are you sure the time is spent in this solver? You can use
>>>>> -log_view to see performance numbers and look for KSPSolve to see how much
>>>>> time is spent in the PETSc solver.
>>>>>
>>>>> Mark
>>>>>
>>>>>
>>>>> On Wed, Nov 7, 2018 at 10:26 AM Zhang, Hong via petsc-maint <
>>>>> petsc-maint at mcs.anl.gov> wrote:
>>>>>
>>>>>> Edoardo:
>>>>>> Forwarding your request to petsc-maint where you can get fast and
>>>>>> expert advise. I do not have suggestion for your application, but someone
>>>>>> in our team likely will make suggestion.
>>>>>> Hong
>>>>>>
>>>>>> Hello Hong,
>>>>>>>
>>>>>>> Well,  using -sub_pc_type lu  it super slow. I am
>>>>>>> desperately triying to enhance performaces of my code (CFD, finite volume,
>>>>>>> PISO alghoritm), in particular I have a strong bottleneck in the solution
>>>>>>> of pressure correction equation which takes almost the 90% of computational
>>>>>>> time. Using multigrid as preconditoner (hypre with default options)  is
>>>>>>> slighlty better, but comparing the results against the multigrid used in
>>>>>>> openFOAM, my code is losing 10s/iteration which a huge amount of time. Now,
>>>>>>> since that all the time is employed by KSPSolve, I feel a bit powerless.
>>>>>>> Do you have any helpful advice?
>>>>>>>
>>>>>>> Thank you very much!
>>>>>>> ------
>>>>>>>
>>>>>>> Edoardo Alinovi, Ph.D.
>>>>>>>
>>>>>>> DICCA, Scuola Politecnica
>>>>>>> Universita' di Genova
>>>>>>> 1, via Montallegro
>>>>>>> 16145 Genova, Italy
>>>>>>>
>>>>>>> email: edoardo.alinovi at dicca.unige.it
>>>>>>> Tel: +39 010 353 2540
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Il giorno mar 6 nov 2018 alle ore 17:15 Zhang, Hong <
>>>>>>> hzhang at mcs.anl.gov> ha scritto:
>>>>>>>
>>>>>>>> Edoardo:
>>>>>>>> Interesting. I thought it would not affect performance much. What
>>>>>>>> happens if you use -sub_pc_type lu'?
>>>>>>>> Hong
>>>>>>>>
>>>>>>>> Dear Hong and Matt,
>>>>>>>>>
>>>>>>>>> thank you for your kind replay. I have just tested your
>>>>>>>>> suggestions and applied " -sub_pc_type ilu -sub_pc_factor_mat_ordering_type
>>>>>>>>> nd/rcm" and, in both cases, I have found  a deterioration of
>>>>>>>>> performances with respect to doing nothing (thus just putting default
>>>>>>>>> PCBJACOBI). Is it normal? However, I guess this is very problem dependent.
>>>>>>>>> ------
>>>>>>>>>
>>>>>>>>> Edoardo Alinovi, Ph.D.
>>>>>>>>>
>>>>>>>>> DICCA, Scuola Politecnica
>>>>>>>>> Universita' di Genova
>>>>>>>>> 1, via Montallegro
>>>>>>>>> 16145 Genova, Italy
>>>>>>>>>
>>>>>>>>> email: edoardo.alinovi at dicca.unige.it
>>>>>>>>> Tel: +39 010 353 2540
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Il giorno mar 6 nov 2018 alle ore 16:04 Zhang, Hong <
>>>>>>>>> hzhang at mcs.anl.gov> ha scritto:
>>>>>>>>>
>>>>>>>>>> Edoardo:
>>>>>>>>>> You can test runtime option '-sub_pc_factor_mat_ordering_type'
>>>>>>>>>> and use '-log_view' to get performance on different orderings,
>>>>>>>>>> e.g.,petsc/src/ksp/ksp/examples/tutorials/ex2.c:
>>>>>>>>>> mpiexec -n 2 ./ex2 -ksp_view -sub_pc_type ilu
>>>>>>>>>> -sub_pc_factor_mat_ordering_type nd
>>>>>>>>>>
>>>>>>>>>> I do not think the ordering inside block for ilu would affect
>>>>>>>>>> performance much. Let us know what you will get.
>>>>>>>>>> Hong
>>>>>>>>>>
>>>>>>>>>> Dear users,
>>>>>>>>>>>
>>>>>>>>>>> I have a question about the correct use of the option
>>>>>>>>>>> "PCFactorSetMatOrderingType" in PETSc.
>>>>>>>>>>>
>>>>>>>>>>> I am solving a problem with 2.5M of unknowns distributed along
>>>>>>>>>>> 16 processors and  I am using the block jacobi preconditioner and MPIAIJ
>>>>>>>>>>> matrix format. I cannot figure out if the above option can be useful or not
>>>>>>>>>>> in decreasing the computational time. Any suggestion or tips?
>>>>>>>>>>>
>>>>>>>>>>> Thank you very much for the kind help
>>>>>>>>>>>
>>>>>>>>>>> ------
>>>>>>>>>>>
>>>>>>>>>>> Edoardo Alinovi, Ph.D.
>>>>>>>>>>>
>>>>>>>>>>> DICCA, Scuola Politecnica
>>>>>>>>>>> Universita' di Genova
>>>>>>>>>>> 1, via Montallegro
>>>>>>>>>>> 16145 Genova, Italy
>>>>>>>>>>>
>>>>>>>>>>> email: edoardo.alinovi at dicca.unige.it
>>>>>>>>>>> Tel: +39 010 353 2540
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20181108/3c9ed3fc/attachment-0001.html>


More information about the petsc-users mailing list