[petsc-users] mg pre-conditioner default setup from PETSc-3.4 to PETSc-3.7

Federico Golfrè Andreasi federico.golfre at gmail.com
Wed Sep 13 10:56:11 CDT 2017


Hi Barry,

I understand and perfectly agree with you that the behavior increase after
the release due to better tuning.

In my case, the difference in the solution is negligible, but the runtime
increases up to +70% (with the same number of ksp_iterations).
So I was wondering if maybe there were just some flags related to memory
preallocation or re-usage of intermediate solution that before was
defaulted.

Thank you,
Federico



On 13 September 2017 at 17:29, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>    There will likely always be slight differences in convergence over that
> many releases. Lots of little defaults etc get changed over time as we
> learn from users and increase the robustness of the defaults.
>
>     So in your case do the differences matter?
>
> 1) What is the time to solution in both cases, is it a few percent
> different or now much slower?
>
> 2) What about number of iterations? Almost identical (say 1 or 2
> different) or does it now take 30 iterations when it use to take 5?
>
>   Barry
>
> > On Sep 13, 2017, at 10:25 AM, Federico Golfrè Andreasi <
> federico.golfre at gmail.com> wrote:
> >
> > Dear PETSc users/developers,
> >
> > I recently switched from PETSc-3.4 to PETSc-3.7 and found that some
> default setup for the "mg" (mutigrid) preconditioner have changed.
> >
> > We were solving a linear system passing, throug command line, the
> following options:
> > -ksp_type      fgmres
> > -ksp_max_it    100000
> > -ksp_rtol      0.000001
> > -pc_type       mg
> > -ksp_view
> >
> > The output of the KSP view is as follow:
> >
> > KSP Object: 128 MPI processes
> >   type: fgmres
> >     GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> >     GMRES: happy breakdown tolerance 1e-30
> >   maximum iterations=100000, initial guess is zero
> >   tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
> >   right preconditioning
> >   using UNPRECONDITIONED norm type for convergence test
> > PC Object: 128 MPI processes
> >   type: mg
> >     MG: type is MULTIPLICATIVE, levels=1 cycles=v
> >       Cycles per PCApply=1
> >       Not using Galerkin computed coarse grid matrices
> >   Coarse grid solver -- level -------------------------------
> >     KSP Object:    (mg_levels_0_)     128 MPI processes
> >       type: chebyshev
> >         Chebyshev: eigenvalue estimates:  min = 0.223549, max = 2.45903
> >         Chebyshev: estimated using:  [0 0.1; 0 1.1]
> >         KSP Object:        (mg_levels_0_est_)         128 MPI processes
> >           type: gmres
> >             GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> >             GMRES: happy breakdown tolerance 1e-30
> >           maximum iterations=10, initial guess is zero
> >           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
> >           left preconditioning
> >           using NONE norm type for convergence test
> >         PC Object:        (mg_levels_0_)         128 MPI processes
> >           type: sor
> >             SOR: type = local_symmetric, iterations = 1, local
> iterations = 1, omega = 1
> >           linear system matrix followed by preconditioner matrix:
> >           Matrix Object:           128 MPI processes
> >             type: mpiaij
> >             rows=279669, cols=279669
> >             total: nonzeros=6427943, allocated nonzeros=6427943
> >             total number of mallocs used during MatSetValues calls =0
> >               not using I-node (on process 0) routines
> >           Matrix Object:           128 MPI processes
> >             type: mpiaij
> >             rows=279669, cols=279669
> >             total: nonzeros=6427943, allocated nonzeros=6427943
> >             total number of mallocs used during MatSetValues calls =0
> >               not using I-node (on process 0) routines
> >       maximum iterations=1, initial guess is zero
> >       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
> >       left preconditioning
> >       using NONE norm type for convergence test
> >     PC Object:    (mg_levels_0_)     128 MPI processes
> >       type: sor
> >         SOR: type = local_symmetric, iterations = 1, local iterations =
> 1, omega = 1
> >       linear system matrix followed by preconditioner matrix:
> >       Matrix Object:       128 MPI processes
> >         type: mpiaij
> >         rows=279669, cols=279669
> >         total: nonzeros=6427943, allocated nonzeros=6427943
> >         total number of mallocs used during MatSetValues calls =0
> >           not using I-node (on process 0) routines
> >       Matrix Object:       128 MPI processes
> >         type: mpiaij
> >         rows=279669, cols=279669
> >         total: nonzeros=6427943, allocated nonzeros=6427943
> >         total number of mallocs used during MatSetValues calls =0
> >           not using I-node (on process 0) routines
> >   linear system matrix followed by preconditioner matrix:
> >   Matrix Object:   128 MPI processes
> >     type: mpiaij
> >     rows=279669, cols=279669
> >     total: nonzeros=6427943, allocated nonzeros=6427943
> >     total number of mallocs used during MatSetValues calls =0
> >       not using I-node (on process 0) routines
> >   Matrix Object:   128 MPI processes
> >     type: mpiaij
> >     rows=279669, cols=279669
> >     total: nonzeros=6427943, allocated nonzeros=6427943
> >     total number of mallocs used during MatSetValues calls =0
> >       not using I-node (on process 0) routines
> >
> > When I build the same program using PETSc-3.7 and run it with the same
> options we observe that the runtime increases and the convergence is
> slightly different. The output of the KSP view is:
> >
> > KSP Object: 128 MPI processes
> >   type: fgmres
> >     GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> >     GMRES: happy breakdown tolerance 1e-30
> >   maximum iterations=100000, initial guess is zero
> >   tolerances:  relative=1e-06, absolute=1e-50, divergence=10000.
> >   right preconditioning
> >   using UNPRECONDITIONED norm type for convergence test
> > PC Object: 128 MPI processes
> >   type: mg
> >     MG: type is MULTIPLICATIVE, levels=1 cycles=v
> >       Cycles per PCApply=1
> >       Not using Galerkin computed coarse grid matrices
> >   Coarse grid solver -- level -------------------------------
> >     KSP Object:    (mg_levels_0_)     128 MPI processes
> >       type: chebyshev
> >         Chebyshev: eigenvalue estimates:  min = 0.223549, max = 2.45903
> >         Chebyshev: eigenvalues estimated using gmres with translations
> [0. 0.1; 0. 1.1]
> >         KSP Object:        (mg_levels_0_esteig_)         128 MPI
> processes
> >           type: gmres
> >             GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
> >             GMRES: happy breakdown tolerance 1e-30
> >           maximum iterations=10, initial guess is zero
> >           tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
> >           left preconditioning
> >           using PRECONDITIONED norm type for convergence test
> >       maximum iterations=2, initial guess is zero
> >       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
> >       left preconditioning
> >       using NONE norm type for convergence test
> >     PC Object:    (mg_levels_0_)     128 MPI processes
> >       type: sor
> >         SOR: type = local_symmetric, iterations = 1, local iterations =
> 1, omega = 1.
> >       linear system matrix followed by preconditioner matrix:
> >       Mat Object:       128 MPI processes
> >         type: mpiaij
> >         rows=279669, cols=279669
> >         total: nonzeros=6427943, allocated nonzeros=6427943
> >         total number of mallocs used during MatSetValues calls =0
> >           not using I-node (on process 0) routines
> >       Mat Object:       128 MPI processes
> >         type: mpiaij
> >         rows=279669, cols=279669
> >         total: nonzeros=6427943, allocated nonzeros=6427943
> >         total number of mallocs used during MatSetValues calls =0
> >           not using I-node (on process 0) routines
> >   linear system matrix followed by preconditioner matrix:
> >   Mat Object:   128 MPI processes
> >     type: mpiaij
> >     rows=279669, cols=279669
> >     total: nonzeros=6427943, allocated nonzeros=6427943
> >     total number of mallocs used during MatSetValues calls =0
> >       not using I-node (on process 0) routines
> >   Mat Object:   128 MPI processes
> >     type: mpiaij
> >     rows=279669, cols=279669
> >     total: nonzeros=6427943, allocated nonzeros=6427943
> >     total number of mallocs used during MatSetValues calls =0
> >       not using I-node (on process 0) routines
> >
> > I was able to get a closer solution adding the following options:
> > -mg_levels_0_esteig_ksp_norm_type   none
> > -mg_levels_0_esteig_ksp_rtol        1.0e-5
> > -mg_levels_ksp_max_it               1
> >
> > But I still can reach the same runtime we were observing with PETSc-3.4,
> could you please advice me if I should specify any other options?
> >
> > Thank you very much for your support,
> > Federico Golfre' Andreasi
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170913/919e1907/attachment-0001.html>


More information about the petsc-users mailing list