[petsc-users] GAMG speed

Michele Rosso mrosso at uci.edu
Thu Aug 1 13:35:11 CDT 2013


Barry,

I am using a finite difference Cartesian uniform grid and DMDA and so 
far it has not given me any problem.
I am using a ksp solver (not snes). In a previous thread, I was told an 
odd number of grid points was needed for the geometric multigrid, is 
that correct?
I tried to run my case with


-pc_type mg -da_refine 4



but it does not seem to use the -da_refine option:

mpiexec   -np 4 ./test  -pc_type mg -da_refine 4  -ksp_view -options_left


KSP Object: 4 MPI processes
   type: cg
   maximum iterations=10000
   tolerances:  relative=1e-08, absolute=1e-50, divergence=10000
   left preconditioning
   using nonzero initial guess
   using UNPRECONDITIONED norm type for convergence test
PC Object: 4 MPI processes
   type: mg
     MG: type is MULTIPLICATIVE, levels=1 cycles=v
       Cycles per PCApply=1
       Not using Galerkin computed coarse grid matrices
   Coarse grid solver -- level -------------------------------
     KSP Object:    (mg_levels_0_)     4 MPI processes
       type: chebyshev
         Chebyshev: eigenvalue estimates:  min = 0.134543, max = 1.47998
         Chebyshev: estimated using:  [0 0.1; 0 1.1]
         KSP Object:        (mg_levels_0_est_)         4 MPI processes
           type: gmres
             GMRES: restart=30, using Classical (unmodified) 
Gram-Schmidt Orthogonalization with no iterative refinement
             GMRES: happy breakdown tolerance 1e-30
           maximum iterations=10, initial guess is zero
           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
           left preconditioning
           using NONE norm type for convergence test
         PC Object:        (mg_levels_0_)         4 MPI processes
           type: sor
             SOR: type = local_symmetric, iterations = 1, local 
iterations = 1, omega = 1
           linear system matrix = precond matrix:
           Matrix Object:           4 MPI processes
             type: mpiaij
             rows=262144, cols=262144
             total: nonzeros=1835008, allocated nonzeros=1835008
             total number of mallocs used during MatSetValues calls =0
       maximum iterations=1, initial guess is zero
       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
       left preconditioning
       using NONE norm type for convergence test
     PC Object:    (mg_levels_0_)     4 MPI processes
       type: sor
         SOR: type = local_symmetric, iterations = 1, local iterations = 
1, omega = 1
       linear system matrix = precond matrix:
       Matrix Object:       4 MPI processes
         type: mpiaij
         rows=262144, cols=262144
         total: nonzeros=1835008, allocated nonzeros=1835008
         total number of mallocs used during MatSetValues calls =0
   linear system matrix = precond matrix:
   Matrix Object:   4 MPI processes
     type: mpiaij
     rows=262144, cols=262144
     total: nonzeros=1835008, allocated nonzeros=1835008
     total number of mallocs used during MatSetValues calls =0
  Solution       =    1.53600013     sec
#PETSc Option Table entries:
-da_refine 4
-ksp_view
-options_left
-pc_type mg
#End of PETSc Option Table entries
There is one unused database option. It is:
Option left: name:-da_refine value: 4

Michele

On 08/01/2013 11:21 AM, Barry Smith wrote:
>     What kind of mesh are you using? Are you using DMDA? If you are using DMDA (and have written your code to use it "correctly") then it should be trivial to run with geometric multigrid and geometric multigrid should be a bit faster.
>
>     For example on src/snes/examples/tutorials/ex19.c   I run with ./ex19 -pc_type mg -da_refine 4 and it refines the original DMDA 4 times and uses geometric multigrid with 5 levels.
>
>
>     Barry
>
>
> On Aug 1, 2013, at 1:14 PM, Michele Rosso <mrosso at uci.edu> wrote:
>
>> Hi,
>>
>> I am successfully using PETSc (v3.4.2)  to solve a 3D Poisson's equation with CG + GAMG as I was suggested to do in a previous thread.
>> So far I am using GAMG with the default settings, i.e.
>>
>> -pc_type gamg -pc_gamg_agg_nsmooths 1
>>
>> The speed of the solution is satisfactory, but I would like to know if you have any suggestions to further speed it up, particularly
>> if there is any parameters worth looking into to achieve an even faster solution, for example number of levels and so on.
>> So far I am using Dirichlet's BCs for my test case, but I will soon have periodic conditions: in this case, does GAMG require particular settings?
>> Finally, I did not try geometric multigrid: do you think it is worth a shot?
>>
>> Here are my current settings:
>>
>> I run with
>>
>> -pc_type gamg -pc_gamg_agg_nsmooths 1 -ksp_view -options_left
>>
>> and the output is:
>>
>> KSP Object: 4 MPI processes
>>    type: cg
>>    maximum iterations=10000
>>    tolerances:  relative=1e-08, absolute=1e-50, divergence=10000
>>    left preconditioning
>>    using nonzero initial guess
>>    using UNPRECONDITIONED norm type for convergence test
>> PC Object: 4 MPI processes
>>    type: gamg
>>      MG: type is MULTIPLICATIVE, levels=3 cycles=v
>>        Cycles per PCApply=1
>>        Using Galerkin computed coarse grid matrices
>>    Coarse grid solver -- level -------------------------------
>>      KSP Object:    (mg_coarse_)     4 MPI processes
>>        type: preonly
>>        maximum iterations=1, initial guess is zero
>>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>        left preconditioning
>>        using NONE norm type for convergence test
>>      PC Object:    (mg_coarse_)     4 MPI processes
>>        type: bjacobi
>>          block Jacobi: number of blocks = 4
>>          Local solve info for each block is in the following KSP and PC objects:
>>        [0] number of local blocks = 1, first local block number = 0
>>                  [0] local block number 0
>> KSP Object:          (mg_coarse_sub_)         1 MPI processes
>>            type: preonly
>>            maximum iterations=1, initial guess is zero
>>                  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>> KSP Object:        (mg_coarse_sub_)            left preconditioning
>>            using NONE norm type for convergence test
>>            PC Object:        (mg_coarse_sub_)       1 MPI processes
>>            type: preonly
>>           1 MPI processes
>>            type: lu
>>            maximum iterations=1, initial guess is zero
>>            tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>            LU: out-of-place factorization
>>              left preconditioning
>>            using NONE norm type for convergence test
>>            PC Object:        (mg_coarse_sub_)         1 MPI processes
>>            type: lu
>>            tolerance for zero pivot 2.22045e-14
>>              using diagonal shift on blocks to prevent zero pivot
>>              matrix ordering: nd
>>              LU: out-of-place factorization
>>              tolerance for zero pivot 2.22045e-14
>>              using diagonal shift on blocks to prevent zero pivot
>>              matrix ordering: nd
>>              factor fill ratio given 5, needed 0
>>                Factored matrix follows:
>>              factor fill ratio given 5, needed 4.13207
>>                Factored matrix follows:
>>                    Matrix Object:              Matrix Object:                 1 MPI processes
>>                    type: seqaij
>>                      rows=395, cols=395
>>                      package used to perform factorization: petsc
>>                    total: nonzeros=132379, allocated nonzeros=132379
>>                    total number of mallocs used during MatSetValues calls =0
>>                          not using I-node routines
>>             1 MPI processes
>>                    type: seqaij
>>            linear system matrix = precond matrix:
>>                      rows=0, cols=0
>>                      package used to perform factorization: petsc
>>                    total: nonzeros=1, allocated nonzeros=1
>>                      total number of mallocs used during MatSetValues calls =0
>>                        not using I-node routines
>>                linear system matrix = precond matrix:
>>    Matrix Object:             1 MPI processes
>>              type: seqaij
>>            Matrix Object:KSP Object:           1 MPI processes
>>              type: seqaij
>>              rows=0, cols=0
>>              total: nonzeros=0, allocated nonzeros=0
>>              total number of mallocs used during MatSetValues calls =0
>>                  not using I-node routines
>>            rows=395, cols=395
>>              total: nonzeros=32037, allocated nonzeros=32037
>>              total number of mallocs used during MatSetValues calls =0
>>                not using I-node routines
>>            - - - - - - - - - - - - - - - - - -
>>            KSP Object:        (mg_coarse_sub_)         1 MPI processes
>>            type: preonly
>>            maximum iterations=1, initial guess is zero
>>            tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>            left preconditioning
>>            using NONE norm type for convergence test
>>          PC Object:        (mg_coarse_sub_)         1 MPI processes
>>            type: lu
>>              LU: out-of-place factorization
>>              tolerance for zero pivot 2.22045e-14
>>              using diagonal shift on blocks to prevent zero pivot
>>              matrix ordering: nd
>>              factor fill ratio given 5, needed 0
>>                Factored matrix follows:
>>                  Matrix Object:                 1 MPI processes
>>                    type: seqaij
>>                    rows=0, cols=0
>>                    package used to perform factorization: petsc
>>                    total: nonzeros=1, allocated nonzeros=1
>>                    total number of mallocs used during MatSetValues calls =0
>>                      not using I-node routines
>>            linear system matrix = precond matrix:
>>            Matrix Object:           1 MPI processes
>>              type: seqaij
>>              rows=0, cols=0
>>              total: nonzeros=0, allocated nonzeros=0
>>              total number of mallocs used during MatSetValues calls =0
>>                not using I-node routines
>>    (mg_coarse_sub_)         1 MPI processes
>>            type: preonly
>>            maximum iterations=1, initial guess is zero
>>            tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>            left preconditioning
>>            using NONE norm type for convergence test
>>          PC Object:        (mg_coarse_sub_)         1 MPI processes
>>            type: lu
>>              LU: out-of-place factorization
>>              tolerance for zero pivot 2.22045e-14
>>              using diagonal shift on blocks to prevent zero pivot
>>              matrix ordering: nd
>>              factor fill ratio given 5, needed 0
>>                Factored matrix follows:
>>                  Matrix Object:                 1 MPI processes
>>                    type: seqaij
>>                    rows=0, cols=0
>>                    package used to perform factorization: petsc
>>                    total: nonzeros=1, allocated nonzeros=1
>>                    total number of mallocs used during MatSetValues calls =0
>>                      not using I-node routines
>>            linear system matrix = precond matrix:
>>            Matrix Object:           1 MPI processes
>>              type: seqaij
>>              rows=0, cols=0
>>              total: nonzeros=0, allocated nonzeros=0
>>              total number of mallocs used during MatSetValues calls =0
>>                not using I-node routines
>>        [1] number of local blocks = 1, first local block number = 1
>>          [1] local block number 0
>>          - - - - - - - - - - - - - - - - - -
>>        [2] number of local blocks = 1, first local block number = 2
>>          [2] local block number 0
>>          - - - - - - - - - - - - - - - - - -
>>        [3] number of local blocks = 1, first local block number = 3
>>          [3] local block number 0
>>          - - - - - - - - - - - - - - - - - -
>>        linear system matrix = precond matrix:
>>        Matrix Object:       4 MPI processes
>>          type: mpiaij
>>          rows=395, cols=395
>>          total: nonzeros=32037, allocated nonzeros=32037
>>          total number of mallocs used during MatSetValues calls =0
>>            not using I-node (on process 0) routines
>>    Down solver (pre-smoother) on level 1 -------------------------------
>>      KSP Object:    (mg_levels_1_)     4 MPI processes
>>        type: chebyshev
>>          Chebyshev: eigenvalue estimates:  min = 0.0636225, max = 1.33607
>>        maximum iterations=2
>>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>        left preconditioning
>>        using nonzero initial guess
>>        using NONE norm type for convergence test
>>      PC Object:    (mg_levels_1_)     4 MPI processes
>>        type: jacobi
>>        linear system matrix = precond matrix:
>>        Matrix Object:       4 MPI processes
>>          type: mpiaij
>>          rows=23918, cols=23918
>>          total: nonzeros=818732, allocated nonzeros=818732
>>          total number of mallocs used during MatSetValues calls =0
>>            not using I-node (on process 0) routines
>>    Up solver (post-smoother) same as down solver (pre-smoother)
>>    Down solver (pre-smoother) on level 2 -------------------------------
>>      KSP Object:    (mg_levels_2_)     4 MPI processes
>>        type: chebyshev
>>          Chebyshev: eigenvalue estimates:  min = 0.0971369, max = 2.03987
>>        maximum iterations=2
>>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>        left preconditioning
>>        using nonzero initial guess
>>        using NONE norm type for convergence test
>>      PC Object:    (mg_levels_2_)     4 MPI processes
>>        type: jacobi
>>        linear system matrix = precond matrix:
>>        Matrix Object:       4 MPI processes
>>          type: mpiaij
>>          rows=262144, cols=262144
>>          total: nonzeros=1835008, allocated nonzeros=1835008
>>          total number of mallocs used during MatSetValues calls =0
>>    Up solver (post-smoother) same as down solver (pre-smoother)
>>    linear system matrix = precond matrix:
>>    Matrix Object:   4 MPI processes
>>      type: mpiaij
>>      rows=262144, cols=262144
>>      total: nonzeros=1835008, allocated nonzeros=1835008
>>      total number of mallocs used during MatSetValues calls =0
>> #PETSc Option Table entries:
>> -ksp_view
>> -options_left
>> -pc_gamg_agg_nsmooths 1
>> -pc_type gamg
>> #End of PETSc Option Table entries
>> There are no unused options.
>>
>>
>> Thank you,
>> Michele
>



More information about the petsc-users mailing list