[petsc-users] Multigrid coarse grid solver

Thu Apr 27 00:59:02 CDT 2017

On 27 April 2017 at 00:30, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>   Yes, you asked for LU so it used LU!
>
>    Of course for smaller coarse grids and large numbers of processes this is very inefficient.
>
>    The default behavior for GAMG is probably what you want. In that case it is equivalent to
> -mg_coarse_pc_type bjacobi --mg_coarse_sub_pc_type lu.  But GAMG tries hard to put all the coarse grid degrees
> of freedom on the first process and none on the rest, so you do end up with the exact equivalent of a direct solver.
> Try -ksp_view in that case.
>

Thanks, Barry.

I'm struggling a little to understand the matrix data structure for
the coarse grid. Is it just a mpiaji matrix, with all entries
(usually) on one process?

Is there an options key prefix for the matrix on different levels?
E.g., to turn on a viewer?

If I get GAMG to use more than one process for the coarse grid (a GAMG
setting), can I get a parallel LU (exact) solver to solve it using
only the processes that store parts of the coarse grid matrix?

Related to all this, do the parallel LU solvers internally
re-distribute a matrix over the whole MPI communicator as part of
their re-ordering phase?

Garth

>    There is also -mg_coarse_pc_type redundant -mg_coarse_redundant_pc_type lu. In that case it makes a copy of the coarse matrix on EACH process and each process does its own factorization and solve. This saves one phase of the communication for each V cycle since every process has the entire solution it just grabs from itself the values it needs without communication.
>
>
>
>
>> On Apr 26, 2017, at 5:25 PM, Garth N. Wells <gnw20 at cam.ac.uk> wrote:
>>
>> I'm a bit confused by the selection of the coarse grid solver for
>> multigrid. For the demo ksp/ex56, if I do:
>>
>>    mpirun -np 1 ./ex56 -ne 16 -ksp_view -pc_type gamg
>> -mg_coarse_ksp_type preonly -mg_coarse_pc_type lu
>>
>> I see
>>
>>  Coarse grid solver -- level -------------------------------
>>    KSP Object: (mg_coarse_) 1 MPI processes
>>      type: preonly
>>      maximum iterations=10000, initial guess is zero
>>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>>      left preconditioning
>>      using NONE norm type for convergence test
>>    PC Object: (mg_coarse_) 1 MPI processes
>>      type: lu
>>        out-of-place factorization
>>        tolerance for zero pivot 2.22045e-14
>>        matrix ordering: nd
>>        factor fill ratio given 5., needed 1.
>>          Factored matrix follows:
>>            Mat Object: 1 MPI processes
>>              type: seqaij
>>              rows=6, cols=6, bs=6
>>              package used to perform factorization: petsc
>>              total: nonzeros=36, allocated nonzeros=36
>>              total number of mallocs used during MatSetValues calls =0
>>                using I-node routines: found 2 nodes, limit used is 5
>>      linear system matrix = precond matrix:
>>      Mat Object: 1 MPI processes
>>        type: seqaij
>>        rows=6, cols=6, bs=6
>>        total: nonzeros=36, allocated nonzeros=36
>>        total number of mallocs used during MatSetValues calls =0
>>          using I-node routines: found 2 nodes, limit used is 5
>>
>> which is what I expect. Increasing from 1 to 2 processes:
>>
>>    mpirun -np 2 ./ex56 -ne 16 -ksp_view -pc_type gamg
>> -mg_coarse_ksp_type preonly -mg_coarse_pc_type lu
>>
>> I see
>>
>>  Coarse grid solver -- level -------------------------------
>>    KSP Object: (mg_coarse_) 2 MPI processes
>>      type: preonly
>>      maximum iterations=10000, initial guess is zero
>>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>>      left preconditioning
>>      using NONE norm type for convergence test
>>    PC Object: (mg_coarse_) 2 MPI processes
>>      type: lu
>>        out-of-place factorization
>>        tolerance for zero pivot 2.22045e-14
>>        matrix ordering: natural
>>        factor fill ratio given 0., needed 0.
>>          Factored matrix follows:
>>            Mat Object: 2 MPI processes
>>              type: superlu_dist
>>              rows=6, cols=6
>>              package used to perform factorization: superlu_dist
>>              total: nonzeros=0, allocated nonzeros=0
>>              total number of mallocs used during MatSetValues calls =0
>>                SuperLU_DIST run parameters:
>>                  Process grid nprow 2 x npcol 1
>>                  Equilibrate matrix TRUE
>>                  Matrix input mode 1
>>                  Replace tiny pivots FALSE
>>                  Use iterative refinement FALSE
>>                  Processors in row 2 col partition 1
>>                  Row permutation LargeDiag
>>                  Column permutation METIS_AT_PLUS_A
>>                  Parallel symbolic factorization FALSE
>>                  Repeated factorization SamePattern
>>      linear system matrix = precond matrix:
>>      Mat Object: 2 MPI processes
>>        type: mpiaij
>>        rows=6, cols=6, bs=6
>>        total: nonzeros=36, allocated nonzeros=36
>>        total number of mallocs used during MatSetValues calls =0
>>          using I-node (on process 0) routines: found 2 nodes, limit used is 5
>>
>> Note that the coarse grid is now using superlu_dist. Is the coarse
>> grid being solved in parallel?
>>
>> Garth
>