[petsc-users] Multigrid coarse grid solver

Thu Apr 27 07:45:12 CDT 2017

Barry, we seem to get an error when you explicitly set this.

Garth, Maybe to set the default explicitly you need to use pc_type asm
-sub_pc_type lu. That is the true default.

More below but this is the error message:

17:46 knepley/feature-plasma-example *=
~/Codes/petsc/src/ksp/ksp/examples/tutorials$
/Users/markadams/Codes/petsc/arch-macosx-gnu-g/bin/mpiexec -np 2 ./ex56 -ne
16 -pc_type gamg -ksp_view -mg_coarse_ksp_type preonly -mg_coarse_pc_type
lu -mg_coarse_pc_factor_mat_solver_package petsc
[0]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/
linearsolvertable.html for possible LU and Cholesky solvers
[0]PETSC ERROR: MatSolverPackage petsc does not support matrix type mpiaij
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
trouble shooting.
[0]PETSC ERROR: Petsc Development GIT revision: v3.7.6-3658-g99fa2798da
 GIT Date: 2017-04-25 12:56:20 -0500
[0]PETSC ERROR: ./ex56 on a arch-macosx-gnu-g named MarksMac-5.local by
markadams Wed Apr 26 17:46:28 2017
[0]PETSC ERROR: Configure options --with-cc=clang --with-cc++=clang++
COPTFLAGS="-g -O0 -mavx2" CXXOPTFLAGS="-g -O0 -mavx2" F

On Thu, Apr 27, 2017 at 1:59 AM, Garth N. Wells <gnw20 at cam.ac.uk> wrote:

> On 27 April 2017 at 00:30, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >   Yes, you asked for LU so it used LU!
> >
> >    Of course for smaller coarse grids and large numbers of processes
> this is very inefficient.
> >
> >    The default behavior for GAMG is probably what you want. In that case
> it is equivalent to
> > -mg_coarse_pc_type bjacobi --mg_coarse_sub_pc_type lu.  But GAMG tries
> hard to put all the coarse grid degrees
> > of freedom on the first process and none on the rest, so you do end up
> with the exact equivalent of a direct solver.
> > Try -ksp_view in that case.
> >
>
> Thanks, Barry.
>
> I'm struggling a little to understand the matrix data structure for
> the coarse grid. Is it just a mpiaji matrix, with all entries
> (usually) on one process?
>

Yes.

>
> Is there an options key prefix for the matrix on different levels?
> E.g., to turn on a viewer?
>

something like -mg_level_1_ksp_view should work (run with -help to get the
correct syntax).

>
> If I get GAMG to use more than one process for the coarse grid (a GAMG
> setting), can I get a parallel LU (exact) solver to solve it using
> only the processes that store parts of the coarse grid matrix?
>

No, we should make a sub communicator for the active processes only, but I
am not too motivated to do this because the only reason that this matters
is if 1) a solver (ie, the parallel direct solver) is lazy and puts
reductions everywhere for not good reason, or 2) you use a Krylov solver
(very uncommon). All of the communication in a non-krylov solver in point
to point and there is no win that I know of with a sub communicator.

Note, the redundant coarse grid solver does use a subcommuncator,
obviously, but I think it is hardwired to PETSC_COMM_SELF, but maybe not?

>
> Related to all this, do the parallel LU solvers internally
> re-distribute a matrix over the whole MPI communicator as part of
> their re-ordering phase?
>

They better not!

I doubt any solver would be that eager by default.

>
> Garth
>
> >    There is also -mg_coarse_pc_type redundant
> -mg_coarse_redundant_pc_type lu. In that case it makes a copy of the coarse
> matrix on EACH process and each process does its own factorization and
> solve. This saves one phase of the communication for each V cycle since
> every process has the entire solution it just grabs from itself the values
> it needs without communication.
> >
> >
> >
> >
> >> On Apr 26, 2017, at 5:25 PM, Garth N. Wells <gnw20 at cam.ac.uk> wrote:
> >>
> >> I'm a bit confused by the selection of the coarse grid solver for
> >> multigrid. For the demo ksp/ex56, if I do:
> >>
> >>    mpirun -np 1 ./ex56 -ne 16 -ksp_view -pc_type gamg
> >> -mg_coarse_ksp_type preonly -mg_coarse_pc_type lu
> >>
> >> I see
> >>
> >>  Coarse grid solver -- level -------------------------------
> >>    KSP Object: (mg_coarse_) 1 MPI processes
> >>      type: preonly
> >>      maximum iterations=10000, initial guess is zero
> >>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
> >>      left preconditioning
> >>      using NONE norm type for convergence test
> >>    PC Object: (mg_coarse_) 1 MPI processes
> >>      type: lu
> >>        out-of-place factorization
> >>        tolerance for zero pivot 2.22045e-14
> >>        matrix ordering: nd
> >>        factor fill ratio given 5., needed 1.
> >>          Factored matrix follows:
> >>            Mat Object: 1 MPI processes
> >>              type: seqaij
> >>              rows=6, cols=6, bs=6
> >>              package used to perform factorization: petsc
> >>              total: nonzeros=36, allocated nonzeros=36
> >>              total number of mallocs used during MatSetValues calls =0
> >>                using I-node routines: found 2 nodes, limit used is 5
> >>      linear system matrix = precond matrix:
> >>      Mat Object: 1 MPI processes
> >>        type: seqaij
> >>        rows=6, cols=6, bs=6
> >>        total: nonzeros=36, allocated nonzeros=36
> >>        total number of mallocs used during MatSetValues calls =0
> >>          using I-node routines: found 2 nodes, limit used is 5
> >>
> >> which is what I expect. Increasing from 1 to 2 processes:
> >>
> >>    mpirun -np 2 ./ex56 -ne 16 -ksp_view -pc_type gamg
> >> -mg_coarse_ksp_type preonly -mg_coarse_pc_type lu
> >>
> >> I see
> >>
> >>  Coarse grid solver -- level -------------------------------
> >>    KSP Object: (mg_coarse_) 2 MPI processes
> >>      type: preonly
> >>      maximum iterations=10000, initial guess is zero
> >>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
> >>      left preconditioning
> >>      using NONE norm type for convergence test
> >>    PC Object: (mg_coarse_) 2 MPI processes
> >>      type: lu
> >>        out-of-place factorization
> >>        tolerance for zero pivot 2.22045e-14
> >>        matrix ordering: natural
> >>        factor fill ratio given 0., needed 0.
> >>          Factored matrix follows:
> >>            Mat Object: 2 MPI processes
> >>              type: superlu_dist
> >>              rows=6, cols=6
> >>              package used to perform factorization: superlu_dist
> >>              total: nonzeros=0, allocated nonzeros=0
> >>              total number of mallocs used during MatSetValues calls =0
> >>                SuperLU_DIST run parameters:
> >>                  Process grid nprow 2 x npcol 1
> >>                  Equilibrate matrix TRUE
> >>                  Matrix input mode 1
> >>                  Replace tiny pivots FALSE
> >>                  Use iterative refinement FALSE
> >>                  Processors in row 2 col partition 1
> >>                  Row permutation LargeDiag
> >>                  Column permutation METIS_AT_PLUS_A
> >>                  Parallel symbolic factorization FALSE
> >>                  Repeated factorization SamePattern
> >>      linear system matrix = precond matrix:
> >>      Mat Object: 2 MPI processes
> >>        type: mpiaij
> >>        rows=6, cols=6, bs=6
> >>        total: nonzeros=36, allocated nonzeros=36
> >>        total number of mallocs used during MatSetValues calls =0
> >>          using I-node (on process 0) routines: found 2 nodes, limit
> used is 5
> >>
> >> Note that the coarse grid is now using superlu_dist. Is the coarse
> >> grid being solved in parallel?
> >>
> >> Garth
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170427/d3620fdc/attachment-0001.html>