<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Thu, Apr 27, 2017 at 9:07 AM, Mark Adams <span dir="ltr"><<a href="mailto:mfadams@lbl.gov" target="_blank">mfadams@lbl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><div class="gmail_quote"><span class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="m_-1152435755099480566HOEnZb"><div class="m_-1152435755099480566h5"><br>
</div></div>Does the matrix operator(s) associated with the ksp have an options prefix?<br>
<span><br></span></blockquote><div><br></div></span><div>I don't think so. run with -help to check.</div><span class=""><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span>
>><br>
>><br>
>> If I get GAMG to use more than one process for the coarse grid (a GAMG<br>
>> setting), can I get a parallel LU (exact) solver to solve it using<br>
>> only the processes that store parts of the coarse grid matrix?<br>
><br>
><br>
> No, we should make a sub communicator for the active processes only, but I<br>
> am not too motivated to do this because the only reason that this matters is<br>
> if 1) a solver (ie, the parallel direct solver) is lazy and puts reductions<br>
> everywhere for not good reason, or 2) you use a Krylov solver (very<br>
> uncommon). All of the communication in a non-krylov solver in point to point<br>
> and there is no win that I know of with a sub communicator.<br>
><br>
> Note, the redundant coarse grid solver does use a subcommuncator, obviously,<br>
> but I think it is hardwired to PETSC_COMM_SELF, but maybe not?<br>
><br>
>><br>
>><br>
>> Related to all this, do the parallel LU solvers internally<br>
>> re-distribute a matrix over the whole MPI communicator as part of<br>
>> their re-ordering phase?<br>
><br>
><br>
> They better not!<br>
><br>
<br>
</span>I did a test with MUMPS, and from the MUMPS diagnostics (memory use<br>
per process) it appears that it does split the matrix across all<br>
processes.</blockquote></span></div></div></div></blockquote><div><br></div><div>1) Can we motivate why you would ever want a parallel coarse grid? I cannot think of a reason.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>Yikes! That is your problem with strong speedup. Use SuperLU.</div><div><br></div><div>I think making a subcommunicator for the coarse grid in GAMG would wreck havoc. </div></div></div></div></blockquote><div><br></div><div>2) I do not see why a subcommunicator is a problem. In fact, this is exactly what PCTELESCOPE is designed to do.</div><div> GAMG does a good job of reducing, but if you want completely custom reductions, TELESCOPE is for that.</div><div><br></div><div> Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>Could we turn that option off in MUMPS from GAMG? Or just turn it off by default? PETSc does not usually get that eager about partitioning.</div><div><div class="h5"><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<span class="m_-1152435755099480566HOEnZb"><font color="#888888"><br>
Garth<br>
</font></span><div class="m_-1152435755099480566HOEnZb"><div class="m_-1152435755099480566h5"><br>
> I doubt any solver would be that eager by default.<br>
><br>
>><br>
>><br>
>> Garth<br>
>><br>
>> > There is also -mg_coarse_pc_type redundant<br>
>> > -mg_coarse_redundant_pc_type lu. In that case it makes a copy of the coarse<br>
>> > matrix on EACH process and each process does its own factorization and<br>
>> > solve. This saves one phase of the communication for each V cycle since<br>
>> > every process has the entire solution it just grabs from itself the values<br>
>> > it needs without communication.<br>
>> ><br>
>> ><br>
>> ><br>
>> ><br>
>> >> On Apr 26, 2017, at 5:25 PM, Garth N. Wells <<a href="mailto:gnw20@cam.ac.uk" target="_blank">gnw20@cam.ac.uk</a>> wrote:<br>
>> >><br>
>> >> I'm a bit confused by the selection of the coarse grid solver for<br>
>> >> multigrid. For the demo ksp/ex56, if I do:<br>
>> >><br>
>> >> mpirun -np 1 ./ex56 -ne 16 -ksp_view -pc_type gamg<br>
>> >> -mg_coarse_ksp_type preonly -mg_coarse_pc_type lu<br>
>> >><br>
>> >> I see<br>
>> >><br>
>> >> Coarse grid solver -- level ------------------------------<wbr>-<br>
>> >> KSP Object: (mg_coarse_) 1 MPI processes<br>
>> >> type: preonly<br>
>> >> maximum iterations=10000, initial guess is zero<br>
>> >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.<br>
>> >> left preconditioning<br>
>> >> using NONE norm type for convergence test<br>
>> >> PC Object: (mg_coarse_) 1 MPI processes<br>
>> >> type: lu<br>
>> >> out-of-place factorization<br>
>> >> tolerance for zero pivot 2.22045e-14<br>
>> >> matrix ordering: nd<br>
>> >> factor fill ratio given 5., needed 1.<br>
>> >> Factored matrix follows:<br>
>> >> Mat Object: 1 MPI processes<br>
>> >> type: seqaij<br>
>> >> rows=6, cols=6, bs=6<br>
>> >> package used to perform factorization: petsc<br>
>> >> total: nonzeros=36, allocated nonzeros=36<br>
>> >> total number of mallocs used during MatSetValues calls =0<br>
>> >> using I-node routines: found 2 nodes, limit used is 5<br>
>> >> linear system matrix = precond matrix:<br>
>> >> Mat Object: 1 MPI processes<br>
>> >> type: seqaij<br>
>> >> rows=6, cols=6, bs=6<br>
>> >> total: nonzeros=36, allocated nonzeros=36<br>
>> >> total number of mallocs used during MatSetValues calls =0<br>
>> >> using I-node routines: found 2 nodes, limit used is 5<br>
>> >><br>
>> >> which is what I expect. Increasing from 1 to 2 processes:<br>
>> >><br>
>> >> mpirun -np 2 ./ex56 -ne 16 -ksp_view -pc_type gamg<br>
>> >> -mg_coarse_ksp_type preonly -mg_coarse_pc_type lu<br>
>> >><br>
>> >> I see<br>
>> >><br>
>> >> Coarse grid solver -- level ------------------------------<wbr>-<br>
>> >> KSP Object: (mg_coarse_) 2 MPI processes<br>
>> >> type: preonly<br>
>> >> maximum iterations=10000, initial guess is zero<br>
>> >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.<br>
>> >> left preconditioning<br>
>> >> using NONE norm type for convergence test<br>
>> >> PC Object: (mg_coarse_) 2 MPI processes<br>
>> >> type: lu<br>
>> >> out-of-place factorization<br>
>> >> tolerance for zero pivot 2.22045e-14<br>
>> >> matrix ordering: natural<br>
>> >> factor fill ratio given 0., needed 0.<br>
>> >> Factored matrix follows:<br>
>> >> Mat Object: 2 MPI processes<br>
>> >> type: superlu_dist<br>
>> >> rows=6, cols=6<br>
>> >> package used to perform factorization: superlu_dist<br>
>> >> total: nonzeros=0, allocated nonzeros=0<br>
>> >> total number of mallocs used during MatSetValues calls =0<br>
>> >> SuperLU_DIST run parameters:<br>
>> >> Process grid nprow 2 x npcol 1<br>
>> >> Equilibrate matrix TRUE<br>
>> >> Matrix input mode 1<br>
>> >> Replace tiny pivots FALSE<br>
>> >> Use iterative refinement FALSE<br>
>> >> Processors in row 2 col partition 1<br>
>> >> Row permutation LargeDiag<br>
>> >> Column permutation METIS_AT_PLUS_A<br>
>> >> Parallel symbolic factorization FALSE<br>
>> >> Repeated factorization SamePattern<br>
>> >> linear system matrix = precond matrix:<br>
>> >> Mat Object: 2 MPI processes<br>
>> >> type: mpiaij<br>
>> >> rows=6, cols=6, bs=6<br>
>> >> total: nonzeros=36, allocated nonzeros=36<br>
>> >> total number of mallocs used during MatSetValues calls =0<br>
>> >> using I-node (on process 0) routines: found 2 nodes, limit<br>
>> >> used is 5<br>
>> >><br>
>> >> Note that the coarse grid is now using superlu_dist. Is the coarse<br>
>> >> grid being solved in parallel?<br>
>> >><br>
>> >> Garth<br>
>> ><br>
><br>
><br>
</div></div></blockquote></div></div></div><br></div></div>
</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature">What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div>
</div></div>