[petsc-users] ML and -pc_factor_shift_nonzero

Mon Apr 19 06:52:19 CDT 2010

On Mon, 19 Apr 2010 13:29:40 +0200, tribur at vision.ee.ethz.ch wrote:
> >> ML works now using, e.g., -mg_coarse_redundant_pc_factor_shift_type
> >> POSITIVE_DEFINITE. However, it converges very slowly using the default
> >> REDUNDANT for the coarse solve.
> >
> > "Converges slowly" or "the coarse-level solve is expensive"?
> 
> hm, rather "converges slowly". Using ML inside a preconditioner for  
> the Schur complement system, the overall outer system preconditioned  
> with the approximated Schur complement preconditioner converges  
> slowly, if you understand what I mean.

Sure, but the redundant coarse solve is a direct solve.  It may be that
the shift (to make it nonsingular) makes it ineffective (and thus outer
system converges slowly), but this is the same behavior you would get
with a non-redundant solve.  I.e. it is the shift that causes the
problem, not the REDUNDANT.

I don't know which flavor of Schur complement iteration you are
currently using.  It is true that pure Schur complement reduction
requires high-accuracy inner solves, you may of course get away with
inexact inner solves if it is part of a full-space iteration.  It's
worth comparing the number of iterations required to solve the inner
(advection-diffusion) block to a given tolerance in parallel and serial.

> My particular problem is that the convergence rate depends strongly on  
> the number of processors. In case of one processor, using ML for  
> preconditioning the deeply inner system the outer system converges in,  
> e.g., 39 iterations. In case of np=10, however, it needs 69 iterations.

ML with defaults has a significant difference between serial and
parallel.  Usually the scalability is acceptable from 2 processors up,
but the difference between one and two can be quite significant.  You
can make it stronger, e.g. with

  -mg_levels_ksp_type gmres -mg_levels_ksp_max_it 1 -mg_levels_pc_type asm -mg_levels_sub_pc_type ilu

> This number of iterations is independent on the number of processes  
> using HYPRE (at least if np<80), but the latter is (applied to this  
> inner system, not generally) slower and scales very badly. That's why  
> I would like to use ML.
> 
> Thinking about it, all this shouldn't have to do anything with the  
> choice of the direct solver of the coarse system inside ML (mumps or  
> petsc-own), should it? The direct solver solves completely,  
> independently from the number of processes, and shouldn't have an  
> influence on the effectiveness of ML, or am I wrong?

A shift makes it solve a somewhat different system.  How different that
perturbed system is depends on the problem and the size of the shift.
MUMPS has more sophisticated ordering/pivoting schemes so you should use
it if the coarse system demands it (you can also try using different
ordering schemes in PETSc,
-mg_coarse_redundant_pc_factor_mat_ordering_type).

> Thanks. Thus, MUMPS is supposed to be the usually fastest parallel  
> direct solver?

Usually.

> > Depending on what problem you are solving, ML could be producing a
> > (nearly) singular coarse level operator in which case you can expect
> > very confusing and inconsistent behavior.
> 
> Could it also be the reason for the decreased convergence rate when  
> increasing from 1 to 10 processors? Even if the equation system  
> remains the same?

ML's aggregates change somewhat in parallel (I don't know how much, I
haven't investigated precisely what is different) and the smoothers are
all different.  With a "normal" discretization of an elliptic system, it
would seem surprising for ML to produce nearly singular coarse-level
operators, in parallel or otherwise.  But snes/tutorials/examples/ex48
exhibits pretty bad ML behavior (the coarse-level isn't singular, but
the parallel aggregates with default smoothers don't converge despite
being an SPD system, ML is informed of translations but not rigid body
modes, I haven't investigated ML's troublesome modes for this problem so
I don't know if they are rigid body modes or something else).

Jed