[petsc-users] ML and -pc_factor_shift_nonzero
tribur at vision.ee.ethz.ch
tribur at vision.ee.ethz.ch
Mon Apr 19 06:29:40 CDT 2010
Hi Jed,
>> ML works now using, e.g., -mg_coarse_redundant_pc_factor_shift_type
>> POSITIVE_DEFINITE. However, it converges very slowly using the default
>> REDUNDANT for the coarse solve.
>
> "Converges slowly" or "the coarse-level solve is expensive"?
hm, rather "converges slowly". Using ML inside a preconditioner for
the Schur complement system, the overall outer system preconditioned
with the approximated Schur complement preconditioner converges
slowly, if you understand what I mean.
My particular problem is that the convergence rate depends strongly on
the number of processors. In case of one processor, using ML for
preconditioning the deeply inner system the outer system converges in,
e.g., 39 iterations. In case of np=10, however, it needs 69 iterations.
This number of iterations is independent on the number of processes
using HYPRE (at least if np<80), but the latter is (applied to this
inner system, not generally) slower and scales very badly. That's why
I would like to use ML.
Thinking about it, all this shouldn't have to do anything with the
choice of the direct solver of the coarse system inside ML (mumps or
petsc-own), should it? The direct solver solves completely,
independently from the number of processes, and shouldn't have an
influence on the effectiveness of ML, or am I wrong?
> I suggest
> starting with
>
> -mg_coarse_pc_type lu -mg_coarse_pc_factor_mat_solver_package mumps
>
> or varying parameters in ML to see if you can make the coarse level
> problem smaller without hurting convergence rate. You can do
> semi-redundant solves if you scale processor counts beyond what MUMPS
> works well with.
Thanks. Thus, MUMPS is supposed to be the usually fastest parallel
direct solver?
> Depending on what problem you are solving, ML could be producing a
> (nearly) singular coarse level operator in which case you can expect
> very confusing and inconsistent behavior.
Could it also be the reason for the decreased convergence rate when
increasing from 1 to 10 processors? Even if the equation system
remains the same?
Thanks a lot,
Kathrin
More information about the petsc-users
mailing list