Smoother settings for AMG
BAYRAKTAR Harun
Harun.BAYRAKTAR at 3ds.com
Fri Jul 31 13:55:49 CDT 2009
Barry,
On Monday I'll use ex10.c to reproduce and send you the full options.
Thanks,
Harun
-----Original Message-----
From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
Sent: Friday, July 31, 2009 2:25 PM
To: PETSc users list
Subject: Re: Smoother settings for AMG
On Jul 31, 2009, at 1:15 PM, BAYRAKTAR Harun wrote:
> Barry,
>
> Thanks a lot for looking in to this. One thing I want to clarify is
> that the 43 (should have been 46 sorry for the typo) iterations on 1
> cpu seems like a real convergence to me. I do look at the
> unpreconditioned residual norm to determine convergence. For this I
> use:
>
> ierr = KSPSetNormType(m_solver, KSP_NORM_UNPRECONDITIONED);
> CHKERRQ(ierr);
>
> Then I check convergence through KSPSetConvergenceTest. As an
> experiment I commented out the line above where I tell KSP to use
> the unpreconditioned norm and while the ||r|| values changed
> (naturally), it still converged in slightly more number of
> iterations (56).
>
> I am familiar with the preconditioned norm going down 6 orders while
> the true relative norm is 0.1 or so (i.e., problem not solved at
> all). This usually happens to me in structural mechanics problems
> with ill conditioned systems and I use a KSP method that does not
> allow for the unpreconditioned residual to be monitored. However,
> this does not seem to be one of those cases though, maybe I am
> missing something.
Ok. I didn't see what you report (I saw it just iterating away for a
long time with the unpreconditioned norm) but then you never sent the
command line options for the solver you used so I may have run it
differently.
>
> Out of curiosity did you use ksp/ksp/examples/tutorials/ex10.c to
> solve this?
Yes.
>
> Thanks again,
> Harun
>
>
>
> -----Original Message-----
> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov
> ] On Behalf Of Barry Smith
> Sent: Thursday, July 30, 2009 7:45 PM
> To: PETSc users list
> Subject: Re: Smoother settings for AMG
>
>
> Harun,
>
> I have played around with this matrix. It is a nasty matrix; I
> think it is really beyond the normal capacity of ML (and hypre's
> boomerAMG).
>
> Even the "convergence" you were getting below is BOGUS. If you run
> with -ksp_norm_type unpreconditioned or -ksp_monitor_true_residual
> you'll see that the "true" residual norm is actually creeping to zero
> and at the converged 43 iterations below the true residual norm has
> decreased by like less than 1/10. (The preconditioned residual norm
> has decreased by 1.e 5 so the iteration stops and you think it has
> converged. In really hard problems preconditioners sometimes scales
> things in a funky way so a large decrease in preconditioned residual
> norm does not mean a large decrease in true residual norm). In other
> words the "answer" you got out of the runs below is garbage.
>
> I suggest,
> 1) check carefully that the matrix being created actually matches the
> model's equations, if they seem right then
> 2) see if you can change the model so it does not generate such
> hopeless matrices. If you MUST solve this nasty matrix
> 3) bite the bullet and use a parallel direct solver from PETSc. Try
> both MUMPS and SuperLU_dist
>
> Good luck,
>
> Barry
>
>
>
>
> On Jul 29, 2009, at 3:54 PM, BAYRAKTAR Harun wrote:
>
>> Hi,
>>
>> I am trying to solve a system of equations and I am having difficulty
>> picking the right smoothers for AMG (using ML as pc_type) in PETSc
>> for
>> parallel execution. First here is what happens in terms of CG
>> (ksp_type)
>> iteration counts (both columns use block jacobi):
>>
>> cpus | AMG w/ ICC(0) x1 | AMG w/ SOR x4
>> ------------------------------------------------------
>> 1 | 43 | 243
>> 4 | 699 | 379
>>
>> x1 or x4 means 1 or 4 iterations of smoother application at each AMG
>> level (all details from ksp view for the 4 cpu run are below). The
>> main
>> observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but
>> falls
>> apart in parallel. SOR on the other hand experiences a 1.5X increase
>> in
>> iteration count which is totally expected from the quality of
>> coarsening
>> ML delivers in parallel.
>>
>> I basically would like to find a way (if possible) to have the
>> number of
>> iterations in parallel stay with 1-2X of 1 cpu iteration count for
>> the
>> AMG w/ ICC case. Is there a way to achieve this?
>>
>> Thanks,
>> Harun
>>
>> %%%%%%%%%%%%%%%%%%%%%%%%%
>> AMG w/ ICC(0) x1 ksp_view
>> %%%%%%%%%%%%%%%%%%%%%%%%%
>> KSP Object:
>> type: cg
>> maximum iterations=10000
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> PC Object:
>> type: ml
>> MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
>> post-smooths=1
>> Coarse gride solver -- level 0 -------------------------------
>> KSP Object:(mg_coarse_)
>> type: preonly
>> maximum iterations=1, initial guess is zero
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> PC Object:(mg_coarse_)
>> type: redundant
>> Redundant preconditioner: First (color=0) of 4 PCs follows
>> KSP Object:(mg_coarse_redundant_)
>> type: preonly
>> maximum iterations=10000, initial guess is zero
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> PC Object:(mg_coarse_redundant_)
>> type: lu
>> LU: out-of-place factorization
>> matrix ordering: nd
>> LU: tolerance for zero pivot 1e-12
>> LU: factor fill ratio needed 2.17227
>> Factored matrix follows
>> Matrix Object:
>> type=seqaij, rows=283, cols=283
>> total: nonzeros=21651, allocated nonzeros=21651
>> using I-node routines: found 186 nodes, limit used is
>> 5
>> linear system matrix = precond matrix:
>> Matrix Object:
>> type=seqaij, rows=283, cols=283
>> total: nonzeros=9967, allocated nonzeros=14150
>> not using I-node routines
>> linear system matrix = precond matrix:
>> Matrix Object:
>> type=mpiaij, rows=283, cols=283
>> total: nonzeros=9967, allocated nonzeros=9967
>> not using I-node (on process 0) routines
>> Down solver (pre-smoother) on level 1 -------------------------------
>> KSP Object:(mg_levels_1_)
>> type: richardson
>> Richardson: damping factor=0.9
>> maximum iterations=1, initial guess is zero
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> PC Object:(mg_levels_1_)
>> type: bjacobi
>> block Jacobi: number of blocks = 4
>> Local solve is same for all blocks, in the following KSP and PC
>> objects:
>> KSP Object:(mg_levels_1_sub_)
>> type: preonly
>> maximum iterations=10000, initial guess is zero
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> PC Object:(mg_levels_1_sub_)
>> type: icc
>> ICC: 0 levels of fill
>> ICC: factor fill ratio allocated 1
>> ICC: using Manteuffel shift
>> ICC: factor fill ratio needed 0.514899
>> Factored matrix follows
>> Matrix Object:
>> type=seqsbaij, rows=2813, cols=2813
>> total: nonzeros=48609, allocated nonzeros=48609
>> block size is 1
>> linear system matrix = precond matrix:
>> Matrix Object:
>> type=seqaij, rows=2813, cols=2813
>> total: nonzeros=94405, allocated nonzeros=94405
>> not using I-node routines
>> linear system matrix = precond matrix:
>> Matrix Object:
>> type=mpiaij, rows=10654, cols=10654
>> total: nonzeros=376634, allocated nonzeros=376634
>> not using I-node (on process 0) routines
>> Up solver (post-smoother) on level 1 -------------------------------
>> KSP Object:(mg_levels_1_)
>> type: richardson
>> Richardson: damping factor=0.9
>> maximum iterations=1
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> PC Object:(mg_levels_1_)
>> type: bjacobi
>> block Jacobi: number of blocks = 4
>> Local solve is same for all blocks, in the following KSP and PC
>> objects:
>> KSP Object:(mg_levels_1_sub_)
>> type: preonly
>> maximum iterations=10000, initial guess is zero
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> PC Object:(mg_levels_1_sub_)
>> type: icc
>> ICC: 0 levels of fill
>> ICC: factor fill ratio allocated 1
>> ICC: using Manteuffel shift
>> ICC: factor fill ratio needed 0.514899
>> Factored matrix follows
>> Matrix Object:
>> type=seqsbaij, rows=2813, cols=2813
>> total: nonzeros=48609, allocated nonzeros=48609
>> block size is 1
>> linear system matrix = precond matrix:
>> Matrix Object:
>> type=seqaij, rows=2813, cols=2813
>> total: nonzeros=94405, allocated nonzeros=94405
>> not using I-node routines
>> linear system matrix = precond matrix:
>> Matrix Object:
>> type=mpiaij, rows=10654, cols=10654
>> total: nonzeros=376634, allocated nonzeros=376634
>> not using I-node (on process 0) routines
>> Down solver (pre-smoother) on level 2 -------------------------------
>> KSP Object:(mg_levels_2_)
>> type: richardson
>> Richardson: damping factor=0.9
>> maximum iterations=1, initial guess is zero
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> PC Object:(mg_levels_2_)
>> type: bjacobi
>> block Jacobi: number of blocks = 4
>> Local solve is same for all blocks, in the following KSP and PC
>> objects:
>> KSP Object:(mg_levels_2_sub_)
>> type: preonly
>> maximum iterations=10000, initial guess is zero
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> PC Object:(mg_levels_2_sub_)
>> type: icc
>> ICC: 0 levels of fill
>> ICC: factor fill ratio allocated 1
>> ICC: using Manteuffel shift
>> ICC: factor fill ratio needed 0.519045
>> Factored matrix follows
>> Matrix Object:
>> type=seqsbaij, rows=101164, cols=101164
>> total: nonzeros=1378558, allocated nonzeros=1378558
>> block size is 1
>> linear system matrix = precond matrix:
>> Matrix Object:
>> type=seqaij, rows=101164, cols=101164
>> total: nonzeros=2655952, allocated nonzeros=5159364
>> not using I-node routines
>> linear system matrix = precond matrix:
>> Matrix Object:
>> type=mpiaij, rows=411866, cols=411866
>> total: nonzeros=10941434, allocated nonzeros=42010332
>> not using I-node (on process 0) routines
>> Up solver (post-smoother) on level 2 -------------------------------
>> KSP Object:(mg_levels_2_)
>> type: richardson
>> Richardson: damping factor=0.9
>> maximum iterations=1
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> PC Object:(mg_levels_2_)
>> type: bjacobi
>> block Jacobi: number of blocks = 4
>> Local solve is same for all blocks, in the following KSP and PC
>> objects:
>> KSP Object:(mg_levels_2_sub_)
>> type: preonly
>> maximum iterations=10000, initial guess is zero
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> PC Object:(mg_levels_2_sub_)
>> type: icc
>> ICC: 0 levels of fill
>> ICC: factor fill ratio allocated 1
>> ICC: using Manteuffel shift
>> ICC: factor fill ratio needed 0.519045
>> Factored matrix follows
>> Matrix Object:
>> type=seqsbaij, rows=101164, cols=101164
>> total: nonzeros=1378558, allocated nonzeros=1378558
>> block size is 1
>> linear system matrix = precond matrix:
>> Matrix Object:
>> type=seqaij, rows=101164, cols=101164
>> total: nonzeros=2655952, allocated nonzeros=5159364
>> not using I-node routines
>> linear system matrix = precond matrix:
>> Matrix Object:
>> type=mpiaij, rows=411866, cols=411866
>> total: nonzeros=10941434, allocated nonzeros=42010332
>> not using I-node (on process 0) routines
>> linear system matrix = precond matrix:
>> Matrix Object:
>> type=mpiaij, rows=411866, cols=411866
>> total: nonzeros=10941434, allocated nonzeros=42010332
>> not using I-node (on process 0) routines
>>
>> %%%%%%%%%%%%%%%%%%%%%%
>> AMG w/ SOR x4 ksp_view
>> %%%%%%%%%%%%%%%%%%%%%%
>>
>> KSP Object:
>> type: cg
>> maximum iterations=10000
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> PC Object:
>> type: ml
>> MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
>> post-smooths=1
>> Coarse gride solver -- level 0 -------------------------------
>> KSP Object:(mg_coarse_)
>> type: preonly
>> maximum iterations=1, initial guess is zero
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> PC Object:(mg_coarse_)
>> type: redundant
>> Redundant preconditioner: First (color=0) of 4 PCs follows
>> KSP Object:(mg_coarse_redundant_)
>> type: preonly
>> maximum iterations=10000, initial guess is zero
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> PC Object:(mg_coarse_redundant_)
>> type: lu
>> LU: out-of-place factorization
>> matrix ordering: nd
>> LU: tolerance for zero pivot 1e-12
>> LU: factor fill ratio needed 2.17227
>> Factored matrix follows
>> Matrix Object:
>> type=seqaij, rows=283, cols=283
>> total: nonzeros=21651, allocated nonzeros=21651
>> using I-node routines: found 186 nodes, limit used is
>> 5
>> linear system matrix = precond matrix:
>> Matrix Object:
>> type=seqaij, rows=283, cols=283
>> total: nonzeros=9967, allocated nonzeros=14150
>> not using I-node routines
>> linear system matrix = precond matrix:
>> Matrix Object:
>> type=mpiaij, rows=283, cols=283
>> total: nonzeros=9967, allocated nonzeros=9967
>> not using I-node (on process 0) routines
>> Down solver (pre-smoother) on level 1 -------------------------------
>> KSP Object:(mg_levels_1_)
>> type: richardson
>> Richardson: damping factor=1
>> maximum iterations=4, initial guess is zero
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> PC Object:(mg_levels_1_)
>> type: sor
>> SOR: type = local_symmetric, iterations = 1, omega = 1
>> linear system matrix = precond matrix:
>> Matrix Object:
>> type=mpiaij, rows=10654, cols=10654
>> total: nonzeros=376634, allocated nonzeros=376634
>> not using I-node (on process 0) routines
>> Up solver (post-smoother) on level 1 -------------------------------
>> KSP Object:(mg_levels_1_)
>> type: richardson
>> Richardson: damping factor=1
>> maximum iterations=4
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> PC Object:(mg_levels_1_)
>> type: sor
>> SOR: type = local_symmetric, iterations = 1, omega = 1
>> linear system matrix = precond matrix:
>> Matrix Object:
>> type=mpiaij, rows=10654, cols=10654
>> total: nonzeros=376634, allocated nonzeros=376634
>> not using I-node (on process 0) routines
>> Down solver (pre-smoother) on level 2 -------------------------------
>> KSP Object:(mg_levels_2_)
>> type: richardson
>> Richardson: damping factor=1
>> maximum iterations=4, initial guess is zero
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> PC Object:(mg_levels_2_)
>> type: sor
>> SOR: type = local_symmetric, iterations = 1, omega = 1
>> linear system matrix = precond matrix:
>> Matrix Object:
>> type=mpiaij, rows=411866, cols=411866
>> total: nonzeros=10941434, allocated nonzeros=42010332
>> not using I-node (on process 0) routines
>> Up solver (post-smoother) on level 2 -------------------------------
>> KSP Object:(mg_levels_2_)
>> type: richardson
>> Richardson: damping factor=1
>> maximum iterations=4
>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> PC Object:(mg_levels_2_)
>> type: sor
>> SOR: type = local_symmetric, iterations = 1, omega = 1
>> linear system matrix = precond matrix:
>> Matrix Object:
>> type=mpiaij, rows=411866, cols=411866
>> total: nonzeros=10941434, allocated nonzeros=42010332
>> not using I-node (on process 0) routines
>> linear system matrix = precond matrix:
>> Matrix Object:
>> type=mpiaij, rows=411866, cols=411866
>> total: nonzeros=10941434, allocated nonzeros=42010332
>> not using I-node (on process 0) routines
>>
>>
>
More information about the petsc-users
mailing list