Smoother settings for AMG
BAYRAKTAR Harun
Harun.BAYRAKTAR at 3ds.com
Thu Jul 30 06:28:35 CDT 2009
I always see a jump going from sequential to parallel runs with ML for all types of problems I have tried (structural mechanics, incompressible flow). But once parallel the iteration counts stay roughly constant given I use a true parallel smoother like Chebychev. My experience is that the iteration count jump is somewhere between 1.5-2.0X.
-----Original Message-----
From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Lisandro Dalcin
Sent: Wednesday, July 29, 2009 6:40 PM
To: PETSc users list
Subject: Re: Smoother settings for AMG
On Wed, Jul 29, 2009 at 6:48 PM, BAYRAKTAR Harun<Harun.BAYRAKTAR at 3ds.com> wrote:
>
> For a uniform mesh I see 20 iterations or so.
>
That's around the iteration count I would expect for for ML on uniform
meshes... Do this iteration count stays at about 20 when you run in 4
CPU's while still using an uniform mesh?
> From: petsc-users-bounces at mcs.anl.gov
> [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley
> Sent: Wednesday, July 29, 2009 5:23 PM
>
> To: PETSc users list
> Subject: Re: Smoother settings for AMG
>
>
>
> On Wed, Jul 29, 2009 at 4:17 PM, BAYRAKTAR Harun <Harun.BAYRAKTAR at 3ds.com>
> wrote:
>
> Matt,
>
> It is from the pressure poisson equation for incompressible navier-stokes so
> it is elliptic. Also on 1 cpu, I am able to solve it with reason able
> iteration count (i.e., 43 to 1.e-5 true res norm rel tolerance). It is the
> parallel runs that really concern me.
>
> Actually, it was the 43 that really concerned me. In my experience, an MG
> that is doing what it is supposed
> to on Poisson takes < 10 iterations. However, if your grid is pretty
> distorted, maybe it can get this bad.
>
> Matt
>
>
>
> Thanks,
>
> Harun
>
>
>
>
>
> From: petsc-users-bounces at mcs.anl.gov
> [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley
> Sent: Wednesday, July 29, 2009 5:00 PM
> To: PETSc users list
> Subject: Re: Smoother settings for AMG
>
>
>
> On Wed, Jul 29, 2009 at 3:54 PM, BAYRAKTAR Harun <Harun.BAYRAKTAR at 3ds.com>
> wrote:
>
> Hi,
>
> I am trying to solve a system of equations and I am having difficulty
> picking the right smoothers for AMG (using ML as pc_type) in PETSc for
> parallel execution. First here is what happens in terms of CG (ksp_type)
> iteration counts (both columns use block jacobi):
>
> Are you sure you have an elliptic system? These iteration counts are
> extremely
> high.
>
> Matt
>
>
> cpus | AMG w/ ICC(0) x1 | AMG w/ SOR x4
> ------------------------------------------------------
> 1 | 43 | 243
> 4 | 699 | 379
>
> x1 or x4 means 1 or 4 iterations of smoother application at each AMG
> level (all details from ksp view for the 4 cpu run are below). The main
> observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but falls
> apart in parallel. SOR on the other hand experiences a 1.5X increase in
> iteration count which is totally expected from the quality of coarsening
> ML delivers in parallel.
>
> I basically would like to find a way (if possible) to have the number of
> iterations in parallel stay with 1-2X of 1 cpu iteration count for the
> AMG w/ ICC case. Is there a way to achieve this?
>
> Thanks,
> Harun
>
> %%%%%%%%%%%%%%%%%%%%%%%%%
> AMG w/ ICC(0) x1 ksp_view
> %%%%%%%%%%%%%%%%%%%%%%%%%
> KSP Object:
> type: cg
> maximum iterations=10000
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> PC Object:
> type: ml
> MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
> post-smooths=1
> Coarse gride solver -- level 0 -------------------------------
> KSP Object:(mg_coarse_)
> type: preonly
> maximum iterations=1, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> PC Object:(mg_coarse_)
> type: redundant
> Redundant preconditioner: First (color=0) of 4 PCs follows
> KSP Object:(mg_coarse_redundant_)
> type: preonly
> maximum iterations=10000, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> PC Object:(mg_coarse_redundant_)
> type: lu
> LU: out-of-place factorization
> matrix ordering: nd
> LU: tolerance for zero pivot 1e-12
> LU: factor fill ratio needed 2.17227
> Factored matrix follows
> Matrix Object:
> type=seqaij, rows=283, cols=283
> total: nonzeros=21651, allocated nonzeros=21651
> using I-node routines: found 186 nodes, limit used is
> 5
> linear system matrix = precond matrix:
> Matrix Object:
> type=seqaij, rows=283, cols=283
> total: nonzeros=9967, allocated nonzeros=14150
> not using I-node routines
> linear system matrix = precond matrix:
> Matrix Object:
> type=mpiaij, rows=283, cols=283
> total: nonzeros=9967, allocated nonzeros=9967
> not using I-node (on process 0) routines
> Down solver (pre-smoother) on level 1 -------------------------------
> KSP Object:(mg_levels_1_)
> type: richardson
> Richardson: damping factor=0.9
> maximum iterations=1, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> PC Object:(mg_levels_1_)
> type: bjacobi
> block Jacobi: number of blocks = 4
> Local solve is same for all blocks, in the following KSP and PC
> objects:
> KSP Object:(mg_levels_1_sub_)
> type: preonly
> maximum iterations=10000, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> PC Object:(mg_levels_1_sub_)
> type: icc
> ICC: 0 levels of fill
> ICC: factor fill ratio allocated 1
> ICC: using Manteuffel shift
> ICC: factor fill ratio needed 0.514899
> Factored matrix follows
> Matrix Object:
> type=seqsbaij, rows=2813, cols=2813
> total: nonzeros=48609, allocated nonzeros=48609
> block size is 1
> linear system matrix = precond matrix:
> Matrix Object:
> type=seqaij, rows=2813, cols=2813
> total: nonzeros=94405, allocated nonzeros=94405
> not using I-node routines
> linear system matrix = precond matrix:
> Matrix Object:
> type=mpiaij, rows=10654, cols=10654
> total: nonzeros=376634, allocated nonzeros=376634
> not using I-node (on process 0) routines
> Up solver (post-smoother) on level 1 -------------------------------
> KSP Object:(mg_levels_1_)
> type: richardson
> Richardson: damping factor=0.9
> maximum iterations=1
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> PC Object:(mg_levels_1_)
> type: bjacobi
> block Jacobi: number of blocks = 4
> Local solve is same for all blocks, in the following KSP and PC
> objects:
> KSP Object:(mg_levels_1_sub_)
> type: preonly
> maximum iterations=10000, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> PC Object:(mg_levels_1_sub_)
> type: icc
> ICC: 0 levels of fill
> ICC: factor fill ratio allocated 1
> ICC: using Manteuffel shift
> ICC: factor fill ratio needed 0.514899
> Factored matrix follows
> Matrix Object:
> type=seqsbaij, rows=2813, cols=2813
> total: nonzeros=48609, allocated nonzeros=48609
> block size is 1
> linear system matrix = precond matrix:
> Matrix Object:
> type=seqaij, rows=2813, cols=2813
> total: nonzeros=94405, allocated nonzeros=94405
> not using I-node routines
> linear system matrix = precond matrix:
> Matrix Object:
> type=mpiaij, rows=10654, cols=10654
> total: nonzeros=376634, allocated nonzeros=376634
> not using I-node (on process 0) routines
> Down solver (pre-smoother) on level 2 -------------------------------
> KSP Object:(mg_levels_2_)
> type: richardson
> Richardson: damping factor=0.9
> maximum iterations=1, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> PC Object:(mg_levels_2_)
> type: bjacobi
> block Jacobi: number of blocks = 4
> Local solve is same for all blocks, in the following KSP and PC
> objects:
> KSP Object:(mg_levels_2_sub_)
> type: preonly
> maximum iterations=10000, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> PC Object:(mg_levels_2_sub_)
> type: icc
> ICC: 0 levels of fill
> ICC: factor fill ratio allocated 1
> ICC: using Manteuffel shift
> ICC: factor fill ratio needed 0.519045
> Factored matrix follows
> Matrix Object:
> type=seqsbaij, rows=101164, cols=101164
> total: nonzeros=1378558, allocated nonzeros=1378558
> block size is 1
> linear system matrix = precond matrix:
> Matrix Object:
> type=seqaij, rows=101164, cols=101164
> total: nonzeros=2655952, allocated nonzeros=5159364
> not using I-node routines
> linear system matrix = precond matrix:
> Matrix Object:
> type=mpiaij, rows=411866, cols=411866
> total: nonzeros=10941434, allocated nonzeros=42010332
> not using I-node (on process 0) routines
> Up solver (post-smoother) on level 2 -------------------------------
> KSP Object:(mg_levels_2_)
> type: richardson
> Richardson: damping factor=0.9
> maximum iterations=1
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> PC Object:(mg_levels_2_)
> type: bjacobi
> block Jacobi: number of blocks = 4
> Local solve is same for all blocks, in the following KSP and PC
> objects:
> KSP Object:(mg_levels_2_sub_)
> type: preonly
> maximum iterations=10000, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> PC Object:(mg_levels_2_sub_)
> type: icc
> ICC: 0 levels of fill
> ICC: factor fill ratio allocated 1
> ICC: using Manteuffel shift
> ICC: factor fill ratio needed 0.519045
> Factored matrix follows
> Matrix Object:
> type=seqsbaij, rows=101164, cols=101164
> total: nonzeros=1378558, allocated nonzeros=1378558
> block size is 1
> linear system matrix = precond matrix:
> Matrix Object:
> type=seqaij, rows=101164, cols=101164
> total: nonzeros=2655952, allocated nonzeros=5159364
> not using I-node routines
> linear system matrix = precond matrix:
> Matrix Object:
> type=mpiaij, rows=411866, cols=411866
> total: nonzeros=10941434, allocated nonzeros=42010332
> not using I-node (on process 0) routines
> linear system matrix = precond matrix:
> Matrix Object:
> type=mpiaij, rows=411866, cols=411866
> total: nonzeros=10941434, allocated nonzeros=42010332
> not using I-node (on process 0) routines
>
> %%%%%%%%%%%%%%%%%%%%%%
> AMG w/ SOR x4 ksp_view
> %%%%%%%%%%%%%%%%%%%%%%
>
> KSP Object:
> type: cg
> maximum iterations=10000
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> PC Object:
> type: ml
> MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
> post-smooths=1
> Coarse gride solver -- level 0 -------------------------------
> KSP Object:(mg_coarse_)
> type: preonly
> maximum iterations=1, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> PC Object:(mg_coarse_)
> type: redundant
> Redundant preconditioner: First (color=0) of 4 PCs follows
> KSP Object:(mg_coarse_redundant_)
> type: preonly
> maximum iterations=10000, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> PC Object:(mg_coarse_redundant_)
> type: lu
> LU: out-of-place factorization
> matrix ordering: nd
> LU: tolerance for zero pivot 1e-12
> LU: factor fill ratio needed 2.17227
> Factored matrix follows
> Matrix Object:
> type=seqaij, rows=283, cols=283
> total: nonzeros=21651, allocated nonzeros=21651
> using I-node routines: found 186 nodes, limit used is
> 5
> linear system matrix = precond matrix:
> Matrix Object:
> type=seqaij, rows=283, cols=283
> total: nonzeros=9967, allocated nonzeros=14150
> not using I-node routines
> linear system matrix = precond matrix:
> Matrix Object:
> type=mpiaij, rows=283, cols=283
> total: nonzeros=9967, allocated nonzeros=9967
> not using I-node (on process 0) routines
> Down solver (pre-smoother) on level 1 -------------------------------
> KSP Object:(mg_levels_1_)
> type: richardson
> Richardson: damping factor=1
> maximum iterations=4, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> PC Object:(mg_levels_1_)
> type: sor
> SOR: type = local_symmetric, iterations = 1, omega = 1
> linear system matrix = precond matrix:
> Matrix Object:
> type=mpiaij, rows=10654, cols=10654
> total: nonzeros=376634, allocated nonzeros=376634
> not using I-node (on process 0) routines
> Up solver (post-smoother) on level 1 -------------------------------
> KSP Object:(mg_levels_1_)
> type: richardson
> Richardson: damping factor=1
> maximum iterations=4
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> PC Object:(mg_levels_1_)
> type: sor
> SOR: type = local_symmetric, iterations = 1, omega = 1
> linear system matrix = precond matrix:
> Matrix Object:
> type=mpiaij, rows=10654, cols=10654
> total: nonzeros=376634, allocated nonzeros=376634
> not using I-node (on process 0) routines
> Down solver (pre-smoother) on level 2 -------------------------------
> KSP Object:(mg_levels_2_)
> type: richardson
> Richardson: damping factor=1
> maximum iterations=4, initial guess is zero
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> PC Object:(mg_levels_2_)
> type: sor
> SOR: type = local_symmetric, iterations = 1, omega = 1
> linear system matrix = precond matrix:
> Matrix Object:
> type=mpiaij, rows=411866, cols=411866
> total: nonzeros=10941434, allocated nonzeros=42010332
> not using I-node (on process 0) routines
> Up solver (post-smoother) on level 2 -------------------------------
> KSP Object:(mg_levels_2_)
> type: richardson
> Richardson: damping factor=1
> maximum iterations=4
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
> left preconditioning
> PC Object:(mg_levels_2_)
> type: sor
> SOR: type = local_symmetric, iterations = 1, omega = 1
> linear system matrix = precond matrix:
> Matrix Object:
> type=mpiaij, rows=411866, cols=411866
> total: nonzeros=10941434, allocated nonzeros=42010332
> not using I-node (on process 0) routines
> linear system matrix = precond matrix:
> Matrix Object:
> type=mpiaij, rows=411866, cols=411866
> total: nonzeros=10941434, allocated nonzeros=42010332
> not using I-node (on process 0) routines
>
>
> --
> What most experimenters take for granted before they begin their experiments
> is infinitely more interesting than any results to which their experiments
> lead.
> -- Norbert Wiener
>
>
> --
> What most experimenters take for granted before they begin their experiments
> is infinitely more interesting than any results to which their experiments
> lead.
> -- Norbert Wiener
--
Lisandro Dalcín
---------------
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594
More information about the petsc-users
mailing list