[petsc-dev] [petsc-users] Problem with AMG packages
Mark F. Adams
mfadams at lbl.gov
Tue Oct 8 22:42:08 CDT 2013
On Oct 8, 2013, at 8:18 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
> MatView 6 1.0 3.4042e+01269.1 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 18 0 0 0 0 25 0 0 0 0 0
>
> Something is seriously wrong with the default matview (or pcview) for PC GAMG? It is printing way to much for the default view and thus totally hosing the timings. The default PCView() is suppose to be very light weight (not do excessive communication) and provide very high level information.
>
Oh, I think the problem is that GAMG sets the coarse grid solver explicitly as a block jacobi with LU local. GAMG insures that all equation are on one PE for the coarsest grid. ML uses redundant. You should be able to use redundant in GAMG, it is just not the default. This is not tested. So I'm guessing the problem is that block Jacobi is noisy.
>
> Barry
>
>
> On Oct 8, 2013, at 6:50 PM, "Mark F. Adams" <mfadams at lbl.gov> wrote:
>
>> Something is going terrible wrong with the setup in hypre and ML. hypre's default parameters are not setup well for 3D. I use:
>>
>> -pc_hypre_boomeramg_no_CF
>> -pc_hypre_boomeramg_agg_nl 1
>> -pc_hypre_boomeramg_coarsen_type HMIS
>> -pc_hypre_boomeramg_interp_type ext+i
>>
>> I'm not sure what is going wrong with ML's setup.
>>
>> GAMG is converging terribly. Is this just a simple 7-point Laplacian? It looks like you the eigen estimate is low on the finest grid, which messes up the smoother. Try running with these parameters and send the output:
>>
>> -pc_gamg_agg_nsmooths 1
>> -pc_gamg_verbose 2
>> -mg_levels_ksp_type richardson
>> -mg_levels_pc_type sor
>>
>> Mark
>>
>> On Oct 8, 2013, at 5:46 PM, Pierre Jolivet <jolivet at ann.jussieu.fr> wrote:
>>
>>> Please find the log for BoomerAMG, ML and GAMG attached. The set up for
>>> GAMG doesn't look so bad compared to the other packages, so I'm wondering
>>> what is going on with those ?
>>>
>>>>
>>>> We need the output from running with -log_summary -pc_mg_log
>>>>
>>>> Also you can run with PETSc's AMG called GAMG (run with -pc_type gamg)
>>>> This will give the most useful information about where it is spending
>>>> the time.
>>>>
>>>>
>>>> Barry
>>>>
>>>>
>>>> On Oct 8, 2013, at 4:11 PM, Pierre Jolivet <jolivet at ann.jussieu.fr> wrote:
>>>>
>>>>> Dear all,
>>>>> I'm trying to compare linear solvers for a simple Poisson equation in
>>>>> 3D.
>>>>> I thought that MG was the way to go, but looking at my log, the
>>>>> performance looks abysmal (I know that the matrices are way too small
>>>>> but
>>>>> if I go bigger, it just never performs a single iteration ..). Even
>>>>> though
>>>>> this is neither the BoomerAMG nor the ML mailing list, could you please
>>>>> tell me if PETSc sets some default flags that make the setup for those
>>>>> solvers so slow for this simple problem ? The performance of (G)ASM is
>>>>> in
>>>>> comparison much better.
>>>>>
>>>>> Thanks in advance for your help.
>>>>>
>>>>> PS: first the BoomerAMG log, then ML (much more verbose, sorry).
>>>>>
>>>>> 0 KSP Residual norm 1.599647112604e+00
>>>>> 1 KSP Residual norm 5.450838232404e-02
>>>>> 2 KSP Residual norm 3.549673478318e-03
>>>>> 3 KSP Residual norm 2.901826808841e-04
>>>>> 4 KSP Residual norm 2.574235778729e-05
>>>>> 5 KSP Residual norm 2.253410171682e-06
>>>>> 6 KSP Residual norm 1.871067784877e-07
>>>>> 7 KSP Residual norm 1.681162800670e-08
>>>>> 8 KSP Residual norm 2.120841512414e-09
>>>>> KSP Object: 2048 MPI processes
>>>>> type: gmres
>>>>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
>>>>> Orthogonalization with no iterative refinement
>>>>> GMRES: happy breakdown tolerance 1e-30
>>>>> maximum iterations=200, initial guess is zero
>>>>> tolerances: relative=1e-08, absolute=1e-50, divergence=10000
>>>>> left preconditioning
>>>>> using PRECONDITIONED norm type for convergence test
>>>>> PC Object: 2048 MPI processes
>>>>> type: hypre
>>>>> HYPRE BoomerAMG preconditioning
>>>>> HYPRE BoomerAMG: Cycle type V
>>>>> HYPRE BoomerAMG: Maximum number of levels 25
>>>>> HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
>>>>> HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
>>>>> HYPRE BoomerAMG: Threshold for strong coupling 0.25
>>>>> HYPRE BoomerAMG: Interpolation truncation factor 0
>>>>> HYPRE BoomerAMG: Interpolation: max elements per row 0
>>>>> HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
>>>>> HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
>>>>> HYPRE BoomerAMG: Maximum row sums 0.9
>>>>> HYPRE BoomerAMG: Sweeps down 1
>>>>> HYPRE BoomerAMG: Sweeps up 1
>>>>> HYPRE BoomerAMG: Sweeps on coarse 1
>>>>> HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi
>>>>> HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi
>>>>> HYPRE BoomerAMG: Relax on coarse Gaussian-elimination
>>>>> HYPRE BoomerAMG: Relax weight (all) 1
>>>>> HYPRE BoomerAMG: Outer relax weight (all) 1
>>>>> HYPRE BoomerAMG: Using CF-relaxation
>>>>> HYPRE BoomerAMG: Measure type local
>>>>> HYPRE BoomerAMG: Coarsen type Falgout
>>>>> HYPRE BoomerAMG: Interpolation type classical
>>>>> linear system matrix = precond matrix:
>>>>> Matrix Object: 2048 MPI processes
>>>>> type: mpiaij
>>>>> rows=4173281, cols=4173281
>>>>> total: nonzeros=102576661, allocated nonzeros=102576661
>>>>> total number of mallocs used during MatSetValues calls =0
>>>>> not using I-node (on process 0) routines
>>>>> --- system solved with PETSc (in 1.005199e+02 seconds)
>>>>>
>>>>> 0 KSP Residual norm 2.368804472986e-01
>>>>> 1 KSP Residual norm 5.676430019132e-02
>>>>> 2 KSP Residual norm 1.898005876002e-02
>>>>> 3 KSP Residual norm 6.193922902926e-03
>>>>> 4 KSP Residual norm 2.008448794493e-03
>>>>> 5 KSP Residual norm 6.390465670228e-04
>>>>> 6 KSP Residual norm 2.157709394389e-04
>>>>> 7 KSP Residual norm 7.295973819979e-05
>>>>> 8 KSP Residual norm 2.358343271482e-05
>>>>> 9 KSP Residual norm 7.489696222066e-06
>>>>> 10 KSP Residual norm 2.390946857593e-06
>>>>> 11 KSP Residual norm 8.068086385140e-07
>>>>> 12 KSP Residual norm 2.706607789749e-07
>>>>> 13 KSP Residual norm 8.636910863376e-08
>>>>> 14 KSP Residual norm 2.761981175852e-08
>>>>> 15 KSP Residual norm 8.755459874369e-09
>>>>> 16 KSP Residual norm 2.708848598341e-09
>>>>> 17 KSP Residual norm 8.968748876265e-10
>>>>> KSP Object: 2048 MPI processes
>>>>> type: gmres
>>>>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
>>>>> Orthogonalization with no iterative refinement
>>>>> GMRES: happy breakdown tolerance 1e-30
>>>>> maximum iterations=200, initial guess is zero
>>>>> tolerances: relative=1e-08, absolute=1e-50, divergence=10000
>>>>> left preconditioning
>>>>> using PRECONDITIONED norm type for convergence test
>>>>> PC Object: 2048 MPI processes
>>>>> type: ml
>>>>> MG: type is MULTIPLICATIVE, levels=3 cycles=v
>>>>> Cycles per PCApply=1
>>>>> Using Galerkin computed coarse grid matrices
>>>>> Coarse grid solver -- level -------------------------------
>>>>> KSP Object: (mg_coarse_) 2048 MPI processes
>>>>> type: preonly
>>>>> maximum iterations=1, initial guess is zero
>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>>>>> left preconditioning
>>>>> using NONE norm type for convergence test
>>>>> PC Object: (mg_coarse_) 2048 MPI processes
>>>>> type: redundant
>>>>> Redundant preconditioner: First (color=0) of 2048 PCs follows
>>>>> KSP Object: (mg_coarse_redundant_) 1 MPI processes
>>>>> type: preonly
>>>>> maximum iterations=10000, initial guess is zero
>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>>>>> left preconditioning
>>>>> using NONE norm type for convergence test
>>>>> PC Object: (mg_coarse_redundant_) 1 MPI processes
>>>>> type: lu
>>>>> LU: out-of-place factorization
>>>>> tolerance for zero pivot 2.22045e-14
>>>>> using diagonal shift on blocks to prevent zero pivot
>>>>> matrix ordering: nd
>>>>> factor fill ratio given 5, needed 4.38504
>>>>> Factored matrix follows:
>>>>> Matrix Object: 1 MPI processes
>>>>> type: seqaij
>>>>> rows=2055, cols=2055
>>>>> package used to perform factorization: petsc
>>>>> total: nonzeros=2476747, allocated nonzeros=2476747
>>>>> total number of mallocs used during MatSetValues calls =0
>>>>> using I-node routines: found 1638 nodes, limit used is
>>>>> 5
>>>>> linear system matrix = precond matrix:
>>>>> Matrix Object: 1 MPI processes
>>>>> type: seqaij
>>>>> rows=2055, cols=2055
>>>>> total: nonzeros=564817, allocated nonzeros=1093260
>>>>> total number of mallocs used during MatSetValues calls =0
>>>>> not using I-node routines
>>>>> linear system matrix = precond matrix:
>>>>> Matrix Object: 2048 MPI processes
>>>>> type: mpiaij
>>>>> rows=2055, cols=2055
>>>>> total: nonzeros=564817, allocated nonzeros=564817
>>>>> total number of mallocs used during MatSetValues calls =0
>>>>> not using I-node (on process 0) routines
>>>>> Down solver (pre-smoother) on level 1 -------------------------------
>>>>> KSP Object: (mg_levels_1_) 2048 MPI processes
>>>>> type: richardson
>>>>> Richardson: damping factor=1
>>>>> maximum iterations=2
>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>>>>> left preconditioning
>>>>> using nonzero initial guess
>>>>> using NONE norm type for convergence test
>>>>> PC Object: (mg_levels_1_) 2048 MPI processes
>>>>> type: sor
>>>>> SOR: type = local_symmetric, iterations = 1, local iterations =
>>>>> 1,
>>>>> omega = 1
>>>>> linear system matrix = precond matrix:
>>>>> Matrix Object: 2048 MPI processes
>>>>> type: mpiaij
>>>>> rows=30194, cols=30194
>>>>> total: nonzeros=3368414, allocated nonzeros=3368414
>>>>> total number of mallocs used during MatSetValues calls =0
>>>>> not using I-node (on process 0) routines
>>>>> Up solver (post-smoother) same as down solver (pre-smoother)
>>>>> Down solver (pre-smoother) on level 2 -------------------------------
>>>>> KSP Object: (mg_levels_2_) 2048 MPI processes
>>>>> type: richardson
>>>>> Richardson: damping factor=1
>>>>> maximum iterations=2
>>>>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000
>>>>> left preconditioning
>>>>> using nonzero initial guess
>>>>> using NONE norm type for convergence test
>>>>> PC Object: (mg_levels_2_) 2048 MPI processes
>>>>> type: sor
>>>>> SOR: type = local_symmetric, iterations = 1, local iterations =
>>>>> 1,
>>>>> omega = 1
>>>>> linear system matrix = precond matrix:
>>>>> Matrix Object: 2048 MPI processes
>>>>> type: mpiaij
>>>>> rows=531441, cols=531441
>>>>> total: nonzeros=12476324, allocated nonzeros=12476324
>>>>> total number of mallocs used during MatSetValues calls =0
>>>>> not using I-node (on process 0) routines
>>>>> Up solver (post-smoother) same as down solver (pre-smoother)
>>>>> linear system matrix = precond matrix:
>>>>> Matrix Object: 2048 MPI processes
>>>>> type: mpiaij
>>>>> rows=531441, cols=531441
>>>>> total: nonzeros=12476324, allocated nonzeros=12476324
>>>>> total number of mallocs used during MatSetValues calls =0
>>>>> not using I-node (on process 0) routines
>>>>> --- system solved with PETSc (in 2.407844e+02 seconds)
>>>>>
>>>>>
>>>>
>>>>
>>> <log-GAMG><log-ML><log-BoomerAMG>
>>
>
More information about the petsc-dev
mailing list