[petsc-users] Help with ML/BoomerAMG
Gaetan Kenway
gaetank at gmail.com
Sun Apr 28 18:01:29 CDT 2013
Hello
I am the process of trying out some of the multigrid functionality in PETSc
and not having much luck. The simple system I am trying to solve is adjoint
system of equations resulting from the finite volume discretization of the
Euler equation on a 147,456 cell mesh resulting in a linear system of
equations of size 5*147,456=737280. All of the test are done on a single
processor and use petsc-3.2-p7. My current technique for solving this is to
use following options:
-matload_block_size 5 -mat_type seqbaij -ksp_type gmres -ksp_max_it 100
-ksp_gmres_restart 50 -ksp_monitor -pc_type asm -pc_asm_overlap 1
-sub_pc_factor_mat_ordering_type rcm -sub_pc_factor_levels 1
Which results in ~1e-6 convergence in ~50 iterations. Next I naively tried
the following options:
-mat_type seqaij -ksp_type gmres -ksp_max_it 100 -ksp_gmres_restart 50
-ksp_monitor -ksp_view -pc_type ml
The result of the ksp_monitor and ksp_view is:
0 KSP Residual norm 7.366926114851e+70
1 KSP Residual norm 1.744597669120e+61
KSP Object: 1 MPI processes
type: gmres
GMRES: restart=50, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=100, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
type: ml
MG: type is MULTIPLICATIVE, levels=5 cycles=v
Cycles per PCApply=1
Using Galerkin computed coarse grid matrices
Coarse grid solver -- level -------------------------------
KSP Object: (mg_coarse_) 1 MPI processes
type: preonly
maximum iterations=1, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_coarse_) 1 MPI processes
type: lu
LU: out-of-place factorization
tolerance for zero pivot 1e-12
matrix ordering: nd
factor fill ratio given 5, needed 1
Factored matrix follows:
Matrix Object: 1 MPI processes
type: seqaij
rows=1, cols=1
package used to perform factorization: petsc
total: nonzeros=1, allocated nonzeros=1
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=1, cols=1
total: nonzeros=1, allocated nonzeros=1
total number of mallocs used during MatSetValues calls =0
not using I-node routines
Down solver (pre-smoother) on level 1 -------------------------------
KSP Object: (mg_levels_1_) 1 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=1
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using PRECONDITIONED norm type for convergence test
PC Object: (mg_levels_1_) 1 MPI processes
type: sor
SOR: type = local_symmetric, iterations = 1, local iterations = 1,
omega = 1
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=2, cols=2
total: nonzeros=4, allocated nonzeros=4
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 1 nodes, limit used is 5
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 2 -------------------------------
KSP Object: (mg_levels_2_) 1 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=1
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using PRECONDITIONED norm type for convergence test
PC Object: (mg_levels_2_) 1 MPI processes
type: sor
SOR: type = local_symmetric, iterations = 1, local iterations = 1,
omega = 1
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=93, cols=93
total: nonzeros=3861, allocated nonzeros=3861
total number of mallocs used during MatSetValues calls =0
not using I-node routines
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 3 -------------------------------
KSP Object: (mg_levels_3_) 1 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=1
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using PRECONDITIONED norm type for convergence test
PC Object: (mg_levels_3_) 1 MPI processes
type: sor
SOR: type = local_symmetric, iterations = 1, local iterations = 1,
omega = 1
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=8303, cols=8303
total: nonzeros=697606, allocated nonzeros=697606
total number of mallocs used during MatSetValues calls =0
not using I-node routines
Up solver (post-smoother) same as down solver (pre-smoother)
Down solver (pre-smoother) on level 4 -------------------------------
KSP Object: (mg_levels_4_) 1 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=1
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using PRECONDITIONED norm type for convergence test
PC Object: (mg_levels_4_) 1 MPI processes
type: sor
SOR: type = local_symmetric, iterations = 1, local iterations = 1,
omega = 1
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=737280, cols=737280
total: nonzeros=46288425, allocated nonzeros=46288425
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 147456 nodes, limit used is 5
Up solver (post-smoother) same as down solver (pre-smoother)
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=737280, cols=737280
total: nonzeros=46288425, allocated nonzeros=46288425
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 147456 nodes, limit used is 5
Time: 11.8599750995636
Actual res norm: 0.132693075876745
It stopped after the second iteration because it "converged" over 6 orders
of magnitude. Of course, it didn't actually converge and the actual
residual remained unchanged. Given that that the coarse solver is LU and
all the smoothers are KSPRichardson with 1 iteration and SOR smoothing, I
should be able to use GMRES. I also tried it with
-ksp_type fgrmes and all the other options the same.
This time, it doesn't blowup, but also doesn't make significant progress.
The actual residual at end of 100 iterations is larger than the real
initial residual. (The ksp_view for this is identical except for the change
is ksp type to fgmres and the use of right preconditioning.
0 KSP Residual norm 1.326930758772e-01
10 KSP Residual norm 1.326906693969e-01
20 KSP Residual norm 1.326879007861e-01
30 KSP Residual norm 1.326851758083e-01
40 KSP Residual norm 1.326824509983e-01
50 KSP Residual norm 1.333174206652e-01
60 KSP Residual norm 1.279860083890e-01
70 KSP Residual norm 1.227523658429e-01
80 KSP Residual norm 1.181123717224e-01
90 KSP Residual norm 1.139616660987e-01
100 KSP Residual norm 1.102198901480e-01
Time: 67.6706080436707
Actual res norm: 0.443508987516716
I then started trying some of the ML options. Specifically,
-pc_mg_smoothup 3
-pc_mg_smoothdown 3
Now it returns a floating point exception:
0 KSP Residual norm 1.326930758772e-01
[0]PETSC ERROR: --------------------- Error Message
------------------------------------
[0]PETSC ERROR: Floating point exception!
[0]PETSC ERROR: Infinite or not-a-number generated in norm!
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 3.2.0, Patch 7, Thu Mar 15 09:30:51
CDT 2012
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: ./main on a real-opt named mica by kenway Sun Apr 28
18:32:22 2013
[0]PETSC ERROR: Libraries linked from
/home/kenway/Downloads/petsc-3.2-p7/real-opt/lib
[0]PETSC ERROR: Configure run at Sun Apr 28 15:16:05 2013
[0]PETSC ERROR: Configure options --with-shared-libraries
--download-superlu_dist=yes --download-parmetis=yes
--with-fortran-interfaces=1 --with-debuggig=no -with-scalar-type=real
--PETSC_ARCH=real-opt --download-hypre=yes --download-spai=yes
--download-sundials=yes --download-ml=yes
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: VecNorm() line 167 in src/vec/vec/interface/rvector.c
[0]PETSC ERROR: VecNormalize() line 261 in src/vec/vec/interface/rvector.c
[0]PETSC ERROR: GMREScycle() line 128 in src/ksp/ksp/impls/gmres/gmres.c
[0]PETSC ERROR: KSPSolve_GMRES() line 231 in src/ksp/ksp/impls/gmres/gmres.c
[0]PETSC ERROR: KSPSolve() line 423 in src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: PCMGMCycle_Private() line 55 in src/ksp/pc/impls/mg/mg.c
[0]PETSC ERROR: PCMGMCycle_Private() line 49 in src/ksp/pc/impls/mg/mg.c
[0]PETSC ERROR: PCMGMCycle_Private() line 49 in src/ksp/pc/impls/mg/mg.c
[0]PETSC ERROR: PCMGMCycle_Private() line 49 in src/ksp/pc/impls/mg/mg.c
[0]PETSC ERROR: PCApply_MG() line 320 in src/ksp/pc/impls/mg/mg.c
[0]PETSC ERROR: PCApply() line 383 in src/ksp/pc/interface/precon.c
[0]PETSC ERROR: FGMREScycle() line 174 in
src/ksp/ksp/impls/gmres/fgmres/fgmres.c
[0]PETSC ERROR: KSPSolve_FGMRES() line 299 in
src/ksp/ksp/impls/gmres/fgmres/fgmres.c
[0]PETSC ERROR: KSPSolve() line 423 in src/ksp/ksp/interface/itfunc.c
Time: 20.1171851158142
Actual res norm: 0.132693075877161
So I tried running with
-pc_mg_smoothup 1
-pc_mg_smoothdown 1
This runs, but somehow does not result in the same sequence of KSP objects
on the various levels.
0 KSP Residual norm 1.326930758772e-01
10 KSP Residual norm 1.310779571550e-01
20 KSP Residual norm 1.290623886203e-01
30 KSP Residual norm 1.271370286802e-01
40 KSP Residual norm 1.252953431171e-01
50 KSP Residual norm 3.224565651498e-01
60 KSP Residual norm 2.974317106914e-01
70 KSP Residual norm 2.708573182063e-01
80 KSP Residual norm 2.503317725548e-01
90 KSP Residual norm 2.338612668565e-01
100 KSP Residual norm 2.202653375200e-01
KSP Object: 1 MPI processes
type: fgmres
GMRES: restart=50, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=100, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
right preconditioning
using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
type: ml
MG: type is MULTIPLICATIVE, levels=5 cycles=v
Cycles per PCApply=1
Using Galerkin computed coarse grid matrices
Coarse grid solver -- level -------------------------------
KSP Object: (mg_coarse_) 1 MPI processes
type: preonly
maximum iterations=1, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using NONE norm type for convergence test
PC Object: (mg_coarse_) 1 MPI processes
type: lu
LU: out-of-place factorization
tolerance for zero pivot 1e-12
matrix ordering: nd
factor fill ratio given 5, needed 1
Factored matrix follows:
Matrix Object: 1 MPI processes
type: seqaij
rows=1, cols=1
package used to perform factorization: petsc
total: nonzeros=1, allocated nonzeros=1
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=1, cols=1
total: nonzeros=1, allocated nonzeros=1
total number of mallocs used during MatSetValues calls =0
not using I-node routines
Down solver (pre-smoother) on level 1 -------------------------------
KSP Object: (mg_levels_1_) 1 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=1, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: (mg_levels_1_) 1 MPI processes
type: sor
SOR: type = local_symmetric, iterations = 1, local iterations = 1,
omega = 1
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=2, cols=2
total: nonzeros=4, allocated nonzeros=4
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 1 nodes, limit used is 5
Up solver (post-smoother) on level 1 -------------------------------
KSP Object: (mg_levels_1_) 1 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=1
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using PRECONDITIONED norm type for convergence test
PC Object: (mg_levels_1_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 1e-12
using diagonal shift to prevent zero pivot
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Matrix Object: 1 MPI processes
type: seqaij
rows=2, cols=2
package used to perform factorization: petsc
total: nonzeros=4, allocated nonzeros=4
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 1 nodes, limit used is 5
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=2, cols=2
total: nonzeros=4, allocated nonzeros=4
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 1 nodes, limit used is 5
Down solver (pre-smoother) on level 2 -------------------------------
KSP Object: (mg_levels_2_) 1 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=1, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: (mg_levels_2_) 1 MPI processes
type: sor
SOR: type = local_symmetric, iterations = 1, local iterations = 1,
omega = 1
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=93, cols=93
total: nonzeros=3861, allocated nonzeros=3861
total number of mallocs used during MatSetValues calls =0
not using I-node routines
Up solver (post-smoother) on level 2 -------------------------------
KSP Object: (mg_levels_2_) 1 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=1
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using PRECONDITIONED norm type for convergence test
PC Object: (mg_levels_2_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 1e-12
using diagonal shift to prevent zero pivot
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Matrix Object: 1 MPI processes
type: seqaij
rows=93, cols=93
package used to perform factorization: petsc
total: nonzeros=3861, allocated nonzeros=3861
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=93, cols=93
total: nonzeros=3861, allocated nonzeros=3861
total number of mallocs used during MatSetValues calls =0
not using I-node routines
Down solver (pre-smoother) on level 3 -------------------------------
KSP Object: (mg_levels_3_) 1 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=1, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: (mg_levels_3_) 1 MPI processes
type: sor
SOR: type = local_symmetric, iterations = 1, local iterations = 1,
omega = 1
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=8303, cols=8303
total: nonzeros=697606, allocated nonzeros=697606
total number of mallocs used during MatSetValues calls =0
not using I-node routines
Up solver (post-smoother) on level 3 -------------------------------
KSP Object: (mg_levels_3_) 1 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=1
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using PRECONDITIONED norm type for convergence test
PC Object: (mg_levels_3_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 1e-12
using diagonal shift to prevent zero pivot
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Matrix Object: 1 MPI processes
type: seqaij
rows=8303, cols=8303
package used to perform factorization: petsc
total: nonzeros=697606, allocated nonzeros=697606
total number of mallocs used during MatSetValues calls =0
not using I-node routines
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=8303, cols=8303
total: nonzeros=697606, allocated nonzeros=697606
total number of mallocs used during MatSetValues calls =0
not using I-node routines
Down solver (pre-smoother) on level 4 -------------------------------
KSP Object: (mg_levels_4_) 1 MPI processes
type: richardson
Richardson: damping factor=1
maximum iterations=1, initial guess is zero
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using PRECONDITIONED norm type for convergence test
PC Object: (mg_levels_4_) 1 MPI processes
type: sor
SOR: type = local_symmetric, iterations = 1, local iterations = 1,
omega = 1
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=737280, cols=737280
total: nonzeros=46288425, allocated nonzeros=46288425
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 147456 nodes, limit used is 5
Up solver (post-smoother) on level 4 -------------------------------
KSP Object: (mg_levels_4_) 1 MPI processes
type: gmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
maximum iterations=1
tolerances: relative=1e-05, absolute=1e-50, divergence=10000
left preconditioning
using nonzero initial guess
using PRECONDITIONED norm type for convergence test
PC Object: (mg_levels_4_) 1 MPI processes
type: ilu
ILU: out-of-place factorization
0 levels of fill
tolerance for zero pivot 1e-12
using diagonal shift to prevent zero pivot
matrix ordering: natural
factor fill ratio given 1, needed 1
Factored matrix follows:
Matrix Object: 1 MPI processes
type: seqaij
rows=737280, cols=737280
package used to perform factorization: petsc
total: nonzeros=46288425, allocated nonzeros=46288425
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 147456 nodes, limit used is 5
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=737280, cols=737280
total: nonzeros=46288425, allocated nonzeros=46288425
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 147456 nodes, limit used is 5
linear system matrix = precond matrix:
Matrix Object: 1 MPI processes
type: seqaij
rows=737280, cols=737280
total: nonzeros=46288425, allocated nonzeros=46288425
total number of mallocs used during MatSetValues calls =0
using I-node routines: found 147456 nodes, limit used is 5
Time: 118.043510913849
Actual res norm: 1.11054776641920
Now, the Down Solver is still KSPRichardson, but the UP solvers have turned
into a default GMRES, ILU solver. In either case, this didn't help the
convergence, and the final residual is larger than then initial.
I have also experimented with BoomerAMG, and have run into similar
problems; all of the various options result in a preconditioner that is not
significantly better tan not using any preconditioner at all.
Any suggestions would be greatly appreciated.
Thank you,
Gaetan Kenway
The compete source code listing is below:
program main
! Test different petsc solution techniques
implicit none
#include "include/finclude/petsc.h"
KSP ksp
Vec RHS, x, res
Mat A
PetscViewer fd
integer :: ierr, rank
real*8 :: timeA, timeB,err_nrm
! Initialize PETSc
call PetscInitialize('petsc_options', ierr)
call mpi_comm_rank(PETSC_COMM_WORLD, rank, ierr)
! Load matrix
call PetscViewerBinaryOpen(PETSC_COMM_WORLD,"drdw.bin", FILE_MODE_READ,
fd, ierr)
call MatCreate(PETSC_COMM_WORLD, A, ierr)
call MatSetFromOptions(A, ierr)
call MatLoad(A, fd, ierr)
call PetscViewerDestroy(fd, ierr)
! Create vector
call MatGetVecs(A, RHS, x, ierr)
call vecduplicate(x, res, ierr)
! Load RHS
call PetscViewerBinaryOpen(PETSC_COMM_WORLD, "didw.bin", FILE_MODE_READ,
fd, ierr)
call VecSetFromOptions(RHS, ierr)
call VecLoad(RHS, fd, ierr)
call PetscViewerDestroy(fd, ierr)
! Create KSP object
call KSPCreate(PEtSC_COMM_WORLD, ksp, ierr)
call KSPsetFromOptions(ksp, ierr)
call KSPSetOperators(ksp, A, A, SAME_NONZERO_PATTERN, ierr)
! Solve system
timeA = mpi_wtime()
call KSPSolve(ksp, RHS, x, ierr)
timeB = mpi_wtime()
if (rank == 0) then
print *,'Time:',timeB-timeA
end if
! Check actual error
call MatMult(A, x, res, ierr)
call VecNorm(res, NORM_2, err_nrm, ierr)
call VecAxPy(res, -1.0_8, RHS, ierr)
call VecNorm(res, NORM_2, err_nrm, ierr)
if (rank == 0) then
print *,'Actual res norm:',err_nrm
end if
! Destroy
call KSPDestroy(ksp, ierr)
call MatDestroy(A, ierr)
call VecDestroy(RHS, ierr)
call VecDestroy(x, ierr)
call VecDestroy(res, ierr)
! Finalize
call PETSCFinalize(ierr)
end program main
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20130428/50a3fe14/attachment-0001.html>
More information about the petsc-users
mailing list