PETSc CG solver uses more iterations than other CG solver
Knut Erik Teigen
knutert at stud.ntnu.no
Tue Mar 13 04:04:36 CDT 2007
On Sat, 2007-03-10 at 21:39 +0100, Berend van Wachem wrote:
> Hi,
>
> I have also used the VOF/level set model in our PETSc based code, and it does
> require a lot more iterations than a single phase model because of the
> stencil. I guess you are solving
>
> d/dx 1/rho d/dx p = 0
>
> ? Which can result in a terrible coefficient structure.
That's the equation I'm solving, yes.
>
> Are you sure the solver/preconditioner you use with PETSc are the same as in
> your other code?
Both are CG with incomplete cholesky preconditioner. I don't know enough
about the solver to compare the source code, so the implementation might
be different. But I would think that the PETSc solver is at least as
efficient as the other solver. And the performance is equal when solving
a simpler Poisson equation.
>
> I think for the problem size, you are having an awful lot of iterations! Are
> you sure the coefficients are put in the right location in the matrix?
>
I've compared the matrices and right hand sides for the two solvers, and
they are equal.
-Knut Erik-
> Berend.
>
> >
> > One boundary cell is defined to have constant pressure, since that
> > makes the equation system have a unique solution. I tried your
> > command, and it lowered the number of iterations for most of the time
> > steps, but for some it reached the maximum number of iterations
> > (10000) without converging.
> >
> > I also tried making all the boundaries von Neumann and using your
> > command. That made the number of iterations more constant, instead of
> > varying between 700 and 2000, it stayed on around 1200. But it
> > actually increased the average number of iterations somewhat. Still
> > far from the performance of the other solver.
> > I've also checked the convergence criteria, and it is the same for
> > both solvers.
> >
> > Siterer Lisandro Dalcin <dalcinl at gmail.com>:
> > > On 3/9/07, Knut Erik Teigen <knutert at stud.ntnu.no> wrote:
> > >> To solve the Navier-Stokes equations, I use an explicit Runge-Kutta
> > >> method with Chorin's projection method,
> > >> so a Poisson equation with von Neumann boundary conditions for the
> > >> pressure has to be solved at every time-step.
> > >
> > > All boundary conditions are Neumann type? In that case, please try to
> > > run your program with the following command line option:
> > >
> > > -ksp_constant_null_space
> > >
> > > and let me know if this corrected your problem.
> > >
> > >
> > > The equation system is
> > >
> > >> positive definite, so I use the CG solver with the ICC preconditioner.
> > >> The problem is that the PETSc solver seems to need a lot more iterations
> > >> to reach the solution than another CG solver I'm using. On a small test
> > >> problem(a rising bubble) with a 60x40 grid, the PETSc solver needs over
> > >> 1000 iterations on average, while the other solver needs less than 100.
> > >> I am using KSPSetInitialGuessNonzero, without this the number of
> > >> iterations is even higher.
> > >> I have also tried applying PETSc to a similar problem, solving the
> > >> Poisson equation with von Neumann boundaries and a forcing function of
> > >> f=sin(pi * x)+sin(pi *y). For this problem, the number of iterations is
> > >> almost exactly the same for PETSc and the other solver.
> > >>
> > >> Does anyone know what the problem might be? Any help is greatly
> > >> appreciated. I've included the -ksp_view of one of the time steps
> > >> and the -log_summary below.
> > >>
> > >> Regards,
> > >> Knut Erik Teigen
> > >> MSc student
> > >> Norwegian University of Science and Technology
> > >>
> > >> Output from -ksp_view:
> > >>
> > >> KSP Object:
> > >> type: cg
> > >> maximum iterations=10000
> > >> tolerances: relative=1e-06, absolute=1e-50, divergence=10000
> > >> left preconditioning
> > >> PC Object:
> > >> type: icc
> > >> ICC: 0 levels of fill
> > >> ICC: factor fill ratio allocated 1
> > >> ICC: factor fill ratio needed 0.601695
> > >> Factored matrix follows
> > >> Matrix Object:
> > >> type=seqsbaij, rows=2400, cols=2400
> > >> total: nonzeros=7100, allocated nonzeros=7100
> > >> block size is 1
> > >> linear system matrix = precond matrix:
> > >> Matrix Object:
> > >> type=seqaij, rows=2400, cols=2400
> > >> total: nonzeros=11800, allocated nonzeros=12000
> > >> not using I-node routines
> > >> Poisson converged after 1403 iterations
> > >>
> > >> Output from -log_summary
> > >>
> > >> ---------------------------------------------- PETSc Performance
> > >> Summary: ----------------------------------------------
> > >>
> > >> ./run on a gcc-ifc-d named iept0415 with 1 processor, by knutert Fri Mar
> > >> 9 17:06:05 2007
> > >> Using Petsc Release Version 2.3.2, Patch 8, Tue Jan 2 14:33:59 PST 2007
> > >> HG revision: ebeddcedcc065e32fc252af32cf1d01ed4fc7a80
> > >>
> > >> Max Max/Min Avg Total
> > >> Time (sec): 5.425e+02 1.00000 5.425e+02
> > >> Objects: 7.000e+02 1.00000 7.000e+02
> > >> Flops: 6.744e+10 1.00000 6.744e+10 6.744e+10
> > >> Flops/sec: 1.243e+08 1.00000 1.243e+08 1.243e+08
> > >> Memory: 4.881e+05 1.00000 4.881e+05
> > >> MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00
> > >> MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00
> > >> MPI Reductions: 1.390e+03 1.00000
> > >>
> > >> Flop counting convention: 1 flop = 1 real number operation of type
> > >> (multiply/divide/add/subtract)
> > >> e.g., VecAXPY() for real vectors of length N
> > >> --> 2N flops
> > >> and VecAXPY() for complex vectors of length
> > >> N --> 8N flops
> > >>
> > >> Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages
> > >> --- -- Message Lengths -- -- Reductions --
> > >> Avg %Total Avg %Total counts %
> > >> Total Avg %Total counts %Total
> > >> 0: Main Stage: 5.4246e+02 100.0% 6.7437e+10 100.0% 0.000e+00
> > >> 0.0% 0.000e+00 0.0% 1.390e+03 100.0%
> > >>
> > >> ------------------------------------------------------------------------
> > >>------------------------------------------------ See the 'Profiling'
> > >> chapter of the users' manual for details on interpreting output.
> > >> Phase summary info:
> > >> Count: number of times phase was executed
> > >> Time and Flops/sec: Max - maximum over all processors
> > >> Ratio - ratio of maximum to minimum over all
> > >> processors
> > >> Mess: number of messages sent
> > >> Avg. len: average message length
> > >> Reduct: number of global reductions
> > >> Global: entire computation
> > >> Stage: stages of a computation. Set stages with PetscLogStagePush()
> > >> and PetscLogStagePop().
> > >> %T - percent time in this phase %F - percent flops in this
> > >> phase
> > >> %M - percent messages in this phase %L - percent message
> > >> lengths in this phase
> > >> %R - percent reductions in this phase
> > >> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> > >> over all processors)
> > >> ------------------------------------------------------------------------
> > >>------------------------------------------------
> > >>
> > >>
> > >> ##########################################################
> > >> # #
> > >> # WARNING!!! #
> > >> # #
> > >> # This code was compiled with a debugging option, #
> > >> # To get timing results run config/configure.py #
> > >> # using --with-debugging=no, the performance will #
> > >> # be generally two or three times faster. #
> > >> # #
> > >> ##########################################################
> > >>
> > >>
> > >>
> > >>
> > >> ##########################################################
> > >> # #
> > >> # WARNING!!! #
> > >> # #
> > >> # This code was run without the PreLoadBegin() #
> > >> # macros. To get timing results we always recommend #
> > >> # preloading. otherwise timing numbers may be #
> > >> # meaningless. #
> > >> ##########################################################
> > >>
> > >>
> > >> Event Count Time (sec) Flops/sec
> > >> --- Global --- --- Stage --- Total
> > >> Max Ratio Max Ratio Max Ratio Mess Avg len
> > >> Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
> > >> ------------------------------------------------------------------------
> > >>------------------------------------------------
> > >>
> > >> --- Event Stage 0: Main Stage
> > >>
> > >> VecDot 1668318 1.0 1.9186e+01 1.0 4.17e+08 1.0 0.0e+00 0.0e+00
> > >> 0.0e+00 4 12 0 0 0 4 12 0 0 0 417
> > >> VecNorm 835191 1.0 1.0935e+01 1.0 3.67e+08 1.0 0.0e+00 0.0e+00
> > >> 0.0e+00 2 6 0 0 0 2 6 0 0 0 367
> > >> VecCopy 688 1.0 1.6126e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> > >> VecAXPY 1667630 1.0 2.4141e+01 1.0 3.32e+08 1.0 0.0e+00 0.0e+00
> > >> 0.0e+00 4 12 0 0 0 4 12 0 0 0 332
> > >> VecAYPX 833815 1.0 1.9062e+01 1.0 2.10e+08 1.0 0.0e+00 0.0e+00
> > >> 0.0e+00 4 6 0 0 0 4 6 0 0 0 210
> > >> VecAssemblyBegin 688 1.0 8.5354e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> > >> VecAssemblyEnd 688 1.0 7.8177e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> > >> MatMult 834503 1.0 1.5044e+02 1.0 1.18e+08 1.0 0.0e+00 0.0e+00
> > >> 0.0e+00 28 26 0 0 0 28 26 0 0 0 118
> > >> MatSolve 835191 1.0 1.8130e+02 1.0 1.42e+08 1.0 0.0e+00 0.0e+00
> > >> 0.0e+00 33 38 0 0 0 33 38 0 0 0 142
> > >> MatCholFctrNum 1 1.0 1.1630e-03 1.0 2.06e+06 1.0 0.0e+00 0.0e+00
> > >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 2
> > >> MatICCFactorSym 1 1.0 2.9588e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > >> 2.0e+00 0 0 0 0 0 0 0 0 0 0 0
> > >> MatAssemblyBegin 688 1.0 2.6343e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> > >> MatAssemblyEnd 688 1.0 1.0964e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> > >> MatGetOrdering 1 1.0 2.6798e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > >> 2.0e+00 0 0 0 0 0 0 0 0 0 0 0
> > >> KSPSetup 1 1.0 2.7585e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > >> 6.0e+00 0 0 0 0 0 0 0 0 0 0 0
> > >> KSPSolve 688 1.0 4.2809e+02 1.0 1.58e+08 1.0 0.0e+00 0.0e+00
> > >> 1.4e+03 79100 0 0100 79100 0 0100 158
> > >> PCSetUp 1 1.0 1.7900e-03 1.0 1.34e+06 1.0 0.0e+00 0.0e+00
> > >> 4.0e+00 0 0 0 0 0 0 0 0 0 0 1
> > >> PCApply 835191 1.0 1.8864e+02 1.0 1.36e+08 1.0 0.0e+00 0.0e+00
> > >> 0.0e+00 35 38 0 0 0 35 38 0 0 0 136
> > >> ------------------------------------------------------------------------
> > >>------------------------------------------------
> > >>
> > >> Memory usage is given in bytes:
> > >>
> > >> Object Type Creations Destructions Memory Descendants'
> > >> Mem.
> > >>
> > >> --- Event Stage 0: Main Stage
> > >>
> > >> Index Set 2 2 19640 0
> > >> Vec 694 693 299376 0
> > >> Matrix 2 2 56400 0
> > >> Krylov Solver 1 1 36 0
> > >> Preconditioner 1 1 108 0
> > >> ========================================================================
> > >>================================================ Average time to get
> > >> PetscTime(): 2.86102e-07
> > >> OptionTable: -ksp_type cg
> > >> OptionTable: -log_summary -ksp_view
> > >> OptionTable: -pc_type icc
> > >> Compiled without FORTRAN kernels
> > >> Compiled with full precision matrices (default)
> > >> sizeof(short) 2 sizeof(int) 4 sizeof(long) 4 sizeof(void*) 4
> > >> sizeof(PetscScalar) 8
> > >> Configure run at: Thu Mar 8 11:54:22 2007
> > >> Configure options: --with-cc=gcc --with-fc=ifort
> > >> --download-f-blas-lapack=1 --download-mpich=1 --with-debugging=1
> > >> --with-shared=0
> > >> -----------------------------------------
> > >> Libraries compiled on Thu Mar 8 12:08:22 CET 2007 on iept0415
> > >> Machine characteristics: Linux iept0415 2.6.16.21-0.8-default #1 Mon Jul
> > >> 3 18:25:39 UTC 2006 i686 i686 i386 GNU/Linux
> > >> Using PETSc directory: /opt/petsc-2.3.2-p8
> > >> Using PETSc arch: gcc-ifc-debug
> > >> -----------------------------------------
> > >> Using C compiler: gcc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing
> > >> -g3
> > >> Using Fortran compiler: ifort -fPIC -g
> > >> -----------------------------------------
> > >> Using include paths: -I/opt/petsc-2.3.2-p8
> > >> -I/opt/petsc-2.3.2-p8/bmake/gcc-ifc-debug -I/opt/petsc-2.3.2-p8/include
> > >> -I/opt/petsc-2.3.2-p8/externalpackages/mpich2-1.0.4p1/gcc-ifc-debug/incl
> > >>ude -I/usr/X11R6/include
> > >> ------------------------------------------
> > >> Using C linker: gcc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -g3
> > >> Using Fortran linker: ifort -fPIC -g
> > >> Using libraries: -Wl,-rpath,/opt/petsc-2.3.2-p8/lib/gcc-ifc-debug
> > >> -L/opt/petsc-2.3.2-p8/lib/gcc-ifc-debug -lpetscts -lpetscsnes -lpetscksp
> > >> -lpetscdm -lpetscmat -lpetscvec -lpetsc
> > >> -Wl,-rpath,/opt/petsc-2.3.2-p8/externalpackages/mpich2-1.0.4p1/gcc-ifc-d
> > >>ebug/lib
> > >> -L/opt/petsc-2.3.2-p8/externalpackages/mpich2-1.0.4p1/gcc-ifc-debug/lib
> > >> -lmpich -lnsl -lrt -L/usr/X11R6/lib -lX11
> > >> -Wl,-rpath,/opt/petsc-2.3.2-p8/externalpackages/fblaslapack/gcc-ifc-debu
> > >>g -L/opt/petsc-2.3.2-p8/externalpackages/fblaslapack/gcc-ifc-debug
> > >> -lflapack
> > >> -Wl,-rpath,/opt/petsc-2.3.2-p8/externalpackages/fblaslapack/gcc-ifc-debu
> > >>g -L/opt/petsc-2.3.2-p8/externalpackages/fblaslapack/gcc-ifc-debug
> > >> -lfblas -lm -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0
> > >> -L/usr/lib/gcc/i586-suse-linux/4.1.0
> > >> -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/../../../../i586-suse-linu
> > >>x/lib
> > >> -L/usr/lib/gcc/i586-suse-linux/4.1.0/../../../../i586-suse-linux/lib
> > >> -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/../../..
> > >> -L/usr/lib/gcc/i586-suse-linux/4.1.0/../../.. -ldl -lgcc_s
> > >> -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0
> > >> -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/../../../../i586-suse-linu
> > >>x/lib -Wl,-rpath,/usr/lib/gcc/!
> > >
> > > i5!
> > >
> > >> 86-suse-linux/4.1.0/../../.. -Wl,-rpath,/opt/intel/fc/9.1.036/lib
> > >> -L/opt/intel/fc/9.1.036/lib
> > >> -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/
> > >> -L/usr/lib/gcc/i586-suse-linux/4.1.0/
> > >> -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/../../../
> > >> -L/usr/lib/gcc/i586-suse-linux/4.1.0/../../../ -lifport -lifcore
> > >> -limf -lm -lipgo -lirc -lirc_s
> > >> -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0
> > >> -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/../../../../i586-suse-linu
> > >>x/lib -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/../../.. -lm
> > >> -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0
> > >> -L/usr/lib/gcc/i586-suse-linux/4.1.0
> > >> -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/../../../../i586-suse-linu
> > >>x/lib
> > >> -L/usr/lib/gcc/i586-suse-linux/4.1.0/../../../../i586-suse-linux/lib
> > >> -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/../../..
> > >> -L/usr/lib/gcc/i586-suse-linux/4.1.0/../../.. -ldl -lgcc_s -ldl
> > >
> > > --
> > > Lisandro Dalcín
> > > ---------------
> > > Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
> > > Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
> > > Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
> > > PTLC - Güemes 3450, (3000) Santa Fe, Argentina
> > > Tel/Fax: +54-(0)342-451.1594
>
>
More information about the petsc-users
mailing list