PETSc CG solver uses more iterations than other CG solver
Matthew Knepley
knepley at gmail.com
Tue Mar 13 07:39:49 CDT 2007
On 3/13/07, Knut Erik Teigen <knutert at stud.ntnu.no> wrote:
> On Sat, 2007-03-10 at 11:55 -0600, Matthew Knepley wrote:
> > Something is obviously wrong. CG has a simple definition and so does ICC(0).
> > So I would suggest
> >
> > 1) Make sure you are comparing apples to apples. Usually, the other solver
> > will have something you have not turned on, like scaling.
> >
> > 2) Compare the iterates exactly. I do this all the time, and it is
> > the best way
> > to compare differences.
>
> Is there an easy way to do this with PETSc, or do I have to add print
> sentences to the source code and recompile?
Here is how I might do it:
a) Compare the solver configuration (-ksp_view). Are they both using
symmetric preconditioning, or is one left, etc.
b) Compare the residual at each iterate (-ksp_monitor)
c) Compare the iterate itself. You register your own monitor method
http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/KSP/KSPSetMonitor.html
and then get the current residual
http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/KSP/KSPBuildResidual.html
and output it with VecView().
Matt
> -Knut Erik-
> >
> > Matt
> >
> > On 3/10/07, knutert at stud.ntnu.no <knutert at stud.ntnu.no> wrote:
> > > Thank you for your reply.
> > >
> > > One boundary cell is defined to have constant pressure, since that
> > > makes the equation system have a unique solution. I tried your
> > > command, and it lowered the number of iterations for most of the time
> > > steps, but for some it reached the maximum number of iterations
> > > (10000) without converging.
> > >
> > > I also tried making all the boundaries von Neumann and using your
> > > command. That made the number of iterations more constant, instead of
> > > varying between 700 and 2000, it stayed on around 1200. But it
> > > actually increased the average number of iterations somewhat. Still
> > > far from the performance of the other solver.
> > > I've also checked the convergence criteria, and it is the same for
> > > both solvers.
> > >
> > >
> > > Siterer Lisandro Dalcin <dalcinl at gmail.com>:
> > >
> > > > On 3/9/07, Knut Erik Teigen <knutert at stud.ntnu.no> wrote:
> > > >> To solve the Navier-Stokes equations, I use an explicit Runge-Kutta
> > > >> method with Chorin's projection method,
> > > >> so a Poisson equation with von Neumann boundary conditions for the
> > > >> pressure has to be solved at every time-step.
> > > >
> > > > All boundary conditions are Neumann type? In that case, please try to
> > > > run your program with the following command line option:
> > > >
> > > > -ksp_constant_null_space
> > > >
> > > > and let me know if this corrected your problem.
> > > >
> > > >
> > > > The equation system is
> > > >> positive definite, so I use the CG solver with the ICC preconditioner.
> > > >> The problem is that the PETSc solver seems to need a lot more iterations
> > > >> to reach the solution than another CG solver I'm using. On a small test
> > > >> problem(a rising bubble) with a 60x40 grid, the PETSc solver needs over
> > > >> 1000 iterations on average, while the other solver needs less than 100.
> > > >> I am using KSPSetInitialGuessNonzero, without this the number of
> > > >> iterations is even higher.
> > > >> I have also tried applying PETSc to a similar problem, solving the
> > > >> Poisson equation with von Neumann boundaries and a forcing function of
> > > >> f=sin(pi * x)+sin(pi *y). For this problem, the number of iterations is
> > > >> almost exactly the same for PETSc and the other solver.
> > > >>
> > > >> Does anyone know what the problem might be? Any help is greatly
> > > >> appreciated. I've included the -ksp_view of one of the time steps
> > > >> and the -log_summary below.
> > > >>
> > > >> Regards,
> > > >> Knut Erik Teigen
> > > >> MSc student
> > > >> Norwegian University of Science and Technology
> > > >>
> > > >> Output from -ksp_view:
> > > >>
> > > >> KSP Object:
> > > >> type: cg
> > > >> maximum iterations=10000
> > > >> tolerances: relative=1e-06, absolute=1e-50, divergence=10000
> > > >> left preconditioning
> > > >> PC Object:
> > > >> type: icc
> > > >> ICC: 0 levels of fill
> > > >> ICC: factor fill ratio allocated 1
> > > >> ICC: factor fill ratio needed 0.601695
> > > >> Factored matrix follows
> > > >> Matrix Object:
> > > >> type=seqsbaij, rows=2400, cols=2400
> > > >> total: nonzeros=7100, allocated nonzeros=7100
> > > >> block size is 1
> > > >> linear system matrix = precond matrix:
> > > >> Matrix Object:
> > > >> type=seqaij, rows=2400, cols=2400
> > > >> total: nonzeros=11800, allocated nonzeros=12000
> > > >> not using I-node routines
> > > >> Poisson converged after 1403 iterations
> > > >>
> > > >> Output from -log_summary
> > > >>
> > > >> ---------------------------------------------- PETSc Performance
> > > >> Summary: ----------------------------------------------
> > > >>
> > > >> ./run on a gcc-ifc-d named iept0415 with 1 processor, by knutert Fri Mar
> > > >> 9 17:06:05 2007
> > > >> Using Petsc Release Version 2.3.2, Patch 8, Tue Jan 2 14:33:59 PST 2007
> > > >> HG revision: ebeddcedcc065e32fc252af32cf1d01ed4fc7a80
> > > >>
> > > >> Max Max/Min Avg Total
> > > >> Time (sec): 5.425e+02 1.00000 5.425e+02
> > > >> Objects: 7.000e+02 1.00000 7.000e+02
> > > >> Flops: 6.744e+10 1.00000 6.744e+10 6.744e+10
> > > >> Flops/sec: 1.243e+08 1.00000 1.243e+08 1.243e+08
> > > >> Memory: 4.881e+05 1.00000 4.881e+05
> > > >> MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00
> > > >> MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00
> > > >> MPI Reductions: 1.390e+03 1.00000
> > > >>
> > > >> Flop counting convention: 1 flop = 1 real number operation of type
> > > >> (multiply/divide/add/subtract)
> > > >> e.g., VecAXPY() for real vectors of length N
> > > >> --> 2N flops
> > > >> and VecAXPY() for complex vectors of length
> > > >> N --> 8N flops
> > > >>
> > > >> Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages
> > > >> --- -- Message Lengths -- -- Reductions --
> > > >> Avg %Total Avg %Total counts %
> > > >> Total Avg %Total counts %Total
> > > >> 0: Main Stage: 5.4246e+02 100.0% 6.7437e+10 100.0% 0.000e+00
> > > >> 0.0% 0.000e+00 0.0% 1.390e+03 100.0%
> > > >>
> > > >> ------------------------------------------------------------------------------------------------------------------------
> > > >> See the 'Profiling' chapter of the users' manual for details on
> > > >> interpreting output.
> > > >> Phase summary info:
> > > >> Count: number of times phase was executed
> > > >> Time and Flops/sec: Max - maximum over all processors
> > > >> Ratio - ratio of maximum to minimum over all
> > > >> processors
> > > >> Mess: number of messages sent
> > > >> Avg. len: average message length
> > > >> Reduct: number of global reductions
> > > >> Global: entire computation
> > > >> Stage: stages of a computation. Set stages with PetscLogStagePush()
> > > >> and PetscLogStagePop().
> > > >> %T - percent time in this phase %F - percent flops in this
> > > >> phase
> > > >> %M - percent messages in this phase %L - percent message
> > > >> lengths in this phase
> > > >> %R - percent reductions in this phase
> > > >> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> > > >> over all processors)
> > > >> ------------------------------------------------------------------------------------------------------------------------
> > > >>
> > > >>
> > > >> ##########################################################
> > > >> # #
> > > >> # WARNING!!! #
> > > >> # #
> > > >> # This code was compiled with a debugging option, #
> > > >> # To get timing results run config/configure.py #
> > > >> # using --with-debugging=no, the performance will #
> > > >> # be generally two or three times faster. #
> > > >> # #
> > > >> ##########################################################
> > > >>
> > > >>
> > > >>
> > > >>
> > > >> ##########################################################
> > > >> # #
> > > >> # WARNING!!! #
> > > >> # #
> > > >> # This code was run without the PreLoadBegin() #
> > > >> # macros. To get timing results we always recommend #
> > > >> # preloading. otherwise timing numbers may be #
> > > >> # meaningless. #
> > > >> ##########################################################
> > > >>
> > > >>
> > > >> Event Count Time (sec) Flops/sec
> > > >> --- Global --- --- Stage --- Total
> > > >> Max Ratio Max Ratio Max Ratio Mess Avg len
> > > >> Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
> > > >> ------------------------------------------------------------------------------------------------------------------------
> > > >>
> > > >> --- Event Stage 0: Main Stage
> > > >>
> > > >> VecDot 1668318 1.0 1.9186e+01 1.0 4.17e+08 1.0 0.0e+00 0.0e+00
> > > >> 0.0e+00 4 12 0 0 0 4 12 0 0 0 417
> > > >> VecNorm 835191 1.0 1.0935e+01 1.0 3.67e+08 1.0 0.0e+00 0.0e+00
> > > >> 0.0e+00 2 6 0 0 0 2 6 0 0 0 367
> > > >> VecCopy 688 1.0 1.6126e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > > >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> > > >> VecAXPY 1667630 1.0 2.4141e+01 1.0 3.32e+08 1.0 0.0e+00 0.0e+00
> > > >> 0.0e+00 4 12 0 0 0 4 12 0 0 0 332
> > > >> VecAYPX 833815 1.0 1.9062e+01 1.0 2.10e+08 1.0 0.0e+00 0.0e+00
> > > >> 0.0e+00 4 6 0 0 0 4 6 0 0 0 210
> > > >> VecAssemblyBegin 688 1.0 8.5354e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > > >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> > > >> VecAssemblyEnd 688 1.0 7.8177e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > > >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> > > >> MatMult 834503 1.0 1.5044e+02 1.0 1.18e+08 1.0 0.0e+00 0.0e+00
> > > >> 0.0e+00 28 26 0 0 0 28 26 0 0 0 118
> > > >> MatSolve 835191 1.0 1.8130e+02 1.0 1.42e+08 1.0 0.0e+00 0.0e+00
> > > >> 0.0e+00 33 38 0 0 0 33 38 0 0 0 142
> > > >> MatCholFctrNum 1 1.0 1.1630e-03 1.0 2.06e+06 1.0 0.0e+00 0.0e+00
> > > >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 2
> > > >> MatICCFactorSym 1 1.0 2.9588e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > > >> 2.0e+00 0 0 0 0 0 0 0 0 0 0 0
> > > >> MatAssemblyBegin 688 1.0 2.6343e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > > >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> > > >> MatAssemblyEnd 688 1.0 1.0964e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > > >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
> > > >> MatGetOrdering 1 1.0 2.6798e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > > >> 2.0e+00 0 0 0 0 0 0 0 0 0 0 0
> > > >> KSPSetup 1 1.0 2.7585e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > > >> 6.0e+00 0 0 0 0 0 0 0 0 0 0 0
> > > >> KSPSolve 688 1.0 4.2809e+02 1.0 1.58e+08 1.0 0.0e+00 0.0e+00
> > > >> 1.4e+03 79100 0 0100 79100 0 0100 158
> > > >> PCSetUp 1 1.0 1.7900e-03 1.0 1.34e+06 1.0 0.0e+00 0.0e+00
> > > >> 4.0e+00 0 0 0 0 0 0 0 0 0 0 1
> > > >> PCApply 835191 1.0 1.8864e+02 1.0 1.36e+08 1.0 0.0e+00 0.0e+00
> > > >> 0.0e+00 35 38 0 0 0 35 38 0 0 0 136
> > > >> ------------------------------------------------------------------------------------------------------------------------
> > > >>
> > > >> Memory usage is given in bytes:
> > > >>
> > > >> Object Type Creations Destructions Memory Descendants'
> > > >> Mem.
> > > >>
> > > >> --- Event Stage 0: Main Stage
> > > >>
> > > >> Index Set 2 2 19640 0
> > > >> Vec 694 693 299376 0
> > > >> Matrix 2 2 56400 0
> > > >> Krylov Solver 1 1 36 0
> > > >> Preconditioner 1 1 108 0
> > > >> ========================================================================================================================
> > > >> Average time to get PetscTime(): 2.86102e-07
> > > >> OptionTable: -ksp_type cg
> > > >> OptionTable: -log_summary -ksp_view
> > > >> OptionTable: -pc_type icc
> > > >> Compiled without FORTRAN kernels
> > > >> Compiled with full precision matrices (default)
> > > >> sizeof(short) 2 sizeof(int) 4 sizeof(long) 4 sizeof(void*) 4
> > > >> sizeof(PetscScalar) 8
> > > >> Configure run at: Thu Mar 8 11:54:22 2007
> > > >> Configure options: --with-cc=gcc --with-fc=ifort
> > > >> --download-f-blas-lapack=1 --download-mpich=1 --with-debugging=1
> > > >> --with-shared=0
> > > >> -----------------------------------------
> > > >> Libraries compiled on Thu Mar 8 12:08:22 CET 2007 on iept0415
> > > >> Machine characteristics: Linux iept0415 2.6.16.21-0.8-default #1 Mon Jul
> > > >> 3 18:25:39 UTC 2006 i686 i686 i386 GNU/Linux
> > > >> Using PETSc directory: /opt/petsc-2.3.2-p8
> > > >> Using PETSc arch: gcc-ifc-debug
> > > >> -----------------------------------------
> > > >> Using C compiler: gcc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing
> > > >> -g3
> > > >> Using Fortran compiler: ifort -fPIC -g
> > > >> -----------------------------------------
> > > >> Using include paths: -I/opt/petsc-2.3.2-p8
> > > >> -I/opt/petsc-2.3.2-p8/bmake/gcc-ifc-debug -I/opt/petsc-2.3.2-p8/include
> > > >> -I/opt/petsc-2.3.2-p8/externalpackages/mpich2-1.0.4p1/gcc-ifc-debug/include
> > > >> -I/usr/X11R6/include
> > > >> ------------------------------------------
> > > >> Using C linker: gcc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -g3
> > > >> Using Fortran linker: ifort -fPIC -g
> > > >> Using libraries: -Wl,-rpath,/opt/petsc-2.3.2-p8/lib/gcc-ifc-debug
> > > >> -L/opt/petsc-2.3.2-p8/lib/gcc-ifc-debug -lpetscts -lpetscsnes -lpetscksp
> > > >> -lpetscdm -lpetscmat -lpetscvec -lpetsc
> > > >> -Wl,-rpath,/opt/petsc-2.3.2-p8/externalpackages/mpich2-1.0.4p1/gcc-ifc-debug/lib -L/opt/petsc-2.3.2-p8/externalpackages/mpich2-1.0.4p1/gcc-ifc-debug/lib -lmpich -lnsl -lrt -L/usr/X11R6/lib -lX11 -Wl,-rpath,/opt/petsc-2.3.2-p8/externalpackages/fblaslapack/gcc-ifc-debug -L/opt/petsc-2.3.2-p8/externalpackages/fblaslapack/gcc-ifc-debug -lflapack -Wl,-rpath,/opt/petsc-2.3.2-p8/externalpackages/fblaslapack/gcc-ifc-debug -L/opt/petsc-2.3.2-p8/externalpackages/fblaslapack/gcc-ifc-debug -lfblas -lm -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0 -L/usr/lib/gcc/i586-suse-linux/4.1.0 -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/../../../../i586-suse-linux/lib -L/usr/lib/gcc/i586-suse-linux/4.1.0/../../../../i586-suse-linux/lib -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/../../.. -L/usr/lib/gcc/i586-suse-linux/4.1.0/../../.. -ldl -lgcc_s -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0 -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/../../../../i586-suse-linux/lib
> > > >> -Wl,-rpath,/usr/lib/gcc/!
> > > > i5!
> > > >> 86-suse-linux/4.1.0/../../.. -Wl,-rpath,/opt/intel/fc/9.1.036/lib
> > > >> -L/opt/intel/fc/9.1.036/lib
> > > >> -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/
> > > >> -L/usr/lib/gcc/i586-suse-linux/4.1.0/
> > > >> -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/../../../
> > > >> -L/usr/lib/gcc/i586-suse-linux/4.1.0/../../../ -lifport -lifcore
> > > >> -limf -lm -lipgo -lirc -lirc_s
> > > >> -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0
> > > >> -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/../../../../i586-suse-linux/lib -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/../../.. -lm -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0 -L/usr/lib/gcc/i586-suse-linux/4.1.0 -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/../../../../i586-suse-linux/lib -L/usr/lib/gcc/i586-suse-linux/4.1.0/../../../../i586-suse-linux/lib -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/../../.. -L/usr/lib/gcc/i586-suse-linux/4.1.0/../../.. -ldl -lgcc_s
> > > >> -ldl
> > > >>
> > > >>
> > > >>
> > > >>
> > > >
> > > >
> > > > --
> > > > Lisandro Dalc�n
> > > > ---------------
> > > > Centro Internacional de M�todos Computacionales en Ingenier�a (CIMEC)
> > > > Instituto de Desarrollo Tecnol�gico para la Industria Qu�mica (INTEC)
> > > > Consejo Nacional de Investigaciones Cient�ficas y T�cnicas (CONICET)
> > > > PTLC - G�emes 3450, (3000) Santa Fe, Argentina
> > > > Tel/Fax: +54-(0)342-451.1594
> > >
> > >
> > >
> > >
> >
> >
>
>
--
One trouble is that despite this system, anyone who reads journals widely
and critically is forced to realize that there are scarcely any bars to eventual
publication. There seems to be no study too fragmented, no hypothesis too
trivial, no literature citation too biased or too egotistical, no design too
warped, no methodology too bungled, no presentation of results too
inaccurate, too obscure, and too contradictory, no analysis too self-serving,
no argument too circular, no conclusions too trifling or too unjustified, and
no grammar and syntax too offensive for a paper to end up in print. --
Drummond Rennie
More information about the petsc-users
mailing list