PETSc CG solver uses more iterations than other CG solver

Knut Erik Teigen knutert at stud.ntnu.no
Mon Mar 12 02:37:25 CDT 2007


On Sat, 2007-03-10 at 11:54 -0600, Barry Smith wrote:
> 
> 
> On Sat, 10 Mar 2007, knutert at stud.ntnu.no wrote:
> 
> > Thank you for your reply.
> > 
> > One boundary cell is defined to have constant pressure, since that makes the
> > equation system have a unique solution. 
> 
>   This is really not the best way to do it; it can produce very ill-conditioned
> matrices.
> 
So the best way is to use von Neumann on all boundaries, and then use
-ksp_constant_nullspace?

> >I tried your command, and it lowered
> > the number of iterations for most of the time steps, but for some it reached
> > the maximum number of iterations (10000) without converging.
> > 
> > I also tried making all the boundaries von Neumann and using your command.
> > That made the number of iterations more constant, instead of varying between
> > 700 and 2000, it stayed on around 1200. But it actually increased the average
> > number of iterations somewhat. Still far from the performance of the other
> > solver.
> > I've also checked the convergence criteria, and it is the same for both
> > solvers.
> 
>   There is something most definitely wrong. Are you sure your matrix is 
> COMPLETELY symmetric?
> 

Yes, I've controlled it several times.

>    Can you send your "model" code to petsc-maint at mcs.anl.gov so we can check it
> out?  
> 
Done. Hope you can find the problem.

>    Since you are on a structured grid eventually you'll want to use simply 
> geometric multigrid; it is really the "right" way to solve the problem 
> and will be much better then CG + ICC. With your code we check out how
> difficult it is to do this.
> 
I am currently working my way through the tutorial programs on
multigrid, so hopefully I'm getting there.

Thanks a lot for the help!

-Knut Erik-

>   Barry
> > 
> > 
> > Siterer Lisandro Dalcin <dalcinl at gmail.com>:
> > 
> > > On 3/9/07, Knut Erik Teigen <knutert at stud.ntnu.no> wrote:
> > > > To solve the Navier-Stokes equations, I use an explicit Runge-Kutta
> > > > method with Chorin's projection method,
> > > > so a Poisson equation with von Neumann boundary conditions for the
> > > > pressure has to be solved at every time-step.
> > > 
> > > All boundary conditions are Neumann type? In that case, please try to
> > > run your program with the following command line option:
> > > 
> > > -ksp_constant_null_space
> > > 
> > > and let me know if this corrected your problem.
> > > 
> > > 
> > > The equation system is
> > > > positive definite, so I use the CG solver with the ICC preconditioner.
> > > > The problem is that the PETSc solver seems to need a lot more iterations
> > > > to reach the solution than another CG solver I'm using. On a small test
> > > > problem(a rising bubble) with a 60x40 grid, the PETSc solver needs over
> > > > 1000 iterations on average, while the other solver needs less than 100.
> > > > I am using KSPSetInitialGuessNonzero, without this the number of
> > > > iterations is even higher.
> > > > I have also tried applying PETSc to a similar problem, solving the
> > > > Poisson equation with von Neumann boundaries and a forcing function of
> > > > f=sin(pi * x)+sin(pi *y). For this problem, the number of iterations is
> > > > almost exactly the same for PETSc and the other solver.
> > > > 
> > > > Does anyone know what the problem might be? Any help is greatly
> > > > appreciated. I've included the -ksp_view of one of the time steps
> > > > and the -log_summary below.
> > > > 
> > > > Regards,
> > > > Knut Erik Teigen
> > > > MSc student
> > > > Norwegian University of Science and Technology
> > > > 
> > > > Output from -ksp_view:
> > > > 
> > > > KSP Object:
> > > > type: cg
> > > > maximum iterations=10000
> > > > tolerances:  relative=1e-06, absolute=1e-50, divergence=10000
> > > > left preconditioning
> > > > PC Object:
> > > > type: icc
> > > >   ICC: 0 levels of fill
> > > >   ICC: factor fill ratio allocated 1
> > > >   ICC: factor fill ratio needed 0.601695
> > > >        Factored matrix follows
> > > >       Matrix Object:
> > > >         type=seqsbaij, rows=2400, cols=2400
> > > >         total: nonzeros=7100, allocated nonzeros=7100
> > > >             block size is 1
> > > > linear system matrix = precond matrix:
> > > > Matrix Object:
> > > >   type=seqaij, rows=2400, cols=2400
> > > >   total: nonzeros=11800, allocated nonzeros=12000
> > > >     not using I-node routines
> > > > Poisson converged after 1403 iterations
> > > > 
> > > > Output from -log_summary
> > > > 
> > > > ---------------------------------------------- PETSc Performance
> > > > Summary: ----------------------------------------------
> > > > 
> > > > ./run on a gcc-ifc-d named iept0415 with 1 processor, by knutert Fri Mar
> > > > 9 17:06:05 2007
> > > > Using Petsc Release Version 2.3.2, Patch 8, Tue Jan  2 14:33:59 PST 2007
> > > > HG revision: ebeddcedcc065e32fc252af32cf1d01ed4fc7a80
> > > > 
> > > >                        Max       Max/Min        Avg      Total
> > > > Time (sec):           5.425e+02      1.00000   5.425e+02
> > > > Objects:              7.000e+02      1.00000   7.000e+02
> > > > Flops:                6.744e+10      1.00000   6.744e+10  6.744e+10
> > > > Flops/sec:            1.243e+08      1.00000   1.243e+08  1.243e+08
> > > > Memory:               4.881e+05      1.00000              4.881e+05
> > > > MPI Messages:         0.000e+00      0.00000   0.000e+00  0.000e+00
> > > > MPI Message Lengths:  0.000e+00      0.00000   0.000e+00  0.000e+00
> > > > MPI Reductions:       1.390e+03      1.00000
> > > > 
> > > > Flop counting convention: 1 flop = 1 real number operation of type
> > > > (multiply/divide/add/subtract)
> > > >                           e.g., VecAXPY() for real vectors of length N
> > > > --> 2N flops
> > > >                           and VecAXPY() for complex vectors of length
> > > > N --> 8N flops
> > > > 
> > > > Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages
> > > > ---  -- Message Lengths --  -- Reductions --
> > > >                       Avg     %Total     Avg     %Total   counts   %
> > > > Total     Avg         %Total   counts   %Total
> > > > 0:      Main Stage: 5.4246e+02 100.0%  6.7437e+10 100.0%  0.000e+00
> > > > 0.0%  0.000e+00        0.0%  1.390e+03 100.0%
> > > > 
> > > > ------------------------------------------------------------------------------------------------------------------------
> > > > See the 'Profiling' chapter of the users' manual for details on
> > > > interpreting output.
> > > > Phase summary info:
> > > >  Count: number of times phase was executed
> > > >  Time and Flops/sec: Max - maximum over all processors
> > > >                      Ratio - ratio of maximum to minimum over all
> > > > processors
> > > >  Mess: number of messages sent
> > > >  Avg. len: average message length
> > > >  Reduct: number of global reductions
> > > >  Global: entire computation
> > > >  Stage: stages of a computation. Set stages with PetscLogStagePush()
> > > > and PetscLogStagePop().
> > > >     %T - percent time in this phase         %F - percent flops in this
> > > > phase
> > > >     %M - percent messages in this phase     %L - percent message
> > > > lengths in this phase
> > > >     %R - percent reductions in this phase
> > > >  Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> > > > over all processors)
> > > > ------------------------------------------------------------------------------------------------------------------------
> > > > 
> > > > 
> > > >     ##########################################################
> > > >     #                                                        #
> > > >     #                          WARNING!!!                    #
> > > >     #                                                        #
> > > >     #   This code was compiled with a debugging option,      #
> > > >     #   To get timing results run config/configure.py        #
> > > >     #   using --with-debugging=no, the performance will      #
> > > >     #   be generally two or three times faster.              #
> > > >     #                                                        #
> > > >     ##########################################################
> > > > 
> > > > 
> > > > 
> > > > 
> > > >     ##########################################################
> > > >     #                                                        #
> > > >     #                          WARNING!!!                    #
> > > >     #                                                        #
> > > >     #   This code was run without the PreLoadBegin()         #
> > > >     #   macros. To get timing results we always recommend    #
> > > >     #   preloading. otherwise timing numbers may be          #
> > > >     #   meaningless.                                         #
> > > >     ##########################################################
> > > > 
> > > > 
> > > > Event                Count      Time (sec)     Flops/sec
> > > > --- Global ---  --- Stage ---   Total
> > > >                  Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
> > > > Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> > > > ------------------------------------------------------------------------------------------------------------------------
> > > > 
> > > > --- Event Stage 0: Main Stage
> > > > 
> > > > VecDot           1668318 1.0 1.9186e+01 1.0 4.17e+08 1.0 0.0e+00 0.0e+00
> > > > 0.0e+00  4 12  0  0  0   4 12  0  0  0   417
> > > > VecNorm           835191 1.0 1.0935e+01 1.0 3.67e+08 1.0 0.0e+00 0.0e+00
> > > > 0.0e+00  2  6  0  0  0   2  6  0  0  0   367
> > > > VecCopy              688 1.0 1.6126e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > > > 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > > VecAXPY          1667630 1.0 2.4141e+01 1.0 3.32e+08 1.0 0.0e+00 0.0e+00
> > > > 0.0e+00  4 12  0  0  0   4 12  0  0  0   332
> > > > VecAYPX           833815 1.0 1.9062e+01 1.0 2.10e+08 1.0 0.0e+00 0.0e+00
> > > > 0.0e+00  4  6  0  0  0   4  6  0  0  0   210
> > > > VecAssemblyBegin     688 1.0 8.5354e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > > > 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > > VecAssemblyEnd       688 1.0 7.8177e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > > > 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > > MatMult           834503 1.0 1.5044e+02 1.0 1.18e+08 1.0 0.0e+00 0.0e+00
> > > > 0.0e+00 28 26  0  0  0  28 26  0  0  0   118
> > > > MatSolve          835191 1.0 1.8130e+02 1.0 1.42e+08 1.0 0.0e+00 0.0e+00
> > > > 0.0e+00 33 38  0  0  0  33 38  0  0  0   142
> > > > MatCholFctrNum         1 1.0 1.1630e-03 1.0 2.06e+06 1.0 0.0e+00 0.0e+00
> > > > 0.0e+00  0  0  0  0  0   0  0  0  0  0     2
> > > > MatICCFactorSym        1 1.0 2.9588e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > > > 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > > MatAssemblyBegin     688 1.0 2.6343e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > > > 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > > MatAssemblyEnd       688 1.0 1.0964e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > > > 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > > MatGetOrdering         1 1.0 2.6798e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > > > 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > > KSPSetup               1 1.0 2.7585e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > > > 6.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > > KSPSolve             688 1.0 4.2809e+02 1.0 1.58e+08 1.0 0.0e+00 0.0e+00
> > > > 1.4e+03 79100  0  0100  79100  0  0100   158
> > > > PCSetUp                1 1.0 1.7900e-03 1.0 1.34e+06 1.0 0.0e+00 0.0e+00
> > > > 4.0e+00  0  0  0  0  0   0  0  0  0  0     1
> > > > PCApply           835191 1.0 1.8864e+02 1.0 1.36e+08 1.0 0.0e+00 0.0e+00
> > > > 0.0e+00 35 38  0  0  0  35 38  0  0  0   136
> > > > ------------------------------------------------------------------------------------------------------------------------
> > > > 
> > > > Memory usage is given in bytes:
> > > > 
> > > > Object Type          Creations   Destructions   Memory  Descendants'
> > > > Mem.
> > > > 
> > > > --- Event Stage 0: Main Stage
> > > > 
> > > >          Index Set     2              2      19640     0
> > > >                Vec   694            693     299376     0
> > > >             Matrix     2              2      56400     0
> > > >      Krylov Solver     1              1         36     0
> > > >     Preconditioner     1              1        108     0
> > > > ========================================================================================================================
> > > > Average time to get PetscTime(): 2.86102e-07
> > > > OptionTable: -ksp_type cg
> > > > OptionTable: -log_summary -ksp_view
> > > > OptionTable: -pc_type icc
> > > > Compiled without FORTRAN kernels
> > > > Compiled with full precision matrices (default)
> > > > sizeof(short) 2 sizeof(int) 4 sizeof(long) 4 sizeof(void*) 4
> > > > sizeof(PetscScalar) 8
> > > > Configure run at: Thu Mar  8 11:54:22 2007
> > > > Configure options: --with-cc=gcc --with-fc=ifort
> > > > --download-f-blas-lapack=1 --download-mpich=1 --with-debugging=1
> > > > --with-shared=0
> > > > -----------------------------------------
> > > > Libraries compiled on Thu Mar  8 12:08:22 CET 2007 on iept0415
> > > > Machine characteristics: Linux iept0415 2.6.16.21-0.8-default #1 Mon Jul
> > > > 3 18:25:39 UTC 2006 i686 i686 i386 GNU/Linux
> > > > Using PETSc directory: /opt/petsc-2.3.2-p8
> > > > Using PETSc arch: gcc-ifc-debug
> > > > -----------------------------------------
> > > > Using C compiler: gcc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing
> > > > -g3
> > > > Using Fortran compiler: ifort -fPIC -g
> > > > -----------------------------------------
> > > > Using include paths: -I/opt/petsc-2.3.2-p8
> > > > -I/opt/petsc-2.3.2-p8/bmake/gcc-ifc-debug -I/opt/petsc-2.3.2-p8/include
> > > > -I/opt/petsc-2.3.2-p8/externalpackages/mpich2-1.0.4p1/gcc-ifc-debug/include
> > > > -I/usr/X11R6/include
> > > > ------------------------------------------
> > > > Using C linker: gcc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -g3
> > > > Using Fortran linker: ifort -fPIC -g
> > > > Using libraries: -Wl,-rpath,/opt/petsc-2.3.2-p8/lib/gcc-ifc-debug
> > > > -L/opt/petsc-2.3.2-p8/lib/gcc-ifc-debug -lpetscts -lpetscsnes -lpetscksp
> > > > -lpetscdm -lpetscmat -lpetscvec -lpetsc
> > > > -Wl,-rpath,/opt/petsc-2.3.2-p8/externalpackages/mpich2-1.0.4p1/gcc-ifc-debug/lib
> > > > -L/opt/petsc-2.3.2-p8/externalpackages/mpich2-1.0.4p1/gcc-ifc-debug/lib
> > > > -lmpich -lnsl -lrt -L/usr/X11R6/lib -lX11
> > > > -Wl,-rpath,/opt/petsc-2.3.2-p8/externalpackages/fblaslapack/gcc-ifc-debug
> > > > -L/opt/petsc-2.3.2-p8/externalpackages/fblaslapack/gcc-ifc-debug -lflapack
> > > > -Wl,-rpath,/opt/petsc-2.3.2-p8/externalpackages/fblaslapack/gcc-ifc-debug
> > > > -L/opt/petsc-2.3.2-p8/externalpackages/fblaslapack/gcc-ifc-debug -lfblas
> > > > -lm -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0
> > > > -L/usr/lib/gcc/i586-suse-linux/4.1.0
> > > > -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/../../../../i586-suse-linux/lib
> > > > -L/usr/lib/gcc/i586-suse-linux/4.1.0/../../../../i586-suse-linux/lib
> > > > -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/../../..
> > > > -L/usr/lib/gcc/i586-suse-linux/4.1.0/../../.. -ldl -lgcc_s
> > > > -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0
> > > > -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/../../../../i586-suse-linux/lib
> > > > -Wl,-rpath,/usr/lib/gcc/!
> > > i5!
> > > > 86-suse-linux/4.1.0/../../.. -Wl,-rpath,/opt/intel/fc/9.1.036/lib
> > > > -L/opt/intel/fc/9.1.036/lib
> > > > -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/
> > > > -L/usr/lib/gcc/i586-suse-linux/4.1.0/
> > > > -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/../../../
> > > > -L/usr/lib/gcc/i586-suse-linux/4.1.0/../../../ -lifport -lifcore  -limf
> > > > -lm -lipgo -lirc -lirc_s  -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0
> > > > -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/../../../../i586-suse-linux/lib
> > > > -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/../../.. -lm
> > > > -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0
> > > > -L/usr/lib/gcc/i586-suse-linux/4.1.0
> > > > -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/../../../../i586-suse-linux/lib
> > > > -L/usr/lib/gcc/i586-suse-linux/4.1.0/../../../../i586-suse-linux/lib
> > > > -Wl,-rpath,/usr/lib/gcc/i586-suse-linux/4.1.0/../../..
> > > > -L/usr/lib/gcc/i586-suse-linux/4.1.0/../../.. -ldl -lgcc_s  -ldl
> > > > 
> > > > 
> > > > 
> > > > 
> > > 
> > > 
> > > -- 
> > > Lisandro Dalcín
> > > ---------------
> > > Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
> > > Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
> > > Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
> > > PTLC - Güemes 3450, (3000) Santa Fe, Argentina
> > > Tel/Fax: +54-(0)342-451.1594
> > 
> > 
> > 




More information about the petsc-users mailing list