dmmg_grid_sequence: KSP not functional?

Xuefeng Li li at
Wed Apr 22 21:44:29 CDT 2009

Hello, everyone.

I am running src/snes/examples/tutorials/ex19.c
to test the use of multi-level dmmg in Petsc with
option -dmmg_grid_sequence.

In all the tests I've run, I observed that on the
coarsest level, KSPSolve() always converges in one
iteration with reason 4 and residual 0. And option
-ksp_monitor is not producing any output on this level.

Attached is an output from one test run with a two-level
dmmg, refine factor 2 and mesh 33x33 (coarse)/65x65 (fine).
The line containing "step in LS:" is printed from
src/snes/impls/ls/ls.c to report KSP activities
regarding KSP converge reason (kreason) on every iteration.

It feels like KSP for the coarsest level is either not
functional or a direct solver, whereas KSP for finer
levels are iterative solvers. What is the KSP type
associated with SNES on the coarsest level? Is the above
observation by design in Petsc?


--Xuefeng Li, (504)865-3340(phone)
   Like floating clouds, the heart rests easy
   Like flowing water, the spirit stays free
   New Orleans, Louisiana (504)865-2051(fax)
-------------- next part --------------
lid velocity = 0, prandtl # = 1, grashof # = 10
    0 SNES Function norm 3.027343750000e-01 
       1-st step in LS: kiter=   1; kreason=4; kres= 0.000000000000e+00;
    1 SNES Function norm 2.715281504674e-04 
       2-nd step in LS: kiter=   1; kreason=4; kres= 0.000000000000e+00;
    2 SNES Function norm 6.154554861599e-11 
  0 SNES Function norm 1.459106922253e-01 
    0 KSP Residual norm 1.459106922253e-01 
    1 KSP Residual norm 1.371836659970e-01 
    2 KSP Residual norm 2.540661848461e-02 
    3 KSP Residual norm 6.181889597814e-03 
    4 KSP Residual norm 1.572134257147e-03 
    5 KSP Residual norm 2.287065092537e-04 
    6 KSP Residual norm 3.184241572285e-05 
    7 KSP Residual norm 4.332199061241e-06 
    8 KSP Residual norm 8.673786649533e-07 
       1-st step in LS: kiter=   8; kreason=2; kres= 8.673786649533e-07;
  1 SNES Function norm 9.473925324366e-07 
    0 KSP Residual norm 9.473925324366e-07 
    1 KSP Residual norm 1.125539671715e-07 
    2 KSP Residual norm 1.532200439274e-08 
    3 KSP Residual norm 4.962833769150e-09 
    4 KSP Residual norm 1.093531357780e-09 
    5 KSP Residual norm 1.096111146965e-10 
    6 KSP Residual norm 2.341320410628e-11 
    7 KSP Residual norm 3.853426393455e-12 
       2-nd step in LS: kiter=   7; kreason=2; kres= 3.853426393455e-12;
  2 SNES Function norm 3.854044974807e-12 
Number of Newton iterations = 2
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

./ex19 on a cygwin_bl named with 16 processors, by Friend Wed Apr 22 21:12:05 2009
Using Petsc Release Version 3.0.0, Patch 2, Wed Jan 14 22:57:05 CST 2009

                         Max       Max/Min        Avg      Total 
Time (sec):           4.090e+01      1.00345   4.085e+01
Objects:              4.610e+02      1.00000   4.610e+02
Flops:                3.517e+08      1.00787   3.497e+08  5.596e+09
Flops/sec:            8.598e+06      1.00727   8.563e+06  1.370e+08
Memory:               1.730e+07      1.01353              2.744e+08
MPI Messages:         2.664e+03      1.37594   2.313e+03  3.700e+04
MPI Message Lengths:  9.758e+06      1.12054   3.939e+03  1.458e+08
MPI Reductions:       8.862e+01      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 2.4873e-01   0.6%  0.0000e+00   0.0%  1.500e+01   0.0%  1.621e-03        0.0%  0.000e+00   0.0% 
 1:           SetUp: 2.6383e+00   6.5%  1.3020e+05   0.0%  5.790e+02   1.6%  4.420e-01        0.0%  1.400e+02   9.9% 
 2:           Solve: 1.6828e+01  41.2%  2.7978e+09  50.0%  1.792e+04  48.4%  1.969e+03       50.0%  5.030e+02  35.5% 

See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)

      #                                                        #
      #                          WARNING!!!                    #
      #                                                        #
      #   This code was compiled with a debugging option,      #
      #   To get timing results run config/        #
      #   using --with-debugging=no, the performance will      #
      #   be generally two or three times faster.              #
      #                                                        #

Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s

--- Event Stage 0: Main Stage

PetscBarrier           2 1.0 1.0574e-0111.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0  29  0  0  0  0     0

--- Event Stage 1: SetUp

VecSet                 3 1.0 3.1624e-0424.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterBegin        1 1.0 5.5172e-03790.0 0.00e+00 0.0 3.3e+01 1.3e+02 0.0e+00  0  0  0  0  0   0  0  6 25  0     0
VecScatterEnd          1 1.0 9.1836e-032191.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMultTranspose       1 1.0 2.3881e-02 2.8 5.00e+03 1.1 3.3e+01 1.3e+02 2.0e+00  0  0  0  0  0   1 58  6 25  1     3
MatAssemblyBegin       3 1.0 9.4317e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00  0  0  0  0  0   3  0  0  0  4     0
MatAssemblyEnd         3 1.0 4.5032e-01 1.0 0.00e+00 0.0 2.6e+02 2.3e+01 3.0e+01  1  0  1  0  2  17  0 45 37 21     0
MatFDColorCreate       2 1.0 7.6469e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+01  2  0  0  0  3  29  0  0  0 29     0

--- Event Stage 2: Solve

VecDot                 4 1.0 7.6557e-02 1.8 5.92e+03 1.2 0.0e+00 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  1     1
VecMDot               45 1.0 7.7941e-01 2.0 2.17e+05 1.1 0.0e+00 0.0e+00 4.5e+01  1  0  0  0  3   3  0  0  0  9     4
VecNorm              123 1.0 1.7008e+00 1.4 2.69e+05 1.1 0.0e+00 0.0e+00 1.2e+02  3  0  0  0  8   8  0  0  0 23     2
VecScale              77 1.0 1.7737e-03 2.5 8.90e+04 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   734
VecCopy              164 1.0 1.1524e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               250 1.0 2.4710e-03 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY              182 1.0 2.2335e-03 1.8 2.88e+05 1.2 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1856
VecAYPX               15 1.0 3.7295e-04 2.5 3.47e+04 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1359
VecWAXPY               4 1.0 6.9087e-0420.1 2.96e+03 1.2 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    62
VecMAXPY              77 1.0 2.5467e-03 2.2 3.21e+05 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1845
VecPointwiseMult       2 1.0 1.1175e-05 1.3 6.48e+02 1.3 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   780
VecScatterBegin      233 1.0 1.2532e+00 2.9 0.00e+00 0.0 1.5e+04 1.3e+03 0.0e+00  2  0 40 13  0   5  0 83 25  0     0
VecScatterEnd        233 1.0 3.2011e+00 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  7  0  0  0  0  16  0  0  0  0     0
VecReduceArith         4 1.0 6.7606e-05 1.7 5.92e+03 1.2 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1257
VecReduceComm          2 1.0 5.9115e-02 4.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNormalize          60 1.0 9.8417e-01 1.8 2.08e+05 1.1 0.0e+00 0.0e+00 6.0e+01  2  0  0  0  4   4  0  0  0 12     3
MatMult              110 1.0 2.0059e+00 1.6 4.13e+06 1.1 5.0e+03 3.0e+02 0.0e+00  4  1 14  1  0  10  2 28  2  0    30
MatMultAdd            15 1.0 5.0544e-01 9.0 7.50e+04 1.1 5.0e+02 1.3e+02 0.0e+00  1  0  1  0  0   2  0  3  0  0     2
MatMultTranspose      32 1.0 7.8889e-01 1.4 1.60e+05 1.1 1.1e+03 1.3e+02 6.4e+01  2  0  3  0  5   4  0  6  0 13     3
MatSolve             122 1.0 1.4828e-01 1.3 3.11e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  9  0  0  0   1 18  0  0  0  3320
MatLUFactorSym         2 1.0 3.5973e-01 5.5 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00  0  0  0  0  0   1  0  0  0  1     0
MatLUFactorNum         6 1.0 1.6635e+00 2.0 1.37e+08 1.0 0.0e+00 0.0e+00 0.0e+00  3 39  0  0  0   8 79  0  0  0  1322
MatILUFactorSym        1 1.0 2.5755e-03 4.6 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00  0  0  0  0  0   0  0  0  0  1     0
MatAssemblyBegin      10 1.0 3.1545e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  1  0  0  0  1   1  0  0  0  2     0
MatAssemblyEnd        10 1.0 7.6078e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00  0  0  0  0  0   0  0  0  0  1     0
MatGetRowIJ            3 1.0 9.6632e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         3 1.0 5.7125e-02 6.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  1   0  0  0  0  2     0
MatZeroEntries         6 1.0 3.1484e-04 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatFDColorApply        6 1.0 3.9806e-01 1.7 1.74e+06 1.2 0.0e+00 0.0e+00 2.4e+01  1  0  0  0  2   2  1  0  0  5    62
MatFDColorFunc       126 1.0 4.4332e-03 1.5 1.59e+06 1.2 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   0  1  0  0  0  5096
MatGetRedundant        4 1.0 3.7212e+00 3.9 0.00e+00 0.0 1.9e+03 2.7e+04 4.0e+00  6  0  5 36  0  14  0 11 72  1     0
SNESSolve              2 1.0 1.6165e+01 1.0 1.76e+08 1.0 1.8e+04 4.1e+03 5.0e+02 40 50 48 50 35  96100 99100100   173
SNESLineSearch         4 1.0 3.4570e-01 1.3 1.98e+05 1.2 3.8e+02 2.4e+02 1.6e+01  1  0  1  0  1   2  0  2  0  3     8
SNESFunctionEval       6 1.0 9.3704e-02 2.9 9.32e+04 1.2 2.9e+02 2.4e+02 2.0e+00  0  0  1  0  0   0  0  2  0  0    14
SNESJacobianEval       4 1.0 5.3182e-01 1.1 1.75e+06 1.2 3.5e+02 2.0e+02 2.8e+01  1  0  1  0  2   3  1  2  0  6    47
KSPGMRESOrthog        45 1.0 7.8182e-01 2.0 4.35e+05 1.1 0.0e+00 0.0e+00 4.5e+01  1  0  0  0  3   3  0  0  0  9     8
KSPSetup              14 1.0 5.7125e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   3  0  0  0  0     0
KSPSolve               4 1.0 1.5273e+01 1.0 1.74e+08 1.0 1.7e+04 4.3e+03 4.5e+02 37 49 46 50 32  91 99 95100 90   181
PCSetUp                6 1.0 6.3247e+00 1.6 1.37e+08 1.0 3.0e+03 1.8e+04 5.5e+01 12 39  8 37  4  30 79 17 75 11   348
PCSetUpOnBlocks       30 1.0 9.0615e-03 2.0 2.80e+05 1.1 0.0e+00 0.0e+00 7.0e+00  0  0  0  0  0   0  0  0  0  1   468
PCApply               17 1.0 6.7339e+00 1.0 3.56e+07 1.0 1.3e+04 1.4e+03 3.7e+02 16 10 36 12 26  40 20 74 25 73    83

Memory usage is given in bytes:

Object Type          Creations   Destructions   Memory  Descendants' Mem.

--- Event Stage 0: Main Stage

              Viewer     0              1        336     0

--- Event Stage 1: SetUp

   Distributed array     4              0          0     0
                 Vec    52             18      32088     0
         Vec Scatter    16              0          0     0
           Index Set   112             32      19808     0
   IS L to G Mapping     8              0          0     0
              Matrix    24              0          0     0
  Matrix FD Coloring     4              0          0     0
                SNES     4              0          0     0
       Krylov Solver    10              2       1024     0
      Preconditioner    10              2        664     0
              Viewer     1              0          0     0

--- Event Stage 2: Solve

   Distributed array     0              4      24928     0
                 Vec   136            170    1474968     0
         Vec Scatter     8             24      12960     0
           Index Set    46            126     494032     0
   IS L to G Mapping     0              8      18912     0
              Matrix    10             34   26220032     0
  Matrix FD Coloring     0              4     477456     0
                SNES     0              4       2640     0
       Krylov Solver     6             14      75520     0
      Preconditioner     6             14       6744     0
           Container     4              4        944     0
Average time to get PetscTime(): 3.91111e-06
Average time for MPI_Barrier(): 0.00701251
Average time for zero size MPI_Send(): 0.000452013
#PETSc Option Table entries:
-da_grid_x 33
-da_grid_y 33
-grashof 1.0E01
-lidvelocity 0
-nlevels 2
-prandtl 1
#End o PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 4 sizeof(void*) 4 sizeof(PetscScalar) 8
Configure run at: Wed Mar 25 18:05:57 2009
Configure options: --download-f-blas-lapack=1 --download-mpich=1 --with-debugging=1 --with-cc=gcc --with-fc=g77 PETSC_ARCH=cygwin_blas_lapack --useThreads=0 --with-shared=0
Libraries compiled on Wed Mar 25 18:08:45 CDT 2009 on amoeba 
Machine characteristics: CYGWIN_NT-5.1 amoeba 1.5.25(0.156/4/2) 2008-06-12 19:34 i686 Cygwin 
Using PETSc directory: /home/Friend/software/petsc-3.0.0-p2
Using PETSc arch: cygwin_blas_lapack
Using C compiler: /home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/bin/mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3   
Using Fortran compiler: /home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/bin/mpif77 -Wall -Wno-unused-variable -g   
Using include paths: -I/home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/include -I/home/Friend/software/petsc-3.0.0-p2/include -I/home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/include   
Using C linker: /home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/bin/mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 
Using Fortran linker: /home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/bin/mpif77 -Wall -Wno-unused-variable -g 
Using libraries: -Wl,-rpath,/home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/lib -L/home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/lib -lpetscts -lpetscsnes -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc        -Wl,-rpath,/home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/lib -L/home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/lib -lflapack -lfblas -ldl -L/home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/lib -lpmpich -lmpich -lfrtbegin -lg2c -L/usr/lib/gcc/i686-pc-cygwin/3.4.4 -lcygwin -luser32 -ladvapi32 -lshell32 -lgdi32 -luser32 -ladvapi32 -lkernel32 -ldl 

More information about the petsc-users mailing list