dmmg_grid_sequence: KSP not functional?

Xuefeng Li li at loyno.edu
Wed Apr 22 21:44:29 CDT 2009


Hello, everyone.

I am running src/snes/examples/tutorials/ex19.c
to test the use of multi-level dmmg in Petsc with
option -dmmg_grid_sequence.

In all the tests I've run, I observed that on the
coarsest level, KSPSolve() always converges in one
iteration with reason 4 and residual 0. And option
-ksp_monitor is not producing any output on this level.

Attached is an output from one test run with a two-level
dmmg, refine factor 2 and mesh 33x33 (coarse)/65x65 (fine).
The line containing "step in LS:" is printed from
src/snes/impls/ls/ls.c to report KSP activities
regarding KSP converge reason (kreason) on every iteration.

It feels like KSP for the coarsest level is either not
functional or a direct solver, whereas KSP for finer
levels are iterative solvers. What is the KSP type
associated with SNES on the coarsest level? Is the above
observation by design in Petsc?

Regards,

--Xuefeng Li, (504)865-3340(phone)
   Like floating clouds, the heart rests easy
   Like flowing water, the spirit stays free
        http://www.loyno.edu/~li/home
   New Orleans, Louisiana (504)865-2051(fax)
-------------- next part --------------
lid velocity = 0, prandtl # = 1, grashof # = 10
    0 SNES Function norm 3.027343750000e-01 
       1-st step in LS: kiter=   1; kreason=4; kres= 0.000000000000e+00;
    1 SNES Function norm 2.715281504674e-04 
       2-nd step in LS: kiter=   1; kreason=4; kres= 0.000000000000e+00;
    2 SNES Function norm 6.154554861599e-11 
  0 SNES Function norm 1.459106922253e-01 
    0 KSP Residual norm 1.459106922253e-01 
    1 KSP Residual norm 1.371836659970e-01 
    2 KSP Residual norm 2.540661848461e-02 
    3 KSP Residual norm 6.181889597814e-03 
    4 KSP Residual norm 1.572134257147e-03 
    5 KSP Residual norm 2.287065092537e-04 
    6 KSP Residual norm 3.184241572285e-05 
    7 KSP Residual norm 4.332199061241e-06 
    8 KSP Residual norm 8.673786649533e-07 
       1-st step in LS: kiter=   8; kreason=2; kres= 8.673786649533e-07;
  1 SNES Function norm 9.473925324366e-07 
    0 KSP Residual norm 9.473925324366e-07 
    1 KSP Residual norm 1.125539671715e-07 
    2 KSP Residual norm 1.532200439274e-08 
    3 KSP Residual norm 4.962833769150e-09 
    4 KSP Residual norm 1.093531357780e-09 
    5 KSP Residual norm 1.096111146965e-10 
    6 KSP Residual norm 2.341320410628e-11 
    7 KSP Residual norm 3.853426393455e-12 
       2-nd step in LS: kiter=   7; kreason=2; kres= 3.853426393455e-12;
  2 SNES Function norm 3.854044974807e-12 
Number of Newton iterations = 2
************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

./ex19 on a cygwin_bl named AMOEBA.no.cox.net with 16 processors, by Friend Wed Apr 22 21:12:05 2009
Using Petsc Release Version 3.0.0, Patch 2, Wed Jan 14 22:57:05 CST 2009

                         Max       Max/Min        Avg      Total 
Time (sec):           4.090e+01      1.00345   4.085e+01
Objects:              4.610e+02      1.00000   4.610e+02
Flops:                3.517e+08      1.00787   3.497e+08  5.596e+09
Flops/sec:            8.598e+06      1.00727   8.563e+06  1.370e+08
Memory:               1.730e+07      1.01353              2.744e+08
MPI Messages:         2.664e+03      1.37594   2.313e+03  3.700e+04
MPI Message Lengths:  9.758e+06      1.12054   3.939e+03  1.458e+08
MPI Reductions:       8.862e+01      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 2.4873e-01   0.6%  0.0000e+00   0.0%  1.500e+01   0.0%  1.621e-03        0.0%  0.000e+00   0.0% 
 1:           SetUp: 2.6383e+00   6.5%  1.3020e+05   0.0%  5.790e+02   1.6%  4.420e-01        0.0%  1.400e+02   9.9% 
 2:           Solve: 1.6828e+01  41.2%  2.7978e+09  50.0%  1.792e+04  48.4%  1.969e+03       50.0%  5.030e+02  35.5% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------


      ##########################################################
      #                                                        #
      #                          WARNING!!!                    #
      #                                                        #
      #   This code was compiled with a debugging option,      #
      #   To get timing results run config/configure.py        #
      #   using --with-debugging=no, the performance will      #
      #   be generally two or three times faster.              #
      #                                                        #
      ##########################################################


Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

PetscBarrier           2 1.0 1.0574e-0111.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0  29  0  0  0  0     0

--- Event Stage 1: SetUp

VecSet                 3 1.0 3.1624e-0424.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterBegin        1 1.0 5.5172e-03790.0 0.00e+00 0.0 3.3e+01 1.3e+02 0.0e+00  0  0  0  0  0   0  0  6 25  0     0
VecScatterEnd          1 1.0 9.1836e-032191.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMultTranspose       1 1.0 2.3881e-02 2.8 5.00e+03 1.1 3.3e+01 1.3e+02 2.0e+00  0  0  0  0  0   1 58  6 25  1     3
MatAssemblyBegin       3 1.0 9.4317e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00  0  0  0  0  0   3  0  0  0  4     0
MatAssemblyEnd         3 1.0 4.5032e-01 1.0 0.00e+00 0.0 2.6e+02 2.3e+01 3.0e+01  1  0  1  0  2  17  0 45 37 21     0
MatFDColorCreate       2 1.0 7.6469e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+01  2  0  0  0  3  29  0  0  0 29     0

--- Event Stage 2: Solve

VecDot                 4 1.0 7.6557e-02 1.8 5.92e+03 1.2 0.0e+00 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  1     1
VecMDot               45 1.0 7.7941e-01 2.0 2.17e+05 1.1 0.0e+00 0.0e+00 4.5e+01  1  0  0  0  3   3  0  0  0  9     4
VecNorm              123 1.0 1.7008e+00 1.4 2.69e+05 1.1 0.0e+00 0.0e+00 1.2e+02  3  0  0  0  8   8  0  0  0 23     2
VecScale              77 1.0 1.7737e-03 2.5 8.90e+04 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   734
VecCopy              164 1.0 1.1524e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               250 1.0 2.4710e-03 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY              182 1.0 2.2335e-03 1.8 2.88e+05 1.2 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1856
VecAYPX               15 1.0 3.7295e-04 2.5 3.47e+04 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1359
VecWAXPY               4 1.0 6.9087e-0420.1 2.96e+03 1.2 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    62
VecMAXPY              77 1.0 2.5467e-03 2.2 3.21e+05 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1845
VecPointwiseMult       2 1.0 1.1175e-05 1.3 6.48e+02 1.3 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   780
VecScatterBegin      233 1.0 1.2532e+00 2.9 0.00e+00 0.0 1.5e+04 1.3e+03 0.0e+00  2  0 40 13  0   5  0 83 25  0     0
VecScatterEnd        233 1.0 3.2011e+00 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  7  0  0  0  0  16  0  0  0  0     0
VecReduceArith         4 1.0 6.7606e-05 1.7 5.92e+03 1.2 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1257
VecReduceComm          2 1.0 5.9115e-02 4.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNormalize          60 1.0 9.8417e-01 1.8 2.08e+05 1.1 0.0e+00 0.0e+00 6.0e+01  2  0  0  0  4   4  0  0  0 12     3
MatMult              110 1.0 2.0059e+00 1.6 4.13e+06 1.1 5.0e+03 3.0e+02 0.0e+00  4  1 14  1  0  10  2 28  2  0    30
MatMultAdd            15 1.0 5.0544e-01 9.0 7.50e+04 1.1 5.0e+02 1.3e+02 0.0e+00  1  0  1  0  0   2  0  3  0  0     2
MatMultTranspose      32 1.0 7.8889e-01 1.4 1.60e+05 1.1 1.1e+03 1.3e+02 6.4e+01  2  0  3  0  5   4  0  6  0 13     3
MatSolve             122 1.0 1.4828e-01 1.3 3.11e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  9  0  0  0   1 18  0  0  0  3320
MatLUFactorSym         2 1.0 3.5973e-01 5.5 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00  0  0  0  0  0   1  0  0  0  1     0
MatLUFactorNum         6 1.0 1.6635e+00 2.0 1.37e+08 1.0 0.0e+00 0.0e+00 0.0e+00  3 39  0  0  0   8 79  0  0  0  1322
MatILUFactorSym        1 1.0 2.5755e-03 4.6 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00  0  0  0  0  0   0  0  0  0  1     0
MatAssemblyBegin      10 1.0 3.1545e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  1  0  0  0  1   1  0  0  0  2     0
MatAssemblyEnd        10 1.0 7.6078e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00  0  0  0  0  0   0  0  0  0  1     0
MatGetRowIJ            3 1.0 9.6632e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         3 1.0 5.7125e-02 6.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  1   0  0  0  0  2     0
MatZeroEntries         6 1.0 3.1484e-04 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatFDColorApply        6 1.0 3.9806e-01 1.7 1.74e+06 1.2 0.0e+00 0.0e+00 2.4e+01  1  0  0  0  2   2  1  0  0  5    62
MatFDColorFunc       126 1.0 4.4332e-03 1.5 1.59e+06 1.2 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  0   0  1  0  0  0  5096
MatGetRedundant        4 1.0 3.7212e+00 3.9 0.00e+00 0.0 1.9e+03 2.7e+04 4.0e+00  6  0  5 36  0  14  0 11 72  1     0
SNESSolve              2 1.0 1.6165e+01 1.0 1.76e+08 1.0 1.8e+04 4.1e+03 5.0e+02 40 50 48 50 35  96100 99100100   173
SNESLineSearch         4 1.0 3.4570e-01 1.3 1.98e+05 1.2 3.8e+02 2.4e+02 1.6e+01  1  0  1  0  1   2  0  2  0  3     8
SNESFunctionEval       6 1.0 9.3704e-02 2.9 9.32e+04 1.2 2.9e+02 2.4e+02 2.0e+00  0  0  1  0  0   0  0  2  0  0    14
SNESJacobianEval       4 1.0 5.3182e-01 1.1 1.75e+06 1.2 3.5e+02 2.0e+02 2.8e+01  1  0  1  0  2   3  1  2  0  6    47
KSPGMRESOrthog        45 1.0 7.8182e-01 2.0 4.35e+05 1.1 0.0e+00 0.0e+00 4.5e+01  1  0  0  0  3   3  0  0  0  9     8
KSPSetup              14 1.0 5.7125e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   3  0  0  0  0     0
KSPSolve               4 1.0 1.5273e+01 1.0 1.74e+08 1.0 1.7e+04 4.3e+03 4.5e+02 37 49 46 50 32  91 99 95100 90   181
PCSetUp                6 1.0 6.3247e+00 1.6 1.37e+08 1.0 3.0e+03 1.8e+04 5.5e+01 12 39  8 37  4  30 79 17 75 11   348
PCSetUpOnBlocks       30 1.0 9.0615e-03 2.0 2.80e+05 1.1 0.0e+00 0.0e+00 7.0e+00  0  0  0  0  0   0  0  0  0  1   468
PCApply               17 1.0 6.7339e+00 1.0 3.56e+07 1.0 1.3e+04 1.4e+03 3.7e+02 16 10 36 12 26  40 20 74 25 73    83
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions   Memory  Descendants' Mem.

--- Event Stage 0: Main Stage

              Viewer     0              1        336     0

--- Event Stage 1: SetUp

   Distributed array     4              0          0     0
                 Vec    52             18      32088     0
         Vec Scatter    16              0          0     0
           Index Set   112             32      19808     0
   IS L to G Mapping     8              0          0     0
              Matrix    24              0          0     0
  Matrix FD Coloring     4              0          0     0
                SNES     4              0          0     0
       Krylov Solver    10              2       1024     0
      Preconditioner    10              2        664     0
              Viewer     1              0          0     0

--- Event Stage 2: Solve

   Distributed array     0              4      24928     0
                 Vec   136            170    1474968     0
         Vec Scatter     8             24      12960     0
           Index Set    46            126     494032     0
   IS L to G Mapping     0              8      18912     0
              Matrix    10             34   26220032     0
  Matrix FD Coloring     0              4     477456     0
                SNES     0              4       2640     0
       Krylov Solver     6             14      75520     0
      Preconditioner     6             14       6744     0
           Container     4              4        944     0
========================================================================================================================
Average time to get PetscTime(): 3.91111e-06
Average time for MPI_Barrier(): 0.00701251
Average time for zero size MPI_Send(): 0.000452013
#PETSc Option Table entries:
-da_grid_x 33
-da_grid_y 33
-dmmg_grid_sequence
-grashof 1.0E01
-ksp_monitor
-lidvelocity 0
-log_summary
-nlevels 2
-prandtl 1
-snes_monitor
#End o PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 4 sizeof(void*) 4 sizeof(PetscScalar) 8
Configure run at: Wed Mar 25 18:05:57 2009
Configure options: --download-f-blas-lapack=1 --download-mpich=1 --with-debugging=1 --with-cc=gcc --with-fc=g77 PETSC_ARCH=cygwin_blas_lapack --useThreads=0 --with-shared=0
-----------------------------------------
Libraries compiled on Wed Mar 25 18:08:45 CDT 2009 on amoeba 
Machine characteristics: CYGWIN_NT-5.1 amoeba 1.5.25(0.156/4/2) 2008-06-12 19:34 i686 Cygwin 
Using PETSc directory: /home/Friend/software/petsc-3.0.0-p2
Using PETSc arch: cygwin_blas_lapack
-----------------------------------------
Using C compiler: /home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/bin/mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3   
Using Fortran compiler: /home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/bin/mpif77 -Wall -Wno-unused-variable -g   
-----------------------------------------
Using include paths: -I/home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/include -I/home/Friend/software/petsc-3.0.0-p2/include -I/home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/include   
------------------------------------------
Using C linker: /home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/bin/mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 
Using Fortran linker: /home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/bin/mpif77 -Wall -Wno-unused-variable -g 
Using libraries: -Wl,-rpath,/home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/lib -L/home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/lib -lpetscts -lpetscsnes -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc        -Wl,-rpath,/home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/lib -L/home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/lib -lflapack -lfblas -ldl -L/home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/lib -lpmpich -lmpich -lfrtbegin -lg2c -L/usr/lib/gcc/i686-pc-cygwin/3.4.4 -lcygwin -luser32 -ladvapi32 -lshell32 -lgdi32 -luser32 -ladvapi32 -lkernel32 -ldl 
------------------------------------------


More information about the petsc-users mailing list