dmmg_grid_sequence: KSP not functional?
Xuefeng Li
li at loyno.edu
Wed Apr 22 21:44:29 CDT 2009
Hello, everyone.
I am running src/snes/examples/tutorials/ex19.c
to test the use of multi-level dmmg in Petsc with
option -dmmg_grid_sequence.
In all the tests I've run, I observed that on the
coarsest level, KSPSolve() always converges in one
iteration with reason 4 and residual 0. And option
-ksp_monitor is not producing any output on this level.
Attached is an output from one test run with a two-level
dmmg, refine factor 2 and mesh 33x33 (coarse)/65x65 (fine).
The line containing "step in LS:" is printed from
src/snes/impls/ls/ls.c to report KSP activities
regarding KSP converge reason (kreason) on every iteration.
It feels like KSP for the coarsest level is either not
functional or a direct solver, whereas KSP for finer
levels are iterative solvers. What is the KSP type
associated with SNES on the coarsest level? Is the above
observation by design in Petsc?
Regards,
--Xuefeng Li, (504)865-3340(phone)
Like floating clouds, the heart rests easy
Like flowing water, the spirit stays free
http://www.loyno.edu/~li/home
New Orleans, Louisiana (504)865-2051(fax)
-------------- next part --------------
lid velocity = 0, prandtl # = 1, grashof # = 10
0 SNES Function norm 3.027343750000e-01
1-st step in LS: kiter= 1; kreason=4; kres= 0.000000000000e+00;
1 SNES Function norm 2.715281504674e-04
2-nd step in LS: kiter= 1; kreason=4; kres= 0.000000000000e+00;
2 SNES Function norm 6.154554861599e-11
0 SNES Function norm 1.459106922253e-01
0 KSP Residual norm 1.459106922253e-01
1 KSP Residual norm 1.371836659970e-01
2 KSP Residual norm 2.540661848461e-02
3 KSP Residual norm 6.181889597814e-03
4 KSP Residual norm 1.572134257147e-03
5 KSP Residual norm 2.287065092537e-04
6 KSP Residual norm 3.184241572285e-05
7 KSP Residual norm 4.332199061241e-06
8 KSP Residual norm 8.673786649533e-07
1-st step in LS: kiter= 8; kreason=2; kres= 8.673786649533e-07;
1 SNES Function norm 9.473925324366e-07
0 KSP Residual norm 9.473925324366e-07
1 KSP Residual norm 1.125539671715e-07
2 KSP Residual norm 1.532200439274e-08
3 KSP Residual norm 4.962833769150e-09
4 KSP Residual norm 1.093531357780e-09
5 KSP Residual norm 1.096111146965e-10
6 KSP Residual norm 2.341320410628e-11
7 KSP Residual norm 3.853426393455e-12
2-nd step in LS: kiter= 7; kreason=2; kres= 3.853426393455e-12;
2 SNES Function norm 3.854044974807e-12
Number of Newton iterations = 2
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./ex19 on a cygwin_bl named AMOEBA.no.cox.net with 16 processors, by Friend Wed Apr 22 21:12:05 2009
Using Petsc Release Version 3.0.0, Patch 2, Wed Jan 14 22:57:05 CST 2009
Max Max/Min Avg Total
Time (sec): 4.090e+01 1.00345 4.085e+01
Objects: 4.610e+02 1.00000 4.610e+02
Flops: 3.517e+08 1.00787 3.497e+08 5.596e+09
Flops/sec: 8.598e+06 1.00727 8.563e+06 1.370e+08
Memory: 1.730e+07 1.01353 2.744e+08
MPI Messages: 2.664e+03 1.37594 2.313e+03 3.700e+04
MPI Message Lengths: 9.758e+06 1.12054 3.939e+03 1.458e+08
MPI Reductions: 8.862e+01 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 2.4873e-01 0.6% 0.0000e+00 0.0% 1.500e+01 0.0% 1.621e-03 0.0% 0.000e+00 0.0%
1: SetUp: 2.6383e+00 6.5% 1.3020e+05 0.0% 5.790e+02 1.6% 4.420e-01 0.0% 1.400e+02 9.9%
2: Solve: 1.6828e+01 41.2% 2.7978e+09 50.0% 1.792e+04 48.4% 1.969e+03 50.0% 5.030e+02 35.5%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
##########################################################
# #
# WARNING!!! #
# #
# This code was compiled with a debugging option, #
# To get timing results run config/configure.py #
# using --with-debugging=no, the performance will #
# be generally two or three times faster. #
# #
##########################################################
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
PetscBarrier 2 1.0 1.0574e-0111.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 29 0 0 0 0 0
--- Event Stage 1: SetUp
VecSet 3 1.0 3.1624e-0424.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 1 1.0 5.5172e-03790.0 0.00e+00 0.0 3.3e+01 1.3e+02 0.0e+00 0 0 0 0 0 0 0 6 25 0 0
VecScatterEnd 1 1.0 9.1836e-032191.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatMultTranspose 1 1.0 2.3881e-02 2.8 5.00e+03 1.1 3.3e+01 1.3e+02 2.0e+00 0 0 0 0 0 1 58 6 25 1 3
MatAssemblyBegin 3 1.0 9.4317e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 0 3 0 0 0 4 0
MatAssemblyEnd 3 1.0 4.5032e-01 1.0 0.00e+00 0.0 2.6e+02 2.3e+01 3.0e+01 1 0 1 0 2 17 0 45 37 21 0
MatFDColorCreate 2 1.0 7.6469e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+01 2 0 0 0 3 29 0 0 0 29 0
--- Event Stage 2: Solve
VecDot 4 1.0 7.6557e-02 1.8 5.92e+03 1.2 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 1 1
VecMDot 45 1.0 7.7941e-01 2.0 2.17e+05 1.1 0.0e+00 0.0e+00 4.5e+01 1 0 0 0 3 3 0 0 0 9 4
VecNorm 123 1.0 1.7008e+00 1.4 2.69e+05 1.1 0.0e+00 0.0e+00 1.2e+02 3 0 0 0 8 8 0 0 0 23 2
VecScale 77 1.0 1.7737e-03 2.5 8.90e+04 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 734
VecCopy 164 1.0 1.1524e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 250 1.0 2.4710e-03 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAXPY 182 1.0 2.2335e-03 1.8 2.88e+05 1.2 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1856
VecAYPX 15 1.0 3.7295e-04 2.5 3.47e+04 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1359
VecWAXPY 4 1.0 6.9087e-0420.1 2.96e+03 1.2 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 62
VecMAXPY 77 1.0 2.5467e-03 2.2 3.21e+05 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1845
VecPointwiseMult 2 1.0 1.1175e-05 1.3 6.48e+02 1.3 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 780
VecScatterBegin 233 1.0 1.2532e+00 2.9 0.00e+00 0.0 1.5e+04 1.3e+03 0.0e+00 2 0 40 13 0 5 0 83 25 0 0
VecScatterEnd 233 1.0 3.2011e+00 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 7 0 0 0 0 16 0 0 0 0 0
VecReduceArith 4 1.0 6.7606e-05 1.7 5.92e+03 1.2 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1257
VecReduceComm 2 1.0 5.9115e-02 4.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecNormalize 60 1.0 9.8417e-01 1.8 2.08e+05 1.1 0.0e+00 0.0e+00 6.0e+01 2 0 0 0 4 4 0 0 0 12 3
MatMult 110 1.0 2.0059e+00 1.6 4.13e+06 1.1 5.0e+03 3.0e+02 0.0e+00 4 1 14 1 0 10 2 28 2 0 30
MatMultAdd 15 1.0 5.0544e-01 9.0 7.50e+04 1.1 5.0e+02 1.3e+02 0.0e+00 1 0 1 0 0 2 0 3 0 0 2
MatMultTranspose 32 1.0 7.8889e-01 1.4 1.60e+05 1.1 1.1e+03 1.3e+02 6.4e+01 2 0 3 0 5 4 0 6 0 13 3
MatSolve 122 1.0 1.4828e-01 1.3 3.11e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 9 0 0 0 1 18 0 0 0 3320
MatLUFactorSym 2 1.0 3.5973e-01 5.5 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 0 1 0 0 0 1 0
MatLUFactorNum 6 1.0 1.6635e+00 2.0 1.37e+08 1.0 0.0e+00 0.0e+00 0.0e+00 3 39 0 0 0 8 79 0 0 0 1322
MatILUFactorSym 1 1.0 2.5755e-03 4.6 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 1 0
MatAssemblyBegin 10 1.0 3.1545e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 1 0 0 0 1 1 0 0 0 2 0
MatAssemblyEnd 10 1.0 7.6078e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 0 0 0 0 0 1 0
MatGetRowIJ 3 1.0 9.6632e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetOrdering 3 1.0 5.7125e-02 6.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 1 0 0 0 0 2 0
MatZeroEntries 6 1.0 3.1484e-04 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatFDColorApply 6 1.0 3.9806e-01 1.7 1.74e+06 1.2 0.0e+00 0.0e+00 2.4e+01 1 0 0 0 2 2 1 0 0 5 62
MatFDColorFunc 126 1.0 4.4332e-03 1.5 1.59e+06 1.2 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 1 0 0 0 5096
MatGetRedundant 4 1.0 3.7212e+00 3.9 0.00e+00 0.0 1.9e+03 2.7e+04 4.0e+00 6 0 5 36 0 14 0 11 72 1 0
SNESSolve 2 1.0 1.6165e+01 1.0 1.76e+08 1.0 1.8e+04 4.1e+03 5.0e+02 40 50 48 50 35 96100 99100100 173
SNESLineSearch 4 1.0 3.4570e-01 1.3 1.98e+05 1.2 3.8e+02 2.4e+02 1.6e+01 1 0 1 0 1 2 0 2 0 3 8
SNESFunctionEval 6 1.0 9.3704e-02 2.9 9.32e+04 1.2 2.9e+02 2.4e+02 2.0e+00 0 0 1 0 0 0 0 2 0 0 14
SNESJacobianEval 4 1.0 5.3182e-01 1.1 1.75e+06 1.2 3.5e+02 2.0e+02 2.8e+01 1 0 1 0 2 3 1 2 0 6 47
KSPGMRESOrthog 45 1.0 7.8182e-01 2.0 4.35e+05 1.1 0.0e+00 0.0e+00 4.5e+01 1 0 0 0 3 3 0 0 0 9 8
KSPSetup 14 1.0 5.7125e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 3 0 0 0 0 0
KSPSolve 4 1.0 1.5273e+01 1.0 1.74e+08 1.0 1.7e+04 4.3e+03 4.5e+02 37 49 46 50 32 91 99 95100 90 181
PCSetUp 6 1.0 6.3247e+00 1.6 1.37e+08 1.0 3.0e+03 1.8e+04 5.5e+01 12 39 8 37 4 30 79 17 75 11 348
PCSetUpOnBlocks 30 1.0 9.0615e-03 2.0 2.80e+05 1.1 0.0e+00 0.0e+00 7.0e+00 0 0 0 0 0 0 0 0 0 1 468
PCApply 17 1.0 6.7339e+00 1.0 3.56e+07 1.0 1.3e+04 1.4e+03 3.7e+02 16 10 36 12 26 40 20 74 25 73 83
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
--- Event Stage 0: Main Stage
Viewer 0 1 336 0
--- Event Stage 1: SetUp
Distributed array 4 0 0 0
Vec 52 18 32088 0
Vec Scatter 16 0 0 0
Index Set 112 32 19808 0
IS L to G Mapping 8 0 0 0
Matrix 24 0 0 0
Matrix FD Coloring 4 0 0 0
SNES 4 0 0 0
Krylov Solver 10 2 1024 0
Preconditioner 10 2 664 0
Viewer 1 0 0 0
--- Event Stage 2: Solve
Distributed array 0 4 24928 0
Vec 136 170 1474968 0
Vec Scatter 8 24 12960 0
Index Set 46 126 494032 0
IS L to G Mapping 0 8 18912 0
Matrix 10 34 26220032 0
Matrix FD Coloring 0 4 477456 0
SNES 0 4 2640 0
Krylov Solver 6 14 75520 0
Preconditioner 6 14 6744 0
Container 4 4 944 0
========================================================================================================================
Average time to get PetscTime(): 3.91111e-06
Average time for MPI_Barrier(): 0.00701251
Average time for zero size MPI_Send(): 0.000452013
#PETSc Option Table entries:
-da_grid_x 33
-da_grid_y 33
-dmmg_grid_sequence
-grashof 1.0E01
-ksp_monitor
-lidvelocity 0
-log_summary
-nlevels 2
-prandtl 1
-snes_monitor
#End o PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 4 sizeof(void*) 4 sizeof(PetscScalar) 8
Configure run at: Wed Mar 25 18:05:57 2009
Configure options: --download-f-blas-lapack=1 --download-mpich=1 --with-debugging=1 --with-cc=gcc --with-fc=g77 PETSC_ARCH=cygwin_blas_lapack --useThreads=0 --with-shared=0
-----------------------------------------
Libraries compiled on Wed Mar 25 18:08:45 CDT 2009 on amoeba
Machine characteristics: CYGWIN_NT-5.1 amoeba 1.5.25(0.156/4/2) 2008-06-12 19:34 i686 Cygwin
Using PETSc directory: /home/Friend/software/petsc-3.0.0-p2
Using PETSc arch: cygwin_blas_lapack
-----------------------------------------
Using C compiler: /home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/bin/mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3
Using Fortran compiler: /home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/bin/mpif77 -Wall -Wno-unused-variable -g
-----------------------------------------
Using include paths: -I/home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/include -I/home/Friend/software/petsc-3.0.0-p2/include -I/home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/include
------------------------------------------
Using C linker: /home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/bin/mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3
Using Fortran linker: /home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/bin/mpif77 -Wall -Wno-unused-variable -g
Using libraries: -Wl,-rpath,/home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/lib -L/home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/lib -lpetscts -lpetscsnes -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc -Wl,-rpath,/home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/lib -L/home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/lib -lflapack -lfblas -ldl -L/home/Friend/software/petsc-3.0.0-p2/cygwin_blas_lapack/lib -lpmpich -lmpich -lfrtbegin -lg2c -L/usr/lib/gcc/i686-pc-cygwin/3.4.4 -lcygwin -luser32 -ladvapi32 -lshell32 -lgdi32 -luser32 -ladvapi32 -lkernel32 -ldl
------------------------------------------
More information about the petsc-users
mailing list