Linear solve converged due to CONVERGED_RTOL iterations 7 1 step time: 3.4139130115509033 norm1 error: 1.0111411107219927E-006 norm inf error: 6.3750093157516246E-003 Summary of Memory Usage in PETSc Maximum (over computational time) process memory: total 6.4179e+08 max 1.6358e+05 min 1.5422e+05 Current process memory: total 6.4179e+08 max 1.6358e+05 min 1.5422e+05 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./test_ksp.exe on a gnu-opt named ¯ÿÿÿ with 4096 processors, by wang11 Tue Oct 4 04:58:50 2016 Using Petsc Development GIT revision: v3.6.3-2059-geab7831 GIT Date: 2016-01-20 10:58:35 -0600 Max Max/Min Avg Total Time (sec): 4.443e+00 1.00317 4.436e+00 Objects: 4.270e+02 1.77917 2.429e+02 Flops: 3.033e+08 1.10481 2.750e+08 1.126e+12 Flops/sec: 6.836e+07 1.10612 6.198e+07 2.539e+11 MPI Messages: 5.868e+03 2.31089 2.611e+03 1.070e+07 MPI Message Lengths: 9.808e+06 1.17191 3.217e+03 3.440e+10 MPI Reductions: 5.270e+02 1.66246 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 4.4362e+00 100.0% 1.1263e+12 100.0% 1.070e+07 100.0% 3.217e+03 100.0% 3.193e+02 60.6% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSidedF 1 1.0 4.8035e-0210.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecTDot 14 1.0 5.9161e-02 1.5 7.34e+06 1.0 0.0e+00 0.0e+00 1.4e+01 1 3 0 0 3 1 3 0 0 4 508187 VecNorm 8 1.0 8.5229e-02 7.6 4.19e+06 1.0 0.0e+00 0.0e+00 8.0e+00 1 2 0 0 2 1 2 0 0 3 201573 VecScale 42 2.0 3.1710e-04 2.7 7.80e+04 1.3 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 795439 VecCopy 9 1.0 2.9515e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecSet 245 1.8 1.5988e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 28 1.0 1.0060e-01 1.1 1.47e+07 1.0 0.0e+00 0.0e+00 0.0e+00 2 5 0 0 0 2 5 0 0 0 597681 VecAYPX 62 1.5 6.3811e-02 1.9 7.11e+06 1.0 0.0e+00 0.0e+00 0.0e+00 1 3 0 0 0 1 3 0 0 0 454328 VecAssemblyBegin 1 1.0 4.8053e-0210.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecAssemblyEnd 1 1.0 1.1683e-0461.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 252 1.7 3.8301e-02 1.8 0.00e+00 0.0 7.1e+06 3.2e+03 0.0e+00 1 0 67 67 0 1 0 67 67 0 0 VecScatterEnd 252 1.7 2.1207e-01 4.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 MatMult 70 1.4 4.7869e-01 1.1 8.72e+07 1.0 2.9e+06 6.5e+03 0.0e+00 10 31 27 55 0 10 31 27 55 0 731256 MatMultAdd 49 1.8 8.4502e-02 1.2 1.44e+07 1.0 8.1e+05 1.2e+03 0.0e+00 2 5 8 3 0 2 5 8 3 0 686166 MatMultTranspose 63 1.6 1.2481e-01 1.5 1.64e+07 1.0 1.0e+06 1.1e+03 0.0e+00 2 6 10 3 0 2 6 10 3 0 530972 MatSolve 7 0.0 2.2941e-03 0.0 1.68e+06 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 46851 MatSOR 98 1.8 7.4330e-01 1.1 8.90e+07 1.0 2.3e+06 8.9e+02 2.2e-01 16 31 21 6 0 16 31 21 6 0 468147 MatLUFactorSym 1 0.0 5.1272e-03 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 1 0.0 2.1876e-02 0.0 1.95e+07 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 57092 MatConvert 1 0.0 2.1219e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatResidual 49 1.8 2.2687e-01 1.2 4.16e+07 1.0 2.4e+06 3.1e+03 0.0e+00 5 14 23 22 0 5 14 23 22 0 719386 MatAssemblyBegin 37 1.6 1.0935e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01 2 0 0 0 5 2 0 0 0 8 0 MatAssemblyEnd 37 1.6 3.8233e-01 1.0 0.00e+00 0.0 1.4e+06 4.5e+02 8.9e+01 8 0 13 2 17 8 0 13 2 28 0 MatGetRowIJ 1 0.0 1.4997e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 2 2.0 7.9824e-02 4.8 0.00e+00 0.0 2.0e+04 4.3e+03 3.1e+00 1 0 0 0 1 1 0 0 0 1 0 MatGetOrdering 1 0.0 6.4588e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatPtAP 8 1.6 1.1264e+00 1.0 4.18e+07 1.0 3.0e+06 3.4e+03 8.6e+01 25 15 28 30 16 25 15 28 30 27 147659 MatPtAPSymbolic 8 1.6 5.6683e-01 1.0 0.00e+00 0.0 1.8e+06 4.3e+03 3.5e+01 12 0 16 22 7 12 0 16 22 11 0 MatPtAPNumeric 8 1.6 5.6430e-01 1.0 4.18e+07 1.0 1.2e+06 2.2e+03 5.0e+01 12 15 11 8 10 12 15 11 8 16 294747 MatRedundantMat 1 0.0 1.8220e-03 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.2e-02 0 0 0 0 0 0 0 0 0 0 0 MatMPIConcateSeq 1 0.0 6.3102e-02 0.0 0.00e+00 0.0 3.3e+03 1.4e+02 2.3e-01 0 0 0 0 0 0 0 0 0 0 0 MatGetLocalMat 8 1.6 5.3521e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 MatGetBrAoCol 8 1.6 4.7969e-02 1.4 0.00e+00 0.0 1.4e+06 4.5e+03 0.0e+00 1 0 13 18 0 1 0 13 18 0 0 MatGetSymTrans 16 1.6 1.4296e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMCoarsen 7 1.8 6.9678e-02 1.0 0.00e+00 0.0 2.0e+05 6.7e+02 4.5e+01 2 0 2 0 8 2 0 2 0 14 0 DMCreateInterpolation 7 1.8 2.9990e-01 1.0 2.05e+06 1.0 3.5e+05 6.0e+02 6.5e+01 7 1 3 1 12 7 1 3 1 20 27620 KSPSetUp 12 2.0 2.7985e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 1 0 0 0 3 1 0 0 0 4 0 KSPSolve 1 1.0 3.4141e+00 1.0 3.03e+08 1.1 1.1e+07 3.2e+03 2.9e+02 77100 99 98 55 77100 99 98 91 329900 PCSetUp 2 2.0 1.7149e+00 1.1 6.34e+07 1.5 3.6e+06 3.0e+03 2.4e+02 36 16 34 31 46 36 16 34 31 75 102547 PCApply 7 1.0 1.4816e+00 1.0 2.15e+08 1.2 6.9e+06 2.5e+03 1.7e+01 33 68 64 50 3 33 68 64 50 5 515597 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Vector 177 177 29081536 0. Vector Scatter 30 30 2466912 0. Matrix 74 74 79719844 0. Matrix Null Space 1 1 592 0. Distributed Mesh 9 9 44928 0. Star Forest Bipartite Graph 18 18 15264 0. Discrete System 9 9 7704 0. Index Set 66 66 1578724 0. IS L to G Mapping 9 9 1369456 0. Krylov Solver 13 13 16200 0. DMKSP interface 7 7 4536 0. Preconditioner 13 13 12960 0. Viewer 1 0 0 0. ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 4.80175e-05 Average time for zero size MPI_Send(): 9.98733e-06 #PETSc Option Table entries: -ksp_converged_reason -ksp_initial_guess_nonzero yes -ksp_norm_type unpreconditioned -ksp_rtol 1e-7 -ksp_type cg -log_view -matptap_scalable -matrap 0 -memory_view -mg_coarse_ksp_type preonly -mg_coarse_pc_telescope_reduction_factor 64 -mg_coarse_pc_type telescope -mg_coarse_telescope_ksp_type preonly -mg_coarse_telescope_mg_coarse_ksp_type preonly -mg_coarse_telescope_mg_coarse_pc_type redundant -mg_coarse_telescope_mg_levels_ksp_max_it 1 -mg_coarse_telescope_mg_levels_ksp_type richardson -mg_coarse_telescope_pc_mg_galerkin -mg_coarse_telescope_pc_mg_levels 4 -mg_coarse_telescope_pc_type mg -mg_levels_ksp_max_it 1 -mg_levels_ksp_type richardson -N 1024 -options_left 1 -pc_mg_galerkin -pc_mg_levels 5 -pc_type mg -ppe_max_iter 20 -px 16 -py 16 -pz 16 #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --known-level1-dcache-size=16384 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=4 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-memcmp-ok=1 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1 --known-mpi-c-double-complex=1 --known-sdot-returns-double=0 --known-snrm2-returns-double=0 --known-has-attribute-aligned=1 --with-batch="1 " --known-mpi-shared="0 " --known-mpi-shared-libraries=0 --known-memcmp-ok --with-blas-lapack-lib=/opt/acml/5.3.1/gfortran64/lib/libacml.a --COPTFLAGS="-march=bdver1 -O3 -ffast-math -fPIC " --FOPTFLAGS="-march=bdver1 -O3 -ffast-math -fPIC " --CXXOPTFLAGS="-march=bdver1 -O3 -ffast-math -fPIC " --with-x="0 " --with-debugging="0 " --with-clib-autodetect="0 " --with-cxxlib-autodetect="0 " --with-fortranlib-autodetect="0 " --with-shared-libraries="0 " --with-mpi-compilers="1 " --with-cc="cc " --with-cxx="CC " --with-fc="ftn " --download-hypre="1 " --download-blacs="1 " --download-scalapack="1 " --download-superlu_dist="1 " --download-metis="1 " --download-parmetis="1 " PETSC_ARCH=gnu-opt ----------------------------------------- Libraries compiled on Tue Feb 16 12:57:46 2016 on h2ologin3 Machine characteristics: Linux-3.0.101-0.46-default-x86_64-with-SuSE-11-x86_64 Using PETSc directory: /mnt/a/u/sciteam/wang11/Sftw/petsc Using PETSc arch: gnu-opt ----------------------------------------- Using C compiler: cc -march=bdver1 -O3 -ffast-math -fPIC ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: ftn -march=bdver1 -O3 -ffast-math -fPIC ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/mnt/a/u/sciteam/wang11/Sftw/petsc/gnu-opt/include -I/mnt/a/u/sciteam/wang11/Sftw/petsc/include -I/mnt/a/u/sciteam/wang11/Sftw/petsc/include -I/mnt/a/u/sciteam/wang11/Sftw/petsc/gnu-opt/include ----------------------------------------- Using C linker: cc Using Fortran linker: ftn Using libraries: -Wl,-rpath,/mnt/a/u/sciteam/wang11/Sftw/petsc/gnu-opt/lib -L/mnt/a/u/sciteam/wang11/Sftw/petsc/gnu-opt/lib -lpetsc -Wl,-rpath,/mnt/a/u/sciteam/wang11/Sftw/petsc/gnu-opt/lib -L/mnt/a/u/sciteam/wang11/Sftw/petsc/gnu-opt/lib -lsuperlu_dist_4.3 -lHYPRE -lscalapack -Wl,-rpath,/opt/acml/5.3.1/gfortran64/lib -L/opt/acml/5.3.1/gfortran64/lib -lacml -lparmetis -lmetis -lssl -lcrypto -ldl ----------------------------------------- #PETSc Option Table entries: -ksp_converged_reason -ksp_initial_guess_nonzero yes -ksp_norm_type unpreconditioned -ksp_rtol 1e-7 -ksp_type cg -log_view -matptap_scalable -matrap 0 -memory_view -mg_coarse_ksp_type preonly -mg_coarse_pc_telescope_reduction_factor 64 -mg_coarse_pc_type telescope -mg_coarse_telescope_ksp_type preonly -mg_coarse_telescope_mg_coarse_ksp_type preonly -mg_coarse_telescope_mg_coarse_pc_type redundant -mg_coarse_telescope_mg_levels_ksp_max_it 1 -mg_coarse_telescope_mg_levels_ksp_type richardson -mg_coarse_telescope_pc_mg_galerkin -mg_coarse_telescope_pc_mg_levels 4 -mg_coarse_telescope_pc_type mg -mg_levels_ksp_max_it 1 -mg_levels_ksp_type richardson -N 1024 -options_left 1 -pc_mg_galerkin -pc_mg_levels 5 -pc_type mg -ppe_max_iter 20 -px 16 -py 16 -pz 16 #End of PETSc Option Table entries There is one unused database option. It is: Option left: name:-ppe_max_iter value: 20 Application 48712749 resources: utime ~20161s, stime ~6093s, Rss ~163584, inblocks ~6310249, outblocks ~4045805