Linear solve converged due to CONVERGED_RTOL iterations 8 1 step time: 6.5219831466674805 norm1 error: 3.9698462589050823E-008 norm inf error: 3.7289930100947552E-003 Summary of Memory Usage in PETSc Maximum (over computational time) process memory: total 7.3027e+09 max 1.6857e+05 min 7.6072e+04 Current process memory: total 7.3027e+09 max 1.6857e+05 min 7.6072e+04 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./test_ksp.exe on a gnu-opt named ¯ÿÿÿ with 65536 processors, by wang11 Fri Sep 30 21:43:33 2016 Using Petsc Development GIT revision: v3.6.3-2059-geab7831 GIT Date: 2016-01-20 10:58:35 -0600 Max Max/Min Avg Total Time (sec): 7.144e+00 1.00132 7.139e+00 Objects: 3.850e+02 1.58436 2.441e+02 Flops: 2.253e+09 118.53368 3.646e+07 2.389e+12 Flops/sec: 3.155e+08 118.57474 5.106e+06 3.346e+11 MPI Messages: 1.095e+04 4.00640 2.818e+03 1.847e+08 MPI Message Lengths: 2.151e+06 1.36793 5.598e+02 1.034e+11 MPI Reductions: 4.870e+02 1.50774 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 7.1385e+00 100.0% 2.3892e+12 100.0% 1.847e+08 100.0% 5.598e+02 100.0% 3.233e+02 66.4% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage BuildTwoSidedF 1 1.0 1.0173e-01 4.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecTDot 16 1.0 3.3837e-02 1.7 5.24e+05 1.0 0.0e+00 0.0e+00 1.6e+01 0 1 0 0 3 0 1 0 0 5 1015408 VecNorm 9 1.0 1.7193e-0125.7 2.95e+05 1.0 0.0e+00 0.0e+00 9.0e+00 2 1 0 0 2 2 1 0 0 3 112416 VecScale 40 1.7 7.2098e-0414.4 1.83e+04 1.4 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1202045 VecCopy 10 1.0 1.1430e-03 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 245 1.6 1.8079e-03 4.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 32 1.0 1.0626e-02 4.7 1.05e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 6467060 VecAYPX 63 1.3 3.5503e-03 3.1 5.15e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 9418763 VecAssemblyBegin 1 1.0 1.0175e-01 4.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecAssemblyEnd 1 1.0 2.0599e-04108.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 254 1.5 3.3792e-02 3.5 0.00e+00 0.0 1.3e+08 5.6e+02 0.0e+00 0 0 69 70 0 0 0 69 70 0 0 VecScatterEnd 254 1.5 4.4103e+0029.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 61 0 0 0 0 61 0 0 0 0 0 MatMult 72 1.3 1.4929e-01 3.1 6.35e+06 1.0 5.2e+07 1.1e+03 0.0e+00 1 17 28 57 0 1 17 28 57 0 2679633 MatMultAdd 48 1.5 2.4098e-02 4.4 1.04e+06 1.0 1.5e+07 2.2e+02 0.0e+00 0 3 8 3 0 0 3 8 3 0 2749821 MatMultTranspose 62 1.4 5.6703e-02 8.7 1.17e+06 1.0 1.8e+07 2.0e+02 0.0e+00 0 3 10 3 0 0 3 10 3 0 1314789 MatSolve 8 0.0 6.3036e-02 0.0 5.01e+07 0.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 406961 MatSOR 96 1.5 6.1739e-02 1.8 6.36e+06 1.1 4.1e+07 1.7e+02 1.2e-01 1 16 22 7 0 1 16 22 7 0 6185560 MatLUFactorSym 1 0.0 1.3201e-01 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 1 0.0 3.0439e+00 0.0 2.18e+09 0.0 0.0e+00 0.0e+00 0.0e+00 0 47 0 0 0 0 47 0 0 0 367132 MatConvert 1 0.0 6.2990e-04 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatResidual 48 1.5 1.1101e-01 4.3 3.09e+06 1.1 4.4e+07 5.4e+02 0.0e+00 0 8 24 23 0 0 8 24 23 0 1680246 MatAssemblyBegin 33 1.4 1.8515e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01 2 0 0 0 5 2 0 0 0 7 0 MatAssemblyEnd 33 1.4 5.6487e-01 1.1 0.00e+00 0.0 2.2e+07 8.1e+01 8.8e+01 7 0 12 2 18 7 0 12 2 27 0 MatGetRowIJ 1 0.0 1.1489e-03 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 2 2.0 1.7173e-01 4.6 0.00e+00 0.0 3.3e+05 2.7e+02 3.0e+00 1 0 0 0 1 1 0 0 0 1 0 MatGetOrdering 1 0.0 8.0841e-03 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatPtAP 7 1.4 7.2457e-01 1.0 2.71e+06 1.1 4.7e+07 6.0e+02 8.5e+01 10 7 25 27 18 10 7 25 27 26 231076 MatPtAPSymbolic 7 1.4 3.8619e-01 1.0 0.00e+00 0.0 2.8e+07 7.5e+02 3.5e+01 5 0 15 20 7 5 0 15 20 11 0 MatPtAPNumeric 7 1.4 3.4655e-01 1.1 2.71e+06 1.1 1.9e+07 3.9e+02 5.0e+01 5 7 10 7 10 5 7 10 7 16 483142 MatRedundantMat 1 0.0 2.3213e-02 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 3.1e-02 0 0 0 0 0 0 0 0 0 0 0 MatMPIConcateSeq 1 0.0 1.5960e-01 0.0 0.00e+00 0.0 2.7e+04 4.0e+01 1.2e-01 0 0 0 0 0 0 0 0 0 0 0 MatGetLocalMat 7 1.4 2.6488e-03 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 7 1.4 3.1543e-02 2.0 0.00e+00 0.0 2.2e+07 7.7e+02 0.0e+00 0 0 12 16 0 0 0 12 16 0 0 MatGetSymTrans 14 1.4 1.2188e-03 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMCoarsen 6 1.5 1.6075e-01 1.0 0.00e+00 0.0 3.2e+06 1.1e+02 4.4e+01 2 0 2 0 9 2 0 2 0 14 0 DMCreateInterpolation 6 1.5 3.4492e-01 1.0 1.30e+05 1.0 5.5e+06 1.1e+02 6.4e+01 5 0 3 1 13 5 0 3 1 20 24014 KSPSetUp 11 1.8 5.5076e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 1 0 0 0 3 1 0 0 0 4 0 KSPSolve 1 1.0 6.5222e+00 1.0 2.25e+09118.5 1.8e+08 5.5e+02 2.9e+02 91100 99 98 60 91100 99 98 91 366313 PCSetUp 2 2.0 6.0733e+00 3.5 2.19e+09815.5 5.7e+07 5.2e+02 2.4e+02 25 54 31 28 49 25 54 31 28 74 212935 PCApply 8 1.0 4.5051e+00 1.0 2.25e+09172.2 1.2e+08 4.4e+02 1.7e+01 63 84 67 53 4 63 84 67 53 5 443673 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Vector 160 160 2256232 0. Vector Scatter 27 27 183568 0. Matrix 66 66 45702228 0. Matrix Null Space 1 1 592 0. Distributed Mesh 8 8 39936 0. Star Forest Bipartite Graph 16 16 13568 0. Discrete System 8 8 6848 0. Index Set 60 60 242668 0. IS L to G Mapping 8 8 109472 0. Krylov Solver 12 12 14920 0. DMKSP interface 6 6 3888 0. Preconditioner 12 12 11984 0. Viewer 1 0 0 0. ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 0.000321198 Average time for zero size MPI_Send(): 1.1118e-05 #PETSc Option Table entries: -ksp_converged_reason -ksp_initial_guess_nonzero yes -ksp_norm_type unpreconditioned -ksp_rtol 1e-7 -ksp_type cg -log_view -matptap_scalable -matrap 0 -memory_view -mg_coarse_ksp_type preonly -mg_coarse_pc_telescope_reduction_factor 128 -mg_coarse_pc_type telescope -mg_coarse_telescope_ksp_type preonly -mg_coarse_telescope_mg_coarse_ksp_type preonly -mg_coarse_telescope_mg_coarse_pc_type redundant -mg_coarse_telescope_mg_levels_ksp_max_it 1 -mg_coarse_telescope_mg_levels_ksp_type richardson -mg_coarse_telescope_pc_mg_galerkin -mg_coarse_telescope_pc_mg_levels 3 -mg_coarse_telescope_pc_type mg -mg_levels_ksp_max_it 1 -mg_levels_ksp_type richardson -N 1024 -options_left 1 -pc_mg_galerkin -pc_mg_levels 5 -pc_type mg -ppe_max_iter 20 -px 32 -py 32 -pz 64 #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --known-level1-dcache-size=16384 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=4 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-memcmp-ok=1 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-int64_t=1 --known-mpi-c-double-complex=1 --known-sdot-returns-double=0 --known-snrm2-returns-double=0 --known-has-attribute-aligned=1 --with-batch="1 " --known-mpi-shared="0 " --known-mpi-shared-libraries=0 --known-memcmp-ok --with-blas-lapack-lib=/opt/acml/5.3.1/gfortran64/lib/libacml.a --COPTFLAGS="-march=bdver1 -O3 -ffast-math -fPIC " --FOPTFLAGS="-march=bdver1 -O3 -ffast-math -fPIC " --CXXOPTFLAGS="-march=bdver1 -O3 -ffast-math -fPIC " --with-x="0 " --with-debugging="0 " --with-clib-autodetect="0 " --with-cxxlib-autodetect="0 " --with-fortranlib-autodetect="0 " --with-shared-libraries="0 " --with-mpi-compilers="1 " --with-cc="cc " --with-cxx="CC " --with-fc="ftn " --download-hypre="1 " --download-blacs="1 " --download-scalapack="1 " --download-superlu_dist="1 " --download-metis="1 " --download-parmetis="1 " PETSC_ARCH=gnu-opt ----------------------------------------- Libraries compiled on Tue Feb 16 12:57:46 2016 on h2ologin3 Machine characteristics: Linux-3.0.101-0.46-default-x86_64-with-SuSE-11-x86_64 Using PETSc directory: /mnt/a/u/sciteam/wang11/Sftw/petsc Using PETSc arch: gnu-opt ----------------------------------------- Using C compiler: cc -march=bdver1 -O3 -ffast-math -fPIC ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: ftn -march=bdver1 -O3 -ffast-math -fPIC ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/mnt/a/u/sciteam/wang11/Sftw/petsc/gnu-opt/include -I/mnt/a/u/sciteam/wang11/Sftw/petsc/include -I/mnt/a/u/sciteam/wang11/Sftw/petsc/include -I/mnt/a/u/sciteam/wang11/Sftw/petsc/gnu-opt/include ----------------------------------------- Using C linker: cc Using Fortran linker: ftn Using libraries: -Wl,-rpath,/mnt/a/u/sciteam/wang11/Sftw/petsc/gnu-opt/lib -L/mnt/a/u/sciteam/wang11/Sftw/petsc/gnu-opt/lib -lpetsc -Wl,-rpath,/mnt/a/u/sciteam/wang11/Sftw/petsc/gnu-opt/lib -L/mnt/a/u/sciteam/wang11/Sftw/petsc/gnu-opt/lib -lsuperlu_dist_4.3 -lHYPRE -lscalapack -Wl,-rpath,/opt/acml/5.3.1/gfortran64/lib -L/opt/acml/5.3.1/gfortran64/lib -lacml -lparmetis -lmetis -lssl -lcrypto -ldl ----------------------------------------- #PETSc Option Table entries: -ksp_converged_reason -ksp_initial_guess_nonzero yes -ksp_norm_type unpreconditioned -ksp_rtol 1e-7 -ksp_type cg -log_view -matptap_scalable -matrap 0 -memory_view -mg_coarse_ksp_type preonly -mg_coarse_pc_telescope_reduction_factor 128 -mg_coarse_pc_type telescope -mg_coarse_telescope_ksp_type preonly -mg_coarse_telescope_mg_coarse_ksp_type preonly -mg_coarse_telescope_mg_coarse_pc_type redundant -mg_coarse_telescope_mg_levels_ksp_max_it 1 -mg_coarse_telescope_mg_levels_ksp_type richardson -mg_coarse_telescope_pc_mg_galerkin -mg_coarse_telescope_pc_mg_levels 3 -mg_coarse_telescope_pc_type mg -mg_levels_ksp_max_it 1 -mg_levels_ksp_type richardson -N 1024 -options_left 1 -pc_mg_galerkin -pc_mg_levels 5 -pc_type mg -ppe_max_iter 20 -px 32 -py 32 -pz 64 #End of PETSc Option Table entries There is one unused database option. It is: Option left: name:-ppe_max_iter value: 20 Application 48685337 resources: utime ~714792s, stime ~77932s, Rss ~168568, inblocks ~73804863, outblocks ~64402237