ModuleCmd_Switch.c(179):ERROR:152: Module 'PrgEnv-intel/6.0.3' is currently not loaded Level 1 domain size (m) 1e+04 x 1e+04 x 1e+03, num elements 80 x 80 x 9 (57600), size (m) 125. x 125. x 125. Level 0 domain size (m) 1e+04 x 1e+04 x 1e+03, num elements 40 x 40 x 5 (8000), size (m) 250. x 250. x 250. Solution statistics after solve: Full CONVERGED_FNORM_RELATIVE: Number of SNES iterations = 7, total linear iterations = 26 |X|_2 3996.69 -3.3156e-11 <= u <= 24.479 -3.10124 <= v <= 3.10124 3.50294e-13 <= c <= 24.479 Surface statistics: u in [1.214710e+01, 2.447899e+01] mean 2.010957e+01 Global eta range 2.96251e+10 to 9.2273e+12 converged range 2.96251e+10 to 2.44973e+12 Global beta2 range 1e+100 to 0. converged range 1e+100 to 0. Wall-clock time: 1.238e+02 seconds Degrees-of-freedom: 115200 FLOPS: 2.217e+12 L1 misses: 1.506e+10 Intensity: 1.473e+02 Rate: 9.308e+02 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ex48cori on a arch-cori-c-opt named nid11247 with 64 processors, by jychang Tue Apr 4 11:23:23 2017 Using Petsc Development GIT revision: v3.7.5-3418-ge372536 GIT Date: 2017-03-30 13:35:15 -0500 Max Max/Min Avg Total Time (sec): 1.247e+02 1.00003 1.247e+02 Objects: 1.640e+02 1.00000 1.640e+02 Flop: 3.465e+10 1.00000 3.465e+10 2.217e+12 Flop/sec: 2.777e+08 1.00003 2.777e+08 1.778e+10 MPI Messages: 5.313e+03 1.00000 5.313e+03 3.400e+05 MPI Message Lengths: 1.081e+07 1.00000 2.034e+03 6.916e+08 MPI Reductions: 4.440e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.2473e+02 100.0% 2.2174e+12 100.0% 3.400e+05 100.0% 2.034e+03 100.0% 4.430e+02 99.8% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecDot 7 1.0 3.2120e-03 1.9 2.52e+04 1.0 0.0e+00 0.0e+00 7.0e+00 0 0 0 0 2 0 0 0 0 2 502 VecMDot 96 1.0 1.6959e-01 1.3 1.61e+06 1.0 0.0e+00 0.0e+00 9.6e+01 0 0 0 0 22 0 0 0 0 22 607 VecNorm 125 1.0 2.7249e+00 4.6 4.50e+05 1.0 0.0e+00 0.0e+00 1.2e+02 2 0 0 0 28 2 0 0 0 28 11 VecScale 110 1.0 4.0441e-01 1.0 1.98e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 31 VecCopy 61 1.0 3.2041e-02 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 270 1.0 1.6516e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 14 1.0 3.0682e-01 1.1 5.04e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 11 VecAYPX 264 1.0 6.6946e-03 1.4 5.94e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 5679 VecAXPBYCZ 132 1.0 5.2545e-03 1.9 1.19e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 14470 VecWAXPY 7 1.0 8.1062e-05 1.5 1.26e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 9948 VecMAXPY 110 1.0 4.2089e-02 1.1 1.95e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2972 VecPointwiseMult 7 1.0 7.3166e-02 1.0 1.75e+03 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2 VecScatterBegin 471 1.0 6.2749e-01 1.0 0.00e+00 0.0 3.2e+05 1.3e+03 0.0e+00 0 0 93 58 0 0 0 93 58 0 0 VecScatterEnd 471 1.0 3.4508e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecReduceArith 14 1.0 1.3230e-02 1.4 5.04e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 244 VecReduceComm 7 1.0 1.7258e-02 3.7 0.00e+00 0.0 0.0e+00 0.0e+00 7.0e+00 0 0 0 0 2 0 0 0 0 2 0 VecNormalize 110 1.0 2.6996e+00 4.9 5.94e+05 1.0 0.0e+00 0.0e+00 1.1e+02 2 0 0 0 25 2 0 0 0 25 14 MatMult 301 1.0 6.8363e-01 1.1 5.36e+07 1.0 1.5e+05 7.9e+02 0.0e+00 1 0 45 18 0 1 0 45 18 0 5021 MatMultAdd 33 1.0 1.3318e-01 1.3 3.86e+05 1.0 6.3e+03 2.2e+02 0.0e+00 0 0 2 0 0 0 0 2 0 0 186 MatMultTranspose 41 1.0 1.7809e-01 2.0 4.80e+05 1.0 7.9e+03 2.2e+02 0.0e+00 0 0 2 0 0 0 0 2 0 0 172 MatSolve 33 1.0 1.5525e+00 1.0 7.17e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 2 0 0 0 1 2 0 0 0 29540 MatSOR 275 1.0 5.5730e-01 1.1 4.31e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4952 MatLUFactorSym 1 1.0 2.6266e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 MatLUFactorNum 7 1.0 7.5081e+01 1.0 3.38e+10 1.0 0.0e+00 0.0e+00 0.0e+00 59 98 0 0 0 59 98 0 0 0 28834 MatCopy 6 1.0 7.6939e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatResidual 33 1.0 3.5461e-02 1.4 5.94e+06 1.0 1.7e+04 7.9e+02 0.0e+00 0 0 5 2 0 0 0 5 2 0 10720 MatAssemblyBegin 19 1.0 3.4786e-01 1.2 0.00e+00 0.0 8.1e+03 3.3e+04 3.4e+01 0 0 2 39 8 0 0 2 39 8 0 MatAssemblyEnd 19 1.0 8.9293e-01 1.0 0.00e+00 0.0 2.4e+03 2.3e+02 2.4e+01 1 0 1 0 5 1 0 1 0 5 0 MatGetRowIJ 1 1.0 9.5682e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCreateSubMats 7 1.0 3.9239e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 2 0 0 0 0 2 0 MatGetOrdering 1 1.0 5.0396e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatZeroEntries 14 1.0 5.0514e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatRedundantMat 7 1.0 1.4950e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 1 0 0 0 2 1 0 0 0 2 0 MatMPIConcateSeq 7 1.0 6.6202e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 DMCoarsen 1 1.0 4.1387e-01 1.0 0.00e+00 0.0 2.0e+03 3.8e+01 2.2e+01 0 0 1 0 5 0 0 1 0 5 0 DMCreateInterp 1 1.0 1.0247e+00 1.0 1.17e+04 1.0 9.6e+02 1.0e+02 2.5e+01 1 0 0 0 6 1 0 0 0 6 1 SNESSolve 1 1.0 9.5076e+01 1.0 3.46e+10 1.0 3.4e+05 2.1e+03 4.0e+02 76100 99100 90 76100 99100 90 23323 SNESFunctionEval 8 1.0 9.6542e-01 1.0 0.00e+00 0.0 8.2e+03 5.9e+02 0.0e+00 1 0 2 1 0 1 0 2 1 0 0 SNESJacobianEval 14 1.0 9.7495e-01 1.1 0.00e+00 0.0 1.5e+04 1.8e+04 2.8e+01 1 0 4 39 6 1 0 4 39 6 0 SNESLineSearch 7 1.0 7.6288e-01 1.0 1.39e+06 1.0 1.1e+04 6.6e+02 2.8e+01 1 0 3 1 6 1 0 3 1 6 116 KSPGMRESOrthog 96 1.0 3.3612e-01 1.1 3.22e+06 1.0 0.0e+00 0.0e+00 9.6e+01 0 0 0 0 22 0 0 0 0 22 613 KSPSetUp 29 1.0 2.5424e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 3 0 0 0 0 3 0 KSPSolve 7 1.0 9.2040e+01 1.0 3.46e+10 1.0 3.2e+05 1.5e+03 3.5e+02 74100 93 67 79 74100 93 67 79 24091 PCSetUp 7 1.0 8.3138e+01 1.0 3.38e+10 1.0 2.1e+04 3.7e+03 1.2e+02 65 98 6 11 26 65 98 6 11 26 26040 PCApply 33 1.0 1.0625e+01 1.3 8.13e+08 1.0 2.8e+05 1.3e+03 1.5e+02 8 2 83 55 33 8 2 83 55 33 4900 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Toy Hydrostatic Ice 1 1 832 0. Vector 75 75 1196400 0. Vector Scatter 10 10 38480 0. Matrix 15 15 200751840 0. Distributed Mesh 4 4 21744 0. Index Set 27 27 685816 0. IS L to G Mapping 4 4 16720 0. Star Forest Bipartite Graph 8 8 6960 0. Discrete System 4 4 3720 0. SNES 1 1 1488 0. SNESLineSearch 1 1 1040 0. DMSNES 2 2 1504 0. Krylov Solver 5 5 53208 0. DMKSP interface 2 2 1392 0. Preconditioner 4 4 4192 0. Viewer 1 0 0 0. ======================================================================================================================== Average time to get PetscTime(): 6.19888e-07 Average time for MPI_Barrier(): 6.62804e-06 Average time for zero size MPI_Send(): 0.000123281 #PETSc Option Table entries: -M 40 -N 40 -P 5 -da_refine 1 -log_view -pc_type mg -thi_mat_type aij #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8 Configure options: --with-64-bit-indices=1 --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-fc=ftn --with-fortranlib-autodetect=0 --with-memalign=64 --with-mpiexec=srun COPTFLAGS="-g -O3 -fp-model fast -xMIC-AVX512" CXXOPTFLAGS="-g -O3 -fp-model fast -xMIC-AVX512" FOPTFLAGS="-g -O3 -fp-model fast -xMIC-AVX512" PETSC_ARCH=arch-cori-c-opt ----------------------------------------- Libraries compiled on Mon Apr 3 16:17:18 2017 on cori04 Machine characteristics: Linux-3.12.60-52.63.1.12215.0.PTF.1017941-default-x86_64-with-SuSE-12-x86_64 Using PETSc directory: /global/homes/j/jychang/Software/petsc Using PETSc arch: arch-cori-c-opt ----------------------------------------- Using C compiler: cc ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: ftn ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/global/homes/j/jychang/Software/petsc/arch-cori-c-opt/include -I/global/homes/j/jychang/Software/petsc/include -I/global/homes/j/jychang/Software/petsc/include -I/global/homes/j/jychang/Software/petsc/arch-cori-c-opt/include ----------------------------------------- Using C linker: cc Using Fortran linker: ftn Using libraries: -Wl,-rpath,/global/homes/j/jychang/Software/petsc/arch-cori-c-opt/lib -L/global/homes/j/jychang/Software/petsc/arch-cori-c-opt/lib -lpetsc -ldl ----------------------------------------- Level 2 domain size (m) 1e+04 x 1e+04 x 1e+03, num elements 160 x 160 x 17 (435200), size (m) 62.5 x 62.5 x 62.5 Level 1 domain size (m) 1e+04 x 1e+04 x 1e+03, num elements 80 x 80 x 9 (57600), size (m) 125. x 125. x 125. Level 0 domain size (m) 1e+04 x 1e+04 x 1e+03, num elements 40 x 40 x 5 (8000), size (m) 250. x 250. x 250. Solution statistics after solve: Full CONVERGED_FNORM_RELATIVE: Number of SNES iterations = 8, total linear iterations = 29 |X|_2 11109.6 -1.95866e-16 <= u <= 24.5864 -3.11696 <= v <= 3.11696 1.39164e-19 <= c <= 24.5864 Surface statistics: u in [1.222055e+01, 2.458639e+01] mean 2.020187e+01 Global eta range 2.76911e+10 to 9.2273e+12 converged range 2.76911e+10 to 4.45572e+12 Global beta2 range 1e+100 to 0. converged range 1e+100 to 0. Wall-clock time: 1.199e+02 seconds Degrees-of-freedom: 870400 FLOPS: 2.593e+12 L1 misses: 1.833e+10 Intensity: 1.415e+02 Rate: 7.257e+03 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ex48cori on a arch-cori-c-opt named nid11247 with 64 processors, by jychang Tue Apr 4 11:25:42 2017 Using Petsc Development GIT revision: v3.7.5-3418-ge372536 GIT Date: 2017-03-30 13:35:15 -0500 Max Max/Min Avg Total Time (sec): 1.206e+02 1.00004 1.206e+02 Objects: 2.420e+02 1.00000 2.420e+02 Flop: 4.051e+10 1.00000 4.051e+10 2.593e+12 Flop/sec: 3.360e+08 1.00004 3.359e+08 2.150e+10 MPI Messages: 8.807e+03 1.00000 8.807e+03 5.636e+05 MPI Message Lengths: 3.580e+07 1.00000 4.065e+03 2.291e+09 MPI Reductions: 7.740e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.2058e+02 100.0% 2.5929e+12 100.0% 5.636e+05 100.0% 4.065e+03 100.0% 7.730e+02 99.9% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecDot 8 1.0 9.2020e-03 3.1 2.18e+05 1.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 1513 VecMDot 189 1.0 1.2104e-01 1.8 1.54e+07 1.0 0.0e+00 0.0e+00 1.9e+02 0 0 0 0 24 0 0 0 0 24 8143 VecNorm 230 1.0 1.2401e+00 3.9 4.18e+06 1.0 0.0e+00 0.0e+00 2.3e+02 0 0 0 0 30 0 0 0 0 30 216 VecScale 213 1.0 2.1293e-01 1.0 1.86e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 559 VecCopy 114 1.0 3.5626e-02 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 453 1.0 9.1018e-02 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 24 1.0 1.4475e-01 1.1 4.64e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 205 VecAYPX 592 1.0 1.3454e-02 1.2 5.70e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 27105 VecAXPBYCZ 296 1.0 9.6307e-03 1.2 1.14e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 75731 VecWAXPY 8 1.0 4.9782e-04 1.2 1.09e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 13987 VecMAXPY 213 1.0 3.6830e-02 1.1 1.87e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 32416 VecPointwiseMult 16 1.0 8.4910e-03 1.4 1.64e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 124 VecScatterBegin 923 1.0 6.4633e-01 1.0 0.00e+00 0.0 5.3e+05 1.8e+03 0.0e+00 1 0 94 42 0 1 0 94 42 0 0 VecScatterEnd 923 1.0 3.5276e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecReduceArith 16 1.0 1.5760e-02 1.4 4.35e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1767 VecReduceComm 8 1.0 1.7638e-02 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 0 VecNormalize 213 1.0 1.2557e+00 3.8 5.58e+06 1.0 0.0e+00 0.0e+00 2.1e+02 1 0 0 0 28 1 0 0 0 28 284 MatMult 641 1.0 1.4887e+00 1.0 5.28e+08 1.0 3.3e+05 1.9e+03 0.0e+00 1 1 58 27 0 1 1 58 27 0 22682 MatMultAdd 74 1.0 5.9382e-02 1.9 3.76e+06 1.0 1.4e+04 4.9e+02 0.0e+00 0 0 3 0 0 0 0 3 0 0 4056 MatMultTranspose 92 1.0 1.8601e-01 2.3 4.68e+06 1.0 1.8e+04 4.9e+02 0.0e+00 0 0 3 0 0 0 0 3 0 0 1610 MatSolve 37 1.0 1.6484e+00 1.0 8.03e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 2 0 0 0 1 2 0 0 0 31193 MatSOR 620 1.0 2.2525e+00 1.1 4.57e+08 1.0 0.0e+00 0.0e+00 0.0e+00 2 1 0 0 0 2 1 0 0 0 12996 MatLUFactorSym 1 1.0 2.5897e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 MatLUFactorNum 8 1.0 9.2021e+01 1.0 3.87e+10 1.0 0.0e+00 0.0e+00 0.0e+00 76 95 0 0 0 76 95 0 0 0 26887 MatCopy 7 1.0 3.7559e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatResidual 74 1.0 1.2738e-01 1.1 5.89e+07 1.0 3.8e+04 1.8e+03 0.0e+00 0 0 7 3 0 0 0 7 3 0 29581 MatAssemblyBegin 31 1.0 4.1213e-01 1.2 0.00e+00 0.0 1.4e+04 9.4e+04 5.8e+01 0 0 2 57 7 0 0 2 57 8 0 MatAssemblyEnd 31 1.0 5.4225e-01 1.2 0.00e+00 0.0 3.8e+03 5.5e+02 4.0e+01 0 0 1 0 5 0 0 1 0 5 0 MatGetRowIJ 1 1.0 3.9160e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCreateSubMats 8 1.0 1.7472e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+01 0 0 0 0 1 0 0 0 0 1 0 MatGetOrdering 1 1.0 1.1031e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatZeroEntries 24 1.0 5.7229e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatRedundantMat 8 1.0 6.3932e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+01 1 0 0 0 1 1 0 0 0 1 0 MatMPIConcateSeq 8 1.0 3.1772e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMCoarsen 2 1.0 7.4563e-02 1.0 0.00e+00 0.0 4.1e+03 7.5e+01 4.4e+01 0 0 1 0 6 0 0 1 0 6 0 DMCreateInterp 2 1.0 1.3233e-01 1.0 1.02e+05 1.0 1.9e+03 2.3e+02 5.0e+01 0 0 0 0 6 0 0 0 0 6 49 SNESSolve 1 1.0 1.1059e+02 1.0 4.05e+10 1.0 5.6e+05 4.1e+03 7.3e+02 92100 99100 94 92100 99100 94 23446 SNESFunctionEval 9 1.0 1.3480e+00 1.0 0.00e+00 0.0 9.2e+03 2.1e+03 0.0e+00 1 0 2 1 0 1 0 2 1 0 0 SNESJacobianEval 24 1.0 3.4835e+00 1.0 0.00e+00 0.0 2.6e+04 5.0e+04 4.8e+01 3 0 5 57 6 3 0 5 57 6 0 SNESLineSearch 8 1.0 1.2974e+00 1.0 1.24e+07 1.0 1.2e+04 2.4e+03 3.2e+01 1 0 2 1 4 1 0 2 1 4 611 KSPGMRESOrthog 189 1.0 2.0608e-01 1.4 3.08e+07 1.0 0.0e+00 0.0e+00 1.9e+02 0 0 0 0 24 0 0 0 0 24 9566 KSPSetUp 42 1.0 1.3820e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 2.8e+01 0 0 0 0 4 0 0 0 0 4 0 KSPSolve 8 1.0 1.0564e+02 1.0 4.05e+10 1.0 5.4e+05 2.3e+03 6.8e+02 88100 96 55 87 88100 96 55 87 24539 PCSetUp 8 1.0 9.7099e+01 1.0 3.87e+10 1.0 3.7e+04 9.1e+03 2.1e+02 80 95 7 15 28 80 95 7 15 28 25482 PCApply 37 1.0 9.3793e+00 1.1 1.80e+09 1.0 4.9e+05 1.8e+03 3.7e+02 7 4 86 38 47 7 4 86 38 47 12250 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Toy Hydrostatic Ice 1 1 832 0. Vector 114 114 5900296 0. Vector Scatter 15 15 159720 0. Matrix 24 24 213760336 0. Distributed Mesh 6 6 32576 0. Index Set 37 37 789832 0. IS L to G Mapping 6 6 87840 0. Star Forest Bipartite Graph 12 12 10384 0. Discrete System 6 6 5576 0. SNES 1 1 1488 0. SNESLineSearch 1 1 1040 0. DMSNES 3 3 2296 0. Krylov Solver 7 7 85144 0. DMKSP interface 3 3 2088 0. Preconditioner 5 5 5232 0. Viewer 1 0 0 0. ======================================================================================================================== Average time to get PetscTime(): 5.00679e-07 Average time for MPI_Barrier(): 6.38962e-06 Average time for zero size MPI_Send(): 0.000109408 #PETSc Option Table entries: -M 40 -N 40 -P 5 -da_refine 2 -log_view -pc_type mg -thi_mat_type aij #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8 Configure options: --with-64-bit-indices=1 --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-fc=ftn --with-fortranlib-autodetect=0 --with-memalign=64 --with-mpiexec=srun COPTFLAGS="-g -O3 -fp-model fast -xMIC-AVX512" CXXOPTFLAGS="-g -O3 -fp-model fast -xMIC-AVX512" FOPTFLAGS="-g -O3 -fp-model fast -xMIC-AVX512" PETSC_ARCH=arch-cori-c-opt ----------------------------------------- Libraries compiled on Mon Apr 3 16:17:18 2017 on cori04 Machine characteristics: Linux-3.12.60-52.63.1.12215.0.PTF.1017941-default-x86_64-with-SuSE-12-x86_64 Using PETSc directory: /global/homes/j/jychang/Software/petsc Using PETSc arch: arch-cori-c-opt ----------------------------------------- Using C compiler: cc ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: ftn ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/global/homes/j/jychang/Software/petsc/arch-cori-c-opt/include -I/global/homes/j/jychang/Software/petsc/include -I/global/homes/j/jychang/Software/petsc/include -I/global/homes/j/jychang/Software/petsc/arch-cori-c-opt/include ----------------------------------------- Using C linker: cc Using Fortran linker: ftn Using libraries: -Wl,-rpath,/global/homes/j/jychang/Software/petsc/arch-cori-c-opt/lib -L/global/homes/j/jychang/Software/petsc/arch-cori-c-opt/lib -lpetsc -ldl ----------------------------------------- Level 3 domain size (m) 1e+04 x 1e+04 x 1e+03, num elements 320 x 320 x 33 (3379200), size (m) 31.25 x 31.25 x 31.25 Level 2 domain size (m) 1e+04 x 1e+04 x 1e+03, num elements 160 x 160 x 17 (435200), size (m) 62.5 x 62.5 x 62.5 Level 1 domain size (m) 1e+04 x 1e+04 x 1e+03, num elements 80 x 80 x 9 (57600), size (m) 125. x 125. x 125. Level 0 domain size (m) 1e+04 x 1e+04 x 1e+03, num elements 40 x 40 x 5 (8000), size (m) 250. x 250. x 250. Solution statistics after solve: Full CONVERGED_FNORM_RELATIVE: Number of SNES iterations = 8, total linear iterations = 29 |X|_2 31108.6 -8.91817e-18 <= u <= 24.6134 -3.12097 <= v <= 3.12097 1.82295e-21 <= c <= 24.6134 Surface statistics: u in [1.223876e+01, 2.461340e+01] mean 2.022494e+01 Global eta range 2.6772e+10 to 9.2273e+12 converged range 2.6772e+10 to 6.96406e+12 Global beta2 range 1e+100 to 0. converged range 1e+100 to 0. Wall-clock time: 1.761e+02 seconds Degrees-of-freedom: 6758400 FLOPS: 3.073e+12 L1 misses: 3.198e+10 Intensity: 9.611e+01 Rate: 3.838e+04 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ex48cori on a arch-cori-c-opt named nid11247 with 64 processors, by jychang Tue Apr 4 11:28:52 2017 Using Petsc Development GIT revision: v3.7.5-3418-ge372536 GIT Date: 2017-03-30 13:35:15 -0500 Max Max/Min Avg Total Time (sec): 1.772e+02 1.00002 1.772e+02 Objects: 3.200e+02 1.00000 3.200e+02 Flop: 4.802e+10 1.00000 4.802e+10 3.073e+12 Flop/sec: 2.711e+08 1.00002 2.711e+08 1.735e+10 MPI Messages: 1.167e+04 1.00000 1.167e+04 7.468e+05 MPI Message Lengths: 1.290e+08 1.00000 1.106e+04 8.259e+09 MPI Reductions: 1.067e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.7714e+02 100.0% 3.0735e+12 100.0% 7.468e+05 100.0% 1.106e+04 100.0% 1.066e+03 99.9% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecDot 8 1.0 8.1756e-03 2.0 1.69e+06 1.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 13226 VecMDot 269 1.0 9.3023e-01 2.4 1.21e+08 1.0 0.0e+00 0.0e+00 2.7e+02 0 0 0 0 25 0 0 0 0 25 8314 VecNorm 318 1.0 3.3969e+00 7.9 3.27e+07 1.0 0.0e+00 0.0e+00 3.2e+02 2 0 0 0 30 2 0 0 0 30 616 VecScale 301 1.0 2.1075e-01 1.1 1.46e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4420 VecCopy 159 1.0 6.9776e-02 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 604 1.0 1.6005e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 32 1.0 1.2502e-01 1.2 3.63e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1856 VecAYPX 888 1.0 9.0479e-02 1.3 4.48e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 31668 VecAXPBYCZ 444 1.0 6.6206e-02 1.4 8.95e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 86557 VecWAXPY 8 1.0 8.3420e-03 1.1 8.45e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 6481 VecMAXPY 301 1.0 6.1114e-01 1.1 1.46e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 15324 VecPointwiseMult 24 1.0 3.6499e-02 1.1 1.25e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 220 VecScatterBegin 1316 1.0 6.5118e-01 1.2 0.00e+00 0.0 7.1e+05 4.1e+03 0.0e+00 0 0 94 35 0 0 0 94 35 0 0 VecScatterEnd 1316 1.0 1.4457e+00 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecReduceArith 16 1.0 1.8659e-02 1.4 3.38e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 11590 VecReduceComm 8 1.0 2.3512e-02 3.0 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 0 VecNormalize 301 1.0 3.3408e+00 8.4 4.37e+07 1.0 0.0e+00 0.0e+00 3.0e+02 2 0 0 0 28 2 0 0 0 28 836 MatMult 943 1.0 8.1884e+00 1.1 4.23e+09 1.0 4.8e+05 5.1e+03 0.0e+00 4 9 65 30 0 4 9 65 30 0 33047 MatMultAdd 111 1.0 2.6025e-01 1.5 2.99e+07 1.0 2.1e+04 1.3e+03 0.0e+00 0 0 3 0 0 0 0 3 0 0 7346 MatMultTranspose 138 1.0 5.8439e-01 2.8 3.71e+07 1.0 2.6e+04 1.3e+03 0.0e+00 0 0 4 0 0 0 0 4 0 0 4067 MatSolve 37 1.0 1.7436e+00 1.1 8.03e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 2 0 0 0 1 2 0 0 0 29492 MatSOR 930 1.0 1.6228e+01 1.0 3.81e+09 1.0 0.0e+00 0.0e+00 0.0e+00 9 8 0 0 0 9 8 0 0 0 15014 MatLUFactorSym 1 1.0 2.8519e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 MatLUFactorNum 8 1.0 8.6130e+01 1.0 3.87e+10 1.0 0.0e+00 0.0e+00 0.0e+00 47 81 0 0 0 47 81 0 0 0 28726 MatCopy 7 1.0 8.7251e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 5 0 0 0 0 5 0 0 0 0 0 MatResidual 111 1.0 8.9457e-01 1.2 4.72e+08 1.0 5.7e+04 4.8e+03 0.0e+00 0 1 8 3 0 0 1 8 3 0 33792 MatAssemblyBegin 41 1.0 1.7271e+00 1.4 0.00e+00 0.0 1.8e+04 2.9e+05 7.8e+01 1 0 2 64 7 1 0 2 64 7 0 MatAssemblyEnd 41 1.0 1.6833e+00 1.1 0.00e+00 0.0 5.2e+03 1.5e+03 5.6e+01 1 0 1 0 5 1 0 1 0 5 0 MatGetRowIJ 1 1.0 6.5138e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCreateSubMats 8 1.0 2.4041e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+01 0 0 0 0 1 0 0 0 0 1 0 MatGetOrdering 1 1.0 1.8698e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatZeroEntries 32 1.0 1.5066e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatRedundantMat 8 1.0 9.5533e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+01 5 0 0 0 1 5 0 0 0 1 0 MatMPIConcateSeq 8 1.0 9.0860e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 5 0 0 0 0 5 0 0 0 0 0 DMCoarsen 3 1.0 5.4627e-01 1.0 0.00e+00 0.0 6.1e+03 1.8e+02 6.6e+01 0 0 1 0 6 0 0 1 0 6 0 DMCreateInterp 3 1.0 3.9989e-01 1.0 8.07e+05 1.0 2.9e+03 5.9e+02 7.5e+01 0 0 0 0 7 0 0 0 0 7 129 SNESSolve 1 1.0 1.6741e+02 1.0 4.80e+10 1.0 7.4e+05 1.1e+04 1.0e+03 94100100100 96 95100100100 96 18359 SNESFunctionEval 9 1.0 5.1997e+00 1.0 0.00e+00 0.0 9.2e+03 8.1e+03 0.0e+00 3 0 1 1 0 3 0 1 1 0 0 SNESJacobianEval 32 1.0 2.4191e+01 1.0 0.00e+00 0.0 3.5e+04 1.5e+05 6.4e+01 14 0 5 65 6 14 0 5 65 6 0 SNESLineSearch 8 1.0 5.0492e+00 1.0 9.78e+07 1.0 1.2e+04 9.0e+03 3.2e+01 3 0 2 1 3 3 0 2 1 3 1240 KSPGMRESOrthog 269 1.0 1.4984e+00 1.6 2.42e+08 1.0 0.0e+00 0.0e+00 2.7e+02 1 1 0 0 25 1 1 0 0 25 10323 KSPSetUp 51 1.0 2.2536e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 4.4e+01 0 0 0 0 4 0 0 0 0 4 0 KSPSolve 8 1.0 1.4036e+02 1.0 4.79e+10 1.0 7.2e+05 5.7e+03 9.7e+02 79100 97 50 91 79100 97 50 91 21853 PCSetUp 8 1.0 1.0460e+02 1.0 3.87e+10 1.0 5.1e+04 2.6e+04 3.1e+02 58 81 7 16 29 58 81 7 16 29 23659 PCApply 37 1.0 3.7417e+01 1.1 8.89e+09 1.0 6.6e+05 3.9e+03 5.6e+02 21 19 88 31 53 21 19 88 31 53 15205 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Toy Hydrostatic Ice 1 1 832 0. Vector 153 153 41695424 0. Vector Scatter 20 20 1036160 0. Matrix 33 33 316236640 0. Distributed Mesh 8 8 43408 0. Index Set 47 47 1375832 0. IS L to G Mapping 8 8 569072 0. Star Forest Bipartite Graph 16 16 13808 0. Discrete System 8 8 7432 0. SNES 1 1 1488 0. SNESLineSearch 1 1 1040 0. DMSNES 4 4 3088 0. Krylov Solver 9 9 117080 0. DMKSP interface 4 4 2784 0. Preconditioner 6 6 6272 0. Viewer 1 0 0 0. ======================================================================================================================== Average time to get PetscTime(): 5.00679e-07 Average time for MPI_Barrier(): 6.62804e-06 Average time for zero size MPI_Send(): 0.00011519 #PETSc Option Table entries: -M 40 -N 40 -P 5 -da_refine 3 -log_view -pc_type mg -thi_mat_type aij #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8 Configure options: --with-64-bit-indices=1 --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-fc=ftn --with-fortranlib-autodetect=0 --with-memalign=64 --with-mpiexec=srun COPTFLAGS="-g -O3 -fp-model fast -xMIC-AVX512" CXXOPTFLAGS="-g -O3 -fp-model fast -xMIC-AVX512" FOPTFLAGS="-g -O3 -fp-model fast -xMIC-AVX512" PETSC_ARCH=arch-cori-c-opt ----------------------------------------- Libraries compiled on Mon Apr 3 16:17:18 2017 on cori04 Machine characteristics: Linux-3.12.60-52.63.1.12215.0.PTF.1017941-default-x86_64-with-SuSE-12-x86_64 Using PETSc directory: /global/homes/j/jychang/Software/petsc Using PETSc arch: arch-cori-c-opt ----------------------------------------- Using C compiler: cc ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: ftn ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/global/homes/j/jychang/Software/petsc/arch-cori-c-opt/include -I/global/homes/j/jychang/Software/petsc/include -I/global/homes/j/jychang/Software/petsc/include -I/global/homes/j/jychang/Software/petsc/arch-cori-c-opt/include ----------------------------------------- Using C linker: cc Using Fortran linker: ftn Using libraries: -Wl,-rpath,/global/homes/j/jychang/Software/petsc/arch-cori-c-opt/lib -L/global/homes/j/jychang/Software/petsc/arch-cori-c-opt/lib -lpetsc -ldl -----------------------------------------