ModuleCmd_Switch.c(179):ERROR:152: Module 'PrgEnv-intel/6.0.3' is currently not loaded Level 1 domain size (m) 1e+04 x 1e+04 x 1e+03, num elements 80 x 80 x 9 (57600), size (m) 125. x 125. x 125. Level 0 domain size (m) 1e+04 x 1e+04 x 1e+03, num elements 40 x 40 x 5 (8000), size (m) 250. x 250. x 250. Solution statistics after solve: Full CONVERGED_FNORM_RELATIVE: Number of SNES iterations = 7, total linear iterations = 26 |X|_2 3996.69 -2.34727e-12 <= u <= 24.479 -3.10124 <= v <= 3.10124 1.98e-14 <= c <= 24.479 Surface statistics: u in [1.214710e+01, 2.447899e+01] mean 2.010957e+01 Global eta range 2.96251e+10 to 9.2273e+12 converged range 2.96251e+10 to 2.44973e+12 Global beta2 range 1e+100 to 0. converged range 1e+100 to 0. Wall-clock time: 4.768e+01 seconds Degrees-of-freedom: 115200 FLOPS: 1.112e+12 L1 misses: 1.186e+11 Intensity: 9.374e+00 Rate: 2.416e+03 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- /global/cscratch1/sd/jychang/Icesheet/./ex48cori on a arch-cori-c-opt named nid00315 with 32 processors, by jychang Tue Apr 4 11:24:31 2017 Using Petsc Development GIT revision: v3.7.5-3418-ge372536 GIT Date: 2017-03-30 13:35:15 -0500 Max Max/Min Avg Total Time (sec): 4.815e+01 1.00000 4.815e+01 Objects: 1.640e+02 1.00000 1.640e+02 Flop: 3.475e+10 1.00000 3.475e+10 1.112e+12 Flop/sec: 7.218e+08 1.00000 7.218e+08 2.310e+10 MPI Messages: 4.193e+03 1.00000 4.193e+03 1.342e+05 MPI Message Lengths: 1.384e+07 1.00000 3.301e+03 4.429e+08 MPI Reductions: 4.440e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 4.8143e+01 100.0% 1.1121e+12 100.0% 1.342e+05 100.0% 3.301e+03 100.0% 4.430e+02 99.8% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecDot 7 1.0 1.2879e-03 6.9 5.04e+04 1.0 0.0e+00 0.0e+00 7.0e+00 0 0 0 0 2 0 0 0 0 2 1252 VecMDot 96 1.0 3.9147e-02 1.5 3.22e+06 1.0 0.0e+00 0.0e+00 9.6e+01 0 0 0 0 22 0 0 0 0 22 2630 VecNorm 125 1.0 1.1578e+00 6.1 9.00e+05 1.0 0.0e+00 0.0e+00 1.2e+02 1 0 0 0 28 1 0 0 0 28 25 VecScale 110 1.0 2.3275e-01 1.0 3.96e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 54 VecCopy 61 1.0 6.8545e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 270 1.0 1.9180e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 14 1.0 8.1632e-02 1.0 1.01e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 40 VecAYPX 264 1.0 2.1043e-03 1.4 1.19e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 18066 VecAXPBYCZ 132 1.0 1.2178e-03 1.4 2.38e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 62432 VecWAXPY 7 1.0 6.8188e-05 1.4 2.52e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 11826 VecMAXPY 110 1.0 2.7920e-02 1.0 3.91e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4481 VecPointwiseMult 7 1.0 1.7210e-02 1.0 3.50e+03 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 7 VecScatterBegin 471 1.0 9.1160e-02 1.0 0.00e+00 0.0 1.2e+05 1.8e+03 0.0e+00 0 0 93 51 0 0 0 93 51 0 0 VecScatterEnd 471 1.0 7.3683e-02 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecReduceArith 14 1.0 2.0466e-03 1.3 1.01e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1576 VecReduceComm 7 1.0 1.7121e-03 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 7.0e+00 0 0 0 0 2 0 0 0 0 2 0 VecNormalize 110 1.0 1.2481e+00 4.5 1.19e+06 1.0 0.0e+00 0.0e+00 1.1e+02 2 0 0 0 25 2 0 0 0 25 30 MatMult 301 1.0 2.7188e-01 1.1 1.07e+08 1.0 7.7e+04 1.2e+03 0.0e+00 1 0 57 20 0 1 0 57 20 0 12626 MatMultAdd 33 1.0 4.7640e-02 2.8 7.72e+05 1.0 3.2e+03 3.2e+02 0.0e+00 0 0 2 0 0 0 0 2 0 0 519 MatMultTranspose 41 1.0 2.5323e-02 2.2 9.59e+05 1.0 3.9e+03 3.2e+02 0.0e+00 0 0 3 0 0 0 0 3 0 0 1212 MatSolve 33 1.0 1.2690e+00 1.0 7.17e+08 1.0 0.0e+00 0.0e+00 0.0e+00 3 2 0 0 0 3 2 0 0 0 18070 MatSOR 275 1.0 2.5031e-01 1.1 8.93e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 11419 MatLUFactorSym 1 1.0 3.7103e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 MatLUFactorNum 7 1.0 3.4719e+01 1.0 3.38e+10 1.0 0.0e+00 0.0e+00 0.0e+00 71 97 0 0 0 71 97 0 0 0 31178 MatCopy 6 1.0 4.1486e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatResidual 33 1.0 1.4773e-02 1.2 1.19e+07 1.0 8.4e+03 1.2e+03 0.0e+00 0 0 6 2 0 0 0 6 2 0 25733 MatAssemblyBegin 19 1.0 5.0304e-02 1.2 0.00e+00 0.0 4.0e+03 5.0e+04 3.4e+01 0 0 3 46 8 0 0 3 46 8 0 MatAssemblyEnd 19 1.0 2.0647e-01 1.1 0.00e+00 0.0 1.2e+03 3.3e+02 2.4e+01 0 0 1 0 5 0 0 1 0 5 0 MatGetRowIJ 1 1.0 1.3759e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCreateSubMats 7 1.0 1.3228e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 2 0 0 0 0 2 0 MatGetOrdering 1 1.0 1.7804e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatZeroEntries 14 1.0 8.4143e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatRedundantMat 7 1.0 2.4873e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 1 0 0 0 2 1 0 0 0 2 0 MatMPIConcateSeq 7 1.0 7.8219e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMCoarsen 1 1.0 1.4224e-01 1.0 0.00e+00 0.0 1.0e+03 5.3e+01 2.2e+01 0 0 1 0 5 0 0 1 0 5 0 DMCreateInterp 1 1.0 7.7232e-02 1.0 2.34e+04 1.0 4.8e+02 1.5e+02 2.5e+01 0 0 0 0 6 0 0 0 0 6 10 SNESSolve 1 1.0 4.1031e+01 1.0 3.48e+10 1.0 1.3e+05 3.3e+03 4.0e+02 85100 99100 90 85100 99100 90 27105 SNESFunctionEval 8 1.0 1.6092e-01 1.0 0.00e+00 0.0 4.1e+03 8.6e+02 0.0e+00 0 0 3 1 0 0 0 3 1 0 0 SNESJacobianEval 14 1.0 3.1566e-01 1.0 0.00e+00 0.0 7.6e+03 2.7e+04 2.8e+01 1 0 6 46 6 1 0 6 46 6 0 SNESLineSearch 7 1.0 1.2888e-01 1.0 2.77e+06 1.0 5.4e+03 9.6e+02 2.8e+01 0 0 4 1 6 0 0 4 1 6 688 KSPGMRESOrthog 96 1.0 1.2377e-01 1.1 6.44e+06 1.0 0.0e+00 0.0e+00 9.6e+01 0 0 0 0 22 0 0 0 0 22 1664 KSPSetUp 29 1.0 1.3257e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 3 0 0 0 0 3 0 KSPSolve 7 1.0 4.0327e+01 1.0 3.48e+10 1.0 1.2e+05 2.2e+03 3.5e+02 84100 92 62 79 84100 92 62 79 27576 PCSetUp 7 1.0 3.6120e+01 1.0 3.38e+10 1.0 8.6e+03 6.2e+03 1.2e+02 74 97 6 12 26 74 97 6 12 26 29969 PCApply 33 1.0 5.0917e+00 1.2 9.13e+08 1.0 1.1e+05 2.0e+03 1.5e+02 10 3 80 48 33 10 3 80 48 33 5741 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Toy Hydrostatic Ice 1 1 832 0. Vector 75 75 1862560 0. Vector Scatter 10 10 64880 0. Matrix 15 15 202617752 0. Distributed Mesh 4 4 21744 0. Index Set 27 27 701176 0. IS L to G Mapping 4 4 28000 0. Star Forest Bipartite Graph 8 8 6960 0. Discrete System 4 4 3720 0. SNES 1 1 1488 0. SNESLineSearch 1 1 1040 0. DMSNES 2 2 1504 0. Krylov Solver 5 5 53208 0. DMKSP interface 2 2 1392 0. Preconditioner 4 4 4192 0. Viewer 1 0 0 0. ======================================================================================================================== Average time to get PetscTime(): 1.90735e-07 Average time for MPI_Barrier(): 2.24113e-06 Average time for zero size MPI_Send(): 4.372e-05 #PETSc Option Table entries: -M 40 -N 40 -P 5 -da_refine 1 -log_view -pc_type mg -thi_mat_type aij #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8 Configure options: --with-64-bit-indices=1 --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-fc=ftn --with-fortranlib-autodetect=0 --with-memalign=64 --with-mpiexec=srun COPTFLAGS="-g -O3 -fp-model fast -xMIC-AVX512" CXXOPTFLAGS="-g -O3 -fp-model fast -xMIC-AVX512" FOPTFLAGS="-g -O3 -fp-model fast -xMIC-AVX512" PETSC_ARCH=arch-cori-c-opt ----------------------------------------- Libraries compiled on Mon Apr 3 16:17:18 2017 on cori04 Machine characteristics: Linux-3.12.60-52.63.1.12215.0.PTF.1017941-default-x86_64-with-SuSE-12-x86_64 Using PETSc directory: /global/homes/j/jychang/Software/petsc Using PETSc arch: arch-cori-c-opt ----------------------------------------- Using C compiler: cc ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: ftn ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/global/homes/j/jychang/Software/petsc/arch-cori-c-opt/include -I/global/homes/j/jychang/Software/petsc/include -I/global/homes/j/jychang/Software/petsc/include -I/global/homes/j/jychang/Software/petsc/arch-cori-c-opt/include ----------------------------------------- Using C linker: cc Using Fortran linker: ftn Using libraries: -Wl,-rpath,/global/homes/j/jychang/Software/petsc/arch-cori-c-opt/lib -L/global/homes/j/jychang/Software/petsc/arch-cori-c-opt/lib -lpetsc -ldl ----------------------------------------- Level 2 domain size (m) 1e+04 x 1e+04 x 1e+03, num elements 160 x 160 x 17 (435200), size (m) 62.5 x 62.5 x 62.5 Level 1 domain size (m) 1e+04 x 1e+04 x 1e+03, num elements 80 x 80 x 9 (57600), size (m) 125. x 125. x 125. Level 0 domain size (m) 1e+04 x 1e+04 x 1e+03, num elements 40 x 40 x 5 (8000), size (m) 250. x 250. x 250. Solution statistics after solve: Full CONVERGED_FNORM_RELATIVE: Number of SNES iterations = 8, total linear iterations = 29 |X|_2 11109.6 -3.52227e-16 <= u <= 24.5864 -3.11696 <= v <= 3.11696 4.9996e-19 <= c <= 24.5864 Surface statistics: u in [1.222055e+01, 2.458639e+01] mean 2.020187e+01 Global eta range 2.76911e+10 to 9.2273e+12 converged range 2.76911e+10 to 4.45572e+12 Global beta2 range 1e+100 to 0. converged range 1e+100 to 0. Wall-clock time: 5.842e+01 seconds Degrees-of-freedom: 870400 FLOPS: 1.331e+12 L1 misses: 1.452e+11 Intensity: 9.167e+00 Rate: 1.490e+04 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- /global/cscratch1/sd/jychang/Icesheet/./ex48cori on a arch-cori-c-opt named nid00315 with 32 processors, by jychang Tue Apr 4 11:25:59 2017 Using Petsc Development GIT revision: v3.7.5-3418-ge372536 GIT Date: 2017-03-30 13:35:15 -0500 Max Max/Min Avg Total Time (sec): 5.860e+01 1.00001 5.860e+01 Objects: 2.420e+02 1.00000 2.420e+02 Flop: 4.158e+10 1.00000 4.158e+10 1.331e+12 Flop/sec: 7.096e+08 1.00001 7.096e+08 2.271e+10 MPI Messages: 7.559e+03 1.00000 7.559e+03 2.419e+05 MPI Message Lengths: 5.098e+07 1.00000 6.744e+03 1.631e+09 MPI Reductions: 7.740e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 5.8592e+01 100.0% 1.3307e+12 100.0% 2.419e+05 100.0% 6.744e+03 100.0% 7.730e+02 99.9% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecDot 8 1.0 1.3297e-03 2.2 4.35e+05 1.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 10473 VecMDot 189 1.0 1.0654e-01 2.2 3.08e+07 1.0 0.0e+00 0.0e+00 1.9e+02 0 0 0 0 24 0 0 0 0 24 9252 VecNorm 230 1.0 7.8258e-01 4.6 8.36e+06 1.0 0.0e+00 0.0e+00 2.3e+02 1 0 0 0 30 1 0 0 0 30 342 VecScale 213 1.0 9.5624e-02 1.0 3.72e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1244 VecCopy 114 1.0 7.1056e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 453 1.0 9.9604e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 24 1.0 5.1492e-02 1.0 9.28e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 577 VecAYPX 592 1.0 3.6405e-02 1.2 1.14e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 10017 VecAXPBYCZ 296 1.0 2.2356e-02 1.2 2.28e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 32624 VecWAXPY 8 1.0 1.0059e-03 1.1 2.18e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 6922 VecMAXPY 213 1.0 1.3508e-02 1.1 3.73e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 88386 VecPointwiseMult 16 1.0 5.4884e-04 3.2 3.28e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1912 VecScatterBegin 923 1.0 1.0712e-01 1.0 0.00e+00 0.0 2.3e+05 2.8e+03 0.0e+00 0 0 94 39 0 0 0 94 39 0 0 VecScatterEnd 923 1.0 1.9820e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecReduceArith 16 1.0 2.4767e-03 1.2 8.70e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 11246 VecReduceComm 8 1.0 2.3155e-03 2.6 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 0 VecNormalize 213 1.0 7.5179e-01 5.4 1.12e+07 1.0 0.0e+00 0.0e+00 2.1e+02 1 0 0 0 28 1 0 0 0 28 475 MatMult 641 1.0 2.3028e+00 1.0 1.06e+09 1.0 1.6e+05 2.8e+03 0.0e+00 4 3 68 28 0 4 3 68 28 0 14663 MatMultAdd 74 1.0 6.7120e-02 2.0 7.53e+06 1.0 7.1e+03 7.2e+02 0.0e+00 0 0 3 0 0 0 0 3 0 0 3588 MatMultTranspose 92 1.0 5.9264e-02 1.8 9.36e+06 1.0 8.8e+03 7.2e+02 0.0e+00 0 0 4 0 0 0 0 4 0 0 5052 MatSolve 37 1.0 1.4223e+00 1.0 8.03e+08 1.0 0.0e+00 0.0e+00 0.0e+00 2 2 0 0 0 2 2 0 0 0 18076 MatSOR 620 1.0 4.1353e+00 1.0 9.32e+08 1.0 0.0e+00 0.0e+00 0.0e+00 7 2 0 0 0 7 2 0 0 0 7215 MatLUFactorSym 1 1.0 3.3888e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 MatLUFactorNum 8 1.0 4.0083e+01 1.0 3.87e+10 1.0 0.0e+00 0.0e+00 0.0e+00 68 93 0 0 0 68 93 0 0 0 30863 MatCopy 7 1.0 3.7252e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatResidual 74 1.0 2.4422e-01 1.1 1.18e+08 1.0 1.9e+04 2.7e+03 0.0e+00 0 0 8 3 0 0 0 8 3 0 15429 MatAssemblyBegin 31 1.0 1.2390e-01 1.6 0.00e+00 0.0 6.9e+03 1.4e+05 5.8e+01 0 0 3 60 7 0 0 3 60 8 0 MatAssemblyEnd 31 1.0 2.4030e-01 1.1 0.00e+00 0.0 1.9e+03 8.1e+02 4.0e+01 0 0 1 0 5 0 0 1 0 5 0 MatGetRowIJ 1 1.0 1.6043e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCreateSubMats 8 1.0 1.1817e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+01 0 0 0 0 1 0 0 0 0 1 0 MatGetOrdering 1 1.0 6.2355e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatZeroEntries 24 1.0 6.0331e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatRedundantMat 8 1.0 2.5527e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+01 0 0 0 0 1 0 0 0 0 1 0 MatMPIConcateSeq 8 1.0 7.7490e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMCoarsen 2 1.0 6.7571e-02 1.0 0.00e+00 0.0 2.0e+03 1.1e+02 4.4e+01 0 0 1 0 6 0 0 1 0 6 0 DMCreateInterp 2 1.0 6.1647e-02 1.0 2.03e+05 1.0 9.6e+02 3.4e+02 5.0e+01 0 0 0 0 6 0 0 0 0 6 106 SNESSolve 1 1.0 5.2859e+01 1.0 4.16e+10 1.0 2.4e+05 6.8e+03 7.3e+02 90100 99100 94 90100 99100 94 25174 SNESFunctionEval 9 1.0 3.9097e-01 1.0 0.00e+00 0.0 4.6e+03 3.2e+03 0.0e+00 1 0 2 1 0 1 0 2 1 0 0 SNESJacobianEval 24 1.0 1.4423e+00 1.0 0.00e+00 0.0 1.3e+04 7.6e+04 4.8e+01 2 0 5 61 6 2 0 5 61 6 0 SNESLineSearch 8 1.0 3.7681e-01 1.0 2.48e+07 1.0 6.1e+03 3.5e+03 3.2e+01 1 0 3 1 4 1 0 3 1 4 2102 KSPGMRESOrthog 189 1.0 1.2580e-01 1.9 6.16e+07 1.0 0.0e+00 0.0e+00 1.9e+02 0 0 0 0 24 0 0 0 0 24 15671 KSPSetUp 42 1.0 2.9766e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 2.8e+01 0 0 0 0 4 0 0 0 0 4 0 KSPSolve 8 1.0 5.0948e+01 1.0 4.16e+10 1.0 2.3e+05 3.7e+03 6.8e+02 87100 95 52 87 87100 95 52 87 26103 PCSetUp 8 1.0 4.1274e+01 1.0 3.87e+10 1.0 1.6e+04 1.5e+04 2.1e+02 70 93 7 15 28 70 93 7 15 28 29974 PCApply 37 1.0 1.0180e+01 1.1 2.80e+09 1.0 2.1e+05 2.8e+03 3.7e+02 17 7 85 35 47 17 7 85 35 47 8816 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Toy Hydrostatic Ice 1 1 832 0. Vector 114 114 11165736 0. Vector Scatter 15 15 301320 0. Matrix 24 24 228593048 0. Distributed Mesh 6 6 32576 0. Index Set 37 37 880872 0. IS L to G Mapping 6 6 162480 0. Star Forest Bipartite Graph 12 12 10384 0. Discrete System 6 6 5576 0. SNES 1 1 1488 0. SNESLineSearch 1 1 1040 0. DMSNES 3 3 2296 0. Krylov Solver 7 7 85144 0. DMKSP interface 3 3 2088 0. Preconditioner 5 5 5232 0. Viewer 1 0 0 0. ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 2.19345e-06 Average time for zero size MPI_Send(): 3.95328e-05 #PETSc Option Table entries: -M 40 -N 40 -P 5 -da_refine 2 -log_view -pc_type mg -thi_mat_type aij #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8 Configure options: --with-64-bit-indices=1 --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-fc=ftn --with-fortranlib-autodetect=0 --with-memalign=64 --with-mpiexec=srun COPTFLAGS="-g -O3 -fp-model fast -xMIC-AVX512" CXXOPTFLAGS="-g -O3 -fp-model fast -xMIC-AVX512" FOPTFLAGS="-g -O3 -fp-model fast -xMIC-AVX512" PETSC_ARCH=arch-cori-c-opt ----------------------------------------- Libraries compiled on Mon Apr 3 16:17:18 2017 on cori04 Machine characteristics: Linux-3.12.60-52.63.1.12215.0.PTF.1017941-default-x86_64-with-SuSE-12-x86_64 Using PETSc directory: /global/homes/j/jychang/Software/petsc Using PETSc arch: arch-cori-c-opt ----------------------------------------- Using C compiler: cc ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: ftn ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/global/homes/j/jychang/Software/petsc/arch-cori-c-opt/include -I/global/homes/j/jychang/Software/petsc/include -I/global/homes/j/jychang/Software/petsc/include -I/global/homes/j/jychang/Software/petsc/arch-cori-c-opt/include ----------------------------------------- Using C linker: cc Using Fortran linker: ftn Using libraries: -Wl,-rpath,/global/homes/j/jychang/Software/petsc/arch-cori-c-opt/lib -L/global/homes/j/jychang/Software/petsc/arch-cori-c-opt/lib -lpetsc -ldl ----------------------------------------- Level 3 domain size (m) 1e+04 x 1e+04 x 1e+03, num elements 320 x 320 x 33 (3379200), size (m) 31.25 x 31.25 x 31.25 Level 2 domain size (m) 1e+04 x 1e+04 x 1e+03, num elements 160 x 160 x 17 (435200), size (m) 62.5 x 62.5 x 62.5 Level 1 domain size (m) 1e+04 x 1e+04 x 1e+03, num elements 80 x 80 x 9 (57600), size (m) 125. x 125. x 125. Level 0 domain size (m) 1e+04 x 1e+04 x 1e+03, num elements 40 x 40 x 5 (8000), size (m) 250. x 250. x 250. Solution statistics after solve: Full CONVERGED_FNORM_RELATIVE: Number of SNES iterations = 8, total linear iterations = 29 |X|_2 31108.6 -2.39449e-16 <= u <= 24.6134 -3.12097 <= v <= 3.12097 5.83435e-20 <= c <= 24.6134 Surface statistics: u in [1.223876e+01, 2.461340e+01] mean 2.022494e+01 Global eta range 2.6772e+10 to 9.2273e+12 converged range 2.6772e+10 to 6.96406e+12 Global beta2 range 1e+100 to 0. converged range 1e+100 to 0. Wall-clock time: 1.303e+02 seconds Degrees-of-freedom: 6758400 FLOPS: 1.813e+12 L1 misses: 2.236e+11 Intensity: 8.108e+00 Rate: 5.188e+04 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- /global/cscratch1/sd/jychang/Icesheet/./ex48cori on a arch-cori-c-opt named nid00315 with 32 processors, by jychang Tue Apr 4 11:28:21 2017 Using Petsc Development GIT revision: v3.7.5-3418-ge372536 GIT Date: 2017-03-30 13:35:15 -0500 Max Max/Min Avg Total Time (sec): 1.308e+02 1.00000 1.308e+02 Objects: 3.200e+02 1.00000 3.200e+02 Flop: 5.666e+10 1.00000 5.666e+10 1.813e+12 Flop/sec: 4.332e+08 1.00000 4.332e+08 1.386e+10 MPI Messages: 1.042e+04 1.00000 1.042e+04 3.334e+05 MPI Message Lengths: 1.907e+08 1.00000 1.830e+04 6.102e+09 MPI Reductions: 1.067e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flop and VecAXPY() for complex vectors of length N --> 8N flop Summary of Stages: ----- Time ------ ----- Flop ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.3068e+02 99.9% 1.8131e+12 100.0% 3.334e+05 100.0% 1.830e+04 100.0% 1.066e+03 99.9% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flop: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flop in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flop over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage VecDot 8 1.0 1.3568e-02 2.3 3.38e+06 1.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 7970 VecMDot 269 1.0 5.8322e-01 2.3 2.42e+08 1.0 0.0e+00 0.0e+00 2.7e+02 0 0 0 0 25 0 0 0 0 25 13261 VecNorm 318 1.0 1.1728e+00 2.6 6.54e+07 1.0 0.0e+00 0.0e+00 3.2e+02 1 0 0 0 30 1 0 0 0 30 1784 VecScale 301 1.0 9.5400e-02 1.0 2.91e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 9765 VecCopy 159 1.0 7.6090e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 604 1.0 7.3201e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 32 1.0 3.8205e-02 1.1 7.25e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 6074 VecAYPX 888 1.0 3.2668e-01 1.4 8.95e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 8771 VecAXPBYCZ 444 1.0 2.0251e-01 1.2 1.79e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 28298 VecWAXPY 8 1.0 1.2123e-02 1.1 1.69e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4460 VecMAXPY 301 1.0 4.5411e-01 1.1 2.93e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 20622 VecPointwiseMult 24 1.0 2.1911e-02 1.0 2.50e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 366 VecScatterBegin 1316 1.0 3.5330e-01 1.1 0.00e+00 0.0 3.1e+05 6.6e+03 0.0e+00 0 0 94 34 0 0 0 94 34 0 0 VecScatterEnd 1316 1.0 9.0839e-01 2.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecReduceArith 16 1.0 1.2193e-02 1.1 6.76e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 17737 VecReduceComm 8 1.0 1.5723e-02 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00 0 0 0 0 1 0 0 0 0 1 0 VecNormalize 301 1.0 9.4121e-01 4.3 8.73e+07 1.0 0.0e+00 0.0e+00 3.0e+02 0 0 0 0 28 0 0 0 0 28 2969 MatMult 943 1.0 1.8678e+01 1.0 8.46e+09 1.0 2.4e+05 7.5e+03 0.0e+00 14 15 72 30 0 14 15 72 30 0 14488 MatMultAdd 111 1.0 2.1049e-01 1.3 5.97e+07 1.0 1.1e+04 1.9e+03 0.0e+00 0 0 3 0 0 0 0 3 0 0 9082 MatMultTranspose 138 1.0 3.2180e-01 2.1 7.43e+07 1.0 1.3e+04 1.9e+03 0.0e+00 0 0 4 0 0 0 0 4 0 0 7386 MatSolve 37 1.0 1.4498e+00 1.0 8.03e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 17733 MatSOR 930 1.0 3.1305e+01 1.0 7.69e+09 1.0 0.0e+00 0.0e+00 0.0e+00 24 14 0 0 0 24 14 0 0 0 7859 MatLUFactorSym 1 1.0 3.4306e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 8 1.0 3.9910e+01 1.0 3.87e+10 1.0 0.0e+00 0.0e+00 0.0e+00 30 68 0 0 0 30 68 0 0 0 30997 MatCopy 7 1.0 2.5133e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatResidual 111 1.0 2.1072e+00 1.1 9.45e+08 1.0 2.8e+04 7.2e+03 0.0e+00 2 2 9 3 0 2 2 9 3 0 14346 MatAssemblyBegin 41 1.0 3.9748e-01 1.6 0.00e+00 0.0 9.2e+03 4.3e+05 7.8e+01 0 0 3 65 7 0 0 3 65 7 0 MatAssemblyEnd 41 1.0 8.3059e-01 1.1 0.00e+00 0.0 2.6e+03 2.3e+03 5.6e+01 1 0 1 0 5 1 0 1 0 5 0 MatGetRowIJ 1 1.0 9.3739e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCreateSubMats 8 1.0 1.7377e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+01 0 0 0 0 1 0 0 0 0 1 0 MatGetOrdering 1 1.0 4.0094e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatZeroEntries 32 1.0 4.6876e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatRedundantMat 8 1.0 3.1657e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+01 0 0 0 0 1 0 0 0 0 1 0 MatMPIConcateSeq 8 1.0 8.2272e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 DMCoarsen 3 1.0 7.6960e-02 1.0 0.00e+00 0.0 3.1e+03 2.6e+02 6.6e+01 0 0 1 0 6 0 0 1 0 6 0 DMCreateInterp 3 1.0 1.7092e-01 1.0 1.61e+06 1.0 1.4e+03 8.7e+02 7.5e+01 0 0 0 0 7 0 0 0 0 7 302 SNESSolve 1 1.0 1.2485e+02 1.0 5.67e+10 1.0 3.3e+05 1.8e+04 1.0e+03 95100100100 96 96100100100 96 14522 SNESFunctionEval 9 1.0 2.5234e+00 1.0 0.00e+00 0.0 4.6e+03 1.2e+04 0.0e+00 2 0 1 1 0 2 0 1 1 0 0 SNESJacobianEval 32 1.0 1.0044e+01 1.0 0.00e+00 0.0 1.7e+04 2.3e+05 6.4e+01 8 0 5 66 6 8 0 5 66 6 0 SNESLineSearch 8 1.0 1.2265e+01 1.0 1.96e+08 1.0 6.1e+03 1.3e+04 3.2e+01 9 0 2 1 3 9 0 2 1 3 511 KSPGMRESOrthog 269 1.0 9.6511e-01 1.5 4.83e+08 1.0 0.0e+00 0.0e+00 2.7e+02 1 1 0 0 25 1 1 0 0 25 16027 KSPSetUp 51 1.0 4.6217e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 4.4e+01 0 0 0 0 4 0 0 0 0 4 0 KSPSolve 8 1.0 1.0284e+02 1.0 5.65e+10 1.0 3.2e+05 9.3e+03 9.7e+02 79100 96 49 91 79100 96 49 91 17568 PCSetUp 8 1.0 4.2533e+01 1.0 3.87e+10 1.0 2.4e+04 4.3e+04 3.1e+02 32 68 7 17 29 32 68 7 17 29 29097 PCApply 37 1.0 5.9487e+01 1.0 1.70e+10 1.0 2.9e+05 6.4e+03 5.6e+02 45 30 87 30 53 45 30 87 30 53 9171 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Toy Hydrostatic Ice 1 1 832 0. Vector 153 153 82555584 0. Vector Scatter 20 20 2048160 0. Matrix 33 33 433469512 0. Distributed Mesh 8 8 43408 0. Index Set 47 47 1971512 0. IS L to G Mapping 8 8 1100672 0. Star Forest Bipartite Graph 16 16 13808 0. Discrete System 8 8 7432 0. SNES 1 1 1488 0. SNESLineSearch 1 1 1040 0. DMSNES 4 4 3088 0. Krylov Solver 9 9 117080 0. DMKSP interface 4 4 2784 0. Preconditioner 6 6 6272 0. Viewer 1 0 0 0. ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 2.19345e-06 Average time for zero size MPI_Send(): 4.00916e-05 #PETSc Option Table entries: -M 40 -N 40 -P 5 -da_refine 3 -log_view -pc_type mg -thi_mat_type aij #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 8 Configure options: --with-64-bit-indices=1 --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-fc=ftn --with-fortranlib-autodetect=0 --with-memalign=64 --with-mpiexec=srun COPTFLAGS="-g -O3 -fp-model fast -xMIC-AVX512" CXXOPTFLAGS="-g -O3 -fp-model fast -xMIC-AVX512" FOPTFLAGS="-g -O3 -fp-model fast -xMIC-AVX512" PETSC_ARCH=arch-cori-c-opt ----------------------------------------- Libraries compiled on Mon Apr 3 16:17:18 2017 on cori04 Machine characteristics: Linux-3.12.60-52.63.1.12215.0.PTF.1017941-default-x86_64-with-SuSE-12-x86_64 Using PETSc directory: /global/homes/j/jychang/Software/petsc Using PETSc arch: arch-cori-c-opt ----------------------------------------- Using C compiler: cc ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: ftn ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/global/homes/j/jychang/Software/petsc/arch-cori-c-opt/include -I/global/homes/j/jychang/Software/petsc/include -I/global/homes/j/jychang/Software/petsc/include -I/global/homes/j/jychang/Software/petsc/arch-cori-c-opt/include ----------------------------------------- Using C linker: cc Using Fortran linker: ftn Using libraries: -Wl,-rpath,/global/homes/j/jychang/Software/petsc/arch-cori-c-opt/lib -L/global/homes/j/jychang/Software/petsc/arch-cori-c-opt/lib -lpetsc -ldl -----------------------------------------