8 processors ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./a.out on a arch-linu named n12-68 with 8 processors, by wtay Thu Jan 5 16:34:46 2012 Using Petsc Release Version 3.2.0, Patch 5, Sat Oct 29 13:45:54 CDT 2011 Max Max/Min Avg Total Time (sec): 2.727e+02 1.00834 2.707e+02 Objects: 2.250e+02 1.00000 2.250e+02 Flops: 2.052e+09 1.02285 2.042e+09 1.634e+10 Flops/sec: 7.587e+06 1.03138 7.543e+06 6.035e+07 MPI Messages: 3.520e+02 2.00000 3.080e+02 2.464e+03 MPI Message Lengths: 5.502e+07 2.00000 1.563e+05 3.851e+08 MPI Reductions: 9.400e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 2.7073e+02 100.0% 1.6337e+10 100.0% 2.464e+03 100.0% 1.563e+05 100.0% 9.390e+02 99.9% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatMult 168 1.0 4.8071e+00 1.5 7.07e+08 1.0 2.4e+03 1.6e+05 0.0e+00 2 34 95 99 0 2 34 95 99 0 1169 MatSolve 135 1.0 2.9020e+00 2.2 5.55e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 27 0 0 0 1 27 0 0 0 1520 MatLUFactorNum 27 1.0 1.5177e+00 1.2 1.60e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 8 0 0 0 1 8 0 0 0 836 MatILUFactorSym 27 1.0 1.6980e+00 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 2.7e+01 0 0 0 0 3 0 0 0 0 3 0 MatConvert 1 1.0 1.0580e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyBegin 55 1.0 1.6970e+0015.1 0.00e+00 0.0 0.0e+00 0.0e+00 5.6e+01 0 0 0 0 6 0 0 0 0 6 0 MatAssemblyEnd 55 1.0 1.4480e+00 1.2 0.00e+00 0.0 1.1e+02 4.0e+04 6.0e+01 0 0 5 1 6 0 0 5 1 6 0 MatGetRowIJ 29 1.0 2.2173e-05 2.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 27 1.0 2.7889e+00 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+02 1 0 0 0 11 1 0 0 0 12 0 MatGetOrdering 27 1.0 1.1115e-01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 5.4e+01 0 0 0 0 6 0 0 0 0 6 0 KSPSetup 55 1.0 4.6352e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 37 1.0 1.1571e+02 1.0 2.05e+09 1.0 2.4e+03 1.6e+05 5.8e+02 43100 95 99 61 43100 95 99 61 141 VecDot 168 1.0 2.1703e+00 4.0 1.10e+08 1.0 0.0e+00 0.0e+00 1.7e+02 0 5 0 0 18 0 5 0 0 18 406 VecDotNorm2 84 1.0 1.6903e+00 8.8 1.10e+08 1.0 0.0e+00 0.0e+00 8.4e+01 0 5 0 0 9 0 5 0 0 9 521 VecNorm 121 1.0 3.0106e+00 6.4 7.93e+07 1.0 0.0e+00 0.0e+00 1.2e+02 1 4 0 0 13 1 4 0 0 13 211 VecCopy 74 1.0 3.0349e-01 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 316 1.0 6.9309e-01 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPBYCZ 168 1.0 1.0517e+00 2.0 2.20e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 11 0 0 0 0 11 0 0 0 1675 VecWAXPY 168 1.0 9.2991e-01 1.7 1.10e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 947 VecAssemblyBegin 74 1.0 2.8515e-0187.4 0.00e+00 0.0 0.0e+00 0.0e+00 2.2e+02 0 0 0 0 24 0 0 0 0 24 0 VecAssemblyEnd 74 1.0 7.1287e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 168 1.0 8.8502e-02 2.1 0.00e+00 0.0 2.4e+03 1.6e+05 0.0e+00 0 0 95 99 0 0 0 95 99 0 0 VecScatterEnd 168 1.0 2.1965e+0021.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCSetUp 55 1.0 3.5193e+01 1.1 1.60e+08 1.0 0.0e+00 0.0e+00 2.0e+02 13 8 0 0 21 13 8 0 0 22 36 PCSetUpOnBlocks 162 1.0 6.4670e+00 1.7 7.15e+08 1.0 0.0e+00 0.0e+00 8.1e+01 2 35 0 0 9 2 35 0 0 9 878 PCApply 205 1.0 7.2195e+01 1.0 5.55e+08 1.0 0.0e+00 0.0e+00 0.0e+00 26 27 0 0 0 26 27 0 0 0 61 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 66 51 1572900956 0 Krylov Solver 7 1 1088 0 Vector 48 15 26400736 0 Vector Scatter 4 1 1060 0 Index Set 92 80 31517408 0 Preconditioner 7 1 1040 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 0 Average time for MPI_Barrier(): 3.57628e-06 Average time for zero size MPI_Send(): 4.11272e-06 #PETSc Option Table entries: -20000 -50000 -log_summary #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 Configure run at: Sun Nov 27 15:18:15 2011 Configure options: --with-mpi-dir=/opt/openmpi-1.5.3/ --with-blas-lapack-dir=/opt/intel_xe_2011/mkl/lib/intel64/ --with-debugging=0 --download-hypre=1 --prefix=/home/wtay/Lib/petsc-3.2-p5_mumps_rel COPTFLAGS=-O3 FOPTFLAGS=-O3 --download-mumps=1 --download-parmetis=1 --download-scalapack=1 --download-blacs=1 16 processors ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./a.out on a arch-linu named n12-75 with 16 processors, by wtay Thu Jan 5 15:45:31 2012 Using Petsc Release Version 3.2.0, Patch 5, Sat Oct 29 13:45:54 CDT 2011 Max Max/Min Avg Total Time (sec): 2.312e+02 1.00201 2.308e+02 Objects: 2.250e+02 1.00000 2.250e+02 Flops: 1.018e+09 1.04717 1.013e+09 1.621e+10 Flops/sec: 4.411e+06 1.04927 4.389e+06 7.022e+07 MPI Messages: 3.520e+02 2.00000 3.300e+02 5.280e+03 MPI Message Lengths: 5.502e+07 2.00000 1.563e+05 8.253e+08 MPI Reductions: 9.400e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 2.3079e+02 100.0% 1.6207e+10 100.0% 5.280e+03 100.0% 1.563e+05 100.0% 9.390e+02 99.9% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatMult 168 1.0 3.9638e+00 1.9 3.54e+08 1.1 5.0e+03 1.6e+05 0.0e+00 1 35 95 99 0 1 35 95 99 0 1418 MatSolve 135 1.0 1.7009e+00 2.0 2.72e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 27 0 0 0 0 27 0 0 0 2542 MatLUFactorNum 27 1.0 2.1348e+00 1.6 7.71e+07 1.1 0.0e+00 0.0e+00 0.0e+00 1 8 0 0 0 1 8 0 0 0 574 MatILUFactorSym 27 1.0 9.6214e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 2.7e+01 0 0 0 0 3 0 0 0 0 3 0 MatConvert 1 1.0 5.7336e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyBegin 55 1.0 4.2789e+00132.1 0.00e+00 0.0 0.0e+00 0.0e+00 5.6e+01 1 0 0 0 6 1 0 0 0 6 0 MatAssemblyEnd 55 1.0 1.1419e+00 1.2 0.00e+00 0.0 2.4e+02 4.0e+04 6.0e+01 0 0 5 1 6 0 0 5 1 6 0 MatGetRowIJ 29 1.0 5.6028e-05 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrice 27 1.0 1.4987e+00 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+02 0 0 0 0 11 0 0 0 0 12 0 MatGetOrdering 27 1.0 8.0589e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 5.4e+01 0 0 0 0 6 0 0 0 0 6 0 KSPSetup 55 1.0 4.1183e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 37 1.0 1.6369e+02 1.0 1.02e+09 1.0 5.0e+03 1.6e+05 5.8e+02 71100 95 99 61 71100 95 99 61 99 VecDot 168 1.0 1.2964e+00 2.7 5.51e+07 1.0 0.0e+00 0.0e+00 1.7e+02 0 5 0 0 18 0 5 0 0 18 679 VecDotNorm2 84 1.0 9.5916e-01 4.5 5.51e+07 1.0 0.0e+00 0.0e+00 8.4e+01 0 5 0 0 9 0 5 0 0 9 918 VecNorm 121 1.0 2.7318e+00 4.7 3.96e+07 1.0 0.0e+00 0.0e+00 1.2e+02 1 4 0 0 13 1 4 0 0 13 232 VecCopy 74 1.0 1.4244e-01 2.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 316 1.0 3.3811e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPBYCZ 168 1.0 6.8925e-01 2.9 1.10e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 11 0 0 0 0 11 0 0 0 2556 VecWAXPY 168 1.0 6.3240e-01 2.5 5.51e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 1393 VecAssemblyBegin 74 1.0 1.2071e+0031.1 0.00e+00 0.0 0.0e+00 0.0e+00 2.2e+02 0 0 0 0 24 0 0 0 0 24 0 VecAssemblyEnd 74 1.0 2.5630e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 168 1.0 9.6959e-02 3.6 0.00e+00 0.0 5.0e+03 1.6e+05 0.0e+00 0 0 95 99 0 0 0 95 99 0 0 VecScatterEnd 168 1.0 1.8129e+00 6.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCSetUp 55 1.0 3.1927e+01 1.1 7.71e+07 1.1 0.0e+00 0.0e+00 2.0e+02 13 8 0 0 21 13 8 0 0 22 38 PCSetUpOnBlocks 162 1.0 5.0030e+00 1.7 3.49e+08 1.1 0.0e+00 0.0e+00 8.1e+01 2 34 0 0 9 2 34 0 0 9 1109 PCApply 205 1.0 1.2626e+02 1.0 2.72e+08 1.1 0.0e+00 0.0e+00 0.0e+00 54 27 0 0 0 54 27 0 0 0 34 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 66 51 763439196 0 Krylov Solver 7 1 1088 0 Vector 48 15 13293536 0 Vector Scatter 4 1 1060 0 Index Set 92 80 15788768 0 Preconditioner 7 1 1040 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 1.19209e-07 Average time for MPI_Barrier(): 1.27792e-05 Average time for zero size MPI_Send(): 1.65701e-05 #PETSc Option Table entries: -20000 -50000 -log_summary #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 Configure run at: Sun Nov 27 15:18:15 2011 Configure options: --with-mpi-dir=/opt/openmpi-1.5.3/ --with-blas-lapack-dir=/opt/intel_xe_2011/mkl/lib/intel64/ --with-debugging=0 --download-hypre=1 --prefix=/home/wtay/Lib/petsc-3.2-p5_mumps_rel COPTFLAGS=-O3 FOPTFLAGS=-O3 --download-mumps=1 --download-parmetis=1 --download-scalapack=1 --download-blacs=1