Strong Scaling Study /usr/local/u/cekees/BOB/mpirun -np 1 /usr/local/u/cekees/proteus/externalPackages/petsc-dev/src/snes/examples/tutorials/ex5 -da_grid_x 1601 -da_grid_y 1601 -pc_type jacobi -log_summary ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- /usr/local/u/cekees/proteus/externalPackages/petsc-dev/src/snes/examples/tutorials/ex5 on a diamond named r8i2n9 with 1 processor, by cekees Thu Apr 11 18:33:08 2013 Using Petsc Development HG revision: unknown HG Date: unknown Max Max/Min Avg Total Time (sec): 1.239e+03 1.00000 1.239e+03 Objects: 8.000e+01 1.00000 8.000e+01 Flops: 1.987e+12 1.00000 1.987e+12 1.987e+12 Flops/sec: 1.604e+09 1.00000 1.604e+09 1.604e+09 MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Reductions: 1.070e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.2390e+03 100.0% 1.9871e+12 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 1.060e+02 99.1% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage SNESSolve 1 1.0 1.2379e+03 1.0 1.99e+12 1.0 0.0e+00 0.0e+00 7.2e+01100100 0 0 67 100100 0 0 68 1605 SNESFunctionEval 1 1.0 5.7745e-02 1.0 2.82e+07 1.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 2 0 0 0 0 2 488 SNESJacobianEval 1 1.0 3.9690e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecMDot 10000 1.0 3.3323e+02 1.0 7.94e+11 1.0 0.0e+00 0.0e+00 0.0e+00 27 40 0 0 0 27 40 0 0 0 2383 VecNorm 10334 1.0 1.7750e+01 1.0 5.30e+10 1.0 0.0e+00 0.0e+00 0.0e+00 1 3 0 0 0 1 3 0 0 0 2985 VecScale 10334 1.0 2.4144e+01 1.0 2.65e+10 1.0 0.0e+00 0.0e+00 0.0e+00 2 1 0 0 0 2 1 0 0 0 1097 VecCopy 334 1.0 1.4562e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 378 1.0 1.1514e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 667 1.0 2.8969e+00 1.0 3.42e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1180 VecMAXPY 10334 1.0 4.7423e+02 1.0 8.45e+11 1.0 0.0e+00 0.0e+00 0.0e+00 38 43 0 0 0 38 43 0 0 0 1783 VecPointwiseMult 10334 1.0 6.2629e+01 1.0 2.65e+10 1.0 0.0e+00 0.0e+00 0.0e+00 5 1 0 0 0 5 1 0 0 0 423 VecScatterBegin 2 1.0 1.0207e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecReduceArith 1 1.0 1.0454e-02 1.0 5.13e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 490 VecReduceComm 1 1.0 4.0531e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 10334 1.0 4.1916e+01 1.0 7.95e+10 1.0 0.0e+00 0.0e+00 0.0e+00 3 4 0 0 0 3 4 0 0 0 1896 MatMult 10333 1.0 3.1976e+02 1.0 2.38e+11 1.0 0.0e+00 0.0e+00 0.0e+00 26 12 0 0 0 26 12 0 0 0 745 MatAssemblyBegin 2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 2 1.0 7.7429e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPGMRESOrthog 10000 1.0 7.7890e+02 1.0 1.59e+12 1.0 0.0e+00 0.0e+00 0.0e+00 63 80 0 0 0 63 80 0 0 0 2039 KSPSetUp 1 1.0 3.8952e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 9 0 0 0 0 9 0 KSPSolve 1 1.0 1.2374e+03 1.0 1.99e+12 1.0 0.0e+00 0.0e+00 7.0e+01100100 0 0 65 100100 0 0 66 1606 PCSetUp 1 1.0 4.0531e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 10334 1.0 6.2703e+01 1.0 2.65e+10 1.0 0.0e+00 0.0e+00 2.0e+00 5 1 0 0 2 5 1 0 0 2 422 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage SNES 1 1 1316 0 SNESLineSearch 1 1 864 0 DMSNES 1 1 672 0 Vector 46 46 881809720 0 Vector Scatter 4 4 2576 0 Matrix 1 1 194729096 0 Distributed Mesh 3 3 51278164 0 Bipartite Graph 6 6 4800 0 Index Set 10 10 20513272 0 IS L to G Mapping 3 3 30760176 0 Krylov Solver 1 1 18368 0 DMKSP interface 1 1 656 0 Preconditioner 1 1 832 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 1.19209e-07 #PETSc Option Table entries: -da_grid_x 1601 -da_grid_y 1601 -log_summary -pc_type jacobi #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure run at: Wed Apr 10 18:41:36 2013 Configure options: --with-debugging=0 --with-clanguage=C --with-pic=1 --with-shared-libraries=0 --with-mpi-compilers=1 --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --with-blas-lapack-dir=/opt/intel/cmkl/10.2.4.032 --download-cmake=1 --download-metis=1 --download-parmetis=1 --download-spooles=1 --download-blacs=1 --download-scalapack=1 --download-mumps=1 --download-superlu=1 --download-superlu_dist=1 --download-hypre=1 --PETSC_ARCH=diamond --PETSC_DIR=/usr/local/u/cekees/proteus/externalPackages/petsc-dev --prefix=/usr/local/u/cekees/proteus/diamond ----------------------------------------- Libraries compiled on Wed Apr 10 18:41:36 2013 on diamond03 Machine characteristics: Linux-2.6.32.59-0.7-default-x86_64-with-SuSE-11-x86_64 Using PETSc directory: /usr/local/u/cekees/proteus/externalPackages/petsc-dev Using PETSc arch: diamond ----------------------------------------- Using C compiler: mpiicc -fPIC -wd1572 -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpiifort -fPIC -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/include -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/include -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/include -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/include -I/opt/intel/impi/4.0.3.008/intel64/include ----------------------------------------- Using C linker: mpiicc Using Fortran linker: mpiifort Using libraries: -Wl,-rpath,/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -L/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -lpetsc -Wl,-rpath,/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -L/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_4.3 -lsuperlu_dist_3.2 -lHYPRE -Wl,-rpath,/opt/intel/impi/4.0.3.008/intel64/lib -L/opt/intel/impi/4.0.3.008/intel64/lib -Wl,-rpath,/opt/intel/impi/4.0.3/lib64 -L/opt/intel/impi/4.0.3/lib64 -Wl,-rpath,/opt/intel/Compiler/12.1.003/mkl/lib/intel64 -L/opt/intel/Compiler/12.1.003/mkl/lib/intel64 -Wl,-rpath,/usr/local/applic/intel_new/composer_xe_2011_sp1.9.293/compiler/lib/intel64 -L/usr/local/applic/intel_new/composer_xe_2011_sp1.9.293/compiler/lib/intel64 -Wl,-rpath,/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -Wl,-rpath,/usr/x86_64-suse-linux/lib -L/usr/x86_64-suse-linux/lib -Wl,-rpath,/usr/local/u/cekees/proteus/externalPackages/petsc-dev/-Xlinker -lmpigc4 -Wl,-rpath,/opt/intel/mpi-rt/4.0.3 -Wl,-rpath,/opt/intel/cmkl/10.2.4.032 -L/opt/intel/cmkl/10.2.4.032 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lX11 -lparmetis -lmetis -lpthread -lifport -lifcore -lm -lm -lmpigc4 -ldl -lmpi -lmpigf -lmpigi -lpthread -lrt -limf -lsvml -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl ----------------------------------------- /usr/local/u/cekees/BOB/mpirun -np 2 /usr/local/u/cekees/proteus/externalPackages/petsc-dev/src/snes/examples/tutorials/ex5 -da_grid_x 1601 -da_grid_y 1601 -pc_type jacobi -log_summary ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- /usr/local/u/cekees/proteus/externalPackages/petsc-dev/src/snes/examples/tutorials/ex5 on a diamond named r8i2n9 with 2 processors, by cekees Thu Apr 11 18:44:05 2013 Using Petsc Development HG revision: unknown HG Date: unknown Max Max/Min Avg Total Time (sec): 6.459e+02 1.00000 6.459e+02 Objects: 8.700e+01 1.00000 8.700e+01 Flops: 9.942e+11 1.00125 9.935e+11 1.987e+12 Flops/sec: 1.539e+09 1.00125 1.538e+09 3.077e+09 MPI Messages: 1.034e+04 1.00000 1.034e+04 2.068e+04 MPI Message Lengths: 1.324e+08 1.00000 1.280e+04 2.648e+08 MPI Reductions: 2.047e+04 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 6.4588e+02 100.0% 1.9871e+12 100.0% 2.068e+04 100.0% 1.280e+04 100.0% 2.047e+04 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage SNESSolve 1 1.0 6.4518e+02 1.0 9.94e+11 1.0 2.1e+04 1.3e+04 2.0e+04100100100100100 100100100100100 3080 SNESFunctionEval 1 1.0 2.8533e-02 1.0 1.41e+07 1.0 2.0e+00 1.3e+04 2.0e+00 0 0 0 0 0 0 0 0 0 0 988 SNESJacobianEval 1 1.0 2.6756e-01 1.0 0.00e+00 0.0 2.0e+00 1.3e+04 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecMDot 10000 1.0 1.7011e+02 1.0 3.97e+11 1.0 0.0e+00 0.0e+00 1.0e+04 26 40 0 0 49 26 40 0 0 49 4668 VecNorm 10334 1.0 1.3033e+01 1.3 2.65e+10 1.0 0.0e+00 0.0e+00 1.0e+04 2 3 0 0 50 2 3 0 0 50 4065 VecScale 10334 1.0 1.1086e+01 1.0 1.33e+10 1.0 0.0e+00 0.0e+00 0.0e+00 2 1 0 0 0 2 1 0 0 0 2389 VecCopy 334 1.0 6.9249e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 338 1.0 4.1271e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 667 1.0 1.3672e+00 1.0 1.71e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2501 VecMAXPY 10334 1.0 2.3652e+02 1.0 4.23e+11 1.0 0.0e+00 0.0e+00 0.0e+00 36 43 0 0 0 36 43 0 0 0 3574 VecPointwiseMult 10334 1.0 3.2376e+01 1.0 1.33e+10 1.0 0.0e+00 0.0e+00 0.0e+00 5 1 0 0 0 5 1 0 0 0 818 VecScatterBegin 10335 1.0 1.5519e-01 1.1 0.00e+00 0.0 2.1e+04 1.3e+04 0.0e+00 0 0100100 0 0 0100100 0 0 VecScatterEnd 10335 1.0 1.7190e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecReduceArith 1 1.0 9.7120e-03 1.0 2.56e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 528 VecReduceComm 1 1.0 1.5409e-0346.8 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 10334 1.0 2.4140e+01 1.1 3.98e+10 1.0 0.0e+00 0.0e+00 1.0e+04 3 4 0 0 50 3 4 0 0 50 3292 MatMult 10333 1.0 1.8488e+02 1.0 1.19e+11 1.0 2.1e+04 1.3e+04 0.0e+00 28 12100100 0 28 12100100 0 1289 MatAssemblyBegin 2 1.0 1.0981e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 2 1.0 6.1657e-02 1.0 0.00e+00 0.0 4.0e+00 3.2e+03 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPGMRESOrthog 10000 1.0 3.8960e+02 1.0 7.95e+11 1.0 0.0e+00 0.0e+00 1.0e+04 60 80 0 0 49 60 80 0 0 49 4076 KSPSetUp 1 1.0 2.0162e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 6.4488e+02 1.0 9.94e+11 1.0 2.1e+04 1.3e+04 2.0e+04100100100100100 100100100100100 3081 PCSetUp 1 1.0 4.0531e-06 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 10334 1.0 3.2416e+01 1.0 1.33e+10 1.0 0.0e+00 0.0e+00 2.0e+00 5 1 0 0 0 5 1 0 0 0 817 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage SNES 1 1 1316 0 SNESLineSearch 1 1 864 0 DMSNES 1 1 672 0 Vector 48 48 441245800 0 Vector Scatter 5 5 5300 0 Matrix 3 3 117956656 0 Distributed Mesh 3 3 25694184 0 Bipartite Graph 6 6 4800 0 Index Set 12 12 10287612 0 IS L to G Mapping 3 3 15409788 0 Krylov Solver 1 1 18368 0 DMKSP interface 1 1 656 0 Preconditioner 1 1 832 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 0 Average time for MPI_Barrier(): 1.43051e-06 Average time for zero size MPI_Send(): 1.70469e-05 #PETSc Option Table entries: -da_grid_x 1601 -da_grid_y 1601 -log_summary -pc_type jacobi #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure run at: Wed Apr 10 18:41:36 2013 Configure options: --with-debugging=0 --with-clanguage=C --with-pic=1 --with-shared-libraries=0 --with-mpi-compilers=1 --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --with-blas-lapack-dir=/opt/intel/cmkl/10.2.4.032 --download-cmake=1 --download-metis=1 --download-parmetis=1 --download-spooles=1 --download-blacs=1 --download-scalapack=1 --download-mumps=1 --download-superlu=1 --download-superlu_dist=1 --download-hypre=1 --PETSC_ARCH=diamond --PETSC_DIR=/usr/local/u/cekees/proteus/externalPackages/petsc-dev --prefix=/usr/local/u/cekees/proteus/diamond ----------------------------------------- Libraries compiled on Wed Apr 10 18:41:36 2013 on diamond03 Machine characteristics: Linux-2.6.32.59-0.7-default-x86_64-with-SuSE-11-x86_64 Using PETSc directory: /usr/local/u/cekees/proteus/externalPackages/petsc-dev Using PETSc arch: diamond ----------------------------------------- Using C compiler: mpiicc -fPIC -wd1572 -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpiifort -fPIC -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/include -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/include -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/include -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/include -I/opt/intel/impi/4.0.3.008/intel64/include ----------------------------------------- Using C linker: mpiicc Using Fortran linker: mpiifort Using libraries: -Wl,-rpath,/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -L/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -lpetsc -Wl,-rpath,/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -L/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_4.3 -lsuperlu_dist_3.2 -lHYPRE -Wl,-rpath,/opt/intel/impi/4.0.3.008/intel64/lib -L/opt/intel/impi/4.0.3.008/intel64/lib -Wl,-rpath,/opt/intel/impi/4.0.3/lib64 -L/opt/intel/impi/4.0.3/lib64 -Wl,-rpath,/opt/intel/Compiler/12.1.003/mkl/lib/intel64 -L/opt/intel/Compiler/12.1.003/mkl/lib/intel64 -Wl,-rpath,/usr/local/applic/intel_new/composer_xe_2011_sp1.9.293/compiler/lib/intel64 -L/usr/local/applic/intel_new/composer_xe_2011_sp1.9.293/compiler/lib/intel64 -Wl,-rpath,/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -Wl,-rpath,/usr/x86_64-suse-linux/lib -L/usr/x86_64-suse-linux/lib -Wl,-rpath,/usr/local/u/cekees/proteus/externalPackages/petsc-dev/-Xlinker -lmpigc4 -Wl,-rpath,/opt/intel/mpi-rt/4.0.3 -Wl,-rpath,/opt/intel/cmkl/10.2.4.032 -L/opt/intel/cmkl/10.2.4.032 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lX11 -lparmetis -lmetis -lpthread -lifport -lifcore -lm -lm -lmpigc4 -ldl -lmpi -lmpigf -lmpigi -lpthread -lrt -limf -lsvml -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl ----------------------------------------- /usr/local/u/cekees/BOB/mpirun -np 4 /usr/local/u/cekees/proteus/externalPackages/petsc-dev/src/snes/examples/tutorials/ex5 -da_grid_x 1601 -da_grid_y 1601 -pc_type jacobi -log_summary ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- /usr/local/u/cekees/proteus/externalPackages/petsc-dev/src/snes/examples/tutorials/ex5 on a diamond named r8i2n9 with 4 processors, by cekees Thu Apr 11 18:52:22 2013 Using Petsc Development HG revision: unknown HG Date: unknown Max Max/Min Avg Total Time (sec): 4.898e+02 1.00000 4.898e+02 Objects: 8.700e+01 1.00000 8.700e+01 Flops: 4.974e+11 1.00250 4.968e+11 1.987e+12 Flops/sec: 1.015e+09 1.00250 1.014e+09 4.057e+09 MPI Messages: 2.068e+04 1.00000 2.068e+04 8.274e+04 MPI Message Lengths: 1.325e+08 1.00125 6.401e+03 5.296e+08 MPI Reductions: 2.047e+04 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 4.8982e+02 100.0% 1.9871e+12 100.0% 8.274e+04 100.0% 6.401e+03 100.0% 2.047e+04 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage SNESSolve 1 1.0 4.8948e+02 1.0 4.97e+11 1.0 8.3e+04 6.4e+03 2.0e+04100100100100100 100100100100100 4060 SNESFunctionEval 1 1.0 1.5416e-02 1.0 7.06e+06 1.0 8.0e+00 6.4e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 1829 SNESJacobianEval 1 1.0 1.3525e-01 1.0 0.00e+00 0.0 8.0e+00 6.4e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecMDot 10000 1.0 1.5644e+02 1.1 1.99e+11 1.0 0.0e+00 0.0e+00 1.0e+04 31 40 0 0 49 31 40 0 0 49 5076 VecNorm 10334 1.0 2.1987e+01 2.5 1.33e+10 1.0 0.0e+00 0.0e+00 1.0e+04 3 3 0 0 50 3 3 0 0 50 2409 VecScale 10334 1.0 7.9417e+00 1.1 6.63e+09 1.0 0.0e+00 0.0e+00 0.0e+00 2 1 0 0 0 2 1 0 0 0 3335 VecCopy 334 1.0 5.9115e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 338 1.0 3.8249e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 667 1.0 1.0724e+00 1.1 8.56e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3188 VecMAXPY 10334 1.0 1.8187e+02 1.1 2.12e+11 1.0 0.0e+00 0.0e+00 0.0e+00 36 43 0 0 0 36 43 0 0 0 4648 VecPointwiseMult 10334 1.0 2.7976e+01 1.1 6.63e+09 1.0 0.0e+00 0.0e+00 0.0e+00 5 1 0 0 0 5 1 0 0 0 947 VecScatterBegin 10335 1.0 3.0192e-01 1.3 0.00e+00 0.0 8.3e+04 6.4e+03 0.0e+00 0 0100100 0 0 0100100 0 0 VecScatterEnd 10335 1.0 2.2017e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecReduceArith 1 1.0 1.0697e-02 1.3 1.28e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 479 VecReduceComm 1 1.0 2.8000e-03139.8 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 10334 1.0 2.9789e+01 1.8 1.99e+10 1.0 0.0e+00 0.0e+00 1.0e+04 4 4 0 0 50 4 4 0 0 50 2668 MatMult 10333 1.0 1.1508e+02 1.1 5.96e+10 1.0 8.3e+04 6.4e+03 0.0e+00 23 12100100 0 23 12100100 0 2070 MatAssemblyBegin 2 1.0 9.8441e-03113.1 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 2 1.0 3.6127e-02 1.1 0.00e+00 0.0 1.6e+01 1.6e+03 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPGMRESOrthog 10000 1.0 3.2716e+02 1.1 3.98e+11 1.0 0.0e+00 0.0e+00 1.0e+04 65 80 0 0 49 65 80 0 0 49 4854 KSPSetUp 1 1.0 1.0361e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 4.8932e+02 1.0 4.97e+11 1.0 8.3e+04 6.4e+03 2.0e+04100100100100100 100100100100100 4061 PCSetUp 1 1.0 6.9141e-06 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 10334 1.0 2.8010e+01 1.1 6.63e+09 1.0 0.0e+00 0.0e+00 2.0e+00 5 1 0 0 0 5 1 0 0 0 946 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage SNES 1 1 1316 0 SNESLineSearch 1 1 864 0 DMSNES 1 1 672 0 Vector 48 48 220810624 0 Vector Scatter 5 5 5300 0 Matrix 3 3 59022272 0 Distributed Mesh 3 3 12878224 0 Bipartite Graph 6 6 4800 0 Index Set 12 12 5161228 0 IS L to G Mapping 3 3 7720212 0 Krylov Solver 1 1 18368 0 DMKSP interface 1 1 656 0 Preconditioner 1 1 832 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 1.19209e-06 Average time for zero size MPI_Send(): 1.10269e-05 #PETSc Option Table entries: -da_grid_x 1601 -da_grid_y 1601 -log_summary -pc_type jacobi #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure run at: Wed Apr 10 18:41:36 2013 Configure options: --with-debugging=0 --with-clanguage=C --with-pic=1 --with-shared-libraries=0 --with-mpi-compilers=1 --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --with-blas-lapack-dir=/opt/intel/cmkl/10.2.4.032 --download-cmake=1 --download-metis=1 --download-parmetis=1 --download-spooles=1 --download-blacs=1 --download-scalapack=1 --download-mumps=1 --download-superlu=1 --download-superlu_dist=1 --download-hypre=1 --PETSC_ARCH=diamond --PETSC_DIR=/usr/local/u/cekees/proteus/externalPackages/petsc-dev --prefix=/usr/local/u/cekees/proteus/diamond ----------------------------------------- Libraries compiled on Wed Apr 10 18:41:36 2013 on diamond03 Machine characteristics: Linux-2.6.32.59-0.7-default-x86_64-with-SuSE-11-x86_64 Using PETSc directory: /usr/local/u/cekees/proteus/externalPackages/petsc-dev Using PETSc arch: diamond ----------------------------------------- Using C compiler: mpiicc -fPIC -wd1572 -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpiifort -fPIC -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/include -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/include -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/include -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/include -I/opt/intel/impi/4.0.3.008/intel64/include ----------------------------------------- Using C linker: mpiicc Using Fortran linker: mpiifort Using libraries: -Wl,-rpath,/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -L/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -lpetsc -Wl,-rpath,/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -L/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_4.3 -lsuperlu_dist_3.2 -lHYPRE -Wl,-rpath,/opt/intel/impi/4.0.3.008/intel64/lib -L/opt/intel/impi/4.0.3.008/intel64/lib -Wl,-rpath,/opt/intel/impi/4.0.3/lib64 -L/opt/intel/impi/4.0.3/lib64 -Wl,-rpath,/opt/intel/Compiler/12.1.003/mkl/lib/intel64 -L/opt/intel/Compiler/12.1.003/mkl/lib/intel64 -Wl,-rpath,/usr/local/applic/intel_new/composer_xe_2011_sp1.9.293/compiler/lib/intel64 -L/usr/local/applic/intel_new/composer_xe_2011_sp1.9.293/compiler/lib/intel64 -Wl,-rpath,/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -Wl,-rpath,/usr/x86_64-suse-linux/lib -L/usr/x86_64-suse-linux/lib -Wl,-rpath,/usr/local/u/cekees/proteus/externalPackages/petsc-dev/-Xlinker -lmpigc4 -Wl,-rpath,/opt/intel/mpi-rt/4.0.3 -Wl,-rpath,/opt/intel/cmkl/10.2.4.032 -L/opt/intel/cmkl/10.2.4.032 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lX11 -lparmetis -lmetis -lpthread -lifport -lifcore -lm -lm -lmpigc4 -ldl -lmpi -lmpigf -lmpigi -lpthread -lrt -limf -lsvml -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl ----------------------------------------- /usr/local/u/cekees/BOB/mpirun -np 8 /usr/local/u/cekees/proteus/externalPackages/petsc-dev/src/snes/examples/tutorials/ex5 -da_grid_x 1601 -da_grid_y 1601 -pc_type jacobi -log_summary ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- /usr/local/u/cekees/proteus/externalPackages/petsc-dev/src/snes/examples/tutorials/ex5 on a diamond named r8i2n9 with 8 processors, by cekees Thu Apr 11 18:59:19 2013 Using Petsc Development HG revision: unknown HG Date: unknown Max Max/Min Avg Total Time (sec): 4.090e+02 1.00000 4.090e+02 Objects: 8.700e+01 1.00000 8.700e+01 Flops: 2.490e+11 1.00375 2.484e+11 1.987e+12 Flops/sec: 6.087e+08 1.00375 6.072e+08 4.858e+09 MPI Messages: 3.103e+04 1.50005 2.586e+04 2.068e+05 MPI Message Lengths: 1.655e+08 1.66833 5.120e+03 1.059e+09 MPI Reductions: 2.047e+04 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 4.0904e+02 100.0% 1.9871e+12 100.0% 2.068e+05 100.0% 5.120e+03 100.0% 2.047e+04 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage SNESSolve 1 1.0 4.0885e+02 1.0 2.49e+11 1.0 2.1e+05 5.1e+03 2.0e+04100100100100100 100100100100100 4860 SNESFunctionEval 1 1.0 1.0637e-02 1.1 3.53e+06 1.0 2.0e+01 5.1e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 2651 SNESJacobianEval 1 1.0 7.2171e-02 1.0 0.00e+00 0.0 2.0e+01 5.1e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecMDot 10000 1.0 1.2672e+02 1.0 9.95e+10 1.0 0.0e+00 0.0e+00 1.0e+04 31 40 0 0 49 31 40 0 0 49 6266 VecNorm 10334 1.0 1.2112e+01 3.3 6.64e+09 1.0 0.0e+00 0.0e+00 1.0e+04 2 3 0 0 50 2 3 0 0 50 4374 VecScale 10334 1.0 7.3663e+00 1.1 3.32e+09 1.0 0.0e+00 0.0e+00 0.0e+00 2 1 0 0 0 2 1 0 0 0 3596 VecCopy 334 1.0 5.7240e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 338 1.0 3.8571e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 667 1.0 1.0626e+00 1.1 4.28e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3218 VecMAXPY 10334 1.0 1.5945e+02 1.1 1.06e+11 1.0 0.0e+00 0.0e+00 0.0e+00 38 43 0 0 0 38 43 0 0 0 5302 VecPointwiseMult 10334 1.0 2.5866e+01 1.0 3.32e+09 1.0 0.0e+00 0.0e+00 0.0e+00 6 1 0 0 0 6 1 0 0 0 1024 VecScatterBegin 10335 1.0 3.9911e-01 1.8 0.00e+00 0.0 2.1e+05 5.1e+03 0.0e+00 0 0100100 0 0 0100100 0 0 VecScatterEnd 10335 1.0 2.8134e-01 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecReduceArith 1 1.0 1.5001e-02 1.5 6.42e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 342 VecReduceComm 1 1.0 5.9950e-03318.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 10334 1.0 1.9459e+01 1.8 9.96e+09 1.0 0.0e+00 0.0e+00 1.0e+04 4 4 0 0 50 4 4 0 0 50 4084 MatMult 10333 1.0 8.7251e+01 1.0 2.98e+10 1.0 2.1e+05 5.1e+03 0.0e+00 21 12100100 0 21 12100100 0 2730 MatAssemblyBegin 2 1.0 3.9272e-0327.6 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 2 1.0 2.4849e-02 1.0 0.00e+00 0.0 4.0e+01 1.3e+03 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPGMRESOrthog 10000 1.0 2.7419e+02 1.0 1.99e+11 1.0 0.0e+00 0.0e+00 1.0e+04 66 80 0 0 49 66 80 0 0 49 5792 KSPSetUp 1 1.0 5.9462e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 4.0875e+02 1.0 2.49e+11 1.0 2.1e+05 5.1e+03 2.0e+04100100100100100 100100100100100 4861 PCSetUp 1 1.0 3.5048e-05 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 10334 1.0 2.5898e+01 1.0 3.32e+09 1.0 0.0e+00 0.0e+00 2.0e+00 6 1 0 0 0 6 1 0 0 0 1023 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage SNES 1 1 1316 0 SNESLineSearch 1 1 864 0 DMSNES 1 1 672 0 Vector 48 48 110586624 0 Vector Scatter 5 5 5300 0 Matrix 3 3 29548672 0 Distributed Mesh 3 3 6462224 0 Bipartite Graph 6 6 4800 0 Index Set 12 12 2593228 0 IS L to G Mapping 3 3 3870612 0 Krylov Solver 1 1 18368 0 DMKSP interface 1 1 656 0 Preconditioner 1 1 832 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 0 Average time for MPI_Barrier(): 2.00272e-06 Average time for zero size MPI_Send(): 1.0401e-05 #PETSc Option Table entries: -da_grid_x 1601 -da_grid_y 1601 -log_summary -pc_type jacobi #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure run at: Wed Apr 10 18:41:36 2013 Configure options: --with-debugging=0 --with-clanguage=C --with-pic=1 --with-shared-libraries=0 --with-mpi-compilers=1 --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --with-blas-lapack-dir=/opt/intel/cmkl/10.2.4.032 --download-cmake=1 --download-metis=1 --download-parmetis=1 --download-spooles=1 --download-blacs=1 --download-scalapack=1 --download-mumps=1 --download-superlu=1 --download-superlu_dist=1 --download-hypre=1 --PETSC_ARCH=diamond --PETSC_DIR=/usr/local/u/cekees/proteus/externalPackages/petsc-dev --prefix=/usr/local/u/cekees/proteus/diamond ----------------------------------------- Libraries compiled on Wed Apr 10 18:41:36 2013 on diamond03 Machine characteristics: Linux-2.6.32.59-0.7-default-x86_64-with-SuSE-11-x86_64 Using PETSc directory: /usr/local/u/cekees/proteus/externalPackages/petsc-dev Using PETSc arch: diamond ----------------------------------------- Using C compiler: mpiicc -fPIC -wd1572 -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpiifort -fPIC -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/include -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/include -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/include -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/include -I/opt/intel/impi/4.0.3.008/intel64/include ----------------------------------------- Using C linker: mpiicc Using Fortran linker: mpiifort Using libraries: -Wl,-rpath,/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -L/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -lpetsc -Wl,-rpath,/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -L/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_4.3 -lsuperlu_dist_3.2 -lHYPRE -Wl,-rpath,/opt/intel/impi/4.0.3.008/intel64/lib -L/opt/intel/impi/4.0.3.008/intel64/lib -Wl,-rpath,/opt/intel/impi/4.0.3/lib64 -L/opt/intel/impi/4.0.3/lib64 -Wl,-rpath,/opt/intel/Compiler/12.1.003/mkl/lib/intel64 -L/opt/intel/Compiler/12.1.003/mkl/lib/intel64 -Wl,-rpath,/usr/local/applic/intel_new/composer_xe_2011_sp1.9.293/compiler/lib/intel64 -L/usr/local/applic/intel_new/composer_xe_2011_sp1.9.293/compiler/lib/intel64 -Wl,-rpath,/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -Wl,-rpath,/usr/x86_64-suse-linux/lib -L/usr/x86_64-suse-linux/lib -Wl,-rpath,/usr/local/u/cekees/proteus/externalPackages/petsc-dev/-Xlinker -lmpigc4 -Wl,-rpath,/opt/intel/mpi-rt/4.0.3 -Wl,-rpath,/opt/intel/cmkl/10.2.4.032 -L/opt/intel/cmkl/10.2.4.032 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lX11 -lparmetis -lmetis -lpthread -lifport -lifcore -lm -lm -lmpigc4 -ldl -lmpi -lmpigf -lmpigi -lpthread -lrt -limf -lsvml -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl ----------------------------------------- /usr/local/u/cekees/BOB/mpirun -np 16 /usr/local/u/cekees/proteus/externalPackages/petsc-dev/src/snes/examples/tutorials/ex5 -da_grid_x 1601 -da_grid_y 1601 -pc_type jacobi -log_summary ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- /usr/local/u/cekees/proteus/externalPackages/petsc-dev/src/snes/examples/tutorials/ex5 on a diamond named r8i2n9 with 16 processors, by cekees Thu Apr 11 19:02:49 2013 Using Petsc Development HG revision: unknown HG Date: unknown Max Max/Min Avg Total Time (sec): 2.014e+02 1.00000 2.014e+02 Objects: 8.700e+01 1.00000 8.700e+01 Flops: 1.246e+11 1.00501 1.242e+11 1.987e+12 Flops/sec: 6.190e+08 1.00501 6.168e+08 9.868e+09 MPI Messages: 4.137e+04 2.00019 3.103e+04 4.964e+05 MPI Message Lengths: 1.323e+08 2.00000 3.200e+03 1.589e+09 MPI Reductions: 2.047e+04 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 2.0136e+02 100.0% 1.9871e+12 100.0% 4.964e+05 100.0% 3.200e+03 100.0% 2.047e+04 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage SNESSolve 1 1.0 2.0125e+02 1.0 1.25e+11 1.0 5.0e+05 3.2e+03 2.0e+04100100100100100 100100100100100 9874 SNESFunctionEval 1 1.0 5.5161e-03 1.1 1.77e+06 1.0 4.8e+01 3.2e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 5111 SNESJacobianEval 1 1.0 3.5500e-02 1.0 0.00e+00 0.0 4.8e+01 3.2e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecMDot 10000 1.0 6.3495e+01 1.0 4.98e+10 1.0 0.0e+00 0.0e+00 1.0e+04 31 40 0 0 49 31 40 0 0 49 12506 VecNorm 10334 1.0 4.2022e+00 1.9 3.32e+09 1.0 0.0e+00 0.0e+00 1.0e+04 2 3 0 0 50 2 3 0 0 50 12607 VecScale 10334 1.0 1.9951e+00 1.1 1.66e+09 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 13276 VecCopy 334 1.0 2.9505e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 338 1.0 1.5461e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 667 1.0 5.0231e-01 1.1 2.15e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 6807 VecMAXPY 10334 1.0 7.7959e+01 1.0 5.30e+10 1.0 0.0e+00 0.0e+00 0.0e+00 38 43 0 0 0 38 43 0 0 0 10843 VecPointwiseMult 10334 1.0 1.2497e+01 1.0 1.66e+09 1.0 0.0e+00 0.0e+00 0.0e+00 6 1 0 0 0 6 1 0 0 0 2120 VecScatterBegin 10335 1.0 2.6078e-01 1.9 0.00e+00 0.0 5.0e+05 3.2e+03 0.0e+00 0 0100100 0 0 0100100 0 0 VecScatterEnd 10335 1.0 2.7548e-01 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecReduceArith 1 1.0 1.9505e-02 1.9 3.22e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 263 VecReduceComm 1 1.0 9.5971e-03241.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 10334 1.0 6.1634e+00 1.5 4.99e+09 1.0 0.0e+00 0.0e+00 1.0e+04 3 4 0 0 50 3 4 0 0 50 12893 MatMult 10333 1.0 4.2938e+01 1.0 1.49e+10 1.0 5.0e+05 3.2e+03 0.0e+00 21 12100100 0 21 12100100 0 5548 MatAssemblyBegin 2 1.0 2.2342e-03 5.2 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 2 1.0 1.1836e-02 1.0 0.00e+00 0.0 9.6e+01 8.0e+02 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPGMRESOrthog 10000 1.0 1.3668e+02 1.0 9.96e+10 1.0 0.0e+00 0.0e+00 1.0e+04 67 80 0 0 49 67 80 0 0 49 11620 KSPSetUp 1 1.0 3.1369e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 2.0118e+02 1.0 1.25e+11 1.0 5.0e+05 3.2e+03 2.0e+04100100100100100 100100100100100 9877 PCSetUp 1 1.0 3.2187e-05 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 10334 1.0 1.2521e+01 1.0 1.66e+09 1.0 0.0e+00 0.0e+00 2.0e+00 6 1 0 0 0 6 1 0 0 0 2116 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage SNES 1 1 1316 0 SNESLineSearch 1 1 864 0 DMSNES 1 1 672 0 Vector 48 48 55402624 0 Vector Scatter 5 5 5300 0 Matrix 3 3 14795072 0 Distributed Mesh 3 3 3246224 0 Bipartite Graph 6 6 4800 0 Index Set 12 12 1305228 0 IS L to G Mapping 3 3 1941012 0 Krylov Solver 1 1 18368 0 DMKSP interface 1 1 656 0 Preconditioner 1 1 832 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 4.19617e-06 Average time for zero size MPI_Send(): 7.31647e-06 #PETSc Option Table entries: -da_grid_x 1601 -da_grid_y 1601 -log_summary -pc_type jacobi #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure run at: Wed Apr 10 18:41:36 2013 Configure options: --with-debugging=0 --with-clanguage=C --with-pic=1 --with-shared-libraries=0 --with-mpi-compilers=1 --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --with-blas-lapack-dir=/opt/intel/cmkl/10.2.4.032 --download-cmake=1 --download-metis=1 --download-parmetis=1 --download-spooles=1 --download-blacs=1 --download-scalapack=1 --download-mumps=1 --download-superlu=1 --download-superlu_dist=1 --download-hypre=1 --PETSC_ARCH=diamond --PETSC_DIR=/usr/local/u/cekees/proteus/externalPackages/petsc-dev --prefix=/usr/local/u/cekees/proteus/diamond ----------------------------------------- Libraries compiled on Wed Apr 10 18:41:36 2013 on diamond03 Machine characteristics: Linux-2.6.32.59-0.7-default-x86_64-with-SuSE-11-x86_64 Using PETSc directory: /usr/local/u/cekees/proteus/externalPackages/petsc-dev Using PETSc arch: diamond ----------------------------------------- Using C compiler: mpiicc -fPIC -wd1572 -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpiifort -fPIC -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/include -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/include -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/include -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/include -I/opt/intel/impi/4.0.3.008/intel64/include ----------------------------------------- Using C linker: mpiicc Using Fortran linker: mpiifort Using libraries: -Wl,-rpath,/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -L/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -lpetsc -Wl,-rpath,/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -L/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_4.3 -lsuperlu_dist_3.2 -lHYPRE -Wl,-rpath,/opt/intel/impi/4.0.3.008/intel64/lib -L/opt/intel/impi/4.0.3.008/intel64/lib -Wl,-rpath,/opt/intel/impi/4.0.3/lib64 -L/opt/intel/impi/4.0.3/lib64 -Wl,-rpath,/opt/intel/Compiler/12.1.003/mkl/lib/intel64 -L/opt/intel/Compiler/12.1.003/mkl/lib/intel64 -Wl,-rpath,/usr/local/applic/intel_new/composer_xe_2011_sp1.9.293/compiler/lib/intel64 -L/usr/local/applic/intel_new/composer_xe_2011_sp1.9.293/compiler/lib/intel64 -Wl,-rpath,/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -Wl,-rpath,/usr/x86_64-suse-linux/lib -L/usr/x86_64-suse-linux/lib -Wl,-rpath,/usr/local/u/cekees/proteus/externalPackages/petsc-dev/-Xlinker -lmpigc4 -Wl,-rpath,/opt/intel/mpi-rt/4.0.3 -Wl,-rpath,/opt/intel/cmkl/10.2.4.032 -L/opt/intel/cmkl/10.2.4.032 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lX11 -lparmetis -lmetis -lpthread -lifport -lifcore -lm -lm -lmpigc4 -ldl -lmpi -lmpigf -lmpigi -lpthread -lrt -limf -lsvml -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl ----------------------------------------- /usr/local/u/cekees/BOB/mpirun -np 32 /usr/local/u/cekees/proteus/externalPackages/petsc-dev/src/snes/examples/tutorials/ex5 -da_grid_x 1601 -da_grid_y 1601 -pc_type jacobi -log_summary ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- /usr/local/u/cekees/proteus/externalPackages/petsc-dev/src/snes/examples/tutorials/ex5 on a diamond named r8i2n9 with 32 processors, by cekees Thu Apr 11 19:04:37 2013 Using Petsc Development HG revision: unknown HG Date: unknown Max Max/Min Avg Total Time (sec): 9.912e+01 1.00000 9.912e+01 Objects: 8.700e+01 1.00000 8.700e+01 Flops: 6.248e+10 1.00751 6.210e+10 1.987e+12 Flops/sec: 6.303e+08 1.00752 6.265e+08 2.005e+10 MPI Messages: 4.137e+04 2.00019 3.361e+04 1.076e+06 MPI Message Lengths: 9.923e+07 2.00000 2.462e+03 2.648e+09 MPI Reductions: 2.047e+04 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 9.9117e+01 100.0% 1.9871e+12 100.0% 1.076e+06 100.0% 2.462e+03 100.0% 2.047e+04 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage SNESSolve 1 1.0 9.8621e+01 1.0 6.25e+10 1.0 1.1e+06 2.5e+03 2.0e+04 99100100100100 99100100100100 20148 SNESFunctionEval 1 1.0 2.3243e-02 9.6 8.87e+05 1.0 1.0e+02 2.5e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 1213 SNESJacobianEval 1 1.0 3.4400e-02 1.5 0.00e+00 0.0 1.0e+02 2.5e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecMDot 10000 1.0 3.3509e+01 1.1 2.50e+10 1.0 0.0e+00 0.0e+00 1.0e+04 33 40 0 0 49 33 40 0 0 49 23697 VecNorm 10334 1.0 3.2441e+00 2.7 1.67e+09 1.0 0.0e+00 0.0e+00 1.0e+04 2 3 0 0 50 2 3 0 0 50 16330 VecScale 10334 1.0 9.3445e-01 1.2 8.33e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 28346 VecCopy 334 1.0 1.2666e-01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 338 1.0 6.7488e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 667 1.0 1.9579e-01 1.2 1.08e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 17464 VecMAXPY 10334 1.0 3.7510e+01 1.1 2.66e+10 1.0 0.0e+00 0.0e+00 0.0e+00 37 43 0 0 0 37 43 0 0 0 22536 VecPointwiseMult 10334 1.0 5.6136e+00 1.4 8.33e+08 1.0 0.0e+00 0.0e+00 0.0e+00 4 1 0 0 0 4 1 0 0 0 4719 VecScatterBegin 10335 1.0 2.0503e-01 1.9 0.00e+00 0.0 1.1e+06 2.5e+03 0.0e+00 0 0100100 0 0 0100100 0 0 VecScatterEnd 10335 1.0 2.9505e-01 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecReduceArith 1 1.0 8.0973e-0210.7 1.61e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 63 VecReduceComm 1 1.0 1.0539e-012927.3 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 10334 1.0 4.1176e+00 2.0 2.50e+09 1.0 0.0e+00 0.0e+00 1.0e+04 3 4 0 0 50 3 4 0 0 50 19299 MatMult 10333 1.0 2.1398e+01 1.1 7.48e+09 1.0 1.1e+06 2.5e+03 0.0e+00 21 12100100 0 21 12100100 0 11133 MatAssemblyBegin 2 1.0 4.2107e-02 3.7 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 2 1.0 1.5741e-02 1.1 0.00e+00 0.0 2.1e+02 6.2e+02 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPGMRESOrthog 10000 1.0 6.8207e+01 1.1 4.99e+10 1.0 0.0e+00 0.0e+00 1.0e+04 68 80 0 0 49 68 80 0 0 49 23284 KSPSetUp 1 1.0 8.4319e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 9.8469e+01 1.0 6.25e+10 1.0 1.1e+06 2.5e+03 2.0e+04 99100100100100 99100100100100 20179 PCSetUp 1 1.0 9.2449e-03440.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 10334 1.0 5.6340e+00 1.4 8.33e+08 1.0 0.0e+00 0.0e+00 2.0e+00 4 1 0 0 0 4 1 0 0 0 4701 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage SNES 1 1 1316 0 SNESLineSearch 1 1 864 0 DMSNES 1 1 672 0 Vector 48 48 27810624 0 Vector Scatter 5 5 5300 0 Matrix 3 3 7418272 0 Distributed Mesh 3 3 1638224 0 Bipartite Graph 6 6 4800 0 Index Set 12 12 661228 0 IS L to G Mapping 3 3 976212 0 Krylov Solver 1 1 18368 0 DMKSP interface 1 1 656 0 Preconditioner 1 1 832 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 0 Average time for MPI_Barrier(): 7.00951e-06 Average time for zero size MPI_Send(): 5.37932e-06 #PETSc Option Table entries: -da_grid_x 1601 -da_grid_y 1601 -log_summary -pc_type jacobi #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure run at: Wed Apr 10 18:41:36 2013 Configure options: --with-debugging=0 --with-clanguage=C --with-pic=1 --with-shared-libraries=0 --with-mpi-compilers=1 --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --with-blas-lapack-dir=/opt/intel/cmkl/10.2.4.032 --download-cmake=1 --download-metis=1 --download-parmetis=1 --download-spooles=1 --download-blacs=1 --download-scalapack=1 --download-mumps=1 --download-superlu=1 --download-superlu_dist=1 --download-hypre=1 --PETSC_ARCH=diamond --PETSC_DIR=/usr/local/u/cekees/proteus/externalPackages/petsc-dev --prefix=/usr/local/u/cekees/proteus/diamond ----------------------------------------- Libraries compiled on Wed Apr 10 18:41:36 2013 on diamond03 Machine characteristics: Linux-2.6.32.59-0.7-default-x86_64-with-SuSE-11-x86_64 Using PETSc directory: /usr/local/u/cekees/proteus/externalPackages/petsc-dev Using PETSc arch: diamond ----------------------------------------- Using C compiler: mpiicc -fPIC -wd1572 -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpiifort -fPIC -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/include -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/include -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/include -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/include -I/opt/intel/impi/4.0.3.008/intel64/include ----------------------------------------- Using C linker: mpiicc Using Fortran linker: mpiifort Using libraries: -Wl,-rpath,/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -L/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -lpetsc -Wl,-rpath,/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -L/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_4.3 -lsuperlu_dist_3.2 -lHYPRE -Wl,-rpath,/opt/intel/impi/4.0.3.008/intel64/lib -L/opt/intel/impi/4.0.3.008/intel64/lib -Wl,-rpath,/opt/intel/impi/4.0.3/lib64 -L/opt/intel/impi/4.0.3/lib64 -Wl,-rpath,/opt/intel/Compiler/12.1.003/mkl/lib/intel64 -L/opt/intel/Compiler/12.1.003/mkl/lib/intel64 -Wl,-rpath,/usr/local/applic/intel_new/composer_xe_2011_sp1.9.293/compiler/lib/intel64 -L/usr/local/applic/intel_new/composer_xe_2011_sp1.9.293/compiler/lib/intel64 -Wl,-rpath,/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -Wl,-rpath,/usr/x86_64-suse-linux/lib -L/usr/x86_64-suse-linux/lib -Wl,-rpath,/usr/local/u/cekees/proteus/externalPackages/petsc-dev/-Xlinker -lmpigc4 -Wl,-rpath,/opt/intel/mpi-rt/4.0.3 -Wl,-rpath,/opt/intel/cmkl/10.2.4.032 -L/opt/intel/cmkl/10.2.4.032 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lX11 -lparmetis -lmetis -lpthread -lifport -lifcore -lm -lm -lmpigc4 -ldl -lmpi -lmpigf -lmpigi -lpthread -lrt -limf -lsvml -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl ----------------------------------------- /usr/local/u/cekees/BOB/mpirun -np 64 /usr/local/u/cekees/proteus/externalPackages/petsc-dev/src/snes/examples/tutorials/ex5 -da_grid_x 1601 -da_grid_y 1601 -pc_type jacobi -log_summary ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- /usr/local/u/cekees/proteus/externalPackages/petsc-dev/src/snes/examples/tutorials/ex5 on a diamond named r8i2n9 with 64 processors, by cekees Thu Apr 11 19:05:34 2013 Using Petsc Development HG revision: unknown HG Date: unknown Max Max/Min Avg Total Time (sec): 4.412e+01 1.00001 4.412e+01 Objects: 8.700e+01 1.00000 8.700e+01 Flops: 3.131e+10 1.01003 3.105e+10 1.987e+12 Flops/sec: 7.097e+08 1.01003 7.037e+08 4.504e+10 MPI Messages: 4.137e+04 2.00019 3.620e+04 2.317e+06 MPI Message Lengths: 6.615e+07 2.00000 1.600e+03 3.707e+09 MPI Reductions: 2.047e+04 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 4.4121e+01 100.0% 1.9871e+12 100.0% 2.317e+06 100.0% 1.600e+03 100.0% 2.047e+04 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage SNESSolve 1 1.0 4.3974e+01 1.0 3.13e+10 1.0 2.3e+06 1.6e+03 2.0e+04100100100100100 100100100100100 45187 SNESFunctionEval 1 1.0 3.5620e-03 2.7 4.44e+05 1.0 2.2e+02 1.6e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 7916 SNESJacobianEval 1 1.0 9.3830e-03 1.0 0.00e+00 0.0 2.2e+02 1.6e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecMDot 10000 1.0 1.5358e+01 1.0 1.25e+10 1.0 0.0e+00 0.0e+00 1.0e+04 34 40 0 0 49 34 40 0 0 49 51704 VecNorm 10334 1.0 2.6788e+00 2.8 8.35e+08 1.0 0.0e+00 0.0e+00 1.0e+04 5 3 0 0 50 5 3 0 0 50 19776 VecScale 10334 1.0 2.8255e-01 1.1 4.18e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 93748 VecCopy 334 1.0 5.9774e-02 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 338 1.0 3.2922e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 667 1.0 7.4939e-02 1.5 5.39e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 45628 VecMAXPY 10334 1.0 1.5223e+01 1.1 1.33e+10 1.0 0.0e+00 0.0e+00 0.0e+00 31 43 0 0 0 31 43 0 0 0 55530 VecPointwiseMult 10334 1.0 2.0379e+00 1.1 4.18e+08 1.0 0.0e+00 0.0e+00 0.0e+00 4 1 0 0 0 4 1 0 0 0 12998 VecScatterBegin 10335 1.0 1.6237e-01 1.7 0.00e+00 0.0 2.3e+06 1.6e+03 0.0e+00 0 0100100 0 0 0100100 0 0 VecScatterEnd 10335 1.0 2.1572e-01 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecReduceArith 1 1.0 8.7602e-02 9.1 8.08e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 59 VecReduceComm 1 1.0 8.1008e-022808.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 10334 1.0 2.9502e+00 2.4 1.25e+09 1.0 0.0e+00 0.0e+00 1.0e+04 6 4 0 0 50 6 4 0 0 50 26935 MatMult 10333 1.0 1.0214e+01 1.0 3.75e+09 1.0 2.3e+06 1.6e+03 0.0e+00 23 12100100 0 23 12100100 0 23324 MatAssemblyBegin 2 1.0 4.7371e-03 4.0 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 2 1.0 6.8269e-03 1.1 0.00e+00 0.0 4.5e+02 4.0e+02 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPGMRESOrthog 10000 1.0 2.9235e+01 1.0 2.50e+10 1.0 0.0e+00 0.0e+00 1.0e+04 64 80 0 0 49 64 80 0 0 49 54324 KSPSetUp 1 1.0 3.1440e-03 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 4.3873e+01 1.0 3.13e+10 1.0 2.3e+06 1.6e+03 2.0e+04 99100100100100 99100100100100 45291 PCSetUp 1 1.0 3.4785e-0419.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 10334 1.0 2.0552e+00 1.1 4.18e+08 1.0 0.0e+00 0.0e+00 2.0e+00 4 1 0 0 0 4 1 0 0 0 12888 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage SNES 1 1 1316 0 SNESLineSearch 1 1 864 0 DMSNES 1 1 672 0 Vector 48 48 13978624 0 Vector Scatter 5 5 5300 0 Matrix 3 3 3721472 0 Distributed Mesh 3 3 830224 0 Bipartite Graph 6 6 4800 0 Index Set 12 12 337228 0 IS L to G Mapping 3 3 491412 0 Krylov Solver 1 1 18368 0 DMKSP interface 1 1 656 0 Preconditioner 1 1 832 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 0 Average time for MPI_Barrier(): 1.13964e-05 Average time for zero size MPI_Send(): 0.000143997 #PETSc Option Table entries: -da_grid_x 1601 -da_grid_y 1601 -log_summary -pc_type jacobi #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure run at: Wed Apr 10 18:41:36 2013 Configure options: --with-debugging=0 --with-clanguage=C --with-pic=1 --with-shared-libraries=0 --with-mpi-compilers=1 --with-cc=mpiicc --with-cxx=mpiicpc --with-fc=mpiifort --with-blas-lapack-dir=/opt/intel/cmkl/10.2.4.032 --download-cmake=1 --download-metis=1 --download-parmetis=1 --download-spooles=1 --download-blacs=1 --download-scalapack=1 --download-mumps=1 --download-superlu=1 --download-superlu_dist=1 --download-hypre=1 --PETSC_ARCH=diamond --PETSC_DIR=/usr/local/u/cekees/proteus/externalPackages/petsc-dev --prefix=/usr/local/u/cekees/proteus/diamond ----------------------------------------- Libraries compiled on Wed Apr 10 18:41:36 2013 on diamond03 Machine characteristics: Linux-2.6.32.59-0.7-default-x86_64-with-SuSE-11-x86_64 Using PETSc directory: /usr/local/u/cekees/proteus/externalPackages/petsc-dev Using PETSc arch: diamond ----------------------------------------- Using C compiler: mpiicc -fPIC -wd1572 -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpiifort -fPIC -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/include -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/include -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/include -I/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/include -I/opt/intel/impi/4.0.3.008/intel64/include ----------------------------------------- Using C linker: mpiicc Using Fortran linker: mpiifort Using libraries: -Wl,-rpath,/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -L/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -lpetsc -Wl,-rpath,/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -L/usr/local/u/cekees/proteus/externalPackages/petsc-dev/diamond/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lsuperlu_4.3 -lsuperlu_dist_3.2 -lHYPRE -Wl,-rpath,/opt/intel/impi/4.0.3.008/intel64/lib -L/opt/intel/impi/4.0.3.008/intel64/lib -Wl,-rpath,/opt/intel/impi/4.0.3/lib64 -L/opt/intel/impi/4.0.3/lib64 -Wl,-rpath,/opt/intel/Compiler/12.1.003/mkl/lib/intel64 -L/opt/intel/Compiler/12.1.003/mkl/lib/intel64 -Wl,-rpath,/usr/local/applic/intel_new/composer_xe_2011_sp1.9.293/compiler/lib/intel64 -L/usr/local/applic/intel_new/composer_xe_2011_sp1.9.293/compiler/lib/intel64 -Wl,-rpath,/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -Wl,-rpath,/usr/x86_64-suse-linux/lib -L/usr/x86_64-suse-linux/lib -Wl,-rpath,/usr/local/u/cekees/proteus/externalPackages/petsc-dev/-Xlinker -lmpigc4 -Wl,-rpath,/opt/intel/mpi-rt/4.0.3 -Wl,-rpath,/opt/intel/cmkl/10.2.4.032 -L/opt/intel/cmkl/10.2.4.032 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lX11 -lparmetis -lmetis -lpthread -lifport -lifcore -lm -lm -lmpigc4 -ldl -lmpi -lmpigf -lmpigi -lpthread -lrt -limf -lsvml -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl -----------------------------------------