$ mpirun -machinefile iwulf-shar.mf -np 2 ./ex3 -m 800 -ksp_type bcgs -pc_type none -log_summary Time taken for solve: 36.276 Iterations: 626 Converged Reason: 2 ksp_type: bcgs pc_type: none Norm of error: 0.00611835 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ex3 on a linux-int named node5 with 2 processors, by mk527 Wed Sep 28 13:52:10 2011 Using Petsc Release Version 3.1.0, Patch 8, Thu Mar 17 13:37:48 CDT 2011 Max Max/Min Avg Total Time (sec): 5.101e+01 1.00001 5.101e+01 Objects: 2.200e+01 1.00000 2.200e+01 Flops: 1.122e+10 1.00000 1.122e+10 2.245e+10 Flops/sec: 2.200e+08 1.00001 2.200e+08 4.400e+08 MPI Messages: 6.429e+05 1.00000 6.429e+05 1.286e+06 MPI Message Lengths: 2.638e+09 1.00000 4.103e+03 5.275e+09 MPI Reductions: 2.538e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 5.1011e+01 100.0% 2.2445e+10 100.0% 1.286e+06 100.0% 4.103e+03 100.0% 2.529e+03 99.6% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatMult 1253 1.0 2.1322e+01 1.0 6.80e+09 1.0 2.5e+03 6.4e+03 0.0e+00 42 61 0 0 0 42 61 0 0 0 638 MatAssemblyBegin 1 1.0 2.5361e-03 8.8 0.00e+00 0.0 6.0e+00 1.7e+04 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 1 1.0 4.6473e-02 1.0 0.00e+00 0.0 4.0e+00 1.6e+03 7.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecDot 1252 1.0 2.2184e+00 1.0 8.03e+08 1.0 0.0e+00 0.0e+00 1.3e+03 4 7 0 0 49 4 7 0 0 50 724 VecDotNorm2 626 1.0 1.2937e+00 1.0 8.03e+08 1.0 0.0e+00 0.0e+00 6.3e+02 3 7 0 0 25 3 7 0 0 25 1242 VecNorm 629 1.0 3.0768e+00 1.2 4.04e+08 1.0 0.0e+00 0.0e+00 6.3e+02 6 4 0 0 25 6 4 0 0 25 262 VecCopy 1256 1.0 1.4303e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 VecSet 4 1.0 4.6701e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 2 1.0 1.0126e-02 1.0 1.28e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 253 VecAXPBYCZ 1252 1.0 3.7488e+00 1.0 1.61e+09 1.0 0.0e+00 0.0e+00 0.0e+00 7 14 0 0 0 7 14 0 0 0 857 VecWAXPY 1252 1.0 3.7588e+00 1.0 8.03e+08 1.0 0.0e+00 0.0e+00 0.0e+00 7 7 0 0 0 7 7 0 0 0 427 VecAssemblyBegin 4 1.0 4.4963e-03 5.3 0.00e+00 0.0 1.2e+01 8.0e+03 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 VecAssemblyEnd 4 1.0 1.6689e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 1253 1.0 2.4810e-02 1.0 0.00e+00 0.0 2.5e+03 6.4e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterEnd 1253 1.0 3.8082e-01 6.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSetup 1 1.0 1.2308e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 3.6276e+01 1.0 1.12e+10 1.0 2.5e+03 6.4e+03 2.5e+03 71100 0 0 99 71100 0 0 99 619 PCSetUp 1 1.0 9.5367e-07 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 1254 1.0 1.4293e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 3 3 64169972 0 Vec 12 12 25686656 0 Vec Scatter 1 1 892 0 Index Set 2 2 4264 0 Krylov Solver 1 1 840 0 Preconditioner 1 1 576 0 Viewer 2 2 1104 0 ======================================================================================================================== Average time to get PetscTime(): 3.09944e-07 Average time for MPI_Barrier(): 1.62125e-06 Average time for zero size MPI_Send(): 7.98702e-06 #PETSc Option Table entries: -ksp_type bcgs -log_summary -m 800 -pc_type none #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 Configure run at: Thu Mar 31 16:44:20 2011 Configure options: --prefix=/opt/petsc/3.1-p8-intel-12.0-opt --CFLAGS= -O3 -xHost --FFLAGS= -O3 -xHost --with-shared=1 --with-dynamic=0 --with-debugging=0 --useThreads 0 --with-mpi-shared=1 --with-x11=1 --with-c2html=1 --download-c2html=yes --with-blas-lib="-L/opt/intel-12.0/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread" --with-lapack-lib="-L/opt/intel-12.0/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread" --with-cproto=1 --download-cproto=yes --with-triangle=1 --download-triangle=yes --with-superlu=1 --download-superlu=yes --with-chaco=1 --download-chaco=yes --with-scalapack=1 --download-scalapack=yes --with-blacs=1 --download-blacs=yes --with-zoltan=1 --download-zoltan=yes --with-scotch=0 --with-pastix=0 --with-parmetis=1 --download-parmetis=yes --with-mumps=1 --download-mumps=yes --with-boost=1 --download-boost=yes --with-lgrind=1 --download-lgrind=yes --with-plapack=1 --download-plapack=yes --with-sowing=1 --download-sowing=yes --with-hypre=1 --download-hypre=yes --with-sundials=1 --download-sundials=yes --with-spooles=1 --download-spooles=yes --with-generator=1 --download-generator=yes --with-sprng=1 --download-sprng=yes --with-spai=1 --download-spai=yes --with-superlu_dist=1 --download-superlu_dist=yes --with-umfpack=1 --download-umfpack=yes --with-hdf5=1 --download-hdf5=yes --with-blopex=1 --download-blopex=yes --with-ml=1 --download-ml=yes ----------------------------------------- Libraries compiled on Thu Mar 31 16:44:20 BST 2011 on master Machine characteristics: Linux master 2.6.16.60-0.42.5-smp #1 SMP Mon Aug 24 09:41:41 UTC 2009 x86_64 x86_64 x86_64 GNU/Linux Using PETSc directory: /opt/petsc/src/petsc-3.1-p8 Using PETSc arch: linux-intel-12-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -O Using Fortran compiler: mpif90 -fPIC -O ----------------------------------------- Using include paths: -I/opt/petsc/src/petsc-3.1-p8/linux-intel-12-c-opt/include -I/opt/petsc/src/petsc-3.1-p8/include -I/opt/petsc/src/petsc-3.1-p8/linux-intel-12-c-opt/include -I/opt/openmpi/1.4.3-intel-12/lib -I/usr/X11R6/include -I/opt/petsc/src/petsc-3.1-p8/linux-intel-12-c-opt/lib -I/opt/petsc/src/petsc-3.1-p8/externalpackages/Boost/ ------------------------------------------ Using C linker: mpicc -fPIC -O Using Fortran linker: mpif90 -fPIC -O Using libraries: -Wl,-rpath,/opt/petsc/src/petsc-3.1-p8/linux-intel-12-c-opt/lib -L/opt/petsc/src/petsc-3.1-p8/linux-intel-12-c-opt/lib -lpetsc -Wl,-rpath,/opt/petsc/src/petsc-3.1-p8/linux-intel-12-c-opt/lib -L/opt/petsc/src/petsc-3.1-p8/linux-intel-12-c-opt/lib -ltriangle -lzoltan -L/usr/X11R6/lib64 -lX11 -lBLOPEX -lHYPRE -lmpi_cxx -Wl,-rpath,/opt/openmpi/1.4.3-intel-12/lib -Wl,-rpath,/opt/intel-12.0/composerxe-2011.2.137/compiler/lib/intel64 -Wl,-rpath,/opt/intel-12.0/composerxe-2011.2.137/mkl/lib/intel64 -Wl,-rpath,/opt/intel-12.0/composerxe-2011.2.137/ipp/lib/intel64 -Wl,-rpath,/opt/intel-12.0/composerxe-2011.2.137/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -Wl,-rpath,/usr/lib64/gcc/x86_64-suse-linux/4.1.2 -Wl,-rpath,/usr/x86_64-suse-linux/lib -lstdc++ -lchaco -lsuperlu_dist_2.4 -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lparmetis -lmetis -lhdf5_fortran -lhdf5 -lz -lspai -lcmrg -llcg64 -llcg -llfg -lmlfg -lPLAPACK -lsundials_cvode -lsundials_nvecserial -lsundials_nvecparallel -lscalapack -lblacs -lsuperlu_4.0 -lml -lmpi_cxx -lstdc++ -lspooles -lumfpack -lamd -L/opt/intel-12.0/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -ldl -Wl,-rpath,/opt/openmpi/1.4.3-intel-12/lib -L/opt/openmpi/1.4.3-intel-12/lib -lmpi -lopen-rte -lopen-pal -lnsl -lutil -Wl,-rpath,/opt/intel-12.0/composerxe-2011.2.137/compiler/lib/intel64 -L/opt/intel-12.0/composerxe-2011.2.137/compiler/lib/intel64 -Wl,-rpath,/opt/intel-12.0/composerxe-2011.2.137/mkl/lib/intel64 -L/opt/intel-12.0/composerxe-2011.2.137/mkl/lib/intel64 -Wl,-rpath,/opt/intel-12.0/composerxe-2011.2.137/ipp/lib/intel64 -L/opt/intel-12.0/composerxe-2011.2.137/ipp/lib/intel64 -Wl,-rpath,/opt/intel-12.0/composerxe-2011.2.137/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -L/opt/intel-12.0/composerxe-2011.2.137/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -Wl,-rpath,/usr/lib64/gcc/x86_64-suse-linux/4.1.2 -L/usr/lib64/gcc/x86_64-suse-linux/4.1.2 -Wl,-rpath,/usr/x86_64-suse-linux/lib -L/usr/x86_64-suse-linux/lib -limf -lsvml -lipgo -ldecimal -lgcc_s -lirc -lpthread -lirc_s -lmpi_f90 -lmpi_f77 -lifport -lifcoremt -lm -lm -lmpi_cxx -lstdc++ -lmpi_cxx -lstdc++ -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -limf -lsvml -lipgo -ldecimal -lgcc_s -lirc -lpthread -lirc_s -ldl ------------------------------------------