$ mpirun -machinefile iwulf-shar.mf -np 4 ./ex3 -m 800 -ksp_type bcgs -pc_type none -log_summary Time taken for solve: 36.0357 Iterations: 760 Converged Reason: 2 ksp_type: bcgs pc_type: none Norm of error: 0.00311053 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ex3 on a linux-int named node5 with 4 processors, by mk527 Wed Sep 28 13:57:20 2011 Using Petsc Release Version 3.1.0, Patch 8, Thu Mar 17 13:37:48 CDT 2011 Max Max/Min Avg Total Time (sec): 5.392e+01 1.00002 5.392e+01 Objects: 2.200e+01 1.00000 2.200e+01 Flops: 6.821e+09 1.00286 6.811e+09 2.725e+10 Flops/sec: 1.265e+08 1.00288 1.263e+08 5.053e+08 MPI Messages: 9.640e+05 2.99042 4.835e+05 1.934e+06 MPI Message Lengths: 3.954e+09 2.98504 4.109e+03 7.948e+09 MPI Reductions: 3.074e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 5.3918e+01 100.0% 2.7246e+10 100.0% 1.934e+06 100.0% 4.109e+03 100.0% 3.065e+03 99.7% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatMult 1521 1.0 2.3550e+01 1.0 4.14e+09 1.0 9.1e+03 6.4e+03 0.0e+00 44 61 0 1 0 44 61 0 1 0 701 MatAssemblyBegin 1 1.0 3.6390e-03 9.0 0.00e+00 0.0 1.8e+01 1.7e+04 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 1 1.0 3.7904e-02 1.0 0.00e+00 0.0 1.2e+01 1.6e+03 7.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecDot 1520 1.0 1.7911e+00 1.0 4.88e+08 1.0 0.0e+00 0.0e+00 1.5e+03 3 7 0 0 49 3 7 0 0 50 1089 VecDotNorm2 760 1.0 8.5852e-01 1.1 4.88e+08 1.0 0.0e+00 0.0e+00 7.6e+02 2 7 0 0 25 2 7 0 0 25 2272 VecNorm 763 1.0 6.9961e+00 4.3 2.45e+08 1.0 0.0e+00 0.0e+00 7.6e+02 7 4 0 0 25 7 4 0 0 25 140 VecCopy 1524 1.0 1.5744e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 VecSet 4 1.0 4.1602e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 2 1.0 1.0530e-02 1.2 6.42e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 244 VecAXPBYCZ 1520 1.0 3.6649e+00 1.0 9.75e+08 1.0 0.0e+00 0.0e+00 0.0e+00 7 14 0 0 0 7 14 0 0 0 1064 VecWAXPY 1520 1.0 3.1228e+00 1.0 4.88e+08 1.0 0.0e+00 0.0e+00 0.0e+00 6 7 0 0 0 6 7 0 0 0 625 VecAssemblyBegin 4 1.0 1.8442e-0216.4 0.00e+00 0.0 6.0e+01 4.9e+03 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 VecAssemblyEnd 4 1.0 3.8910e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 1521 1.0 6.7050e-02 1.5 0.00e+00 0.0 9.1e+03 6.4e+03 0.0e+00 0 0 0 1 0 0 0 0 1 0 0 VecScatterEnd 1521 1.0 6.0617e-01 6.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 KSPSetup 1 1.0 8.2829e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 3.6036e+01 1.0 6.82e+09 1.0 9.1e+03 6.4e+03 3.0e+03 67100 0 1 99 67100 0 1 99 756 PCSetUp 1 1.0 9.5367e-07 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 1522 1.0 1.5749e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 3 3 32089972 0 Vec 12 12 12854656 0 Vec Scatter 1 1 892 0 Index Set 2 2 4264 0 Krylov Solver 1 1 840 0 Preconditioner 1 1 576 0 Viewer 2 2 1104 0 ======================================================================================================================== Average time to get PetscTime(): 3.09944e-07 Average time for MPI_Barrier(): 4.43459e-06 Average time for zero size MPI_Send(): 8.70228e-06 #PETSc Option Table entries: -ksp_type bcgs -log_summary -m 800 -pc_type none #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 Configure run at: Thu Mar 31 16:44:20 2011 Configure options: --prefix=/opt/petsc/3.1-p8-intel-12.0-opt --CFLAGS= -O3 -xHost --FFLAGS= -O3 -xHost --with-shared=1 --with-dynamic=0 --with-debugging=0 --useThreads 0 --with-mpi-shared=1 --with-x11=1 --with-c2html=1 --download-c2html=yes --with-blas-lib="-L/opt/intel-12.0/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread" --with-lapack-lib="-L/opt/intel-12.0/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread" --with-cproto=1 --download-cproto=yes --with-triangle=1 --download-triangle=yes --with-superlu=1 --download-superlu=yes --with-chaco=1 --download-chaco=yes --with-scalapack=1 --download-scalapack=yes --with-blacs=1 --download-blacs=yes --with-zoltan=1 --download-zoltan=yes --with-scotch=0 --with-pastix=0 --with-parmetis=1 --download-parmetis=yes --with-mumps=1 --download-mumps=yes --with-boost=1 --download-boost=yes --with-lgrind=1 --download-lgrind=yes --with-plapack=1 --download-plapack=yes --with-sowing=1 --download-sowing=yes --with-hypre=1 --download-hypre=yes --with-sundials=1 --download-sundials=yes --with-spooles=1 --download-spooles=yes --with-generator=1 --download-generator=yes --with-sprng=1 --download-sprng=yes --with-spai=1 --download-spai=yes --with-superlu_dist=1 --download-superlu_dist=yes --with-umfpack=1 --download-umfpack=yes --with-hdf5=1 --download-hdf5=yes --with-blopex=1 --download-blopex=yes --with-ml=1 --download-ml=yes ----------------------------------------- Libraries compiled on Thu Mar 31 16:44:20 BST 2011 on master Machine characteristics: Linux master 2.6.16.60-0.42.5-smp #1 SMP Mon Aug 24 09:41:41 UTC 2009 x86_64 x86_64 x86_64 GNU/Linux Using PETSc directory: /opt/petsc/src/petsc-3.1-p8 Using PETSc arch: linux-intel-12-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -O Using Fortran compiler: mpif90 -fPIC -O ----------------------------------------- Using include paths: -I/opt/petsc/src/petsc-3.1-p8/linux-intel-12-c-opt/include -I/opt/petsc/src/petsc-3.1-p8/include -I/opt/petsc/src/petsc-3.1-p8/linux-intel-12-c-opt/include -I/opt/openmpi/1.4.3-intel-12/lib -I/usr/X11R6/include -I/opt/petsc/src/petsc-3.1-p8/linux-intel-12-c-opt/lib -I/opt/petsc/src/petsc-3.1-p8/externalpackages/Boost/ ------------------------------------------ Using C linker: mpicc -fPIC -O Using Fortran linker: mpif90 -fPIC -O Using libraries: -Wl,-rpath,/opt/petsc/src/petsc-3.1-p8/linux-intel-12-c-opt/lib -L/opt/petsc/src/petsc-3.1-p8/linux-intel-12-c-opt/lib -lpetsc -Wl,-rpath,/opt/petsc/src/petsc-3.1-p8/linux-intel-12-c-opt/lib -L/opt/petsc/src/petsc-3.1-p8/linux-intel-12-c-opt/lib -ltriangle -lzoltan -L/usr/X11R6/lib64 -lX11 -lBLOPEX -lHYPRE -lmpi_cxx -Wl,-rpath,/opt/openmpi/1.4.3-intel-12/lib -Wl,-rpath,/opt/intel-12.0/composerxe-2011.2.137/compiler/lib/intel64 -Wl,-rpath,/opt/intel-12.0/composerxe-2011.2.137/mkl/lib/intel64 -Wl,-rpath,/opt/intel-12.0/composerxe-2011.2.137/ipp/lib/intel64 -Wl,-rpath,/opt/intel-12.0/composerxe-2011.2.137/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -Wl,-rpath,/usr/lib64/gcc/x86_64-suse-linux/4.1.2 -Wl,-rpath,/usr/x86_64-suse-linux/lib -lstdc++ -lchaco -lsuperlu_dist_2.4 -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lparmetis -lmetis -lhdf5_fortran -lhdf5 -lz -lspai -lcmrg -llcg64 -llcg -llfg -lmlfg -lPLAPACK -lsundials_cvode -lsundials_nvecserial -lsundials_nvecparallel -lscalapack -lblacs -lsuperlu_4.0 -lml -lmpi_cxx -lstdc++ -lspooles -lumfpack -lamd -L/opt/intel-12.0/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -ldl -Wl,-rpath,/opt/openmpi/1.4.3-intel-12/lib -L/opt/openmpi/1.4.3-intel-12/lib -lmpi -lopen-rte -lopen-pal -lnsl -lutil -Wl,-rpath,/opt/intel-12.0/composerxe-2011.2.137/compiler/lib/intel64 -L/opt/intel-12.0/composerxe-2011.2.137/compiler/lib/intel64 -Wl,-rpath,/opt/intel-12.0/composerxe-2011.2.137/mkl/lib/intel64 -L/opt/intel-12.0/composerxe-2011.2.137/mkl/lib/intel64 -Wl,-rpath,/opt/intel-12.0/composerxe-2011.2.137/ipp/lib/intel64 -L/opt/intel-12.0/composerxe-2011.2.137/ipp/lib/intel64 -Wl,-rpath,/opt/intel-12.0/composerxe-2011.2.137/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -L/opt/intel-12.0/composerxe-2011.2.137/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -Wl,-rpath,/usr/lib64/gcc/x86_64-suse-linux/4.1.2 -L/usr/lib64/gcc/x86_64-suse-linux/4.1.2 -Wl,-rpath,/usr/x86_64-suse-linux/lib -L/usr/x86_64-suse-linux/lib -limf -lsvml -lipgo -ldecimal -lgcc_s -lirc -lpthread -lirc_s -lmpi_f90 -lmpi_f77 -lifport -lifcoremt -lm -lm -lmpi_cxx -lstdc++ -lmpi_cxx -lstdc++ -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -limf -lsvml -lipgo -ldecimal -lgcc_s -lirc -lpthread -lirc_s -ldl ------------------------------------------