$ mpirun -machinefile iwulf-shar.mf -np 8 ./ex3 -m 800 -ksp_type bcgs -pc_type none -log_summary Time taken for solve: 30.6097 Iterations: 664 Converged Reason: 2 ksp_type: bcgs pc_type: none Norm of error: 0.00599148 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ex3 on a linux-int named node5 with 8 processors, by mk527 Wed Sep 28 13:59:24 2011 Using Petsc Release Version 3.1.0, Patch 8, Thu Mar 17 13:37:48 CDT 2011 Max Max/Min Avg Total Time (sec): 4.972e+01 1.00006 4.972e+01 Objects: 2.200e+01 1.00000 2.200e+01 Flops: 2.980e+09 1.00574 2.976e+09 2.381e+10 Flops/sec: 5.994e+07 1.00575 5.986e+07 4.789e+08 MPI Messages: 1.124e+06 6.94912 2.831e+05 2.265e+06 MPI Message Lengths: 4.611e+09 6.92142 4.117e+03 9.325e+09 MPI Reductions: 2.690e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 4.9716e+01 100.0% 2.3807e+10 100.0% 2.265e+06 100.0% 4.117e+03 100.0% 2.681e+03 99.7% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatMult 1329 1.0 2.0521e+01 1.1 1.81e+09 1.0 1.9e+04 6.4e+03 0.0e+00 41 61 1 1 0 41 61 1 1 0 703 MatAssemblyBegin 1 1.0 2.0430e-03 3.2 0.00e+00 0.0 4.2e+01 1.7e+04 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 1 1.0 3.8409e-02 1.0 0.00e+00 0.0 2.8e+01 1.6e+03 7.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecDot 1328 1.0 2.3155e+00 1.6 2.13e+08 1.0 0.0e+00 0.0e+00 1.3e+03 3 7 0 0 49 3 7 0 0 50 736 VecDotNorm2 664 1.0 1.3468e+00 2.2 2.13e+08 1.0 0.0e+00 0.0e+00 6.6e+02 2 7 0 0 25 2 7 0 0 25 1265 VecNorm 667 1.0 7.9731e+0010.2 1.07e+08 1.0 0.0e+00 0.0e+00 6.7e+02 8 4 0 0 25 8 4 0 0 25 107 VecCopy 1332 1.0 1.3961e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 VecSet 4 1.0 3.6769e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 2 1.0 9.9850e-03 1.4 3.21e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 257 VecAXPBYCZ 1328 1.0 3.1841e+00 1.0 4.26e+08 1.0 0.0e+00 0.0e+00 0.0e+00 6 14 0 0 0 6 14 0 0 0 1070 VecWAXPY 1328 1.0 2.7074e+00 1.0 2.13e+08 1.0 0.0e+00 0.0e+00 0.0e+00 5 7 0 0 0 5 7 0 0 0 629 VecAssemblyBegin 4 1.0 2.3053e-03 1.6 0.00e+00 0.0 2.5e+02 2.8e+03 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 VecAssemblyEnd 4 1.0 2.2078e-04 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 1329 1.0 1.1816e-01 1.6 0.00e+00 0.0 1.9e+04 6.4e+03 0.0e+00 0 0 1 1 0 0 0 1 1 0 0 VecScatterEnd 1329 1.0 8.5507e-01 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 KSPSetup 1 1.0 7.8249e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 3.0610e+01 1.0 2.98e+09 1.0 1.9e+04 6.4e+03 2.7e+03 62100 1 1 99 62100 1 1 99 778 PCSetUp 1 1.0 3.0994e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 1330 1.0 1.3987e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 3 3 16049972 0 Vec 12 12 6438656 0 Vec Scatter 1 1 892 0 Index Set 2 2 4264 0 Krylov Solver 1 1 840 0 Preconditioner 1 1 576 0 Viewer 2 2 1104 0 ======================================================================================================================== Average time to get PetscTime(): 2.14577e-07 Average time for MPI_Barrier(): 7.39098e-06 Average time for zero size MPI_Send(): 9.74536e-06 #PETSc Option Table entries: -ksp_type bcgs -log_summary -m 800 -pc_type none #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 Configure run at: Thu Mar 31 16:44:20 2011 Configure options: --prefix=/opt/petsc/3.1-p8-intel-12.0-opt --CFLAGS= -O3 -xHost --FFLAGS= -O3 -xHost --with-shared=1 --with-dynamic=0 --with-debugging=0 --useThreads 0 --with-mpi-shared=1 --with-x11=1 --with-c2html=1 --download-c2html=yes --with-blas-lib="-L/opt/intel-12.0/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread" --with-lapack-lib="-L/opt/intel-12.0/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread" --with-cproto=1 --download-cproto=yes --with-triangle=1 --download-triangle=yes --with-superlu=1 --download-superlu=yes --with-chaco=1 --download-chaco=yes --with-scalapack=1 --download-scalapack=yes --with-blacs=1 --download-blacs=yes --with-zoltan=1 --download-zoltan=yes --with-scotch=0 --with-pastix=0 --with-parmetis=1 --download-parmetis=yes --with-mumps=1 --download-mumps=yes --with-boost=1 --download-boost=yes --with-lgrind=1 --download-lgrind=yes --with-plapack=1 --download-plapack=yes --with-sowing=1 --download-sowing=yes --with-hypre=1 --download-hypre=yes --with-sundials=1 --download-sundials=yes --with-spooles=1 --download-spooles=yes --with-generator=1 --download-generator=yes --with-sprng=1 --download-sprng=yes --with-spai=1 --download-spai=yes --with-superlu_dist=1 --download-superlu_dist=yes --with-umfpack=1 --download-umfpack=yes --with-hdf5=1 --download-hdf5=yes --with-blopex=1 --download-blopex=yes --with-ml=1 --download-ml=yes ----------------------------------------- Libraries compiled on Thu Mar 31 16:44:20 BST 2011 on master Machine characteristics: Linux master 2.6.16.60-0.42.5-smp #1 SMP Mon Aug 24 09:41:41 UTC 2009 x86_64 x86_64 x86_64 GNU/Linux Using PETSc directory: /opt/petsc/src/petsc-3.1-p8 Using PETSc arch: linux-intel-12-c-opt ----------------------------------------- Using C compiler: mpicc -fPIC -O Using Fortran compiler: mpif90 -fPIC -O ----------------------------------------- Using include paths: -I/opt/petsc/src/petsc-3.1-p8/linux-intel-12-c-opt/include -I/opt/petsc/src/petsc-3.1-p8/include -I/opt/petsc/src/petsc-3.1-p8/linux-intel-12-c-opt/include -I/opt/openmpi/1.4.3-intel-12/lib -I/usr/X11R6/include -I/opt/petsc/src/petsc-3.1-p8/linux-intel-12-c-opt/lib -I/opt/petsc/src/petsc-3.1-p8/externalpackages/Boost/ ------------------------------------------ Using C linker: mpicc -fPIC -O Using Fortran linker: mpif90 -fPIC -O Using libraries: -Wl,-rpath,/opt/petsc/src/petsc-3.1-p8/linux-intel-12-c-opt/lib -L/opt/petsc/src/petsc-3.1-p8/linux-intel-12-c-opt/lib -lpetsc -Wl,-rpath,/opt/petsc/src/petsc-3.1-p8/linux-intel-12-c-opt/lib -L/opt/petsc/src/petsc-3.1-p8/linux-intel-12-c-opt/lib -ltriangle -lzoltan -L/usr/X11R6/lib64 -lX11 -lBLOPEX -lHYPRE -lmpi_cxx -Wl,-rpath,/opt/openmpi/1.4.3-intel-12/lib -Wl,-rpath,/opt/intel-12.0/composerxe-2011.2.137/compiler/lib/intel64 -Wl,-rpath,/opt/intel-12.0/composerxe-2011.2.137/mkl/lib/intel64 -Wl,-rpath,/opt/intel-12.0/composerxe-2011.2.137/ipp/lib/intel64 -Wl,-rpath,/opt/intel-12.0/composerxe-2011.2.137/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -Wl,-rpath,/usr/lib64/gcc/x86_64-suse-linux/4.1.2 -Wl,-rpath,/usr/x86_64-suse-linux/lib -lstdc++ -lchaco -lsuperlu_dist_2.4 -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lparmetis -lmetis -lhdf5_fortran -lhdf5 -lz -lspai -lcmrg -llcg64 -llcg -llfg -lmlfg -lPLAPACK -lsundials_cvode -lsundials_nvecserial -lsundials_nvecparallel -lscalapack -lblacs -lsuperlu_4.0 -lml -lmpi_cxx -lstdc++ -lspooles -lumfpack -lamd -L/opt/intel-12.0/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -ldl -Wl,-rpath,/opt/openmpi/1.4.3-intel-12/lib -L/opt/openmpi/1.4.3-intel-12/lib -lmpi -lopen-rte -lopen-pal -lnsl -lutil -Wl,-rpath,/opt/intel-12.0/composerxe-2011.2.137/compiler/lib/intel64 -L/opt/intel-12.0/composerxe-2011.2.137/compiler/lib/intel64 -Wl,-rpath,/opt/intel-12.0/composerxe-2011.2.137/mkl/lib/intel64 -L/opt/intel-12.0/composerxe-2011.2.137/mkl/lib/intel64 -Wl,-rpath,/opt/intel-12.0/composerxe-2011.2.137/ipp/lib/intel64 -L/opt/intel-12.0/composerxe-2011.2.137/ipp/lib/intel64 -Wl,-rpath,/opt/intel-12.0/composerxe-2011.2.137/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -L/opt/intel-12.0/composerxe-2011.2.137/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -Wl,-rpath,/usr/lib64/gcc/x86_64-suse-linux/4.1.2 -L/usr/lib64/gcc/x86_64-suse-linux/4.1.2 -Wl,-rpath,/usr/x86_64-suse-linux/lib -L/usr/x86_64-suse-linux/lib -limf -lsvml -lipgo -ldecimal -lgcc_s -lirc -lpthread -lirc_s -lmpi_f90 -lmpi_f77 -lifport -lifcoremt -lm -lm -lmpi_cxx -lstdc++ -lmpi_cxx -lstdc++ -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -limf -lsvml -lipgo -ldecimal -lgcc_s -lirc -lpthread -lirc_s -ldl ------------------------------------------