Loading took = 2.29478 Using Cholesky factorization Trace = 25.5725 Trace calculations took = 78.9249 Total time = 81.2196 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ComputeTraceParallelKSP on a arch-linux2-c-debug named gcn-13-71.sdsc.edu with 40 processors, by slivkaje Mon Feb 4 11:38:19 2013 Using Petsc Release Version 3.3.0, Patch 5, Sat Dec 1 15:10:41 CST 2012 Max Max/Min Avg Total Time (sec): 8.173e+01 1.00462 8.137e+01 Objects: 5.320e+02 1.00000 5.320e+02 Flops: 0.000e+00 0.00000 0.000e+00 0.000e+00 Flops/sec: 0.000e+00 0.00000 0.000e+00 0.000e+00 Memory: 2.092e+08 1.00000 8.367e+09 MPI Messages: 1.755e+02 1.48101 1.199e+02 4.797e+03 MPI Message Lengths: 6.240e+08 11.55543 5.691e+05 2.730e+09 MPI Reductions: 1.601e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 8.1372e+01 100.0% 0.0000e+00 0.0% 4.797e+03 100.0% 5.691e+05 100.0% 1.600e+03 99.9% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ ########################################################## # # # WARNING!!! # # # # This code was compiled with a debugging option, # # To get timing results run ./configure # # using --with-debugging=no, the performance will # # be generally two or three times faster. # # # ########################################################## Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatSolve 250 1.0 3.3418e+01 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 21 0 0 0 0 21 0 0 0 0 0 MatCholFctrSym 1 1.0 4.6231e-01 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatCholFctrNum 1 1.0 4.4273e+01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 31 0 0 0 0 31 0 0 0 0 0 MatAssemblyBegin 3 1.0 1.4000e+00 9.9 0.00e+00 0.0 4.7e+03 3.3e+05 6.0e+00 1 0 98 57 0 1 0 98 57 0 0 MatAssemblyEnd 3 1.0 3.0060e-01 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 3.3e+01 0 0 0 0 2 0 0 0 0 2 0 MatGetRow 250 1.0 7.1270e-02 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatLoad 2 1.0 2.2318e+00 1.1 0.00e+00 0.0 7.8e+01 1.5e+07 5.5e+01 3 0 2 43 3 3 0 2 43 3 0 MatTranspose 1 1.0 2.0663e+0144.7 0.00e+00 0.0 4.7e+03 3.3e+05 2.0e+01 25 0 98 57 1 25 0 98 57 1 0 VecCopy 250 1.0 8.1117e-03 5.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 753 1.0 4.9405e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSetUp 1 1.0 2.1458e-06 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 250 1.0 3.3491e+01 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 5.0e+02 21 0 0 0 31 21 0 0 0 31 0 PCSetUp 1 1.0 4.4736e+01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 31 0 0 0 0 31 0 0 0 0 0 PCApply 250 1.0 3.3419e+01 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 21 0 0 0 0 21 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Viewer 3 2 1440 0 Matrix 10 10 64470372 0 Vector 506 506 40884784 0 Vector Scatter 3 3 1860 0 Index Set 8 8 5896 0 Krylov Solver 1 1 1072 0 Preconditioner 1 1 920 0 ======================================================================================================================== Average time to get PetscTime(): 0 Average time for MPI_Barrier(): 1.1158e-05 Average time for zero size MPI_Send(): 5.29885e-06 #PETSc Option Table entries: -log_summary #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure run at: Sun Jan 20 11:25:21 2013 Configure options: --prefix=/opt/petsc/intel/mvapich2/ib --with-fc=mpif90 --with-cc=mpicc -with-mpi=1 --download-pastix=1 --download-ptscotch --with-blas-lib=" -Wl,--start-group /opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64/libmkl_intel_lp64.a /opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64/libmkl_sequential.a /opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64/libmkl_core.a -Wl,--end-group -lpthread -lm" --with-lapack-lib=" -Wl,--start-group /opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64/libmkl_intel_lp64.a /opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64/libmkl_sequential.a /opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64/libmkl_core.a -Wl,--end-group -lpthread -lm" --with-superlu_dist-include=/scratch/triton/src/roll/math/src/petsc/../..//src/build-superlu_intel_mvapich2_ib/include --with-superlu_dist-lib="-L/scratch/triton/src/roll/math/src/petsc/../..//src/build-superlu_intel_mvapich2_ib/lib -lsuperlu" --with-parmetis-dir=/scratch/triton/src/roll/math/src/petsc/../..//src/build-parmetis_intel_mvapich2_ib --with-metis-dir=/scratch/triton/src/roll/math/src/petsc/../..//src/build-parmetis_intel_mvapich2_ib --download-mumps=yes --with-scalapack-dir=/scratch/triton/src/roll/math/src/petsc/../..//src/build-scalapack_intel_mvapich2_ib --download-blacs=yes ----------------------------------------- Libraries compiled on Sun Jan 20 11:25:21 2013 on gcn-19-26.sdsc.edu Machine characteristics: Linux-2.6.34.7-1-x86_64-with-redhat-5.6-Final Using PETSc directory: /scratch/triton/src/roll/math/src/petsc/petsc-3.3.p5 Using PETSc arch: arch-linux2-c-debug ----------------------------------------- Using C compiler: mpicc -wd1572 -g ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -g ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/scratch/triton/src/roll/math/src/petsc/petsc-3.3.p5/arch-linux2-c-debug/include -I/scratch/triton/src/roll/math/src/petsc/petsc-3.3.p5/include -I/scratch/triton/src/roll/math/src/petsc/petsc-3.3.p5/include -I/scratch/triton/src/roll/math/src/petsc/petsc-3.3.p5/arch-linux2-c-debug/include -I/scratch/triton/src/roll/math/src/build-superlu_intel_mvapich2_ib/include -I/scratch/triton/src/roll/math/src/petsc/../..//src/build-parmetis_intel_mvapich2_ib/include -I/opt/mvapich2/intel/ib/include ----------------------------------------- Using C linker: mpicc Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/scratch/triton/src/roll/math/src/petsc/petsc-3.3.p5/arch-linux2-c-debug/lib -L/scratch/triton/src/roll/math/src/petsc/petsc-3.3.p5/arch-linux2-c-debug/lib -lpetsc -lX11 -Wl,-rpath,/scratch/triton/src/roll/math/src/petsc/petsc-3.3.p5/arch-linux2-c-debug/lib -L/scratch/triton/src/roll/math/src/petsc/petsc-3.3.p5/arch-linux2-c-debug/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -Wl,-rpath,/scratch/triton/src/roll/math/src/petsc/../..//src/build-scalapack_intel_mvapich2_ib/lib -L/scratch/triton/src/roll/math/src/petsc/../..//src/build-scalapack_intel_mvapich2_ib/lib -lscalapack -lblacs -L/scratch/triton/src/roll/math/src/petsc/../..//src/build-superlu_intel_mvapich2_ib/lib -lsuperlu -Wl,-rpath,/scratch/triton/src/roll/math/src/petsc/../..//src/build-parmetis_intel_mvapich2_ib/lib -L/scratch/triton/src/roll/math/src/petsc/../..//src/build-parmetis_intel_mvapich2_ib/lib -lparmetis -lmetis -lpthread -lpastix -lptesmumps -lptscotch -lptscotcherr -Wl,--start-group -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -Wl,--end-group -lpthread -lm -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm -L/opt/mvapich2/intel/ib/lib -L/opt/intel/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/ipp/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -L/opt/gridengine/lib/lx26-amd64 -L/opt/intel/composer_xe_2011_sp1.7.256/debugger/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/mpirt/lib/intel64 -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2 -lmpichf90 -Wl,-rpath,/opt/mvapich2/intel/ib/lib -lifport -lifcore -lm -lm -lrt -lm -lrt -lm -lz -lz -ldl -lmpich -lopa -lmpl -lpthread -llimic2 -lrdmacm -libverbs -libumad -lrt -limf -lsvml -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl -----------------------------------------