************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./femsolcu on a linux-gnu named fuzhou with 1 processor, by xiangze Fri Mar 16 23:42:21 2012 Using Petsc Development HG revision: 2c4589cfcadbae5ef75ecd387a8f0be19121454f HG Date: Fri Mar 09 18:34:58 2012 -0500 Max Max/Min Avg Total Time (sec): 3.474e+00 1.00000 3.474e+00 Objects: 2.300e+01 1.00000 2.300e+01 Flops: 7.697e+07 1.00000 7.697e+07 7.697e+07 Flops/sec: 2.215e+07 1.00000 2.215e+07 2.215e+07 Memory: 1.192e+07 1.00000 1.192e+07 MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Reductions: 2.000e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 3.4742e+00 100.0% 7.6967e+07 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 1.990e+02 99.5% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ ########################################################## # # # WARNING!!! # # # # This code was compiled with a debugging option, # # To get timing results run ./configure # # using --with-debugging=no, the performance will # # be generally two or three times faster. # # # ########################################################## Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatMult 32 1.0 6.4753e-02 1.0 3.47e+07 1.0 0.0e+00 0.0e+00 0.0e+00 2 45 0 0 0 2 45 0 0 0 536 MatSOR 33 1.0 2.2497e-01 1.0 3.64e+07 1.0 0.0e+00 0.0e+00 0.0e+00 6 47 0 0 0 6 47 0 0 0 162 MatAssemblyBegin 1 1.0 2.6941e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 1 0 0 0 0 1 0 MatAssemblyEnd 1 1.0 9.2838e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+01 0 0 0 0 6 0 0 0 0 6 0 MatCUSPCopyTo 2 1.0 6.1390e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 Preallocation 1 1.0 1.5018e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 0 0 0 0 2 0 0 0 0 2 0 Insert values 1 1.0 1.4820e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0 Mat assembly 1 1.0 9.4500e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.3e+01 0 0 0 0 6 0 0 0 0 7 0 VecDot 32 1.0 2.9984e-03 1.0 1.05e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 351 VecDotNorm2 16 1.0 2.1124e-03 1.0 1.05e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 499 VecNorm 17 1.0 9.9587e-04 1.0 5.60e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 562 VecCopy 2 1.0 1.5378e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 11 1.0 2.8086e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPBYCZ 32 1.0 1.5121e-03 1.0 2.11e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 1394 VecWAXPY 32 1.0 1.2271e-03 1.0 1.05e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 859 VecAssemblyBegin 2 1.0 1.5020e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 2 0 0 0 0 2 0 VecAssemblyEnd 2 1.0 2.1458e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 33 1.0 1.0347e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecCUSPCopyTo 66 1.0 1.9238e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecCUSPCopyFrom 53 1.0 2.5399e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSetUp 1 1.0 4.8208e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.2e+01 0 0 0 0 11 0 0 0 0 11 0 KSPSolve 1 1.0 3.8448e-01 1.0 7.70e+07 1.0 0.0e+00 0.0e+00 1.3e+02 11100 0 0 66 11100 0 0 67 200 PCSetUp 1 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 33 1.0 2.2504e-01 1.0 3.64e+07 1.0 0.0e+00 0.0e+00 0.0e+00 6 47 0 0 0 6 47 0 0 0 162 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 3 3 6194088 0 Vector 12 12 610308 0 Vector Scatter 2 2 1240 0 Index Set 3 3 2216 0 Krylov Solver 1 1 1056 0 Preconditioner 1 1 812 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 #PETSc Option Table entries: -ksp_atol 1e-6 -ksp_max_it 2000 -ksp_type bcgs -log_summary g_sor_sum2 -mat_no_inode -pc_sor_omega 1.2 -pc_type sor #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with single precision PetscScalar and PetscReal Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 4 sizeof(PetscInt) 4 Configure run at: Sat Mar 10 21:49:33 2012 Configure options: --with-cc=gcc --with-fc=gfortran --download-f-blas-lapack=1 --download-mpich=1 --with-cuda=1 --with-cusp=1 --with-thrust=1 --with-precision=single ----------------------------------------- Libraries compiled on Sat Mar 10 21:49:33 2012 on fuzhou Machine characteristics: Linux-3.2.0-1-amd64-x86_64-with-debian-wheezy-sid Using PETSc directory: /usr/src/petsc-dev Using PETSc arch: linux-gnu ----------------------------------------- Using C compiler: /usr/src/petsc-dev/linux-gnu/bin/mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: /usr/src/petsc-dev/linux-gnu/bin/mpif90 -Wall -Wno-unused-variable -Wno-unused-dummy-argument -g ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/usr/src/petsc-dev/linux-gnu/include -I/usr/src/petsc-dev/include -I/usr/src/petsc-dev/include -I/usr/src/petsc-dev/linux-gnu/include -I/usr/local/cuda/include -I/usr/local/cuda/cusp/ ----------------------------------------- Using C linker: /usr/src/petsc-dev/linux-gnu/bin/mpicc Using Fortran linker: /usr/src/petsc-dev/linux-gnu/bin/mpif90 Using libraries: -Wl,-rpath,/usr/src/petsc-dev/linux-gnu/lib -L/usr/src/petsc-dev/linux-gnu/lib -lpetsc -lX11 -lpthread -Wl,-rpath,/usr/local/cuda/lib64 -L/usr/local/cuda/lib64 -lcufft -lcublas -lcudart -lcusparse -Wl,-rpath,/usr/src/petsc-dev/linux-gnu/lib -L/usr/src/petsc-dev/linux-gnu/lib -lflapack -lfblas -lm -L/usr/lib/gcc/x86_64-linux-gnu/4.4.6 -L/usr/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -ldl -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -lmpichf90 -lgfortran -lm -L/usr/lib/gcc/x86_64-linux-gnu/4.6 -lgfortran -lm -lquadmath -lm -lmpichcxx -lstdc++ -ldl -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -ldl -----------------------------------------