************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ex32 on a arch-linux2-c-debug named sblnxtesla01 with 1 processor, by 6642 Mon Sep 22 10:02:38 2014 Using Petsc Development GIT revision: v3.5.2-266-g0323fbc GIT Date: 2014-09-15 22:15:02 -0500 Max Max/Min Avg Total Time (sec): 3.861e+02 1.00000 3.861e+02 Objects: 5.600e+01 1.00000 5.600e+01 Flops: 7.847e+10 1.00000 7.847e+10 7.847e+10 Flops/sec: 2.032e+08 1.00000 2.032e+08 2.032e+08 Memory: 1.698e+09 1.00000 1.698e+09 MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Reductions: 0.000e+00 0.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 3.8614e+02 100.0% 7.8469e+10 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ ########################################################## # # # WARNING!!! # # # # This code was compiled with a debugging option, # # To get timing results run ./configure # # using --with-debugging=no, the performance will # # be generally two or three times faster. # # # ########################################################## Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage ThreadCommRunKer 1 1.0 6.9141e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ThreadCommBarrie 1 1.0 4.0531e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecMDot 250 1.0 1.8905e+00 1.0 2.55e+10 1.0 0.0e+00 0.0e+00 0.0e+00 0 32 0 0 0 0 32 0 0 0 13479 VecNorm 259 1.0 1.0985e-01 1.0 1.75e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 15915 VecScale 259 1.0 9.1819e-02 1.0 8.74e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 9520 VecCopy 9 1.0 4.6090e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 47 1.0 2.0747e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 17 1.0 8.1256e-02 1.0 1.15e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1412 VecMAXPY 259 1.0 1.0497e+00 1.0 2.76e+10 1.0 0.0e+00 0.0e+00 0.0e+00 0 35 0 0 0 0 35 0 0 0 26314 VecNormalize 259 1.0 2.0568e-01 1.0 2.62e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 12750 VecCUSPCopyTo 275 1.0 1.2186e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecCUSPCopyFrom 314 1.0 1.2774e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatMult 258 1.0 4.4579e+01 1.0 1.13e+10 1.0 0.0e+00 0.0e+00 0.0e+00 12 14 0 0 0 12 14 0 0 0 252 MatSolve 259 1.0 4.3305e+01 1.0 1.13e+10 1.0 0.0e+00 0.0e+00 0.0e+00 11 14 0 0 0 11 14 0 0 0 261 MatLUFactorNum 1 1.0 2.3031e+00 1.0 6.21e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 27 MatILUFactorSym 1 1.0 3.9626e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatScale 1 1.0 2.5440e-02 1.0 2.35e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 923 MatAssemblyBegin 4 1.0 1.5020e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 4 1.0 2.4888e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetRow 6750000 1.0 2.9557e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 8 0 0 0 0 8 0 0 0 0 0 MatGetRowIJ 1 1.0 8.1062e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetOrdering 1 1.0 1.1388e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAXPY 1 1.0 7.7408e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 20 0 0 0 0 20 0 0 0 0 0 MatTranspose 1 1.0 3.9145e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 KSPGMRESOrthog 250 1.0 2.8824e+00 1.0 5.14e+10 1.0 0.0e+00 0.0e+00 0.0e+00 1 66 0 0 0 1 66 0 0 0 17837 KSPSetUp 1 1.0 4.6310e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 1.9987e+02 1.0 7.84e+10 1.0 0.0e+00 0.0e+00 0.0e+00 52100 0 0 0 52100 0 0 0 392 PCSetUp 1 1.0 3.8384e+00 1.0 6.21e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 16 PCApply 259 1.0 4.3309e+01 1.0 1.13e+10 1.0 0.0e+00 0.0e+00 0.0e+00 11 14 0 0 0 11 14 0 0 0 261 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Vector 38 38 945057520 0 Vector Scatter 1 1 604 0 Matrix 4 4 1316530292 0 Distributed Mesh 1 1 4912 0 Star Forest Bipartite Graph 2 2 1616 0 Discrete System 1 1 792 0 Index Set 5 5 40503920 0 IS L to G Mapping 1 1 13500588 0 Krylov Solver 1 1 18376 0 Preconditioner 1 1 1008 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 #PETSc Option Table entries: -M 150 -dm_mat_type aijcusp -dm_vec_type cusp -log_summary #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --with-mpi=0 --with-shared-libraries=0 --with-blas-lib=/usr/local/lib64/librefblas.a --with-lapack-lib=/usr/local/lib64/liblapack.a --with-cuda=1 --with-cuda-arch=sm_35 --with-cusp=1 --with-thrust=1 --with-clean=1 ----------------------------------------- Libraries compiled on Tue Sep 16 10:14:47 2014 on sblnxtesla01 Machine characteristics: Linux-2.6.18-348.12.1.el5-x86_64-with-redhat-5.10-Tikanga Using PETSc directory: /net/home_00/6642/petsc-master Using PETSc arch: arch-linux2-c-debug ----------------------------------------- Using C compiler: gcc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: gfortran -Wall -Wno-unused-variable -ffree-line-length-0 -g -O0 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/net/home_00/6642/petsc-master/arch-linux2-c-debug/include -I/net/home_00/6642/petsc-master/include -I/net/home_00/6642/petsc-master/include -I/net/home_00/6642/petsc-master/arch-linux2-c-debug/include -I/usr/local/cuda/include -I/net/home_00/6642/petsc-master/include/mpiuni ----------------------------------------- Using C linker: gcc Using Fortran linker: gfortran Using libraries: -Wl,-rpath,/net/home_00/6642/petsc-master/arch-linux2-c-debug/lib -L/net/home_00/6642/petsc-master/arch-linux2-c-debug/lib -lpetsc -Wl,-rpath,/usr/local/lib64 -L/usr/local/lib64 -llapack -lrefblas -Wl,-rpath,/usr/local/cuda/lib64 -L/usr/local/cuda/lib64 -lcufft -lcublas -lcudart -lcusparse -lX11 -lpthread -lssl -lcrypto -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2 -lgfortran -lm -lstdc++ -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2 -lstdc++ -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2 -L/usr/lib/gcc/x86_64-redhat-linux/4.1.2 -ldl -lgcc_s -ldl -----------------------------------------