Norm of error 5.15748e-05 iterations 861 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ex2 on a cray-tita named guana with 1 processor, by keita Fri Aug 27 14:16:18 2010 Using Petsc Development HG revision: unknown HG Date: unknown Max Max/Min Avg Total Time (sec): 4.376e+00 1.00000 4.376e+00 Objects: 1.000e+01 1.00000 1.000e+01 Flops: 5.192e+09 1.00000 5.192e+09 5.192e+09 Flops/sec: 1.186e+09 1.00000 1.186e+09 1.186e+09 Memory: 2.209e+07 1.00000 2.209e+07 MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Reductions: 0.000e+00 0.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 4.2199e+00 96.4% 5.1916e+09 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 1: Assembly: 1.5600e-01 3.6% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ ########################################################## # # # WARNING!!! # # # # This code was compiled with a debugging option, # # To get timing results run ./configure # # using --with-debugging=no, the performance will # # be generally two or three times faster. # # # ########################################################## Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatMult 862 1.0 4.3199e-01 1.0 2.26e+09 1.0 0.0e+00 0.0e+00 0.0e+00 10 43 0 0 0 10 43 0 0 0 5223 VecDot 1722 1.0 3.5199e-01 1.0 9.03e+08 1.0 0.0e+00 0.0e+00 0.0e+00 8 17 0 0 0 8 17 0 0 0 2565 VecNorm 863 1.0 1.5200e-01 1.0 4.52e+08 1.0 0.0e+00 0.0e+00 0.0e+00 3 9 0 0 0 4 9 0 0 0 2977 VecCopy 2 1.0 7.1320e-08 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 WARNING!!! Minimum time -8.31757e-08 over all processors for VecSet is negative! This happens on some machines whose times cannot handle too rapid calls.! artificially changing minimum to zero. VecSet 3 1.0 -8.3176e-08 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 -0 0 0 0 0 -0 0 0 0 0 -0 VecAXPY 1723 1.0 3.1600e-01 1.0 9.03e+08 1.0 0.0e+00 0.0e+00 0.0e+00 7 17 0 0 0 7 17 0 0 0 2859 VecAYPX 860 1.0 1.6400e-01 1.0 4.51e+08 1.0 0.0e+00 0.0e+00 0.0e+00 4 9 0 0 0 4 9 0 0 0 2749 VecPointwiseMult 862 1.0 1.8000e-01 1.0 2.26e+08 1.0 0.0e+00 0.0e+00 0.0e+00 4 4 0 0 0 4 4 0 0 0 1255 VecCUDACopyTo 1 1.0 4.0000e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 WARNING!!! Minimum time -8.31757e-08 over all processors for KSPSetup is negative! This happens on some machines whose times cannot handle too rapid calls.! artificially changing minimum to zero. KSPSetup 1 1.0 -8.3176e-08 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 -0 0 0 0 0 -0 0 0 0 0 -0 KSPSolve 1 1.0 1.6040e+00 1.0 5.19e+09 1.0 0.0e+00 0.0e+00 0.0e+00 37100 0 0 0 38100 0 0 0 3234 WARNING!!! Minimum time -8.31757e-08 over all processors for PCSetUp is negative! This happens on some machines whose times cannot handle too rapid calls.! artificially changing minimum to zero. PCSetUp 1 1.0 -8.3176e-08 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 -0 0 0 0 0 -0 0 0 0 0 -0 PCApply 862 1.0 1.8800e-01 1.0 2.26e+08 1.0 0.0e+00 0.0e+00 0.0e+00 4 4 0 0 0 4 4 0 0 0 1202 --- Event Stage 1: Assembly MatAssemblyBegin 1 1.0 3.3173e-08 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 1 1.0 2.4000e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 15 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 1 1 19925332 0 Vec 7 7 2107008 0 Krylov Solver 1 1 1040 0 Preconditioner 1 1 736 0 --- Event Stage 1: Assembly ======================================================================================================================== Average time to get PetscTime(): -8.39691e-09 #PETSc Option Table entries: -cuda_synchronize -ksp_type cg -log_summary -m 512 -mat_type seqaijcuda -n 512 -pc_type jacobi -vec_type cuda #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 Configure run at: Fri Aug 27 12:19:31 2010 Configure options: --known-level1-dcache-size=65536 --known-level1-dcache-linesize=64 --known-level1-dcache-assoc=2 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --FC=gfortran --with-x=0 --with-batch --download-c-blas-lapack=1 --with-dynamic=false --with-shared=false --CXX=CC --CC=gcc --CPP=cpp --with-cpp=cpp --COPTFLAGS --FOPTFLAGS --with-debugging=1 --with-mpi=0 --with-single-library --with-c++-support --with-fc=0 --with-cuda-dir=/opt/cuda/3.1/cuda --with-cusp-dir=/home/users/keita/petsc-dev/cusp --with-thrust-dir=/home/users/keita/petsc-dev//thrust ----------------------------------------- Libraries compiled on Fri Aug 27 12:19:31 2010 on guana Machine characteristics: Linux-2.6.27.19-5-default-x86_64-with-SuSE-11-x86_64 Using PETSc directory: /home/users/keita/petsc-dev Using PETSc arch: cray-titan ----------------------------------------- Using C compiler: gcc -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas ${COPTFLAGS} ${CFLAGS} ----------------------------------------- Using include paths: -I/home/users/keita/petsc-dev/cray-titan/include -I/home/users/keita/petsc-dev/include -I/opt/cuda/3.1/cuda/include -I/home/users/keita/petsc-dev/cusp/ -I/home/users/keita/petsc-dev//thrust/ -I/home/users/keita/petsc-dev/include/mpiuni ----------------------------------------- Using C linker: gcc Using libraries: -Wl,-rpath,/home/users/keita/petsc-dev/cray-titan/lib -L/home/users/keita/petsc-dev/cray-titan/lib -lpetscts -lpetscsnes -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetscsys -Wl,-rpath,/opt/cuda/3.1/cuda/lib64 -L/opt/cuda/3.1/cuda/lib64 -lcublas -lcudart -Wl,-rpath,/home/users/keita/petsc-dev/cray-titan/lib -L/home/users/keita/petsc-dev/cray-titan/lib -lf2clapack -lf2cblas -lm -lm -L/opt/cray/pmi/1.0-1.0000.7884.20.2.ss/lib64 -L/opt/cray/portals/2.2.0-1.0301.22039.18.1.ss/lib64 -L/opt/cray/mpt/5.0.2/xt/seastar/mpich2-gnu/lib -L/opt/cray/mpt/5.0.2/xt/seastar/sma/lib64 -L/opt/xt-libsci/10.4.7/gnu/lib/44 -L/opt/xt-libsci/10.4.7/gnu/lib -L/opt/cray/xt-sysroot/3.1.29.securitypatch.20100716/usr/lib64 -L/opt/cray/xt-sysroot/3.1.29.securitypatch.20100716/lib64 -L/opt/cray/xt-sysroot/3.1.29.securitypatch.20100716/usr/lib/alps -L/usr/lib/alps -lgfortran -lsci -lmpichcxx -lmpich -lrt -lsma -lportals -lpmi -lalpslli -lalpsutil -lpthread -lgomp -lstdc++ -lgcc_eh -ldl -----------------------------------------