Norm of error 5.01403e-05 iterations 18133 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ksp_ksp_ex2 on a LINUX_GNU named ca202.localdomain with 8 processors, by wdn Thu Feb 16 11:23:49 2012 Using Petsc Development HG revision: unknown HG Date: unknown Max Max/Min Avg Total Time (sec): 8.286e+02 1.00000 8.286e+02 Objects: 1.800e+01 1.00000 1.800e+01 Flops: 4.987e+12 1.00007 4.987e+12 3.989e+13 Flops/sec: 6.018e+09 1.00007 6.018e+09 4.814e+10 MPI Messages: 3.627e+04 2.00000 3.174e+04 2.539e+05 MPI Message Lengths: 2.902e+09 2.00000 7.999e+04 2.031e+10 MPI Reductions: 5.442e+04 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 8.2267e+02 99.3% 3.9893e+13 100.0% 2.539e+05 100.0% 7.999e+04 100.0% 5.441e+04 100.0% 1: Assembly: 5.9268e+00 0.7% 0.0000e+00 0.0% 2.800e+01 0.0% 2.206e+00 0.0% 1.000e+01 0.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatMult 18134 1.0 4.5064e+02 1.5 2.04e+12 1.0 2.5e+05 8.0e+04 0.0e+00 47 41100100 0 47 41100100 0 36213 VecTDot 36266 1.0 2.8535e+02 2.2 9.07e+11 1.0 0.0e+00 0.0e+00 3.6e+04 23 18 0 0 67 23 18 0 0 67 25419 VecNorm 18135 1.0 1.9033e+01 1.0 4.53e+11 1.0 0.0e+00 0.0e+00 1.8e+04 2 9 0 0 33 2 9 0 0 33 190564 VecCopy 2 1.0 1.2306e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 5 1.0 1.0178e-01 3.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 36267 1.0 1.0688e+02 1.0 9.07e+11 1.0 0.0e+00 0.0e+00 0.0e+00 13 18 0 0 0 13 18 0 0 0 67864 VecAYPX 18132 1.0 5.3687e+01 1.0 4.53e+11 1.0 0.0e+00 0.0e+00 0.0e+00 6 9 0 0 0 7 9 0 0 0 67547 VecPointwiseMult 18134 1.0 5.4070e+01 1.0 2.27e+11 1.0 0.0e+00 0.0e+00 0.0e+00 7 5 0 0 0 7 5 0 0 0 33538 VecScatterBegin 18134 1.0 3.8343e+02316.0 0.00e+00 0.0 2.5e+05 8.0e+04 0.0e+00 30 0100100 0 31 0100100 0 0 VecScatterEnd 18134 1.0 3.8326e+02137.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 17 0 0 0 0 17 0 0 0 0 0 VecCUSPCopyTo 2 1.0 4.6226e-02 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecCUSPCopyFrom 1 1.0 2.1943e-0114.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecCopyToSome 18133 1.0 1.3713e+00 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecCopyFromSome 18134 1.0 3.8212e+02409.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 30 0 0 0 0 30 0 0 0 0 0 KSPSetUp 1 1.0 2.3127e-05 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 1 1.0 8.1195e+02 1.0 4.99e+12 1.0 2.5e+05 8.0e+04 5.4e+04 98100100100100 99100100100100 49130 PCSetUp 1 1.0 2.1458e-06 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 18134 1.0 5.4644e+01 1.0 2.27e+11 1.0 0.0e+00 0.0e+00 2.0e+00 7 5 0 0 0 7 5 0 0 0 33186 --- Event Stage 1: Assembly MatAssemblyBegin 1 1.0 2.0614e-012318.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 3 0 0 0 20 0 MatAssemblyEnd 1 1.0 1.5139e+00 1.0 0.00e+00 0.0 2.8e+01 2.0e+04 8.0e+00 0 0 0 0 0 26 0100100 80 0 MatCUSPCopyTo 2 1.0 5.9702e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 10 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 3 3 1900047596 0 Vector 7 8 92032 0 Vector Scatter 0 1 1060 0 Krylov Solver 1 1 1144 0 Preconditioner 1 1 800 0 Viewer 1 0 0 0 --- Event Stage 1: Assembly Vector 2 1 1512 0 Vector Scatter 1 0 0 0 Index Set 2 2 1496 0 ======================================================================================================================== Average time to get PetscTime(): 6.19888e-07 Average time for MPI_Barrier(): 4.81606e-06 Average time for zero size MPI_Send(): 1.22488e-05 #PETSc Option Table entries: -cusp_storage_format dia -ksp_max_it 100000 -ksp_type cg -log_summary -m 10000 -mat_type aijcusp -n 10000 -options_left -pc_type jacobi -vec_type cusp #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure run at: Wed Feb 15 17:49:49 2012 Configure options: --PETSC_DIR=/users/wdn/Projects/Gazebo/atc/test_exec/petsc_dev/build/LINUX_GNU_OPT_OPENMPI_CUDA_41_MKL_LITE/petsc-dev --PETSC_ARCH=LINUX_GNU_OPT_OPENMPI_CUDA_41_MKL_LITE --download-cmake=yes --download-txpetscgpu=yes --with-blas-lapack-dir=/opt/intel-12.1/mkl/lib/intel64 --with-cuda=1 --with-cuda-arch=sm_20 --with-cuda-dir=/opt/cudatoolkit-4.1 --with-thrust=1 --with-cusp=1 --with-cusp-dir=/users/wdn/Projects/Gazebo/atc/test_exec/petsc_dev/build/LINUX_GNU_OPT_OPENMPI_CUDA_41_MKL_LITE/petsc-dev/externalpackages/cusp --with-mpi=1 --with-mpi-dir=/opt/openmpi-1.5-gnu --with-shared-libraries --with-debugging=0 --with-64-bit-pointers=1 --FOPTFLAGS=-O3 --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 --CUDAOPTFLAGS=-O3 --with-large-file-io=1 ----------------------------------------- Libraries compiled on Wed Feb 15 17:49:49 2012 on ca201.localdomain Machine characteristics: Linux-2.6.32-220.4.1.2chaos.ch5.x86_64-x86_64-with-redhat-6.2-Santiago Using PETSc directory: /users/wdn/Projects/Gazebo/atc/test_exec/petsc_dev/build/LINUX_GNU_OPT_OPENMPI_CUDA_41_MKL_LITE/petsc-dev Using PETSc arch: LINUX_GNU_OPT_OPENMPI_CUDA_41_MKL_LITE ----------------------------------------- Using C compiler: /opt/openmpi-1.5-gnu/bin/mpicc -fPIC -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -O3 ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: /opt/openmpi-1.5-gnu/bin/mpif90 -fPIC -Wall -Wno-unused-variable -O3 ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/users/wdn/Projects/Gazebo/atc/test_exec/petsc_dev/build/LINUX_GNU_OPT_OPENMPI_CUDA_41_MKL_LITE/petsc-dev/LINUX_GNU_OPT_OPENMPI_CUDA_41_MKL_LITE/include -I/users/wdn/Projects/Gazebo/atc/test_exec/petsc_dev/build/LINUX_GNU_OPT_OPENMPI_CUDA_41_MKL_LITE/petsc-dev/include -I/users/wdn/Projects/Gazebo/atc/test_exec/petsc_dev/build/LINUX_GNU_OPT_OPENMPI_CUDA_41_MKL_LITE/petsc-dev/include -I/users/wdn/Projects/Gazebo/atc/test_exec/petsc_dev/build/LINUX_GNU_OPT_OPENMPI_CUDA_41_MKL_LITE/petsc-dev/LINUX_GNU_OPT_OPENMPI_CUDA_41_MKL_LITE/include -I/opt/cudatoolkit-4.1/include -I/users/wdn/Projects/Gazebo/atc/test_exec/petsc_dev/build/LINUX_GNU_OPT_OPENMPI_CUDA_41_MKL_LITE/petsc-dev/LINUX_GNU_OPT_OPENMPI_CUDA_41_MKL_LITE/include/txpetscgpu/include -I/users/wdn/Projects/Gazebo/atc/test_exec/petsc_dev/build/LINUX_GNU_OPT_OPENMPI_CUDA_41_MKL_LITE/petsc-dev/externalpackages/cusp/ -I/opt/openmpi-1.5-gnu/include ----------------------------------------- Using C linker: /opt/openmpi-1.5-gnu/bin/mpicc Using Fortran linker: /opt/openmpi-1.5-gnu/bin/mpif90 Using libraries: -Wl,-rpath,/users/wdn/Projects/Gazebo/atc/test_exec/petsc_dev/build/LINUX_GNU_OPT_OPENMPI_CUDA_41_MKL_LITE/petsc-dev/LINUX_GNU_OPT_OPENMPI_CUDA_41_MKL_LITE/lib -L/users/wdn/Projects/Gazebo/atc/test_exec/petsc_dev/build/LINUX_GNU_OPT_OPENMPI_CUDA_41_MKL_LITE/petsc-dev/LINUX_GNU_OPT_OPENMPI_CUDA_41_MKL_LITE/lib -lpetsc -lX11 -lpthread -Wl,-rpath,/opt/cudatoolkit-4.1/lib64 -L/opt/cudatoolkit-4.1/lib64 -lcufft -lcublas -lcudart -lcusparse -Wl,-rpath,/opt/intel-12.1/mkl/lib/intel64 -L/opt/intel-12.1/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_gnu_thread -lmkl_core -lgomp -lpthread -Wl,-rpath,/opt/openmpi-1.5-gnu/lib -L/opt/openmpi-1.5-gnu/lib -Wl,-rpath,/var/lib/perceus/vnfs/gpgpu/rootfs/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -L/var/lib/perceus/vnfs/gpgpu/rootfs/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -Wl,-rpath,/var/lib/perceus/vnfs/gpgpu/rootfs/usr/lib/gcc -L/var/lib/perceus/vnfs/gpgpu/rootfs/usr/lib/gcc -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -Wl,-rpath,/var/lib/perceus/vnfs/gpgpu/rootfs/usr/lib64 -L/var/lib/perceus/vnfs/gpgpu/rootfs/usr/lib64 -Wl,-rpath,/var/lib/perceus/vnfs/gpgpu/rootfs/usr/lib -L/var/lib/perceus/vnfs/gpgpu/rootfs/usr/lib -ldl -lmpi -lnsl -lutil -lgcc_s -lpthread -lmpi_f90 -lmpi_f77 -lgfortran -lm -lm -lm -lm -lmpi_cxx -lstdc++ -lmpi_cxx -lstdc++ -ldl -lmpi -lnsl -lutil -lgcc_s -lpthread -ldl ----------------------------------------- #PETSc Option Table entries: -cusp_storage_format dia -ksp_max_it 100000 -ksp_type cg -log_summary -m 10000 -mat_type aijcusp -n 10000 -options_left -pc_type jacobi -vec_type cusp #End of PETSc Option Table entries There are no unused options. real 831.34 user 1635.44 sys 19.88