lid velocity = 0.0001, prandtl # = 1, grashof # = 1 Number of SNES iterations = 2 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ex19 on a intel-opt-precise-O3 named lagrange.tomato with 1 processor, by jfe Thu Oct 25 13:24:58 2012 Using Petsc Development HG revision: f8bbb9afb3f28a97dd47839f1c4674891dd5c594 HG Date: Wed Oct 24 13:40:55 2012 -0400 Max Max/Min Avg Total Time (sec): 5.358e+00 1.00000 5.358e+00 Objects: 9.500e+01 1.00000 9.500e+01 Flops: 8.620e+09 1.00000 8.620e+09 8.620e+09 Flops/sec: 1.609e+09 1.00000 1.609e+09 1.609e+09 MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Reductions: 1.290e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 5.3576e+00 100.0% 8.6196e+09 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 1.280e+02 99.2% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage ThreadCommRunKernel 12903 1.0 5.1450e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 96 0 0 0 0 96 0 0 0 0 0 SNESSolve 1 1.0 5.3272e+00 1.0 8.62e+09 1.0 0.0e+00 0.0e+00 1.1e+02 99100 0 0 83 99100 0 0 84 1618 SNESFunctionEval 3 1.0 1.2140e-03 1.0 2.52e+06 1.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 2 0 0 0 0 2 2076 SNESJacobianEval 2 1.0 1.5228e-01 1.0 3.85e+07 1.0 0.0e+00 0.0e+00 3.3e+01 3 0 0 0 26 3 0 0 0 26 253 SNESLineSearch 2 1.0 3.8791e-03 1.0 5.09e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1313 VecDot 2 1.0 7.6056e-05 1.0 1.60e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2104 VecMDot 2024 1.0 1.5474e+00 1.0 2.50e+09 1.0 0.0e+00 0.0e+00 0.0e+00 29 29 0 0 0 29 29 0 0 0 1616 VecNorm 2095 1.0 3.9504e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecScale 2092 1.0 6.3965e-02 1.0 8.37e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 1308 VecCopy 2206 1.0 9.9524e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 VecSet 116 1.0 8.8339e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 174 1.0 5.0702e-03 1.0 1.39e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2745 VecWAXPY 2 1.0 1.8215e-04 1.0 8.00e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 439 VecMAXPY 2092 1.0 8.6512e-01 1.0 2.66e+09 1.0 0.0e+00 0.0e+00 0.0e+00 16 31 0 0 0 16 31 0 0 0 3078 VecScatterBegin 47 1.0 3.8862e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecReduceArith 5 1.0 8.1649e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecReduceComm 3 1.0 5.0068e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 2092 1.0 1.0624e-01 1.0 8.37e+07 1.0 0.0e+00 0.0e+00 0.0e+00 2 1 0 0 0 2 1 0 0 0 788 MatMult 2092 1.0 2.5192e+00 1.0 3.32e+09 1.0 0.0e+00 0.0e+00 0.0e+00 47 39 0 0 0 47 39 0 0 0 1318 MatAssemblyBegin 3 1.0 9.5367e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 3 1.0 2.8129e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatZeroEntries 2 1.0 1.2968e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatFDColorCreate 1 1.0 1.4269e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.8e+01 0 0 0 0 22 0 0 0 0 22 0 MatFDColorApply 2 1.0 1.3744e-01 1.0 3.85e+07 1.0 0.0e+00 0.0e+00 2.0e+00 3 0 0 0 2 3 0 0 0 2 280 MatFDColorFunc 42 1.0 1.3608e-02 1.0 3.53e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2593 KSPGMRESOrthog 2024 1.0 2.3621e+00 1.0 5.00e+09 1.0 0.0e+00 0.0e+00 0.0e+00 44 58 0 0 0 44 58 0 0 0 2117 KSPSetUp 2 1.0 7.6103e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 8 0 0 0 0 8 0 KSPSolve 2 1.0 5.1620e+00 1.0 8.58e+09 1.0 0.0e+00 0.0e+00 6.8e+01 96 99 0 0 53 96 99 0 0 53 1661 PCSetUp 2 1.0 9.5367e-07 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 2092 1.0 9.5067e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 2 2 1112 0 SNES 1 1 1292 0 SNESLineSearch 1 1 848 0 Vector 50 50 15112128 0 Vector Scatter 3 3 1884 0 Matrix 1 1 10165812 0 Matrix FD Coloring 1 1 724 0 Distributed Mesh 1 1 205328 0 Bipartite Graph 2 2 1400 0 Index Set 27 27 220288 0 IS L to G Mapping 3 3 161716 0 Krylov Solver 1 1 18304 0 Preconditioner 1 1 768 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 0 #PETSc Option Table entries: -da_grid_x 100 -da_grid_y 100 -log_summary -mat_no_inode -pc_type none -preload off -threadcomm_nthreads 1 -threadcomm_type openmp #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure run at: Wed Oct 24 15:17:55 2012 Configure options: --with-x=0 --download-f-blas-lapack=0 --with-blas-lapack-dir=/opt/intel/composerxe/mkl/lib/intel64 --with-mpi=1 --with-mpi-shared=1 --with-mpi=1 --download-mpich=no --with-openmp=1 --with-pthreadclasses=1 --with-debugging=0 --with-gnu-compilers=no --with-vendor-compilers=intel --with-cc=/usr/local/encap/platform_mpi-8.02.01/bin/mpicc --with-cxx=/usr/local/encap/platform_mpi-8.02.01/bin/mpiCC --with-fc=/usr/local/encap/platform_mpi-8.02.01/bin/mpif90 --with-shared-libraries=1 --with-c++-support --with-clanguage=C --COPTFLAGS="-fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info" --CXXOPTFLAGS="-fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info" --FOPTFLAGS="-fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info" --download-scalapack=1 --download-blacs=1 --with-blacs=1 --download-umfpack=1 --download-parmetis=1 --download-metis=1 --download-superlu=1 --download-superlu_dist=1 --download-mumps=1 --download-ml=1 --download-hypre=1 ----------------------------------------- Libraries compiled on Wed Oct 24 15:17:55 2012 on lagrange.tomato Machine characteristics: Linux-2.6.32-279.11.1.el6.x86_64-x86_64-with-centos-6.3-Final Using PETSc directory: /home/jfe/local/petsc-dev Using PETSc arch: intel-opt-precise-O3 ----------------------------------------- Using C compiler: /usr/local/encap/platform_mpi-8.02.01/bin/mpicc -fPIC -wd1572 -fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info -fopenmp ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: /usr/local/encap/platform_mpi-8.02.01/bin/mpif90 -fPIC -fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info -fopenmp ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/home/jfe/local/petsc-dev/intel-opt-precise-O3/include -I/home/jfe/local/petsc-dev/include -I/home/jfe/local/petsc-dev/include -I/home/jfe/local/petsc-dev/intel-opt-precise-O3/include -I/usr/local/encap/platform_mpi-8.02.01/include ----------------------------------------- Using C linker: /usr/local/encap/platform_mpi-8.02.01/bin/mpicc Using Fortran linker: /usr/local/encap/platform_mpi-8.02.01/bin/mpif90 Using libraries: -Wl,-rpath,/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -L/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -lpetsc -Wl,-rpath,/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -L/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lblacs -lml -Wl,-rpath,/usr/local/encap/platform_mpi-8.02.01/lib/linux_amd64 -L/usr/local/encap/platform_mpi-8.02.01/lib/linux_amd64 -lmpiCC -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/ipp/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/ipp/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -L/opt/intel/composer_xe_2011_sp1.7.256/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -lpthread -lsuperlu_dist_3.1 -lparmetis -lmetis -lsuperlu_4.3 -lHYPRE -lmpiCC -lumfpack -lamd -Wl,-rpath,/opt/intel/composerxe/mkl/lib/intel64 -L/opt/intel/composerxe/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lifport -lifcore -lm -lpthread -lm -lmpiCC -lpcmpio -lpcmpi -ldl -limf -lsvml -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl ----------------------------------------- lid velocity = 0.0001, prandtl # = 1, grashof # = 1 Number of SNES iterations = 2 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ex19 on a intel-opt-precise-O3 named lagrange.tomato with 1 processor, by jfe Thu Oct 25 13:25:01 2012 With 2 threads per MPI_Comm Using Petsc Development HG revision: f8bbb9afb3f28a97dd47839f1c4674891dd5c594 HG Date: Wed Oct 24 13:40:55 2012 -0400 Max Max/Min Avg Total Time (sec): 2.839e+00 1.00000 2.839e+00 Objects: 9.500e+01 1.00000 9.500e+01 Flops: 8.620e+09 1.00000 8.620e+09 8.620e+09 Flops/sec: 3.036e+09 1.00000 3.036e+09 3.036e+09 MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Reductions: 1.290e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 2.8390e+00 100.0% 8.6196e+09 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 1.280e+02 99.2% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage ThreadCommRunKernel 12903 1.0 2.6375e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 93 0 0 0 0 93 0 0 0 0 0 SNESSolve 1 1.0 2.8123e+00 1.0 8.62e+09 1.0 0.0e+00 0.0e+00 1.1e+02 99100 0 0 83 99100 0 0 84 3065 SNESFunctionEval 3 1.0 1.1320e-03 1.0 2.52e+06 1.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 2 0 0 0 0 2 2226 SNESJacobianEval 2 1.0 1.4213e-01 1.0 3.85e+07 1.0 0.0e+00 0.0e+00 3.3e+01 5 0 0 0 26 5 0 0 0 26 271 SNESLineSearch 2 1.0 2.5022e-03 1.0 5.09e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2036 VecDot 2 1.0 6.6757e-05 1.0 1.60e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2397 VecMDot 2024 1.0 7.8013e-01 1.0 2.50e+09 1.0 0.0e+00 0.0e+00 0.0e+00 27 29 0 0 0 27 29 0 0 0 3206 VecNorm 2095 1.0 2.9587e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecScale 2092 1.0 2.0741e-02 1.0 8.37e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 4035 VecCopy 2206 1.0 6.2586e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 VecSet 116 1.0 5.7404e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 174 1.0 4.5159e-03 1.0 1.39e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3082 VecWAXPY 2 1.0 9.5844e-05 1.0 8.00e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 835 VecMAXPY 2092 1.0 4.2411e-01 1.0 2.66e+09 1.0 0.0e+00 0.0e+00 0.0e+00 15 31 0 0 0 15 31 0 0 0 6278 VecScatterBegin 47 1.0 3.7689e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecReduceArith 5 1.0 4.8897e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecReduceComm 3 1.0 5.0068e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 2092 1.0 5.2986e-02 1.0 8.37e+07 1.0 0.0e+00 0.0e+00 0.0e+00 2 1 0 0 0 2 1 0 0 0 1579 MatMult 2092 1.0 1.3193e+00 1.0 3.32e+09 1.0 0.0e+00 0.0e+00 0.0e+00 46 39 0 0 0 46 39 0 0 0 2517 MatAssemblyBegin 3 1.0 1.9073e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 3 1.0 2.7649e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatZeroEntries 2 1.0 8.1682e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatFDColorCreate 1 1.0 1.3694e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.8e+01 0 0 0 0 22 0 0 0 0 22 0 MatFDColorApply 2 1.0 1.2798e-01 1.0 3.85e+07 1.0 0.0e+00 0.0e+00 2.0e+00 5 0 0 0 2 5 0 0 0 2 301 MatFDColorFunc 42 1.0 1.3292e-02 1.0 3.53e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2654 KSPGMRESOrthog 2024 1.0 1.1832e+00 1.0 5.00e+09 1.0 0.0e+00 0.0e+00 0.0e+00 42 58 0 0 0 42 58 0 0 0 4227 KSPSetUp 2 1.0 4.7874e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 8 0 0 0 0 8 0 KSPSolve 2 1.0 2.6620e+00 1.0 8.58e+09 1.0 0.0e+00 0.0e+00 6.8e+01 94 99 0 0 53 94 99 0 0 53 3221 PCSetUp 2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 2092 1.0 6.0513e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 2 2 1112 0 SNES 1 1 1292 0 SNESLineSearch 1 1 848 0 Vector 50 50 15112128 0 Vector Scatter 3 3 1884 0 Matrix 1 1 10165812 0 Matrix FD Coloring 1 1 724 0 Distributed Mesh 1 1 205328 0 Bipartite Graph 2 2 1400 0 Index Set 27 27 220288 0 IS L to G Mapping 3 3 161716 0 Krylov Solver 1 1 18304 0 Preconditioner 1 1 768 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 0 #PETSc Option Table entries: -da_grid_x 100 -da_grid_y 100 -log_summary -mat_no_inode -pc_type none -preload off -threadcomm_nthreads 2 -threadcomm_type openmp #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure run at: Wed Oct 24 15:17:55 2012 Configure options: --with-x=0 --download-f-blas-lapack=0 --with-blas-lapack-dir=/opt/intel/composerxe/mkl/lib/intel64 --with-mpi=1 --with-mpi-shared=1 --with-mpi=1 --download-mpich=no --with-openmp=1 --with-pthreadclasses=1 --with-debugging=0 --with-gnu-compilers=no --with-vendor-compilers=intel --with-cc=/usr/local/encap/platform_mpi-8.02.01/bin/mpicc --with-cxx=/usr/local/encap/platform_mpi-8.02.01/bin/mpiCC --with-fc=/usr/local/encap/platform_mpi-8.02.01/bin/mpif90 --with-shared-libraries=1 --with-c++-support --with-clanguage=C --COPTFLAGS="-fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info" --CXXOPTFLAGS="-fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info" --FOPTFLAGS="-fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info" --download-scalapack=1 --download-blacs=1 --with-blacs=1 --download-umfpack=1 --download-parmetis=1 --download-metis=1 --download-superlu=1 --download-superlu_dist=1 --download-mumps=1 --download-ml=1 --download-hypre=1 ----------------------------------------- Libraries compiled on Wed Oct 24 15:17:55 2012 on lagrange.tomato Machine characteristics: Linux-2.6.32-279.11.1.el6.x86_64-x86_64-with-centos-6.3-Final Using PETSc directory: /home/jfe/local/petsc-dev Using PETSc arch: intel-opt-precise-O3 ----------------------------------------- Using C compiler: /usr/local/encap/platform_mpi-8.02.01/bin/mpicc -fPIC -wd1572 -fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info -fopenmp ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: /usr/local/encap/platform_mpi-8.02.01/bin/mpif90 -fPIC -fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info -fopenmp ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/home/jfe/local/petsc-dev/intel-opt-precise-O3/include -I/home/jfe/local/petsc-dev/include -I/home/jfe/local/petsc-dev/include -I/home/jfe/local/petsc-dev/intel-opt-precise-O3/include -I/usr/local/encap/platform_mpi-8.02.01/include ----------------------------------------- Using C linker: /usr/local/encap/platform_mpi-8.02.01/bin/mpicc Using Fortran linker: /usr/local/encap/platform_mpi-8.02.01/bin/mpif90 Using libraries: -Wl,-rpath,/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -L/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -lpetsc -Wl,-rpath,/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -L/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lblacs -lml -Wl,-rpath,/usr/local/encap/platform_mpi-8.02.01/lib/linux_amd64 -L/usr/local/encap/platform_mpi-8.02.01/lib/linux_amd64 -lmpiCC -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/ipp/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/ipp/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -L/opt/intel/composer_xe_2011_sp1.7.256/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -lpthread -lsuperlu_dist_3.1 -lparmetis -lmetis -lsuperlu_4.3 -lHYPRE -lmpiCC -lumfpack -lamd -Wl,-rpath,/opt/intel/composerxe/mkl/lib/intel64 -L/opt/intel/composerxe/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lifport -lifcore -lm -lpthread -lm -lmpiCC -lpcmpio -lpcmpi -ldl -limf -lsvml -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl ----------------------------------------- lid velocity = 0.0001, prandtl # = 1, grashof # = 1 Number of SNES iterations = 2 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ex19 on a intel-opt-precise-O3 named lagrange.tomato with 1 processor, by jfe Thu Oct 25 13:25:04 2012 With 4 threads per MPI_Comm Using Petsc Development HG revision: f8bbb9afb3f28a97dd47839f1c4674891dd5c594 HG Date: Wed Oct 24 13:40:55 2012 -0400 Max Max/Min Avg Total Time (sec): 1.890e+00 1.00000 1.890e+00 Objects: 9.500e+01 1.00000 9.500e+01 Flops: 8.620e+09 1.00000 8.620e+09 8.620e+09 Flops/sec: 4.561e+09 1.00000 4.561e+09 4.561e+09 MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Reductions: 1.290e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.8899e+00 100.0% 8.6196e+09 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 1.280e+02 99.2% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage ThreadCommRunKernel 12903 1.0 1.6819e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 89 0 0 0 0 89 0 0 0 0 0 SNESSolve 1 1.0 1.8580e+00 1.0 8.62e+09 1.0 0.0e+00 0.0e+00 1.1e+02 98100 0 0 83 98100 0 0 84 4639 SNESFunctionEval 3 1.0 1.1683e-03 1.0 2.52e+06 1.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 2 0 0 0 0 2 2157 SNESJacobianEval 2 1.0 1.4636e-01 1.0 3.85e+07 1.0 0.0e+00 0.0e+00 3.3e+01 8 0 0 0 26 8 0 0 0 26 263 SNESLineSearch 2 1.0 2.1379e-03 1.0 5.09e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2383 VecDot 2 1.0 6.4135e-05 1.0 1.60e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2495 VecMDot 2024 1.0 4.5000e-01 1.0 2.50e+09 1.0 0.0e+00 0.0e+00 0.0e+00 24 29 0 0 0 24 29 0 0 0 5557 VecNorm 2095 1.0 2.0440e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecScale 2092 1.0 1.4119e-02 1.0 8.37e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 5927 VecCopy 2206 1.0 5.4946e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 VecSet 116 1.0 8.9054e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 174 1.0 3.4974e-03 1.0 1.39e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3980 VecWAXPY 2 1.0 7.2002e-05 1.0 8.00e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1111 VecMAXPY 2092 1.0 2.5356e-01 1.0 2.66e+09 1.0 0.0e+00 0.0e+00 0.0e+00 13 31 0 0 0 13 31 0 0 0 10501 VecScatterBegin 47 1.0 3.8872e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecReduceArith 5 1.0 1.3513e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecReduceComm 3 1.0 5.2452e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 2092 1.0 3.7524e-02 1.0 8.37e+07 1.0 0.0e+00 0.0e+00 0.0e+00 2 1 0 0 0 2 1 0 0 0 2230 MatMult 2092 1.0 8.7699e-01 1.0 3.32e+09 1.0 0.0e+00 0.0e+00 0.0e+00 46 39 0 0 0 46 39 0 0 0 3786 MatAssemblyBegin 3 1.0 3.0994e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 3 1.0 2.8601e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatZeroEntries 2 1.0 8.0824e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatFDColorCreate 1 1.0 1.4081e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.8e+01 1 0 0 0 22 1 0 0 0 22 0 MatFDColorApply 2 1.0 1.3183e-01 1.0 3.85e+07 1.0 0.0e+00 0.0e+00 2.0e+00 7 0 0 0 2 7 0 0 0 2 292 MatFDColorFunc 42 1.0 1.3633e-02 1.0 3.53e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 2588 KSPGMRESOrthog 2024 1.0 6.9162e-01 1.0 5.00e+09 1.0 0.0e+00 0.0e+00 0.0e+00 37 58 0 0 0 37 58 0 0 0 7232 KSPSetUp 2 1.0 3.6216e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 8 0 0 0 0 8 0 KSPSolve 2 1.0 1.6952e+00 1.0 8.58e+09 1.0 0.0e+00 0.0e+00 6.8e+01 90 99 0 0 53 90 99 0 0 53 5058 PCSetUp 2 1.0 1.9073e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 2092 1.0 5.3138e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 2 2 1112 0 SNES 1 1 1292 0 SNESLineSearch 1 1 848 0 Vector 50 50 15112128 0 Vector Scatter 3 3 1884 0 Matrix 1 1 10165812 0 Matrix FD Coloring 1 1 724 0 Distributed Mesh 1 1 205328 0 Bipartite Graph 2 2 1400 0 Index Set 27 27 220288 0 IS L to G Mapping 3 3 161716 0 Krylov Solver 1 1 18304 0 Preconditioner 1 1 768 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 0 #PETSc Option Table entries: -da_grid_x 100 -da_grid_y 100 -log_summary -mat_no_inode -pc_type none -preload off -threadcomm_nthreads 4 -threadcomm_type openmp #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure run at: Wed Oct 24 15:17:55 2012 Configure options: --with-x=0 --download-f-blas-lapack=0 --with-blas-lapack-dir=/opt/intel/composerxe/mkl/lib/intel64 --with-mpi=1 --with-mpi-shared=1 --with-mpi=1 --download-mpich=no --with-openmp=1 --with-pthreadclasses=1 --with-debugging=0 --with-gnu-compilers=no --with-vendor-compilers=intel --with-cc=/usr/local/encap/platform_mpi-8.02.01/bin/mpicc --with-cxx=/usr/local/encap/platform_mpi-8.02.01/bin/mpiCC --with-fc=/usr/local/encap/platform_mpi-8.02.01/bin/mpif90 --with-shared-libraries=1 --with-c++-support --with-clanguage=C --COPTFLAGS="-fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info" --CXXOPTFLAGS="-fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info" --FOPTFLAGS="-fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info" --download-scalapack=1 --download-blacs=1 --with-blacs=1 --download-umfpack=1 --download-parmetis=1 --download-metis=1 --download-superlu=1 --download-superlu_dist=1 --download-mumps=1 --download-ml=1 --download-hypre=1 ----------------------------------------- Libraries compiled on Wed Oct 24 15:17:55 2012 on lagrange.tomato Machine characteristics: Linux-2.6.32-279.11.1.el6.x86_64-x86_64-with-centos-6.3-Final Using PETSc directory: /home/jfe/local/petsc-dev Using PETSc arch: intel-opt-precise-O3 ----------------------------------------- Using C compiler: /usr/local/encap/platform_mpi-8.02.01/bin/mpicc -fPIC -wd1572 -fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info -fopenmp ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: /usr/local/encap/platform_mpi-8.02.01/bin/mpif90 -fPIC -fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info -fopenmp ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/home/jfe/local/petsc-dev/intel-opt-precise-O3/include -I/home/jfe/local/petsc-dev/include -I/home/jfe/local/petsc-dev/include -I/home/jfe/local/petsc-dev/intel-opt-precise-O3/include -I/usr/local/encap/platform_mpi-8.02.01/include ----------------------------------------- Using C linker: /usr/local/encap/platform_mpi-8.02.01/bin/mpicc Using Fortran linker: /usr/local/encap/platform_mpi-8.02.01/bin/mpif90 Using libraries: -Wl,-rpath,/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -L/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -lpetsc -Wl,-rpath,/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -L/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lblacs -lml -Wl,-rpath,/usr/local/encap/platform_mpi-8.02.01/lib/linux_amd64 -L/usr/local/encap/platform_mpi-8.02.01/lib/linux_amd64 -lmpiCC -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/ipp/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/ipp/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -L/opt/intel/composer_xe_2011_sp1.7.256/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -lpthread -lsuperlu_dist_3.1 -lparmetis -lmetis -lsuperlu_4.3 -lHYPRE -lmpiCC -lumfpack -lamd -Wl,-rpath,/opt/intel/composerxe/mkl/lib/intel64 -L/opt/intel/composerxe/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lifport -lifcore -lm -lpthread -lm -lmpiCC -lpcmpio -lpcmpi -ldl -limf -lsvml -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl ----------------------------------------- lid velocity = 0.0001, prandtl # = 1, grashof # = 1 Number of SNES iterations = 2 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ex19 on a intel-opt-precise-O3 named lagrange.tomato with 1 processor, by jfe Thu Oct 25 13:25:07 2012 With 6 threads per MPI_Comm Using Petsc Development HG revision: f8bbb9afb3f28a97dd47839f1c4674891dd5c594 HG Date: Wed Oct 24 13:40:55 2012 -0400 Max Max/Min Avg Total Time (sec): 1.698e+00 1.00000 1.698e+00 Objects: 9.500e+01 1.00000 9.500e+01 Flops: 8.620e+09 1.00000 8.620e+09 8.620e+09 Flops/sec: 5.076e+09 1.00000 5.076e+09 5.076e+09 MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00 MPI Reductions: 1.290e+02 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.6980e+00 100.0% 8.6196e+09 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 1.280e+02 99.2% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage ThreadCommRunKernel 12903 1.0 1.4895e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 88 0 0 0 0 88 0 0 0 0 0 SNESSolve 1 1.0 1.6647e+00 1.0 8.62e+09 1.0 0.0e+00 0.0e+00 1.1e+02 98100 0 0 83 98100 0 0 84 5178 SNESFunctionEval 3 1.0 1.1868e-03 1.0 2.52e+06 1.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 2 0 0 0 0 2 2123 SNESJacobianEval 2 1.0 1.4850e-01 1.0 3.85e+07 1.0 0.0e+00 0.0e+00 3.3e+01 9 0 0 0 26 9 0 0 0 26 259 SNESLineSearch 2 1.0 2.0700e-03 1.0 5.09e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2461 VecDot 2 1.0 6.9141e-05 1.0 1.60e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2314 VecMDot 2024 1.0 3.7082e-01 1.0 2.50e+09 1.0 0.0e+00 0.0e+00 0.0e+00 22 29 0 0 0 22 29 0 0 0 6744 VecNorm 2095 1.0 1.8710e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecScale 2092 1.0 1.2559e-02 1.0 8.37e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 6663 VecCopy 2206 1.0 5.5530e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 VecSet 116 1.0 9.4790e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecAXPY 174 1.0 3.6380e-03 1.0 1.39e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3826 VecWAXPY 2 1.0 5.2929e-05 1.0 8.00e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1511 VecMAXPY 2092 1.0 2.0497e-01 1.0 2.66e+09 1.0 0.0e+00 0.0e+00 0.0e+00 12 31 0 0 0 12 31 0 0 0 12991 VecScatterBegin 47 1.0 4.0169e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecReduceArith 5 1.0 3.5195e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 VecReduceComm 3 1.0 7.1526e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 2092 1.0 3.4492e-02 1.0 8.37e+07 1.0 0.0e+00 0.0e+00 0.0e+00 2 1 0 0 0 2 1 0 0 0 2426 MatMult 2092 1.0 7.9206e-01 1.0 3.32e+09 1.0 0.0e+00 0.0e+00 0.0e+00 47 39 0 0 0 47 39 0 0 0 4192 MatAssemblyBegin 3 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 3 1.0 2.9066e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatZeroEntries 2 1.0 8.1420e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatFDColorCreate 1 1.0 1.4220e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.8e+01 1 0 0 0 22 1 0 0 0 22 0 MatFDColorApply 2 1.0 1.3384e-01 1.0 3.85e+07 1.0 0.0e+00 0.0e+00 2.0e+00 8 0 0 0 2 8 0 0 0 2 288 MatFDColorFunc 42 1.0 1.3742e-02 1.0 3.53e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 2567 KSPGMRESOrthog 2024 1.0 5.6496e-01 1.0 5.00e+09 1.0 0.0e+00 0.0e+00 0.0e+00 33 58 0 0 0 33 58 0 0 0 8853 KSPSetUp 2 1.0 3.2616e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 8 0 0 0 0 8 0 KSPSolve 2 1.0 1.4782e+00 1.0 8.58e+09 1.0 0.0e+00 0.0e+00 6.8e+01 87 99 0 0 53 87 99 0 0 53 5801 PCSetUp 2 1.0 1.9073e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 2092 1.0 5.3416e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 2 2 1112 0 SNES 1 1 1292 0 SNESLineSearch 1 1 848 0 Vector 50 50 15112128 0 Vector Scatter 3 3 1884 0 Matrix 1 1 10165812 0 Matrix FD Coloring 1 1 724 0 Distributed Mesh 1 1 205328 0 Bipartite Graph 2 2 1400 0 Index Set 27 27 220288 0 IS L to G Mapping 3 3 161716 0 Krylov Solver 1 1 18304 0 Preconditioner 1 1 768 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 0 #PETSc Option Table entries: -da_grid_x 100 -da_grid_y 100 -log_summary -mat_no_inode -pc_type none -preload off -threadcomm_nthreads 6 -threadcomm_type openmp #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure run at: Wed Oct 24 15:17:55 2012 Configure options: --with-x=0 --download-f-blas-lapack=0 --with-blas-lapack-dir=/opt/intel/composerxe/mkl/lib/intel64 --with-mpi=1 --with-mpi-shared=1 --with-mpi=1 --download-mpich=no --with-openmp=1 --with-pthreadclasses=1 --with-debugging=0 --with-gnu-compilers=no --with-vendor-compilers=intel --with-cc=/usr/local/encap/platform_mpi-8.02.01/bin/mpicc --with-cxx=/usr/local/encap/platform_mpi-8.02.01/bin/mpiCC --with-fc=/usr/local/encap/platform_mpi-8.02.01/bin/mpif90 --with-shared-libraries=1 --with-c++-support --with-clanguage=C --COPTFLAGS="-fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info" --CXXOPTFLAGS="-fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info" --FOPTFLAGS="-fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info" --download-scalapack=1 --download-blacs=1 --with-blacs=1 --download-umfpack=1 --download-parmetis=1 --download-metis=1 --download-superlu=1 --download-superlu_dist=1 --download-mumps=1 --download-ml=1 --download-hypre=1 ----------------------------------------- Libraries compiled on Wed Oct 24 15:17:55 2012 on lagrange.tomato Machine characteristics: Linux-2.6.32-279.11.1.el6.x86_64-x86_64-with-centos-6.3-Final Using PETSc directory: /home/jfe/local/petsc-dev Using PETSc arch: intel-opt-precise-O3 ----------------------------------------- Using C compiler: /usr/local/encap/platform_mpi-8.02.01/bin/mpicc -fPIC -wd1572 -fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info -fopenmp ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: /usr/local/encap/platform_mpi-8.02.01/bin/mpif90 -fPIC -fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info -fopenmp ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/home/jfe/local/petsc-dev/intel-opt-precise-O3/include -I/home/jfe/local/petsc-dev/include -I/home/jfe/local/petsc-dev/include -I/home/jfe/local/petsc-dev/intel-opt-precise-O3/include -I/usr/local/encap/platform_mpi-8.02.01/include ----------------------------------------- Using C linker: /usr/local/encap/platform_mpi-8.02.01/bin/mpicc Using Fortran linker: /usr/local/encap/platform_mpi-8.02.01/bin/mpif90 Using libraries: -Wl,-rpath,/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -L/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -lpetsc -Wl,-rpath,/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -L/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lblacs -lml -Wl,-rpath,/usr/local/encap/platform_mpi-8.02.01/lib/linux_amd64 -L/usr/local/encap/platform_mpi-8.02.01/lib/linux_amd64 -lmpiCC -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/ipp/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/ipp/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -L/opt/intel/composer_xe_2011_sp1.7.256/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -lpthread -lsuperlu_dist_3.1 -lparmetis -lmetis -lsuperlu_4.3 -lHYPRE -lmpiCC -lumfpack -lamd -Wl,-rpath,/opt/intel/composerxe/mkl/lib/intel64 -L/opt/intel/composerxe/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lifport -lifcore -lm -lpthread -lm -lmpiCC -lpcmpio -lpcmpi -ldl -limf -lsvml -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl ----------------------------------------- lid velocity = 0.0001, prandtl # = 1, grashof # = 1 Number of SNES iterations = 2 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ex19 on a intel-opt-precise-O3 named lagrange.tomato with 2 processors, by jfe Thu Oct 25 13:25:11 2012 Using Petsc Development HG revision: f8bbb9afb3f28a97dd47839f1c4674891dd5c594 HG Date: Wed Oct 24 13:40:55 2012 -0400 Max Max/Min Avg Total Time (sec): 2.754e+00 1.00001 2.754e+00 Objects: 1.020e+02 1.00000 1.020e+02 Flops: 4.394e+09 1.00000 4.394e+09 8.787e+09 Flops/sec: 1.595e+09 1.00001 1.595e+09 3.190e+09 MPI Messages: 2.146e+03 1.00000 2.146e+03 4.291e+03 MPI Message Lengths: 4.283e+06 1.00000 1.996e+03 8.566e+06 MPI Reductions: 4.320e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 2.7544e+00 100.0% 8.7871e+09 100.0% 4.291e+03 100.0% 1.996e+03 100.0% 4.319e+03 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage ThreadCommRunKernel 10767 1.0 2.4521e+00 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 86 0 0 0 0 86 0 0 0 0 0 SNESSolve 1 1.0 2.7325e+00 1.0 4.39e+09 1.0 4.3e+03 2.0e+03 4.3e+03 99100100100 99 99100100100 99 3216 SNESFunctionEval 3 1.0 5.1808e-04 1.0 1.26e+06 1.0 6.0e+00 2.0e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 4864 SNESJacobianEval 2 1.0 2.2617e-01 1.0 1.93e+07 1.0 9.3e+01 2.0e+03 8.1e+01 8 0 2 2 2 8 0 2 2 2 170 SNESLineSearch 2 1.0 1.8930e-03 1.0 2.71e+06 1.0 8.0e+00 2.0e+03 8.0e+00 0 0 0 0 0 0 0 0 0 0 2860 VecDot 2 1.0 6.1035e-05 1.4 8.00e+04 1.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 2621 VecMDot 2024 1.0 7.1133e-01 1.0 1.25e+09 1.0 0.0e+00 0.0e+00 2.0e+03 26 28 0 0 47 26 28 0 0 47 3516 VecNorm 2095 1.0 3.0765e-02 1.1 8.38e+07 1.0 0.0e+00 0.0e+00 2.1e+03 1 2 0 0 48 1 2 0 0 49 5448 VecScale 2092 1.0 3.4868e-02 1.1 4.18e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 2400 VecCopy 2206 1.0 3.7831e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecSet 72 1.0 1.0309e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 174 1.0 1.4505e-0152.7 6.96e+06 1.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 96 VecWAXPY 2 1.0 8.9169e-05 1.0 4.00e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 897 VecMAXPY 2092 1.0 4.2457e-01 1.0 1.33e+09 1.0 0.0e+00 0.0e+00 0.0e+00 15 30 0 0 0 15 30 0 0 0 6272 VecScatterBegin 2139 1.0 8.0109e-03 1.1 0.00e+00 0.0 4.3e+03 2.0e+03 0.0e+00 0 0100100 0 0 0100100 0 0 VecScatterEnd 2139 1.0 1.4959e-0120.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 VecReduceArith 5 1.0 2.0428e-02 4.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecReduceComm 3 1.0 1.5418e-02862.2 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 2092 1.0 6.3646e-02 1.0 1.25e+08 1.0 0.0e+00 0.0e+00 2.1e+03 2 3 0 0 48 2 3 0 0 48 3942 MatMult 2092 1.0 1.2438e+00 1.0 1.66e+09 1.0 4.2e+03 2.0e+03 0.0e+00 45 38 98 98 0 45 38 98 98 0 2669 MatAssemblyBegin 3 1.0 5.1618e-04 5.9 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 3 1.0 2.1307e-03 1.0 0.00e+00 0.0 4.0e+00 2.0e+02 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatZeroEntries 2 1.0 6.7925e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatFDColorCreate 1 1.0 8.9600e-03 1.0 0.00e+00 0.0 4.0e+00 8.0e+02 7.1e+01 0 0 0 0 2 0 0 0 0 2 0 MatFDColorApply 2 1.0 2.1692e-01 1.0 1.93e+07 1.0 8.8e+01 2.1e+03 7.0e+00 8 0 2 2 0 8 0 2 2 0 178 MatFDColorFunc 42 1.0 1.4806e-0125.4 1.76e+07 1.0 8.4e+01 2.0e+03 0.0e+00 3 0 2 2 0 3 0 2 2 0 238 KSPGMRESOrthog 2024 1.0 1.1104e+00 1.0 2.50e+09 1.0 0.0e+00 0.0e+00 2.0e+03 40 57 0 0 47 40 57 0 0 47 4504 KSPSetUp 2 1.0 2.3484e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 2 1.0 2.4835e+00 1.0 4.37e+09 1.0 4.2e+03 2.0e+03 4.2e+03 90 99 97 98 97 90 99 97 98 97 3520 PCSetUp 2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 2092 1.0 3.6231e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 2 2 1112 0 SNES 1 1 1292 0 SNESLineSearch 1 1 848 0 Vector 52 52 7607552 0 Vector Scatter 4 4 4240 0 Matrix 3 3 5410828 0 Matrix FD Coloring 1 1 3255684 0 Distributed Mesh 1 1 107328 0 Bipartite Graph 2 2 1400 0 Index Set 29 29 123784 0 IS L to G Mapping 3 3 83316 0 Krylov Solver 1 1 18304 0 Preconditioner 1 1 768 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 0 Average time for MPI_Barrier(): 2.00272e-06 Average time for zero size MPI_Send(): 2.02656e-06 #PETSc Option Table entries: -da_grid_x 100 -da_grid_y 100 -log_summary -mat_no_inode -pc_type none -preload off -threadcomm_nthreads 1 -threadcomm_type openmp #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure run at: Wed Oct 24 15:17:55 2012 Configure options: --with-x=0 --download-f-blas-lapack=0 --with-blas-lapack-dir=/opt/intel/composerxe/mkl/lib/intel64 --with-mpi=1 --with-mpi-shared=1 --with-mpi=1 --download-mpich=no --with-openmp=1 --with-pthreadclasses=1 --with-debugging=0 --with-gnu-compilers=no --with-vendor-compilers=intel --with-cc=/usr/local/encap/platform_mpi-8.02.01/bin/mpicc --with-cxx=/usr/local/encap/platform_mpi-8.02.01/bin/mpiCC --with-fc=/usr/local/encap/platform_mpi-8.02.01/bin/mpif90 --with-shared-libraries=1 --with-c++-support --with-clanguage=C --COPTFLAGS="-fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info" --CXXOPTFLAGS="-fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info" --FOPTFLAGS="-fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info" --download-scalapack=1 --download-blacs=1 --with-blacs=1 --download-umfpack=1 --download-parmetis=1 --download-metis=1 --download-superlu=1 --download-superlu_dist=1 --download-mumps=1 --download-ml=1 --download-hypre=1 ----------------------------------------- Libraries compiled on Wed Oct 24 15:17:55 2012 on lagrange.tomato Machine characteristics: Linux-2.6.32-279.11.1.el6.x86_64-x86_64-with-centos-6.3-Final Using PETSc directory: /home/jfe/local/petsc-dev Using PETSc arch: intel-opt-precise-O3 ----------------------------------------- Using C compiler: /usr/local/encap/platform_mpi-8.02.01/bin/mpicc -fPIC -wd1572 -fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info -fopenmp ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: /usr/local/encap/platform_mpi-8.02.01/bin/mpif90 -fPIC -fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info -fopenmp ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/home/jfe/local/petsc-dev/intel-opt-precise-O3/include -I/home/jfe/local/petsc-dev/include -I/home/jfe/local/petsc-dev/include -I/home/jfe/local/petsc-dev/intel-opt-precise-O3/include -I/usr/local/encap/platform_mpi-8.02.01/include ----------------------------------------- Using C linker: /usr/local/encap/platform_mpi-8.02.01/bin/mpicc Using Fortran linker: /usr/local/encap/platform_mpi-8.02.01/bin/mpif90 Using libraries: -Wl,-rpath,/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -L/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -lpetsc -Wl,-rpath,/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -L/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lblacs -lml -Wl,-rpath,/usr/local/encap/platform_mpi-8.02.01/lib/linux_amd64 -L/usr/local/encap/platform_mpi-8.02.01/lib/linux_amd64 -lmpiCC -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/ipp/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/ipp/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -L/opt/intel/composer_xe_2011_sp1.7.256/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -lpthread -lsuperlu_dist_3.1 -lparmetis -lmetis -lsuperlu_4.3 -lHYPRE -lmpiCC -lumfpack -lamd -Wl,-rpath,/opt/intel/composerxe/mkl/lib/intel64 -L/opt/intel/composerxe/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lifport -lifcore -lm -lpthread -lm -lmpiCC -lpcmpio -lpcmpi -ldl -limf -lsvml -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl ----------------------------------------- lid velocity = 0.0001, prandtl # = 1, grashof # = 1 Number of SNES iterations = 2 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ex19 on a intel-opt-precise-O3 named lagrange.tomato with 2 processors, by jfe Thu Oct 25 13:25:14 2012 With 2 threads per MPI_Comm Using Petsc Development HG revision: f8bbb9afb3f28a97dd47839f1c4674891dd5c594 HG Date: Wed Oct 24 13:40:55 2012 -0400 Max Max/Min Avg Total Time (sec): 1.551e+00 1.00001 1.551e+00 Objects: 1.020e+02 1.00000 1.020e+02 Flops: 4.394e+09 1.00000 4.394e+09 8.787e+09 Flops/sec: 2.832e+09 1.00001 2.832e+09 5.664e+09 MPI Messages: 2.146e+03 1.00000 2.146e+03 4.291e+03 MPI Message Lengths: 4.283e+06 1.00000 1.996e+03 8.566e+06 MPI Reductions: 4.320e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 1.5514e+00 100.0% 8.7871e+09 100.0% 4.291e+03 100.0% 1.996e+03 100.0% 4.319e+03 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage ThreadCommRunKernel 10767 1.0 1.2035e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 77 0 0 0 0 77 0 0 0 0 0 SNESSolve 1 1.0 1.5263e+00 1.0 4.39e+09 1.0 4.3e+03 2.0e+03 4.3e+03 98100100100 99 98100100100 99 5757 SNESFunctionEval 3 1.0 5.2309e-04 1.0 1.26e+06 1.0 6.0e+00 2.0e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 4818 SNESJacobianEval 2 1.0 1.3211e-01 1.0 1.93e+07 1.0 9.3e+01 2.0e+03 8.1e+01 9 0 2 2 2 9 0 2 2 2 292 SNESLineSearch 2 1.0 1.3511e-03 1.0 2.71e+06 1.0 8.0e+00 2.0e+03 8.0e+00 0 0 0 0 0 0 0 0 0 0 4007 VecDot 2 1.0 5.6028e-05 1.3 8.00e+04 1.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 2856 VecMDot 2024 1.0 3.6912e-01 1.0 1.25e+09 1.0 0.0e+00 0.0e+00 2.0e+03 24 28 0 0 47 24 28 0 0 47 6775 VecNorm 2095 1.0 7.9822e-02 1.8 8.38e+07 1.0 0.0e+00 0.0e+00 2.1e+03 4 2 0 0 48 4 2 0 0 49 2100 VecScale 2092 1.0 2.1905e-02 1.0 4.18e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 3820 VecCopy 2206 1.0 2.5198e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 VecSet 72 1.0 6.7663e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 174 1.0 1.9078e-03 1.0 6.96e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 7296 VecWAXPY 2 1.0 4.6968e-05 1.0 4.00e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1703 VecMAXPY 2092 1.0 2.1488e-01 1.0 1.33e+09 1.0 0.0e+00 0.0e+00 0.0e+00 14 30 0 0 0 14 30 0 0 0 12392 VecScatterBegin 2139 1.0 8.3029e-03 1.1 0.00e+00 0.0 4.3e+03 2.0e+03 0.0e+00 1 0100100 0 1 0100100 0 0 VecScatterEnd 2139 1.0 7.6807e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecReduceArith 5 1.0 3.6771e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecReduceComm 3 1.0 1.5521e-04 8.1 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 2092 1.0 6.7404e-02 1.0 1.25e+08 1.0 0.0e+00 0.0e+00 2.1e+03 4 3 0 0 48 4 3 0 0 48 3722 MatMult 2092 1.0 7.0932e-01 1.0 1.66e+09 1.0 4.2e+03 2.0e+03 0.0e+00 45 38 98 98 0 45 38 98 98 0 4681 MatAssemblyBegin 3 1.0 1.4081e-0321.4 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 3 1.0 2.1212e-03 1.0 0.00e+00 0.0 4.0e+00 2.0e+02 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatZeroEntries 2 1.0 3.5714e-0285.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 MatFDColorCreate 1 1.0 2.1954e-02 1.0 0.00e+00 0.0 4.0e+00 8.0e+02 7.1e+01 1 0 0 0 2 1 0 0 0 2 0 MatFDColorApply 2 1.0 1.0983e-01 1.0 1.93e+07 1.0 8.8e+01 2.1e+03 7.0e+00 7 0 2 2 0 7 0 2 2 0 351 MatFDColorFunc 42 1.0 5.9059e-03 1.0 1.76e+07 1.0 8.4e+01 2.0e+03 0.0e+00 0 0 2 2 0 0 0 2 2 0 5974 KSPGMRESOrthog 2024 1.0 5.7333e-01 1.0 2.50e+09 1.0 0.0e+00 0.0e+00 2.0e+03 37 57 0 0 47 37 57 0 0 47 8724 KSPSetUp 2 1.0 2.3007e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 2 1.0 1.3887e+00 1.0 4.37e+09 1.0 4.2e+03 2.0e+03 4.2e+03 90 99 97 98 97 90 99 97 98 97 6295 PCSetUp 2 1.0 9.5367e-07 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 2092 1.0 2.4360e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 2 2 1112 0 SNES 1 1 1292 0 SNESLineSearch 1 1 848 0 Vector 52 52 7607552 0 Vector Scatter 4 4 4240 0 Matrix 3 3 5410828 0 Matrix FD Coloring 1 1 3255684 0 Distributed Mesh 1 1 107328 0 Bipartite Graph 2 2 1400 0 Index Set 29 29 123784 0 IS L to G Mapping 3 3 83316 0 Krylov Solver 1 1 18304 0 Preconditioner 1 1 768 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 9.53674e-08 Average time for MPI_Barrier(): 3.8147e-07 Average time for zero size MPI_Send(): 2.02656e-06 #PETSc Option Table entries: -da_grid_x 100 -da_grid_y 100 -log_summary -mat_no_inode -pc_type none -preload off -threadcomm_nthreads 2 -threadcomm_type openmp #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure run at: Wed Oct 24 15:17:55 2012 Configure options: --with-x=0 --download-f-blas-lapack=0 --with-blas-lapack-dir=/opt/intel/composerxe/mkl/lib/intel64 --with-mpi=1 --with-mpi-shared=1 --with-mpi=1 --download-mpich=no --with-openmp=1 --with-pthreadclasses=1 --with-debugging=0 --with-gnu-compilers=no --with-vendor-compilers=intel --with-cc=/usr/local/encap/platform_mpi-8.02.01/bin/mpicc --with-cxx=/usr/local/encap/platform_mpi-8.02.01/bin/mpiCC --with-fc=/usr/local/encap/platform_mpi-8.02.01/bin/mpif90 --with-shared-libraries=1 --with-c++-support --with-clanguage=C --COPTFLAGS="-fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info" --CXXOPTFLAGS="-fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info" --FOPTFLAGS="-fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info" --download-scalapack=1 --download-blacs=1 --with-blacs=1 --download-umfpack=1 --download-parmetis=1 --download-metis=1 --download-superlu=1 --download-superlu_dist=1 --download-mumps=1 --download-ml=1 --download-hypre=1 ----------------------------------------- Libraries compiled on Wed Oct 24 15:17:55 2012 on lagrange.tomato Machine characteristics: Linux-2.6.32-279.11.1.el6.x86_64-x86_64-with-centos-6.3-Final Using PETSc directory: /home/jfe/local/petsc-dev Using PETSc arch: intel-opt-precise-O3 ----------------------------------------- Using C compiler: /usr/local/encap/platform_mpi-8.02.01/bin/mpicc -fPIC -wd1572 -fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info -fopenmp ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: /usr/local/encap/platform_mpi-8.02.01/bin/mpif90 -fPIC -fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info -fopenmp ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/home/jfe/local/petsc-dev/intel-opt-precise-O3/include -I/home/jfe/local/petsc-dev/include -I/home/jfe/local/petsc-dev/include -I/home/jfe/local/petsc-dev/intel-opt-precise-O3/include -I/usr/local/encap/platform_mpi-8.02.01/include ----------------------------------------- Using C linker: /usr/local/encap/platform_mpi-8.02.01/bin/mpicc Using Fortran linker: /usr/local/encap/platform_mpi-8.02.01/bin/mpif90 Using libraries: -Wl,-rpath,/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -L/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -lpetsc -Wl,-rpath,/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -L/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lblacs -lml -Wl,-rpath,/usr/local/encap/platform_mpi-8.02.01/lib/linux_amd64 -L/usr/local/encap/platform_mpi-8.02.01/lib/linux_amd64 -lmpiCC -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/ipp/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/ipp/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -L/opt/intel/composer_xe_2011_sp1.7.256/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -lpthread -lsuperlu_dist_3.1 -lparmetis -lmetis -lsuperlu_4.3 -lHYPRE -lmpiCC -lumfpack -lamd -Wl,-rpath,/opt/intel/composerxe/mkl/lib/intel64 -L/opt/intel/composerxe/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lifport -lifcore -lm -lpthread -lm -lmpiCC -lpcmpio -lpcmpi -ldl -limf -lsvml -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl ----------------------------------------- lid velocity = 0.0001, prandtl # = 1, grashof # = 1 Number of SNES iterations = 2 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ex19 on a intel-opt-precise-O3 named lagrange.tomato with 2 processors, by jfe Thu Oct 25 13:25:16 2012 With 4 threads per MPI_Comm Using Petsc Development HG revision: f8bbb9afb3f28a97dd47839f1c4674891dd5c594 HG Date: Wed Oct 24 13:40:55 2012 -0400 Max Max/Min Avg Total Time (sec): 9.809e-01 1.00002 9.809e-01 Objects: 1.020e+02 1.00000 1.020e+02 Flops: 4.394e+09 1.00000 4.394e+09 8.787e+09 Flops/sec: 4.479e+09 1.00002 4.479e+09 8.958e+09 MPI Messages: 2.146e+03 1.00000 2.146e+03 4.291e+03 MPI Message Lengths: 4.283e+06 1.00000 1.996e+03 8.566e+06 MPI Reductions: 4.320e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 9.8091e-01 100.0% 8.7871e+09 100.0% 4.291e+03 100.0% 1.996e+03 100.0% 4.319e+03 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage ThreadCommRunKernel 10767 1.0 6.5954e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 67 0 0 0 0 67 0 0 0 0 0 SNESSolve 1 1.0 9.5998e-01 1.0 4.39e+09 1.0 4.3e+03 2.0e+03 4.3e+03 98100100100 99 98100100100 99 9153 SNESFunctionEval 3 1.0 5.3716e-04 1.0 1.26e+06 1.0 6.0e+00 2.0e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 4691 SNESJacobianEval 2 1.0 8.5923e-02 1.0 1.93e+07 1.0 9.3e+01 2.0e+03 8.1e+01 9 0 2 2 2 9 0 2 2 2 449 SNESLineSearch 2 1.0 1.0507e-03 1.0 2.71e+06 1.0 8.0e+00 2.0e+03 8.0e+00 0 0 0 0 0 0 0 0 0 0 5153 VecDot 2 1.0 5.1975e-05 1.4 8.00e+04 1.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 3078 VecMDot 2024 1.0 1.9616e-01 1.0 1.25e+09 1.0 0.0e+00 0.0e+00 2.0e+03 20 28 0 0 47 20 28 0 0 47 12748 VecNorm 2095 1.0 4.6436e-02 1.1 8.38e+07 1.0 0.0e+00 0.0e+00 2.1e+03 5 2 0 0 48 5 2 0 0 49 3609 VecScale 2092 1.0 1.7011e-02 1.0 4.18e+07 1.0 0.0e+00 0.0e+00 0.0e+00 2 1 0 0 0 2 1 0 0 0 4919 VecCopy 2206 1.0 2.4582e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 VecSet 72 1.0 6.0916e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 174 1.0 1.6856e-03 1.1 6.96e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 8258 VecWAXPY 2 1.0 2.7180e-05 1.0 4.00e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2943 VecMAXPY 2092 1.0 1.1181e-01 1.0 1.33e+09 1.0 0.0e+00 0.0e+00 0.0e+00 11 30 0 0 0 11 30 0 0 0 23816 VecScatterBegin 2139 1.0 8.8797e-03 1.2 0.00e+00 0.0 4.3e+03 2.0e+03 0.0e+00 1 0100100 0 1 0100100 0 0 VecScatterEnd 2139 1.0 7.4925e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecReduceArith 5 1.0 3.2301e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 VecReduceComm 3 1.0 3.4502e-03155.6 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecNormalize 2092 1.0 6.1668e-02 1.0 1.25e+08 1.0 0.0e+00 0.0e+00 2.1e+03 6 3 0 0 48 6 3 0 0 48 4068 MatMult 2092 1.0 4.4137e-01 1.0 1.66e+09 1.0 4.2e+03 2.0e+03 0.0e+00 45 38 98 98 0 45 38 98 98 0 7523 MatAssemblyBegin 3 1.0 3.2210e-04 3.8 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 3 1.0 2.1718e-03 1.0 0.00e+00 0.0 4.0e+00 2.0e+02 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatZeroEntries 2 1.0 2.2492e-03 6.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatFDColorCreate 1 1.0 9.8159e-03 1.1 0.00e+00 0.0 4.0e+00 8.0e+02 7.1e+01 1 0 0 0 2 1 0 0 0 2 0 MatFDColorApply 2 1.0 7.6740e-02 1.0 1.93e+07 1.0 8.8e+01 2.1e+03 7.0e+00 8 0 2 2 0 8 0 2 2 0 502 MatFDColorFunc 42 1.0 5.8541e-03 1.0 1.76e+07 1.0 8.4e+01 2.0e+03 0.0e+00 1 0 2 2 0 1 0 2 2 0 6027 KSPGMRESOrthog 2024 1.0 3.0326e-01 1.0 2.50e+09 1.0 0.0e+00 0.0e+00 2.0e+03 31 57 0 0 47 31 57 0 0 47 16493 KSPSetUp 2 1.0 2.3293e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 2 1.0 8.4022e-01 1.0 4.37e+09 1.0 4.2e+03 2.0e+03 4.2e+03 86 99 97 98 97 86 99 97 98 97 10405 PCSetUp 2 1.0 1.1921e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 2092 1.0 2.4236e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 2 2 1112 0 SNES 1 1 1292 0 SNESLineSearch 1 1 848 0 Vector 52 52 7607552 0 Vector Scatter 4 4 4240 0 Matrix 3 3 5410828 0 Matrix FD Coloring 1 1 3255684 0 Distributed Mesh 1 1 107328 0 Bipartite Graph 2 2 1400 0 Index Set 29 29 123784 0 IS L to G Mapping 3 3 83316 0 Krylov Solver 1 1 18304 0 Preconditioner 1 1 768 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 0 Average time for MPI_Barrier(): 2.00272e-06 Average time for zero size MPI_Send(): 2.02656e-06 #PETSc Option Table entries: -da_grid_x 100 -da_grid_y 100 -log_summary -mat_no_inode -pc_type none -preload off -threadcomm_nthreads 4 -threadcomm_type openmp #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure run at: Wed Oct 24 15:17:55 2012 Configure options: --with-x=0 --download-f-blas-lapack=0 --with-blas-lapack-dir=/opt/intel/composerxe/mkl/lib/intel64 --with-mpi=1 --with-mpi-shared=1 --with-mpi=1 --download-mpich=no --with-openmp=1 --with-pthreadclasses=1 --with-debugging=0 --with-gnu-compilers=no --with-vendor-compilers=intel --with-cc=/usr/local/encap/platform_mpi-8.02.01/bin/mpicc --with-cxx=/usr/local/encap/platform_mpi-8.02.01/bin/mpiCC --with-fc=/usr/local/encap/platform_mpi-8.02.01/bin/mpif90 --with-shared-libraries=1 --with-c++-support --with-clanguage=C --COPTFLAGS="-fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info" --CXXOPTFLAGS="-fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info" --FOPTFLAGS="-fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info" --download-scalapack=1 --download-blacs=1 --with-blacs=1 --download-umfpack=1 --download-parmetis=1 --download-metis=1 --download-superlu=1 --download-superlu_dist=1 --download-mumps=1 --download-ml=1 --download-hypre=1 ----------------------------------------- Libraries compiled on Wed Oct 24 15:17:55 2012 on lagrange.tomato Machine characteristics: Linux-2.6.32-279.11.1.el6.x86_64-x86_64-with-centos-6.3-Final Using PETSc directory: /home/jfe/local/petsc-dev Using PETSc arch: intel-opt-precise-O3 ----------------------------------------- Using C compiler: /usr/local/encap/platform_mpi-8.02.01/bin/mpicc -fPIC -wd1572 -fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info -fopenmp ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: /usr/local/encap/platform_mpi-8.02.01/bin/mpif90 -fPIC -fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info -fopenmp ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/home/jfe/local/petsc-dev/intel-opt-precise-O3/include -I/home/jfe/local/petsc-dev/include -I/home/jfe/local/petsc-dev/include -I/home/jfe/local/petsc-dev/intel-opt-precise-O3/include -I/usr/local/encap/platform_mpi-8.02.01/include ----------------------------------------- Using C linker: /usr/local/encap/platform_mpi-8.02.01/bin/mpicc Using Fortran linker: /usr/local/encap/platform_mpi-8.02.01/bin/mpif90 Using libraries: -Wl,-rpath,/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -L/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -lpetsc -Wl,-rpath,/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -L/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lblacs -lml -Wl,-rpath,/usr/local/encap/platform_mpi-8.02.01/lib/linux_amd64 -L/usr/local/encap/platform_mpi-8.02.01/lib/linux_amd64 -lmpiCC -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/ipp/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/ipp/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -L/opt/intel/composer_xe_2011_sp1.7.256/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -lpthread -lsuperlu_dist_3.1 -lparmetis -lmetis -lsuperlu_4.3 -lHYPRE -lmpiCC -lumfpack -lamd -Wl,-rpath,/opt/intel/composerxe/mkl/lib/intel64 -L/opt/intel/composerxe/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lifport -lifcore -lm -lpthread -lm -lmpiCC -lpcmpio -lpcmpi -ldl -limf -lsvml -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl ----------------------------------------- lid velocity = 0.0001, prandtl # = 1, grashof # = 1 Number of SNES iterations = 2 ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./ex19 on a intel-opt-precise-O3 named lagrange.tomato with 2 processors, by jfe Thu Oct 25 13:25:18 2012 With 6 threads per MPI_Comm Using Petsc Development HG revision: f8bbb9afb3f28a97dd47839f1c4674891dd5c594 HG Date: Wed Oct 24 13:40:55 2012 -0400 Max Max/Min Avg Total Time (sec): 9.138e-01 1.00002 9.138e-01 Objects: 1.020e+02 1.00000 1.020e+02 Flops: 4.394e+09 1.00000 4.394e+09 8.787e+09 Flops/sec: 4.808e+09 1.00002 4.808e+09 9.616e+09 MPI Messages: 2.146e+03 1.00000 2.146e+03 4.291e+03 MPI Message Lengths: 4.283e+06 1.00000 1.996e+03 8.566e+06 MPI Reductions: 4.320e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 9.1379e-01 100.0% 8.7871e+09 100.0% 4.291e+03 100.0% 1.996e+03 100.0% 4.319e+03 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %f - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage ThreadCommRunKernel 10767 1.0 5.9562e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 59 0 0 0 0 59 0 0 0 0 0 SNESSolve 1 1.0 8.8464e-01 1.0 4.39e+09 1.0 4.3e+03 2.0e+03 4.3e+03 97100100100 99 97100100100 99 9933 SNESFunctionEval 3 1.0 5.4908e-04 1.0 1.26e+06 1.0 6.0e+00 2.0e+03 2.0e+00 0 0 0 0 0 0 0 0 0 0 4590 SNESJacobianEval 2 1.0 1.9532e-01 1.0 1.93e+07 1.0 9.3e+01 2.0e+03 8.1e+01 21 0 2 2 2 21 0 2 2 2 197 SNESLineSearch 2 1.0 9.3198e-04 1.0 2.71e+06 1.0 8.0e+00 2.0e+03 8.0e+00 0 0 0 0 0 0 0 0 0 0 5810 VecDot 2 1.0 4.8876e-05 1.1 8.00e+04 1.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 3274 VecMDot 2024 1.0 1.4255e-01 1.0 1.25e+09 1.0 0.0e+00 0.0e+00 2.0e+03 15 28 0 0 47 15 28 0 0 47 17543 VecNorm 2095 1.0 1.0364e-01 5.9 8.38e+07 1.0 0.0e+00 0.0e+00 2.1e+03 7 2 0 0 48 7 2 0 0 49 1617 VecScale 2092 1.0 7.5700e-03 1.0 4.18e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 11054 VecCopy 2206 1.0 2.5504e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 VecSet 72 1.0 5.2786e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 174 1.0 1.8260e-03 1.0 6.96e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 7623 VecWAXPY 2 1.0 2.2173e-05 1.1 4.00e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 3608 VecMAXPY 2092 1.0 8.3066e-02 1.0 1.33e+09 1.0 0.0e+00 0.0e+00 0.0e+00 9 30 0 0 0 9 30 0 0 0 32055 VecScatterBegin 2139 1.0 8.0633e-03 1.1 0.00e+00 0.0 4.3e+03 2.0e+03 0.0e+00 1 0100100 0 1 0100100 0 0 VecScatterEnd 2139 1.0 8.0688e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecReduceArith 5 1.0 5.6234e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 5 0 0 0 0 5 0 0 0 0 0 VecReduceComm 3 1.0 1.9855e-021003.3 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 1 0 0 0 0 1 0 0 0 0 0 VecNormalize 2092 1.0 2.5846e-02 1.0 1.25e+08 1.0 0.0e+00 0.0e+00 2.1e+03 3 3 0 0 48 3 3 0 0 48 9707 MatMult 2092 1.0 3.5138e-01 1.0 1.66e+09 1.0 4.2e+03 2.0e+03 0.0e+00 38 38 98 98 0 38 38 98 98 0 9450 MatAssemblyBegin 3 1.0 5.1332e-04 7.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 3 1.0 2.2070e-03 1.0 0.00e+00 0.0 4.0e+00 2.0e+02 8.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatZeroEntries 2 1.0 8.6918e-02288.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 5 0 0 0 0 5 0 0 0 0 0 MatFDColorCreate 1 1.0 3.3874e-02 1.0 0.00e+00 0.0 4.0e+00 8.0e+02 7.1e+01 4 0 0 0 2 4 0 0 0 2 0 MatFDColorApply 2 1.0 1.6118e-01 1.0 1.93e+07 1.0 8.8e+01 2.1e+03 7.0e+00 18 0 2 2 0 18 0 2 2 0 239 MatFDColorFunc 42 1.0 6.9191e-03 1.2 1.76e+07 1.0 8.4e+01 2.0e+03 0.0e+00 1 0 2 2 0 1 0 2 2 0 5099 KSPGMRESOrthog 2024 1.0 2.2314e-01 1.0 2.50e+09 1.0 0.0e+00 0.0e+00 2.0e+03 24 57 0 0 47 24 57 0 0 47 22414 KSPSetUp 2 1.0 2.2697e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 2 1.0 6.3165e-01 1.0 4.37e+09 1.0 4.2e+03 2.0e+03 4.2e+03 69 99 97 98 97 69 99 97 98 97 13840 PCSetUp 2 1.0 9.5367e-07 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 2092 1.0 2.5042e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Container 2 2 1112 0 SNES 1 1 1292 0 SNESLineSearch 1 1 848 0 Vector 52 52 7607552 0 Vector Scatter 4 4 4240 0 Matrix 3 3 5410828 0 Matrix FD Coloring 1 1 3255684 0 Distributed Mesh 1 1 107328 0 Bipartite Graph 2 2 1400 0 Index Set 29 29 123784 0 IS L to G Mapping 3 3 83316 0 Krylov Solver 1 1 18304 0 Preconditioner 1 1 768 0 Viewer 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 0 Average time for MPI_Barrier(): 5.72205e-07 Average time for zero size MPI_Send(): 2.02656e-06 #PETSc Option Table entries: -da_grid_x 100 -da_grid_y 100 -log_summary -mat_no_inode -pc_type none -preload off -threadcomm_nthreads 6 -threadcomm_type openmp #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure run at: Wed Oct 24 15:17:55 2012 Configure options: --with-x=0 --download-f-blas-lapack=0 --with-blas-lapack-dir=/opt/intel/composerxe/mkl/lib/intel64 --with-mpi=1 --with-mpi-shared=1 --with-mpi=1 --download-mpich=no --with-openmp=1 --with-pthreadclasses=1 --with-debugging=0 --with-gnu-compilers=no --with-vendor-compilers=intel --with-cc=/usr/local/encap/platform_mpi-8.02.01/bin/mpicc --with-cxx=/usr/local/encap/platform_mpi-8.02.01/bin/mpiCC --with-fc=/usr/local/encap/platform_mpi-8.02.01/bin/mpif90 --with-shared-libraries=1 --with-c++-support --with-clanguage=C --COPTFLAGS="-fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info" --CXXOPTFLAGS="-fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info" --FOPTFLAGS="-fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info" --download-scalapack=1 --download-blacs=1 --with-blacs=1 --download-umfpack=1 --download-parmetis=1 --download-metis=1 --download-superlu=1 --download-superlu_dist=1 --download-mumps=1 --download-ml=1 --download-hypre=1 ----------------------------------------- Libraries compiled on Wed Oct 24 15:17:55 2012 on lagrange.tomato Machine characteristics: Linux-2.6.32-279.11.1.el6.x86_64-x86_64-with-centos-6.3-Final Using PETSc directory: /home/jfe/local/petsc-dev Using PETSc arch: intel-opt-precise-O3 ----------------------------------------- Using C compiler: /usr/local/encap/platform_mpi-8.02.01/bin/mpicc -fPIC -wd1572 -fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info -fopenmp ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: /usr/local/encap/platform_mpi-8.02.01/bin/mpif90 -fPIC -fPIC -O3 -xSSE4.2 -fp-model precise -g -debug inline_debug_info -fopenmp ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/home/jfe/local/petsc-dev/intel-opt-precise-O3/include -I/home/jfe/local/petsc-dev/include -I/home/jfe/local/petsc-dev/include -I/home/jfe/local/petsc-dev/intel-opt-precise-O3/include -I/usr/local/encap/platform_mpi-8.02.01/include ----------------------------------------- Using C linker: /usr/local/encap/platform_mpi-8.02.01/bin/mpicc Using Fortran linker: /usr/local/encap/platform_mpi-8.02.01/bin/mpif90 Using libraries: -Wl,-rpath,/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -L/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -lpetsc -Wl,-rpath,/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -L/home/jfe/local/petsc-dev/intel-opt-precise-O3/lib -lcmumps -ldmumps -lsmumps -lzmumps -lmumps_common -lpord -lscalapack -lblacs -lml -Wl,-rpath,/usr/local/encap/platform_mpi-8.02.01/lib/linux_amd64 -L/usr/local/encap/platform_mpi-8.02.01/lib/linux_amd64 -lmpiCC -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/ipp/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/ipp/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 -L/opt/intel/composer_xe_2011_sp1.7.256/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2011_sp1.7.256/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -L/opt/intel/composer_xe_2011_sp1.7.256/tbb/lib/intel64/cc4.1.0_libc2.4_kernel2.6.16.21 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.6 -lpthread -lsuperlu_dist_3.1 -lparmetis -lmetis -lsuperlu_4.3 -lHYPRE -lmpiCC -lumfpack -lamd -Wl,-rpath,/opt/intel/composerxe/mkl/lib/intel64 -L/opt/intel/composerxe/mkl/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -lpthread -lifport -lifcore -lm -lpthread -lm -lmpiCC -lpcmpio -lpcmpi -ldl -limf -lsvml -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -ldl -----------------------------------------