[petsc-users] HPCToolKit/HPCViewer on OS X
Bhalla, Amneet Pal S
amneetb at live.unc.edu
Thu Jan 14 01:26:57 CST 2016
On Jan 13, 2016, at 9:17 PM, Griffith, Boyce Eugene <boyceg at email.unc.edu<mailto:boyceg at email.unc.edu>> wrote:
I see one hot spot:
Here is with opt build
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
./main2d on a linux-opt named aorta with 1 processor, by amneetb Thu Jan 14 02:24:43 2016
Using Petsc Development GIT revision: v3.6.3-3098-ga3ecda2 GIT Date: 2016-01-13 21:30:26 -0600
Max Max/Min Avg Total
Time (sec): 1.018e+00 1.00000 1.018e+00
Objects: 2.935e+03 1.00000 2.935e+03
Flops: 4.957e+08 1.00000 4.957e+08 4.957e+08
Flops/sec: 4.868e+08 1.00000 4.868e+08 4.868e+08
MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00
MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00
MPI Reductions: 0.000e+00 0.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 1.0183e+00 100.0% 4.9570e+08 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
VecDot 4 1.0 2.9564e-05 1.0 3.31e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1120
VecDotNorm2 272 1.0 1.4565e-03 1.0 4.25e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 2920
VecMDot 624 1.0 8.4300e-03 1.0 5.29e+06 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 627
VecNorm 565 1.0 3.8033e-03 1.0 4.38e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1151
VecScale 86 1.0 5.5480e-04 1.0 1.55e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 279
VecCopy 28 1.0 5.2261e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 14567 1.0 1.2443e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
VecAXPY 903 1.0 4.2996e-03 1.0 6.66e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1550
VecAYPX 225 1.0 1.2550e-03 1.0 8.55e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 681
VecAXPBYCZ 42 1.0 1.7118e-04 1.0 3.45e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2014
VecWAXPY 70 1.0 1.9503e-04 1.0 2.98e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1528
VecMAXPY 641 1.0 1.1136e-02 1.0 5.29e+06 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 475
VecSwap 135 1.0 4.5896e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAssemblyBegin 745 1.0 4.9477e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecAssemblyEnd 745 1.0 9.2411e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 40831 1.0 3.4502e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0
BuildTwoSidedF 738 1.0 2.6712e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatMult 513 1.0 9.1235e-02 1.0 7.75e+07 1.0 0.0e+00 0.0e+00 0.0e+00 9 16 0 0 0 9 16 0 0 0 849
MatSolve 13568 1.0 2.3605e-01 1.0 3.45e+08 1.0 0.0e+00 0.0e+00 0.0e+00 23 70 0 0 0 23 70 0 0 0 1460
MatLUFactorSym 84 1.0 3.7430e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0
MatLUFactorNum 85 1.0 3.9623e-02 1.0 4.19e+07 1.0 0.0e+00 0.0e+00 0.0e+00 4 8 0 0 0 4 8 0 0 0 1058
MatILUFactorSym 1 1.0 3.3617e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatScale 4 1.0 2.5511e-04 1.0 2.51e+05 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 984
MatAssemblyBegin 108 1.0 6.3658e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyEnd 108 1.0 2.9490e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetRow 33120 1.0 2.0157e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0
MatGetRowIJ 85 1.0 1.2145e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetSubMatrice 4 1.0 8.4379e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
MatGetOrdering 85 1.0 7.7887e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
MatAXPY 4 1.0 4.9596e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 5 0 0 0 0 5 0 0 0 0 0
MatPtAP 4 1.0 4.4426e-02 1.0 4.99e+06 1.0 0.0e+00 0.0e+00 0.0e+00 4 1 0 0 0 4 1 0 0 0 112
MatPtAPSymbolic 4 1.0 2.7664e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 3 0 0 0 0 0
MatPtAPNumeric 4 1.0 1.6732e-02 1.0 4.99e+06 1.0 0.0e+00 0.0e+00 0.0e+00 2 1 0 0 0 2 1 0 0 0 298
MatGetSymTrans 4 1.0 3.6621e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPGMRESOrthog 16 1.0 9.7778e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
KSPSetUp 90 1.0 5.7650e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1 1.0 7.8831e-01 1.0 4.90e+08 1.0 0.0e+00 0.0e+00 0.0e+00 77 99 0 0 0 77 99 0 0 0 622
PCSetUp 90 1.0 9.9725e-02 1.0 4.19e+07 1.0 0.0e+00 0.0e+00 0.0e+00 10 8 0 0 0 10 8 0 0 0 420
PCSetUpOnBlocks 112 1.0 8.7547e-02 1.0 4.19e+07 1.0 0.0e+00 0.0e+00 0.0e+00 9 8 0 0 0 9 8 0 0 0 479
PCApply 16 1.0 7.1952e-01 1.0 4.89e+08 1.0 0.0e+00 0.0e+00 0.0e+00 71 99 0 0 0 71 99 0 0 0 680
SNESSolve 1 1.0 7.9225e-01 1.0 4.90e+08 1.0 0.0e+00 0.0e+00 0.0e+00 78 99 0 0 0 78 99 0 0 0 619
SNESFunctionEval 2 1.0 3.2940e-03 1.0 4.68e+04 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 14
SNESJacobianEval 1 1.0 4.7255e-04 1.0 4.26e+03 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 9
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Vector 971 839 15573352 0.
Vector Scatter 290 289 189584 0.
Index Set 1171 823 951928 0.
IS L to G Mapping 110 109 2156656 0.
Application Order 6 6 99952 0.
MatMFFD 1 1 776 0.
Matrix 189 189 24083332 0.
Matrix Null Space 4 4 2432 0.
Krylov Solver 90 90 122720 0.
DMKSP interface 1 1 648 0.
Preconditioner 90 90 89872 0.
SNES 1 1 1328 0.
SNESLineSearch 1 1 984 0.
DMSNES 1 1 664 0.
Distributed Mesh 2 2 9168 0.
Star Forest Bipartite Graph 4 4 3168 0.
Discrete System 2 2 1712 0.
Viewer 1 0 0 0.
========================================================================================================================
Average time to get PetscTime(): 9.53674e-07
#PETSc Option Table entries:
-ib_ksp_converged_reason
-ib_ksp_monitor_true_residual
-ib_snes_type ksponly
-log_summary
-stokes_ib_pc_level_0_sub_pc_factor_nonzeros_along_diagonal
-stokes_ib_pc_level_0_sub_pc_type ilu
-stokes_ib_pc_level_ksp_richardson_self_scale
-stokes_ib_pc_level_ksp_type richardson
-stokes_ib_pc_level_pc_asm_local_type additive
-stokes_ib_pc_level_sub_pc_factor_nonzeros_along_diagonal
-stokes_ib_pc_level_sub_pc_type lu
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --CC=mpicc --CXX=mpicxx --FC=mpif90 --with-default-arch=0 --PETSC_ARCH=linux-opt --with-debugging=0 --with-c++-support=1 --with-hypre=1 --download-hypre=1 --with-hdf5=yes --COPTFLAGS=-O3 --CXXOPTFLAGS=-O3 --FOPTFLAGS=-O3
-----------------------------------------
Libraries compiled on Thu Jan 14 01:29:56 2016 on aorta
Machine characteristics: Linux-3.13.0-63-generic-x86_64-with-Ubuntu-14.04-trusty
Using PETSc directory: /not_backed_up/amneetb/softwares/PETSc-BitBucket/PETSc
Using PETSc arch: linux-opt
-----------------------------------------
Using C compiler: mpicc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Qunused-arguments -O3 ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: mpif90 -fPIC -Wall -Wno-unused-variable -ffree-line-length-0 -Wno-unused-dummy-argument -O3 ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------
Using include paths: -I/not_backed_up/amneetb/softwares/PETSc-BitBucket/PETSc/linux-opt/include -I/not_backed_up/amneetb/softwares/PETSc-BitBucket/PETSc/include -I/not_backed_up/amneetb/softwares/PETSc-BitBucket/PETSc/include -I/not_backed_up/amneetb/softwares/PETSc-BitBucket/PETSc/linux-opt/include -I/not_backed_up/softwares/MPICH/include
-----------------------------------------
Using C linker: mpicc
Using Fortran linker: mpif90
Using libraries: -Wl,-rpath,/not_backed_up/amneetb/softwares/PETSc-BitBucket/PETSc/linux-opt/lib -L/not_backed_up/amneetb/softwares/PETSc-BitBucket/PETSc/linux-opt/lib -lpetsc -Wl,-rpath,/not_backed_up/amneetb/softwares/PETSc-BitBucket/PETSc/linux-opt/lib -L/not_backed_up/amneetb/softwares/PETSc-BitBucket/PETSc/linux-opt/lib -lHYPRE -Wl,-rpath,/not_backed_up/softwares/MPICH/lib -L/not_backed_up/softwares/MPICH/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lmpicxx -lstdc++ -llapack -lblas -lpthread -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lX11 -lm -lmpifort -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpicxx -lstdc++ -Wl,-rpath,/not_backed_up/softwares/MPICH/lib -L/not_backed_up/softwares/MPICH/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -ldl -Wl,-rpath,/not_backed_up/softwares/MPICH/lib -lmpi -lgcc_s -ldl
-----------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160114/fc05cbee/attachment-0001.html>
More information about the petsc-users
mailing list