[petsc-users] HPCToolKit/HPCViewer on OS X

Bhalla, Amneet Pal S amneetb at live.unc.edu
Wed Jan 13 23:12:46 CST 2016



On Jan 13, 2016, at 6:22 PM, Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>> wrote:

Can you mail us a -log_summary for a rough cut? Sometimes its hard
to interpret the data avalanche from one of those tools without a simple map.

Does this indicate some hot spots?


************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

./main2d on a darwin-dbg named Amneets-MBP.attlocal.net<http://Amneets-MBP.attlocal.net> with 1 processor, by Taylor Wed Jan 13 21:07:43 2016
Using Petsc Development GIT revision: v3.6.1-2556-g6721a46  GIT Date: 2015-11-16 13:07:08 -0600

                         Max       Max/Min        Avg      Total
Time (sec):           1.039e+01      1.00000   1.039e+01
Objects:              2.834e+03      1.00000   2.834e+03
Flops:                3.552e+08      1.00000   3.552e+08  3.552e+08
Flops/sec:            3.418e+07      1.00000   3.418e+07  3.418e+07
Memory:               3.949e+07      1.00000              3.949e+07
MPI Messages:         0.000e+00      0.00000   0.000e+00  0.000e+00
MPI Message Lengths:  0.000e+00      0.00000   0.000e+00  0.000e+00
MPI Reductions:       0.000e+00      0.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total
 0:      Main Stage: 1.0391e+01 100.0%  3.5520e+08 100.0%  0.000e+00   0.0%  0.000e+00        0.0%  0.000e+00   0.0%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------


      ##########################################################
      #                                                        #
      #                          WARNING!!!                    #
      #                                                        #
      #   This code was compiled with a debugging option,      #
      #   To get timing results run ./configure                #
      #   using --with-debugging=no, the performance will      #
      #   be generally two or three times faster.              #
      #                                                        #
      ##########################################################


Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

VecDot                 4 1.0 9.0525e-04 1.0 3.31e+04 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    37
VecMDot              533 1.0 1.5936e-02 1.0 5.97e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0   375
VecNorm              412 1.0 9.2107e-03 1.0 3.57e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0   388
VecScale             331 1.0 5.8195e-01 1.0 1.41e+06 1.0 0.0e+00 0.0e+00 0.0e+00  6  0  0  0  0   6  0  0  0  0     2
VecCopy              116 1.0 1.9983e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet             18362 1.0 1.5249e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
VecAXPY              254 1.0 4.3961e-01 1.0 1.95e+06 1.0 0.0e+00 0.0e+00 0.0e+00  4  1  0  0  0   4  1  0  0  0     4
VecAYPX               92 1.0 2.5167e-03 1.0 2.66e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   106
VecAXPBYCZ            36 1.0 8.6242e-04 1.0 2.94e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   341
VecWAXPY              58 1.0 1.2539e-03 1.0 2.47e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   197
VecMAXPY             638 1.0 2.3439e-02 1.0 7.68e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0   328
VecSwap              111 1.0 1.9721e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   2  0  0  0  0     0
VecAssemblyBegin     607 1.0 3.8150e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAssemblyEnd       607 1.0 8.3705e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterBegin    26434 1.0 3.0096e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  3  0  0  0  0   3  0  0  0  0     0
VecNormalize         260 1.0 4.9754e-01 1.0 3.84e+06 1.0 0.0e+00 0.0e+00 0.0e+00  5  1  0  0  0   5  1  0  0  0     8
BuildTwoSidedF       600 1.0 1.8942e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatMult              365 1.0 6.0306e-01 1.0 6.26e+07 1.0 0.0e+00 0.0e+00 0.0e+00  6 18  0  0  0   6 18  0  0  0   104
MatSolve            8775 1.0 6.8506e-01 1.0 2.25e+08 1.0 0.0e+00 0.0e+00 0.0e+00  7 63  0  0  0   7 63  0  0  0   328
MatLUFactorSym        85 1.0 1.0664e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
MatLUFactorNum        85 1.0 1.2066e-01 1.0 4.23e+07 1.0 0.0e+00 0.0e+00 0.0e+00  1 12  0  0  0   1 12  0  0  0   350
MatScale               4 1.0 4.0145e-04 1.0 2.51e+05 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   625
MatAssemblyBegin     108 1.0 4.8849e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd       108 1.0 9.8455e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetRow          33120 1.0 1.4157e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
MatGetRowIJ           85 1.0 2.6060e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetSubMatrice       4 1.0 4.2922e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering        85 1.0 3.1230e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAXPY                4 1.0 4.0459e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  4  0  0  0  0   4  0  0  0  0     0
MatPtAP                4 1.0 1.1362e-01 1.0 4.99e+06 1.0 0.0e+00 0.0e+00 0.0e+00  1  1  0  0  0   1  1  0  0  0    44
MatPtAPSymbolic        4 1.0 6.4973e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
MatPtAPNumeric         4 1.0 4.8521e-02 1.0 4.99e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0   103
MatGetSymTrans         4 1.0 5.9780e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPGMRESOrthog       182 1.0 2.0538e-02 1.0 5.11e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0   249
KSPSetUp              90 1.0 2.1210e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve               1 1.0 9.5567e+00 1.0 3.50e+08 1.0 0.0e+00 0.0e+00 0.0e+00 92 98  0  0  0  92 98  0  0  0    37
PCSetUp               90 1.0 4.0597e-01 1.0 4.23e+07 1.0 0.0e+00 0.0e+00 0.0e+00  4 12  0  0  0   4 12  0  0  0   104
PCSetUpOnBlocks       91 1.0 2.9886e-01 1.0 4.23e+07 1.0 0.0e+00 0.0e+00 0.0e+00  3 12  0  0  0   3 12  0  0  0   141
PCApply               13 1.0 9.0558e+00 1.0 3.49e+08 1.0 0.0e+00 0.0e+00 0.0e+00 87 98  0  0  0  87 98  0  0  0    39
SNESSolve              1 1.0 9.5729e+00 1.0 3.50e+08 1.0 0.0e+00 0.0e+00 0.0e+00 92 98  0  0  0  92 98  0  0  0    37
SNESFunctionEval       2 1.0 1.3347e-02 1.0 4.68e+04 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     4
SNESJacobianEval       1 1.0 2.4613e-03 1.0 4.26e+03 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     2
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Vector   870            762     13314200     0.
      Vector Scatter   290            289       189584     0.
           Index Set  1171            823       951096     0.
   IS L to G Mapping   110            109      2156656     0.
   Application Order     6              6        99952     0.
             MatMFFD     1              1          776     0.
              Matrix   189            189     24202324     0.
   Matrix Null Space     4              4         2432     0.
       Krylov Solver    90             90       190080     0.
     DMKSP interface     1              1          648     0.
      Preconditioner    90             90        89128     0.
                SNES     1              1         1328     0.
      SNESLineSearch     1              1          856     0.
              DMSNES     1              1          664     0.
    Distributed Mesh     2              2         9024     0.
Star Forest Bipartite Graph     4              4         3168     0.
     Discrete System     2              2         1696     0.
              Viewer     1              0            0     0.
========================================================================================================================
Average time to get PetscTime(): 4.74e-08
#PETSc Option Table entries:
-ib_ksp_converged_reason
-ib_ksp_monitor_true_residual
-ib_snes_type ksponly
-log_summary
-stokes_ib_pc_level_ksp_richardson_self_scae
-stokes_ib_pc_level_ksp_type gmres
-stokes_ib_pc_level_pc_asm_local_type additive
-stokes_ib_pc_level_pc_asm_type interpolate
-stokes_ib_pc_level_sub_pc_factor_nonzeros_along_diagonal
-stokes_ib_pc_level_sub_pc_type lu
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --CC=mpicc --CXX=mpicxx --FC=mpif90 --PETSC_ARCH=darwin-dbg --with-debugging=1 --with-c++-support=1 --with-hypre=1 --download-hypre=1 --with-hdf5=yes --with-hdf5-dir=/Users/Taylor/Documents/SOFTWARES/HDF5/
-----------------------------------------
Libraries compiled on Mon Nov 16 15:11:21 2015 on d209.math.ucdavis.edu<http://d209.math.ucdavis.edu>
Machine characteristics: Darwin-14.5.0-x86_64-i386-64bit
Using PETSc directory: /Users/Taylor/Documents/SOFTWARES/PETSc-BitBucket/PETSc
Using PETSc arch: darwin-dbg
-----------------------------------------

Using C compiler: mpicc    -g  ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: mpif90   -g   ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------

Using include paths: -I/Users/Taylor/Documents/SOFTWARES/PETSc-BitBucket/PETSc/darwin-dbg/include -I/Users/Taylor/Documents/SOFTWARES/PETSc-BitBucket/PETSc/include -I/Users/Taylor/Documents/SOFTWARES/PETSc-BitBucket/PETSc/include -I/Users/Taylor/Documents/SOFTWARES/PETSc-BitBucket/PETSc/darwin-dbg/include -I/opt/X11/include -I/Users/Taylor/Documents/SOFTWARES/HDF5/include -I/opt/local/include -I/Users/Taylor/Documents/SOFTWARES/MPICH/include
-----------------------------------------

Using C linker: mpicc
Using Fortran linker: mpif90
Using libraries: -Wl,-rpath,/Users/Taylor/Documents/SOFTWARES/PETSc-BitBucket/PETSc/darwin-dbg/lib -L/Users/Taylor/Documents/SOFTWARES/PETSc-BitBucket/PETSc/darwin-dbg/lib -lpetsc -Wl,-rpath,/Users/Taylor/Documents/SOFTWARES/PETSc-BitBucket/PETSc/darwin-dbg/lib -L/Users/Taylor/Documents/SOFTWARES/PETSc-BitBucket/PETSc/darwin-dbg/lib -lHYPRE -Wl,-rpath,/Users/Taylor/Documents/SOFTWARES/MPICH/lib -L/Users/Taylor/Documents/SOFTWARES/MPICH/lib -Wl,-rpath,/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/6.1.0/lib/darwin -L/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/6.1.0/lib/darwin -lclang_rt.osx -lmpicxx -lc++ -Wl,-rpath,/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/6.1.0/lib/darwin -L/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/6.1.0/lib/darwin -lclang_rt.osx -llapack -lblas -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -lX11 -Wl,-rpath,/Users/Taylor/Documents/SOFTWARES/HDF5/lib -L/Users/Taylor/Documents/SOFTWARES/HDF5/lib -lhdf5_hl -lhdf5 -lssl -lcrypto -lmpifort -lgfortran -Wl,-rpath,/opt/local/lib/gcc49/gcc/x86_64-apple-darwin14/4.9.1 -L/opt/local/lib/gcc49/gcc/x86_64-apple-darwin14/4.9.1 -Wl,-rpath,/opt/local/lib/gcc49 -L/opt/local/lib/gcc49 -lgfortran -lgcc_ext.10.5 -lquadmath -lm -lclang_rt.osx -lmpicxx -lc++ -lclang_rt.osx -Wl,-rpath,/Users/Taylor/Documents/SOFTWARES/MPICH/lib -L/Users/Taylor/Documents/SOFTWARES/MPICH/lib -ldl -lmpi -lpmpi -lSystem -Wl,-rpath,/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/6.1.0/lib/darwin -L/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../lib/clang/6.1.0/lib/darwin -lclang_rt.osx -ldl
-----------------------------------------


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160114/d1f67271/attachment-0001.html>


More information about the petsc-users mailing list