<div dir="ltr"><div class="gmail_default" style="font-size:small">Thanks Barry for the reply. When I use the option -log_view, I got the Petsc API function profiling posted as below. But I wonder is there any way for me to get the performance/execution profiling of the functions in the code, like in Xcode profiling, so that I would know which part needs special optimization? I now badly needs your instructions. Thank you very much in advance!<br><br></div><div class="gmail_default" style="font-size:small"><br>---------------------------------------------- PETSc Performance Summary: ----------------------------------------------<br><br>./pStreamer on a arch-darwin-c-opt named <a href="http://rcc-dhcp-249-233.sr.unh.edu">rcc-dhcp-249-233.sr.unh.edu</a> with 4 processors, by GPL Wed Nov 23 18:49:59 2016<br>Using Petsc Release Version 3.5.4, May, 23, 2015 <br><br> Max Max/Min Avg Total <br>Time (sec): 1.026e+02 1.00004 1.026e+02<br>Objects: 3.420e+02 1.00000 3.420e+02<br>Flops: 2.509e+10 1.00200 2.506e+10 1.002e+11<br>Flops/sec: 2.445e+08 1.00202 2.441e+08 9.764e+08<br>MPI Messages: 1.387e+04 1.99971 1.041e+04 4.163e+04<br>MPI Message Lengths: 2.783e+07 1.99971 2.006e+03 8.350e+07<br>MPI Reductions: 1.425e+04 1.00000<br><br>Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)<br> e.g., VecAXPY() for real vectors of length N --> 2N flops<br> and VecAXPY() for complex vectors of length N --> 8N flops<br><br>Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --<br> Avg %Total Avg %Total counts %Total Avg %Total counts %Total <br> 0: Main Stage: 1.0264e+02 100.0% 1.0022e+11 100.0% 4.163e+04 100.0% 2.006e+03 100.0% 1.425e+04 100.0% <br><br>------------------------------------------------------------------------------------------------------------------------<br>See the 'Profiling' chapter of the users' manual for details on interpreting output.<br>Phase summary info:<br> Count: number of times phase was executed<br> Time and Flops: Max - maximum over all processors<br> Ratio - ratio of maximum to minimum over all processors<br> Mess: number of messages sent<br> Avg. len: average message length (bytes)<br> Reduct: number of global reductions<br> Global: entire computation<br> Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().<br> %T - percent time in this phase %F - percent flops in this phase<br> %M - percent messages in this phase %L - percent message lengths in this phase<br> %R - percent reductions in this phase<br> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)<br>------------------------------------------------------------------------------------------------------------------------<br>Event Count Time (sec) Flops --- Global --- --- Stage --- Total<br> Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s<br>------------------------------------------------------------------------------------------------------------------------<br><br>--- Event Stage 0: Main Stage<br><br>VecDot 6778 1.0 5.0633e+00 1.1 1.70e+09 1.0 0.0e+00 0.0e+00 6.8e+03 5 7 0 0 48 5 7 0 0 48 1345<br>VecDotNorm2 3389 1.0 3.1751e+00 1.3 1.70e+09 1.0 0.0e+00 0.0e+00 3.4e+03 3 7 0 0 24 3 7 0 0 24 2144<br>VecNorm 3571 1.0 2.1366e+00 1.1 8.98e+08 1.0 0.0e+00 0.0e+00 3.6e+03 2 4 0 0 25 2 4 0 0 25 1679<br>VecScale 49 1.0 1.2606e-02 1.2 6.16e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 1952<br>VecCopy 157 1.0 4.5638e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0<br>VecSet 7140 1.0 5.1525e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0<br>VecAXPY 146 1.0 7.0036e-02 1.2 3.97e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2266<br>VecAXPBYCZ 6778 1.0 4.9426e+00 1.2 3.41e+09 1.0 0.0e+00 0.0e+00 0.0e+00 4 14 0 0 0 4 14 0 0 0 2755<br>VecWAXPY 6785 1.0 3.9687e+00 1.0 1.71e+09 1.0 0.0e+00 0.0e+00 0.0e+00 4 7 0 0 0 4 7 0 0 0 1716<br>VecScatterBegin 6918 1.0 3.4263e-01 1.7 0.00e+00 0.0 4.2e+04 2.0e+03 0.0e+00 0 0100100 0 0 0100100 0 0<br>VecScatterEnd 6918 1.0 8.6652e+0029.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 4 0 0 0 0 0<br>MatMult 6869 1.0 2.6045e+01 1.5 7.76e+09 1.0 4.1e+04 2.0e+03 0.0e+00 21 31 99 99 0 21 31 99 99 0 1190<br>MatSolve 6960 1.0 1.7961e+01 1.0 7.86e+09 1.0 0.0e+00 0.0e+00 0.0e+00 17 31 0 0 0 17 31 0 0 0 1747<br>MatLUFactorNum 7 1.0 5.3501e-02 1.1 9.62e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 719<br>MatILUFactorSym 7 1.0 5.1423e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0<br>MatAssemblyBegin 14 1.0 3.7722e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 2.8e+01 0 0 0 0 0 0 0 0 0 0 0<br>MatAssemblyEnd 14 1.0 6.4600e-02 1.0 0.00e+00 0.0 8.4e+01 5.0e+02 5.6e+01 0 0 0 0 0 0 0 0 0 0 0<br>MatGetRowIJ 7 1.0 3.4150e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0<br>MatGetOrdering 7 1.0 5.8947e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0<br>KSPSetUp 14 1.0 3.9437e+00132.6 0.00e+00 0.0 0.0e+00 0.0e+00 8.4e+01 2 0 0 0 1 2 0 0 0 1 0<br>KSPSolve 91 1.0 6.6894e+01 1.2 2.51e+10 1.0 4.1e+04 2.0e+03 1.4e+04 59100 99 99 98 59100 99 99 98 1497<br>PCSetUp 14 1.0 1.1119e-01 1.1 9.62e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 346<br>PCSetUpOnBlocks 91 1.0 1.1087e-01 1.1 9.62e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 347<br>PCApply 6960 1.0 1.8604e+01 1.0 7.86e+09 1.0 0.0e+00 0.0e+00 0.0e+00 18 31 0 0 0 18 31 0 0 0 1687<br>TSStep 6 1.0 7.0114e+01 1.0 2.01e+10 1.0 3.3e+04 2.0e+03 1.1e+04 68 80 80 80 78 68 80 80 80 78 1146<br>TSFunctionEval 6 1.0 7.0102e+01 1.0 2.01e+10 1.0 3.3e+04 2.0e+03 1.1e+04 68 80 80 80 78 68 80 80 80 78 1146<br>------------------------------------------------------------------------------------------------------------------------<br><br>Memory usage is given in bytes:<br><br>Object Type Creations Destructions Memory Descendants' Mem.<br>Reports information only for process 0.<br><br>--- Event Stage 0: Main Stage<br><br> Vector 212 207 180438384 0<br> Vector Scatter 9 8 1014488 0<br> Matrix 28 28 147732508 0<br> Distributed Mesh 4 3 13776 0<br>Star Forest Bipartite Graph 8 6 4800 0<br> Discrete System 4 3 2376 0<br> Index Set 39 39 8087676 0<br> IS L to G Mapping 2 1 504596 0<br> Krylov Solver 15 15 17456 0<br> Preconditioner 15 15 14208 0<br> TSAdapt 1 1 1208 0<br> TS 1 1 1168 0<br> DMTS 1 1 712 0<br> SNES 1 1 1336 0<br> DMSNES 1 1 672 0<br> Viewer 1 0 0 0<br>========================================================================================================================<br>Average time to get PetscTime(): 3.82e-08<br>Average time for MPI_Barrier(): 6.2797e-05<br>Average time for zero size MPI_Send(): 1.98967e-05<br>#PETSc Option Table entries:<br>-log_view<br>#End of PETSc Option Table entries<br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Oct 19, 2016 at 8:33 PM, Barry Smith <span dir="ltr"><<a href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>
To start you should just run the code with the ./configure option --with-debugging=0 with the option -log_view this will give you the high view of where it is spending the time. Feel free to email back the output. From that you can focus on what parts are both taking a lot of time AND running slowly and that gives a good idea of what needs to be optimized.<br>
<span class="HOEnZb"><font color="#888888"><br>
Barry<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
> On Oct 19, 2016, at 3:15 PM, Sharp Stone <<a href="mailto:thronesf@gmail.com">thronesf@gmail.com</a>> wrote:<br>
><br>
> Dear all,<br>
><br>
> Now I'm using a Petsc code which needs to be optimized. But after trying, I still don't know how to get the profiling for each of the function for each process. I mean, for each process, how should I know the execution time for each function?<br>
><br>
> Thanks!<br>
><br>
> --<br>
> Best regards,<br>
><br>
> Feng<br>
<br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><font size="2">Best regards,</font></div><div><font size="2"><br></font></div><font size="2">Feng</font></div></div></div></div>
</div>