[petsc-users] RE : How to get bandwidth peak out of PETSc log ?

HOUSSEN Franck Franck.Houssen at cea.fr
Fri Jun 21 01:45:54 CDT 2013


Hello,

The log I attached was a very small test case (1 iteration) : I just wanted to get some information about PETSc log printings (time, flops...). I profiled the code over "big" test case with scalasca : I know I spend 95% of the time calling PETSc (on realistic test cases).

Barry,
I have attached 2 logs over 1 and 2 procs running a intermediate test case (not small but not big).
>From PETSc_1proc.log, I get :
Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 2.6379e+02 100.0%  9.8232e+10 100.0%  0.000e+00   0.0%  0.000e+00        0.0%  2.943e+03 100.0% 
My understanding is that the relevant figure is 9.8232e+10 (flop) / 2.6379e+02 (sec) = 3.72e12 flops = 3723 gflops for 1 proc.
>From PETSc_2procs.log, I get :
Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 1.6733e+02 100.0%  1.0348e+11 100.0%  3.427e+04 100.0%  2.951e+04      100.0%  1.355e+04 100.0% 
My understanding is that the relevant figure is 1.0348e+11 (flop) / 1.6733e+02 (sec) = 6.18e12 flops = 6184 gflops for 2 procs.

Am I correct ?

My understanding is that if the code scales "well", when I double the number of MPI processes, I should double the flops which is not the case (I get 6184 gflops instead of 2*3723) : right ? wrong ?
>From this, how can I know if I am at the bandwidth peak ?

If I compare 6184 gflops to the (computer characteristic computed by Matt = 12.9 GB/s * 1 flops/12 bytes =) 1.1 GF/s, I get a huge difference that I am not sure to understand... I am not sure I can compare these numbers : I guess no ?! Did I miss something ?

More over I realize that I get :
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %f %M %L %R  %T %f %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
VecAXPY            14402 1.0 5.4951e+00 1.0 6.63e+09 1.0 0.0e+00 0.0e+00 0.0e+00  2  7  0  0  0   2  7  0  0  0  1206 (in PETSc_1proc.log => 1.2 gflops which is superior to the 1.1 gflops computed by Matt)
VecAXPY            14402 1.0 4.3619e+00 1.0 3.32e+09 1.0 0.0e+00 0.0e+00 0.0e+00  3  6  0  0  0   3  6  0  0  0  1520 (in PETSc_2procs.log => 1.5 gflops which is superior to the 1.1 gflops computed by Matt)

My understanding is that I can conclude I  am at the bandwidth peak if I rely on :
VecAXPY            14402 1.0 5.4951e+00 1.0 6.63e+09 1.0 0.0e+00 0.0e+00 0.0e+00  2  7  0  0  0   2  7  0  0  0  1206 (in PETSc_1proc.log => 1.2 gflops which is superior to the 1.1 gflops computed by Matt)
VecAXPY            14402 1.0 4.3619e+00 1.0 3.32e+09 1.0 0.0e+00 0.0e+00 0.0e+00  3  6  0  0  0   3  6  0  0  0  1520 (in PETSc_2procs.log => 1.5 gflops which is superior to the 1.1 gflops computed by Matt)

But I can not conclude anything when I rely on 3723 gflops for 1 proc /  6184 gflops for 2 procs.

Can somebody help me to see clear on that ?!

Thanks,

FH
 
________________________________________
De : Barry Smith [bsmith at mcs.anl.gov]
Date d'envoi : jeudi 20 juin 2013 23:20
À : HOUSSEN Franck
Cc: petsc-users at mcs.anl.gov
Objet : Re: [petsc-users] How to get bandwidth peak out of PETSc log ?

   Please send also the -log_summary for 1 process.

   Note that in the run you provided the time spent in PETSc is about 25 percent of the total run. So how the "other" portion of the code scales will make a large difference for speedup, hence we need to see with a different number of processes.

   Barry

On Jun 20, 2013, at 3:24 PM, HOUSSEN Franck <Franck.Houssen at cea.fr> wrote:

> Hello,
>
> I am new to PETSc.
>
> I have written a (MPI) PETSc code to solve an AX=B system : how to know the bandwidth peak for a given run ? The code does not scale as I would expect (doubling the number of MPI processes does not half the elapsed time) : I would like to understand if this behavior is related to a bad MPI parallelization (that I may be able to improve), or, to the fact that the bandwidth limit as been reached (and in this case, my understanding is that I can not do anything to improve neither the performance, nor the scaling). I would like to know what's going on and why !
>
> Concerning the computer, I have tried to estimate the bandwidth peak with the "stream benchmark" : www.streambench.org/index.html. I get this :
> ~>./stream_c.exe
> ...
> Function    Best Rate MB/s  Avg time     Min time     Max time
> Copy:           11473.8     0.014110     0.013945     0.015064
> Scale:          11421.2     0.014070     0.014009     0.014096
> Add:            12974.4     0.018537     0.018498     0.018590
> Triad:          12964.6     0.018683     0.018512     0.019277
> As a conclusion, my understanding is that the bandwidth peak of my computer is about 12 200 MB / s (= average between 11 400 and 12 900) which is about 12200 / 1024 = 11.9 GB/s.
>
> Concerning PETSc, I tried to find (without success) a figure to compare to 11.9 GB/s. First, I tried to run "make streams" but it doesn't compile (I run petsc3.4.1. on Ubuntu 12.04). Then, I looked into the PETSc log (using -log_summary and -ksp_view) but I was not able to find out an estimate of the bandwidth out of it (I get information about time and flops but not about the amount of data transfered between MPI processes in MB). How can I get bandwidth peak out of PETSc log ? Is there a specific option for that or is this not possible ?
>
> I have attached the PETSc log (MPI run over 2 processes).
>
> Thanks,
>
> FH
> <PETSc.log>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: PETSc_1proc.log
Type: text/x-log
Size: 12342 bytes
Desc: PETSc_1proc.log
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20130621/5d72f24f/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PETSc_2procs.log
Type: text/x-log
Size: 12972 bytes
Desc: PETSc_2procs.log
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20130621/5d72f24f/attachment-0003.bin>


More information about the petsc-users mailing list