[petsc-users] How to get bandwidth peak out of PETSc log ?

Matthew Knepley knepley at gmail.com
Thu Jun 20 15:38:08 CDT 2013


On Thu, Jun 20, 2013 at 10:24 PM, HOUSSEN Franck <Franck.Houssen at cea.fr>wrote:

>  Hello,
>
> I am new to PETSc.
>
> I have written a (MPI) PETSc code to solve an AX=B system : how to know
> the bandwidth peak for a given run ? The code does not scale as I would
> expect (doubling the number of MPI processes does not half the elapsed
> time) : I would like to understand if this behavior is related to a bad MPI
> parallelization (that I may be able to improve), or, to the fact that the
> bandwidth limit as been reached (and in this case, my understanding is that
> I can not do anything to improve neither the performance, nor the scaling).
> I would like to know what's going on and why !
>
> Concerning the computer, I have tried to estimate the bandwidth peak with
> the "stream benchmark" : www.streambench.org/index.html. I get this :
> ~>./stream_c.exe
> ...
> Function    Best Rate MB/s  Avg time     Min time     Max time
> Copy:           11473.8     0.014110     0.013945     0.015064
> Scale:          11421.2     0.014070     0.014009     0.014096
> Add:            12974.4     0.018537     0.018498     0.018590
> Triad:          12964.6     0.018683     0.018512     0.019277
> As a conclusion, my understanding is that the bandwidth peak of my
> computer is about 12 200 MB / s (= average between 11 400 and 12 900) which
> is about 12200 / 1024 = 11.9 GB/s.
>

If this is your maximum bandwidth, it limits the amount of computing you
can do since you need to wait for
the operands. Thus, for VecAXPY, you need 2 doubles from memory and you
write 1 double when you do
2 flops, meaning you do

  2 flops/24 bytes = 1 flops/12 bytes

The peak achievable flops for VecAXPY on your machine is roughly

  12.9 GB/s * 1 flops/12 bytes = 1.1 GF/s

In your log, you are getting 1.5 GF/s, so you are at the bandwidth peak.
Another way to see this is that
VecMAXPY gets much better performance. If the performance were limited by
flops, this would not be
the case.

    Matt

Concerning PETSc, I tried to find (without success) a figure to compare to
> 11.9 GB/s. First, I tried to run "make streams" but it doesn't compile (I
> run petsc3.4.1. on Ubuntu 12.04). Then, I looked into the PETSc log (using
> -log_summary and -ksp_view) but I was not able to find out an estimate of
> the bandwidth out of it (I get information about time and flops but not
> about the amount of data transfered between MPI processes in MB). How can I
> get bandwidth peak out of PETSc log ? Is there a specific option for that
> or is this not possible ?
>
> I have attached the PETSc log (MPI run over 2 processes).
>
> Thanks,
>
> FH
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20130620/44dd903a/attachment.html>


More information about the petsc-users mailing list