[mpich-discuss] Faster MPI_Attr_get?

Jed Brown jedbrown at mcs.anl.gov
Fri May 11 16:38:15 CDT 2012


On Fri, May 11, 2012 at 4:20 PM, Jeff Hammond <jhammond at alcf.anl.gov> wrote:

> that's probably greatly underestimating the cost of this function
> since i assume in this test every time the function is called, both
> the dcache and icache hit every time.
>

Actually, that's kinda the case I'm interested in. If the threads pull in
enough data to knock the attribute out of L1, chances are that they will
take at least a few microseconds (except in pathological cases that hit
associativity). Suppose I'm doing BLAS level 1 flavor of vector operations
with vectors of length a couple thousand, so just big enough to be out of
L1 if done in serial. But with 16 or 32 threads, the actual work is very
fast (order of 100 cycles) because we have enough L1 and the operations
don't conflict with other cache lines. There are no atomic instructions in
launching a kernel, though there is a write from one thread and a read from
another, so the writer needs to get exclusive access to a cache line and
then the reader need to get the line back. But that line shuffling doesn't
affect whether MPI's attribute table stays in cache.

I can make an example to either validate or support the discussion above


>
Is "-O2" a suboption to "-pipe" or are you giving the compiler
> conflicting flags?
>

No, MPICH2 slaps it's own -O2 on the end of whatever the user asked for,
-pipe is irrelevant for optimization and perhaps not useful any more (it
used to reduce file system traffic by having gcc using pipes instead of
temporary files). The last optimization option is used in any case, so it
doesn't matter.


>
> > MPICH2 Version:     1.5b1
> > MPICH2 Release date: unreleased development copy
> > MPICH2 Device:     ch3:nemesis
> > MPICH2 configure: --prefix=/homes/jedbrown/usr/mpich-intel
> --enable-shared
> > --enable-error-checking=runtime --enable-error-messages=all
> > --enable-timer-type=clock_gettime CC=icc CXX=icpc --enable-fc=0
> > --enable-f77=0 FC= F77=
> > MPICH2 CC: icc    -O2
> > MPICH2 CXX: icpc   -O2
> > MPICH2 F77: gfortran
> > MPICH2 FC: gfortran
>
> "--enable-error-checking=runtime --enable-error-messages=all" would
> seem to be the kind of thing Dave is talking about that affect
> performance.
>

Right, so should I turn that off? This is a development environment, so I
definitely want those options. I can build a different MPI for profiling,
but it's been irrelevant in other tests. Can't the overhead of run-time
error checking amount to a few unlikely conditionals?

I could also build MPICH2 with all error checking turned off, but with
debugging symbols so that I can determine which lines are sucking up the
time?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20120511/d9b6f3f0/attachment.htm>


More information about the mpich-discuss mailing list