[mpich2-dev] ROMIO: Interleaving test
Rob Latham
robl at mcs.anl.gov
Wed Sep 1 11:25:31 CDT 2010
On Wed, Sep 01, 2010 at 03:37:30PM +0200, Pascal Deveze wrote:
> There is one test that I do not understand. This test is used
> in the collective read/write to detect if the data are interleaved:
>
> /* are the accesses of different processes interleaved? */
> for (i=1; i<nprocs; i++)
> if ((st_offsets[i] < end_offsets[i-1]) &&
> (st_offsets[i] <= end_offsets[i]))
> interleave_count++;
> /* This is a rudimentary check for interleaving, but should suffice
> for the moment. */
> }
>
> The second member of the if statement (st_offsets[i] <=
> end_offsets[i]) is always verified.
> I think this should be (st_offsets[i-1] <= end_offsets[i]).
That addition happened 6 years ago, but I can't find the original bug
report (it's in the old req system, if someone can find "MPICH2 req
#1174" that might tell us more).
for (i=1; i<nprocs; i++)
- if (st_offsets[i] < end_offsets[i-1]) interleave_count++;
+ if ((st_offsets[i] < end_offsets[i-1]) &&
+ (st_offsets[i] <= end_offsets[i]))
+ interleave_count++;
/* This is a rudimentary check for interleaving, but should suffice
for the moment. */
ah, here we go. Back in 2004 Jianwei Li found a bug when some
processes had zero elements.
"When counting the "interleave_count", segments with length == 0
should not be counted in even if their starting offsets fall
within previous segment range."
I'm not sure why the check is for "<=" instead of strictly "<",
though. Wish I had a test case attached to this old bug report.
Ok, now I do. Attached, and I'll add this to the repository.
> Do I miss something ?
Yes, but it's not hard to miss this subtle thing: the comment a few
lines earlier sheds some light on this matter:
/* Note: end_offset points to the last byte-offset that will be accessed.
e.g., if start_offset=0 and 100 bytes to be read, end_offset=99*/
So, in the test case I attached, if you run it with four procs your st_offsets array and end_offsets array look like this:
st_offsets[] = {0, 1,2,3}
end_offsets[] = {3, 0, 1, 2}
See, if i do a zero-byte write at offset 3, my start is 3 and my end
is actually 2. So, st_offsets[i] is not always less than or equal to
end_offsets[i]. specifically, it won't be if the region was a request
for zero bytes.
> And as the interleave_count is always tested with 0, it should be
> possible to break the loop
> after the incrementation of interleave_count.
I suppose we could do something clever like "optimize harder" if the
interleave count is higher... well, we don't do that :>
> In my point of view, the test could be something like:
> /* are the accesses of different processes interleaved? */
> for (i=1; i<nprocs; i++)
> if ((st_offsets[i] < end_offsets[i-1]) &&
> (st_offsets[i-1] <= end_offsets[i])) {
> interleave_count=1;
> break;
> }
> /* This is a rudimentary check for interleaving, but should suffice
> for the moment. */
If I could justify burning a million cpu hours it would be great to
profile ROMIO on a full rack of Intrepid. I'm sure breaking early
from loops like this helps scalability a little bit when these arrays
are 160k elements long.
I think I will leave the st_offsets[i] <= end_offsets[i] as is, but
put in a better comment. I will, though, break as soon as we find
something interleaved.
Thanks for the report, though. I am extremely happy you are taking
such a close look at ROMIO.
==rob
--
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA
-------------- next part --------------
A non-text attachment was scrubbed...
Name: interleave.c
Type: text/x-csrc
Size: 881 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/mpich2-dev/attachments/20100901/3a9b3156/attachment-0001.c>
More information about the mpich2-dev
mailing list