<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type">

  <title></title>

</head>

<body bgcolor="#ffffff" text="#000000">

Hi Rob,<br>

<br>

Thanks for your clear explanation. I undestand now the second part of

the if statement<br>

(st_offsets[i] &lt;= end_offsets[i]) that just tests if the length is

not null.<br>

<br>

The test is not strictly "&lt;", but "&lt;=" because equality means

length=1.<br>

<br>

I still have problems with following cases:<br>

1) After an element with len=0, the interleave is not detected<br>

2) With element of len=1, interleave is not detected.<br>

<br>

I do not know if this could really be a problem. Here are just some

explanations<br>

on how to see the problem. I also give a possible solution.<br>

<br>

I modify your program to test these 2 cases and add a trace in

common/ad_write_coll.c:if (!myrank)<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for (i=0; i&lt;nprocs; i++) {<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (!myrank) printf("st_offsets[%d]=%d

end_offsets[%d]=%d\n", i, st_offsets[i], i, end_offsets[i]);<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for (i=1; i&lt;nprocs; i++)<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if ((st_offsets[i] &lt; end_offsets[i-1]) &amp;&amp;<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (st_offsets[i] &lt;= end_offsets[i]))<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; interleave_count++;<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* This is a rudimentary check for interleaving, but should

suffice<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for the moment. */<br>

if (!myrank) printf("%d: interleave_count=%d\n", interleave_count);<br>

<br>

For case 1:<br>

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; st_offsets[]&nbsp;&nbsp; = {0, 1,2,3}<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; end_offsets[] = {1023, 0, 5, 2}<br>

<br>

For case 2:<br>

&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; st_offsets[]&nbsp;&nbsp; = {1, 1,1,1}<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; end_offsets[] = {1, 1, 1, 1}<br>

<br>

================ The test to see the problem ==========<br>

#include "mpi.h"<br>

#include &lt;stdio.h&gt;<br>

<br>

#define LEN 1024<br>

<br>

int main(int argc, char **argv)<br>

{<br>

<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; MPI_File fh;<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; MPI_Status status;<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; MPI_Offset offset;<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; int length, nprocs, rank, i;<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; char buffer[LEN];<br>

<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for(i=0; i&lt;LEN; i++) buffer[i] = i;<br>

<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; MPI_Init(&amp;argc, &amp;argv);<br>

<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; MPI_Comm_size(MPI_COMM_WORLD, &amp;nprocs);<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; MPI_Comm_rank(MPI_COMM_WORLD, &amp;rank);<br>

<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; MPI_File_open(MPI_COMM_WORLD, argv[1],

MPI_MODE_CREATE|MPI_MODE_RDWR, <br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; MPI_INFO_NULL, &amp;fh);<br>

<br>

// Interleaved data is not detected after an element with null length<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /*<br>

&nbsp;*&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; e.g. P0 ( off_0 = 0,&nbsp;&nbsp;&nbsp; len_0 = LEN )<br>

&nbsp;*&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; P1 ( off_1 = 1,&nbsp;&nbsp;&nbsp; len_1 = 0 ) ===&gt; null length<br>

&nbsp;*&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; P2 ( off_2 = 2,&nbsp;&nbsp;&nbsp; len_2 = 4 ) ===&gt; interleaved

not detected<br>

&nbsp;*&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; P3 ( off_3 = 3,&nbsp;&nbsp;&nbsp; len_3 = 0 )<br>

&nbsp;*&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; .......&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ........<br>

&nbsp;*&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; */<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if ((rank % 2) ==0) length=4;<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; else length=0;<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (rank == 0) length=LEN;<br>

<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; offset = rank;<br>

<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; MPI_File_write_at_all(fh, offset, buffer, length, MPI_BYTE,

&amp;status);<br>

<br>

// Interleaved data is not detected if only one byte is common (len = 1)<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /*<br>

&nbsp;*&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; e.g. P0 ( off_0 = 1,&nbsp;&nbsp;&nbsp; len_0 = 1 )<br>

&nbsp;*&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; P1 ( off_1 = 1,&nbsp;&nbsp;&nbsp; len_1 = 1 ) ===&gt; interleave not

detected<br>

&nbsp;*&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; P2 ( off_2 = 1,&nbsp;&nbsp;&nbsp; len_2 = 1 ) ===&gt; interleaved

not detected<br>

&nbsp;*&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; P3 ( off_3 = 1,&nbsp;&nbsp;&nbsp; len_3 = 1 )<br>

&nbsp;*&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; .......&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ........<br>

&nbsp;*&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; */<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; length=1;<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; offset=1;<br>

<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; MPI_File_seek(fh, 0, MPI_SEEK_SET);<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; MPI_File_write_at_all(fh, offset, buffer, length, MPI_BYTE,

&amp;status);<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; MPI_File_close(&amp;fh);<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; MPI_Finalize();<br>

<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; return 0;<br>

}<br>

================ The results =====================<br>

st_offsets[0]=0 end_offsets[0]=1023<br>

st_offsets[1]=1 end_offsets[1]=0<br>

st_offsets[2]=2 end_offsets[2]=5<br>

st_offsets[3]=3 end_offsets[3]=2<br>

0: interleave_count=0<br>

st_offsets[0]=1 end_offsets[0]=1<br>

st_offsets[1]=1 end_offsets[1]=1<br>

st_offsets[2]=1 end_offsets[2]=1<br>

st_offsets[3]=1 end_offsets[3]=1<br>

0: interleave_count=0<br>

<br>

The interleaves are not detected.<br>

To detect case 1, the end offset of the last element with non null

length should&nbsp; be<br>

stored and tested with the start_offset of element i.<br>

To detect case 2, the first comparison of the if statement should be

"&lt;=", not "&lt;".<br>

<br>

This could be done in the following loop:<br>

<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; ADIO_Offset last_end=end_offsets[0];<br>

<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for (i=1; i&lt;nprocs; i++) {<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (st_offsets[i] &lt;= last_end) { // Possible interleave

for offset i<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (st_offsets[i] &lt;= end_offsets[i]) // length is

not null, so there is an interleave<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; interleave_count++;<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; if (st_offsets[i] &lt;= end_offsets[i]) // length is not

null, so change last_end<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; last_end=end_offsets[i];<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; /* This is a rudimentary check for interleaving, but should

suffice<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; for the moment. */<br>

&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }<br>

<br>

Pascal<br>

<br>

Rob Latham a &eacute;crit&nbsp;:

<blockquote cite="mid:20100901162531.GI23171@mcs.anl.gov" type="cite">

  <pre wrap="">On Wed, Sep 01, 2010 at 03:37:30PM +0200, Pascal Deveze wrote:

  </pre>

  <blockquote type="cite">

    <pre wrap="">There is one test that I do not understand. This test is used

in the collective read/write to detect if the data are interleaved:

      /* are the accesses of different processes interleaved? */

       for (i=1; i&lt;nprocs; i++)

           if ((st_offsets[i] &lt; end_offsets[i-1]) &amp;&amp;

               (st_offsets[i] &lt;= end_offsets[i]))

               interleave_count++;

       /* This is a rudimentary check for interleaving, but should suffice

          for the moment. */

   }

The second member of the if statement (st_offsets[i] &lt;=

end_offsets[i]) is always verified.

I think this should be (st_offsets[i-1] &lt;= end_offsets[i]).

    </pre>

  </blockquote>

  <pre wrap=""><!---->

That addition happened 6 years ago, but I can't find the original bug

report (it's in the old req system, if someone can find "MPICH2 req

#1174" that might tell us more).

        for (i=1; i&lt;nprocs; i++)

-           if (st_offsets[i] &lt; end_offsets[i-1]) interleave_count++;

+           if ((st_offsets[i] &lt; end_offsets[i-1]) &amp;&amp; 

+                (st_offsets[i] &lt;= end_offsets[i]))

+                interleave_count++;

        /* This is a rudimentary check for interleaving, but should suffice

           for the moment. */

ah, here we go. Back in 2004 Jianwei Li found a bug when some

processes had zero elements.  

    "When counting the "interleave_count", segments with length == 0

    should not be counted in even if their starting offsets fall

    within previous segment range."

I'm not sure why the check is for "&lt;=" instead of strictly "&lt;",

though.  Wish I had a test case attached to this old bug report.  

Ok, now I do.  Attached, and I'll add this to the repository. 

  </pre>

  <blockquote type="cite">

    <pre wrap="">Do I miss something ?

    </pre>

  </blockquote>

  <pre wrap=""><!---->

Yes, but it's not hard to miss this subtle thing: the comment a few

lines earlier sheds some light on this matter:

       /* Note: end_offset points to the last byte-offset that will be accessed.

           e.g., if start_offset=0 and 100 bytes to be read, end_offset=99*/

So, in the test case I attached, if you run it with four procs your st_offsets array and end_offsets array look like this:

st_offsets[] = {0, 1,2,3}

end_offsets[] = {3, 0, 1, 2}

See, if i do a zero-byte write at offset 3, my start is 3 and my end

is actually 2.  So, st_offsets[i] is not always less than or equal to

end_offsets[i]. specifically, it won't be if the region was a request

for zero bytes.

  </pre>

  <blockquote type="cite">

    <pre wrap="">And as the interleave_count is always tested with 0, it should be

possible to break the loop

after the incrementation of interleave_count.

    </pre>

  </blockquote>

  <pre wrap=""><!---->

I suppose we could do something clever like "optimize harder" if the

interleave count is higher... well, we don't do that :&gt;

  </pre>

  <blockquote type="cite">

    <pre wrap="">In my point of view, the test could be something like:

      /* are the accesses of different processes interleaved? */

       for (i=1; i&lt;nprocs; i++)

           if ((st_offsets[i] &lt; end_offsets[i-1]) &amp;&amp;

               (st_offsets[i-1] &lt;= end_offsets[i])) {

                         interleave_count=1;

                         break;

           }

       /* This is a rudimentary check for interleaving, but should suffice

          for the moment. */

    </pre>

  </blockquote>

  <pre wrap=""><!---->

If I could justify burning a million cpu hours it would be great to

profile ROMIO on a full rack of Intrepid.  I'm sure breaking early

from loops like this helps scalability a little bit when these arrays

are 160k elements long.

I think I will leave the st_offsets[i] &lt;= end_offsets[i] as is, but

put in a better comment.  I will, though, break as soon as we find

something interleaved.

Thanks for the report, though.  I am extremely happy you are taking

such a close look at ROMIO.  

==rob

  </pre>

</blockquote>

<br>

</body>

</html>