[mpich-discuss] ad_pvfs2_read bug in ADIOI_PVFS2_ReadStrided()

Wei-keng Liao wkliao at ece.northwestern.edu
Tue Jun 2 01:11:54 CDT 2009


Hi Rob,

On Jun 1, 2009, at 11:36 PM, Rob Latham wrote:
> On Mon, Jun 01, 2009 at 06:17:24PM -0500, Wei-keng Liao wrote:
>> This situation has been handled at line 209 where frd_size is
>> calculated.
>> n_filetypes will be increased by 1 so frd_size is kept > 0
>>
>> In fact, this is very similar to our previous patch for data sieving
>> that makes sure the file pointer always points to a valid byte in the
>> fileview. But since pvfs2 driver is not updated with that patch, it
>> is still using the old approach and the old one can automatically
>> move the pointer to the right place in the NEXT I/O call.
>
> Thanks for explaining.  I missed that the Strided code would fix up
> the file pointers in the subsequent call.  However, what about the
> WriteContig code?   That doesn't do any pointer fixing-up if using the
> implicit file pointer (only the _at routines use the explicit file
> pointer).

I don't think WriteContig or ReadContig along will move the file pointer
to the offset whose immediate first byte is in the fileview visible  
range.
They just add whatever is written/read to the file pointer. Moving the
pointer to this position requires additional work, such as those in
ADIOI_XXX_WriteStrided/ReadStrided. If filetype is non-contiguous, the
independent MPI_File_read/write will still have to call
ADIOI_XXX_WriteStrided/ReadStrided, and then WriteContig/ReadContig.

In my earlier data sieving patch for codes in adio/common directory,
the file pointer will always move to the above position. But since
ad_pvfs2 implements its own ReadStride/WriteStride, it still uses
the old approach. But it does not cause any error at all. ROMIO has been
using this old approach without any problem.

The difference between two file pointer update approaches does not
affect the MPI semantics, either. If a user call  
MPI_File_get_position(),
ROMIO will call ADIOI_Get_position() which updates the pointer to the  
above
offset.



>
>> I have revised my test code vector_view_read.c to include your test
>> pattern. It ran just fine with my patch. The only difference is the
>> pointer value, fd->fp_ind. But the I/O results are correct. So, it is
>> you call to pick one of two.
>
> I know, it's kind of a stretch to think of an application that would
> do a noncontiguous read/write of a partial datatype, followed by a
> contig, but I could maybe see one of the high-level I/O libraries
> doing that.  If that's the case, then I think I'm more comfortable
> with fd->fp_ind being at the right place when the function exits:
> there might be more than one place in ROMIO that assumes that
> behavior.

One way to ensure this is to run all test codes available in MPICH and
ROMIO test directories. I checked this before when I prepare the data
sieving patch and ROMIO seems to cover all possibilities and I am
pretty confident in ROMIO's original approach :)

Wei-keng



>> You can include the attached codes in ROMIO test repository.
>
> Great! thanks for the tests and the good discussion.  Definitely helps
> me work through this code.
>
> ==rob
>
> -- 
> Rob Latham
> Mathematics and Computer Science Division
> Argonne National Lab, IL USA
>



More information about the mpich-discuss mailing list