[mpich-discuss] ROMIO: 2 phase IO method and error handling

Wei-keng Liao wkliao at ece.northwestern.edu
Wed Sep 1 13:36:41 CDT 2010


I don't think there is an easy fix for this problem.

To correctly return the read size for each MPI collective I/O call,
the actual read size of an aggregator must be somehow reported to all
requesting process that access this aggregator's file domain.
Since each requesting processes can have noncontiguous file access
to this aggregator's file domain, the fix must find out how much of the
short read size overlaps each contiguous access and return the total
size of the overlaps as the true read size.

Furthermore, if a short read occurred, the contents of missing part of
read buffer should not be changed. Local memory copying must check that.

What makes this problem even more complicated is
1. each process can request from multiple aggregators,
2. short read can happen at all aggregators, and
3. local process's read fileview allows overlapping.

Wei-keng

On Sep 1, 2010, at 12:40 PM, Rob Latham wrote:

> On Mon, Aug 23, 2010 at 04:35:36PM +0200, Pascal Deveze wrote:
>> I discovered that I can read after the end of file !
>> 
>> After a look in the romio source code, I see that the "2 phase IO"
>> method for read:
>> 1) Does not return the right count value in the status
> ...
>> Has anybody an idea on how to correct this ?
> 
> well, i've got an idea but it's not great.
> 
> We could report how much data each process actually read, but that
> would return surprising results in this test: rank 0 reads 5 bytes but
> rank 1 reads none. 
> 
> So we have to communicate the fact that we had a short read.  I guess
> another allreduce in the collective I/O path won't be so bad, but I
> still need to think some more about how to react to the fact that one
> aggregator had a short read.
> 
> ==rob
> 
> -- 
> Rob Latham
> Mathematics and Computer Science Division
> Argonne National Lab, IL USA
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> 



More information about the mpich-discuss mailing list