[mpich-discuss] ROMIO: 2 phase IO method and error handling
Pascal Deveze
Pascal.Deveze at bull.net
Thu Sep 2 07:54:49 CDT 2010
I have another idea. A test (in each process) could be made before the
"2 phase method" to
see how much data can be read (count). The 2 phase method could continue
with this new calculated count.
The size of the file can be obtained with a call to fstat(fd->fd_sys,
&statbuf).
statbuf.st_size then contains the size of the file in bytes.
Now, it is possible (not easy for me, but possible), to calculate how
much data can be read by each process
according to its own datatype and its own offset.
The advantage of this method is that it avoids message passing, its
disadvantage is the call to fstat(). It
also avoid to deal with modifications in the 2 phase method.
Pascal
Wei-keng Liao a écrit :
> I don't think there is an easy fix for this problem.
>
> To correctly return the read size for each MPI collective I/O call,
> the actual read size of an aggregator must be somehow reported to all
> requesting process that access this aggregator's file domain.
> Since each requesting processes can have noncontiguous file access
> to this aggregator's file domain, the fix must find out how much of the
> short read size overlaps each contiguous access and return the total
> size of the overlaps as the true read size.
>
> Furthermore, if a short read occurred, the contents of missing part of
> read buffer should not be changed. Local memory copying must check that.
>
> What makes this problem even more complicated is
> 1. each process can request from multiple aggregators,
> 2. short read can happen at all aggregators, and
> 3. local process's read fileview allows overlapping.
>
> Wei-keng
>
> On Sep 1, 2010, at 12:40 PM, Rob Latham wrote:
>
>
>> On Mon, Aug 23, 2010 at 04:35:36PM +0200, Pascal Deveze wrote:
>>
>>> I discovered that I can read after the end of file !
>>>
>>> After a look in the romio source code, I see that the "2 phase IO"
>>> method for read:
>>> 1) Does not return the right count value in the status
>>>
>> ...
>>
>>> Has anybody an idea on how to correct this ?
>>>
>> well, i've got an idea but it's not great.
>>
>> We could report how much data each process actually read, but that
>> would return surprising results in this test: rank 0 reads 5 bytes but
>> rank 1 reads none.
>>
>> So we have to communicate the fact that we had a short read. I guess
>> another allreduce in the collective I/O path won't be so bad, but I
>> still need to think some more about how to react to the fact that one
>> aggregator had a short read.
>>
>> ==rob
>>
>> --
>> Rob Latham
>> Mathematics and Computer Science Division
>> Argonne National Lab, IL USA
>> _______________________________________________
>> mpich-discuss mailing list
>> mpich-discuss at mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>
>>
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20100902/b2a6c8d3/attachment.htm>
More information about the mpich-discuss
mailing list