[mpich-discuss] ROMIO: 2 phase IO method and error handling

Rob Latham robl at mcs.anl.gov
Tue Aug 31 13:06:35 CDT 2010


On Mon, Aug 23, 2010 at 04:35:36PM +0200, Pascal Deveze wrote:
> Here is a little program to reproduce it: It writes 5 bytes on a
> file (0,1,2,3,4) and two MPI processes try to read 10 bytes.
> The read are succeeded and the 5 read bytes are correct, but:
> 1) The count value is set to 10 (instead of 5 because only 5 valid
> bytes are read).
> 2) The 5 last bytes of the read buffers are overwritten (in this
> case the values 99 are erased by 0)

Hi Pascal

I don't see any buffer corruption problem with a recent MPICH2
checout, but I can confirm the get_count problem.

I do see the buffer problem with OpenMPI-1.4, but we have not synced
up ROMIO versions for about two years now.  we must have fixed things
in the meantime.  What MPI version are you testing?

I have made some very small modifications to your test example to make
it more like other romio tests.  Can you confirm the test still
behaves like your original test?

- have only rank 0 write the file
- introduce a sync/barrier/sync to ensure MPI_File_write and
  MPI_File_read_all are in separate access epochs. 
- first argument (argv[1]) is the file name.
- more precisely specify where the bad elements in buffer lie.

Thanks!
==rob

-- 
Rob Latham
Mathematics and Computer Science Division
Argonne National Lab, IL USA
-------------- next part --------------
A non-text attachment was scrubbed...
Name: end_of_file.c
Type: text/x-csrc
Size: 2081 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20100831/73b0c866/attachment.c>


More information about the mpich-discuss mailing list