[mpich-discuss] committed lustre fixes to MPICH2
Rajeev Thakur
thakur at mcs.anl.gov
Thu Mar 4 15:43:24 CST 2010
In MPI-IO, if you create a new empty file and write at offset 1000, what
exists in the hole between offsets 0 and 1000 is undefined. It is not
necessarily 0 as in POSIX.
Rajeev
> -----Original Message-----
> From: mpich-discuss-bounces at mcs.anl.gov
> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Martin Pokorny
> Sent: Thursday, March 04, 2010 9:55 AM
> To: Pascal Deveze
> Cc: Martin Audet; Tom.Wang; mpich-discuss at mcs.anl.gov; LiuYing
> Subject: Re: [mpich-discuss] committed lustre fixes to MPICH2
>
> When I received Rob's patch, I told him that it would be a
> while before
> I could test it; we've been busily working to get our new
> system on-line
> over the past couple of weeks, and I thought that there would
> be no time
> for testing the patch. As luck would have it, one of the
> problems I was
> encountering led me to the Lustre ADIO driver.
>
> I have lately been running into some problems with my
> application, which
> relies strongly on noncontiguous, collective writes to a small Lustre
> filesystem. My first response in diagnosing this problem (as
> far as the
> Lustre ADIO driver is concerned) was to start with a clean
> mpich2-1.2.1.p1 and apply the patch that Rob provided, but
> that had no
> effect. Eventually, I got around to setting the
> romio_lustre_ds_in_coll
> hint to "disable", and this appears to have solved the problems.
>
> In the current version of my application, unlike previous
> versions, the
> union of all filetypes used by the group processes has holes
> relative to
> the range of the file's valid offsets. My application uses collective
> writes with explicit offsets beyond the end of the file when
> writing new
> records. Apparently, the read phase of a read-modify-write
> cycle (with
> data sieving enabled) for a new record leaves the data in the read
> buffer uninitialized. Subsequently, the write phase puts that
> uninitialized data into the file. I suppose that one could argue that
> reading from beyond the end of a file has an undefined
> result, and thus
> the observed behavior is acceptable. However, it is my understanding
> that MPI file hints shall not change the semantics of I/O operations,
> and given the result when data sieving is disabled, I'd argue
> that the
> current behavior is a bug. Fortunately, the fix appears to be quite
> simple; I will be testing this out soon, and will report my
> results. In
> the meanwhile, my question to you all is whether the current behavior
> should be considered a bug or not?
>
> --
> Martin
>
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>
More information about the mpich-discuss
mailing list