[mpich-discuss] committed lustre fixes to MPICH2
Martin Pokorny
mpokorny at nrao.edu
Thu Mar 4 09:54:44 CST 2010
When I received Rob's patch, I told him that it would be a while before
I could test it; we've been busily working to get our new system on-line
over the past couple of weeks, and I thought that there would be no time
for testing the patch. As luck would have it, one of the problems I was
encountering led me to the Lustre ADIO driver.
I have lately been running into some problems with my application, which
relies strongly on noncontiguous, collective writes to a small Lustre
filesystem. My first response in diagnosing this problem (as far as the
Lustre ADIO driver is concerned) was to start with a clean
mpich2-1.2.1.p1 and apply the patch that Rob provided, but that had no
effect. Eventually, I got around to setting the romio_lustre_ds_in_coll
hint to "disable", and this appears to have solved the problems.
In the current version of my application, unlike previous versions, the
union of all filetypes used by the group processes has holes relative to
the range of the file's valid offsets. My application uses collective
writes with explicit offsets beyond the end of the file when writing new
records. Apparently, the read phase of a read-modify-write cycle (with
data sieving enabled) for a new record leaves the data in the read
buffer uninitialized. Subsequently, the write phase puts that
uninitialized data into the file. I suppose that one could argue that
reading from beyond the end of a file has an undefined result, and thus
the observed behavior is acceptable. However, it is my understanding
that MPI file hints shall not change the semantics of I/O operations,
and given the result when data sieving is disabled, I'd argue that the
current behavior is a bug. Fortunately, the fix appears to be quite
simple; I will be testing this out soon, and will report my results. In
the meanwhile, my question to you all is whether the current behavior
should be considered a bug or not?
--
Martin
More information about the mpich-discuss
mailing list