[mpich-discuss] committed lustre fixes to MPICH2

Rajeev Thakur thakur at mcs.anl.gov
Thu Mar 4 15:43:24 CST 2010


In MPI-IO, if you create a new empty file and write at offset 1000, what
exists in the hole between offsets 0 and 1000 is undefined. It is not
necessarily 0 as in POSIX. 

Rajeev


> -----Original Message-----
> From: mpich-discuss-bounces at mcs.anl.gov 
> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Martin Pokorny
> Sent: Thursday, March 04, 2010 9:55 AM
> To: Pascal Deveze
> Cc: Martin Audet; Tom.Wang; mpich-discuss at mcs.anl.gov; LiuYing
> Subject: Re: [mpich-discuss] committed lustre fixes to MPICH2
> 
> When I received Rob's patch, I told him that it would be a 
> while before 
> I could test it; we've been busily working to get our new 
> system on-line 
> over the past couple of weeks, and I thought that there would 
> be no time 
> for testing the patch. As luck would have it, one of the 
> problems I was 
> encountering led me to the Lustre ADIO driver.
> 
> I have lately been running into some problems with my 
> application, which 
> relies strongly on noncontiguous, collective writes to a small Lustre 
> filesystem. My first response in diagnosing this problem (as 
> far as the 
> Lustre ADIO driver is concerned) was to start with a clean 
> mpich2-1.2.1.p1 and apply the patch that Rob provided, but 
> that had no 
> effect. Eventually, I got around to setting the 
> romio_lustre_ds_in_coll 
> hint to "disable", and this appears to have solved the problems.
> 
> In the current version of my application, unlike previous 
> versions, the 
> union of all filetypes used by the group processes has holes 
> relative to 
> the range of the file's valid offsets. My application uses collective 
> writes with explicit offsets beyond the end of the file when 
> writing new 
> records. Apparently, the read phase of a read-modify-write 
> cycle (with 
> data sieving enabled) for a new record leaves the data in the read 
> buffer uninitialized. Subsequently, the write phase puts that 
> uninitialized data into the file. I suppose that one could argue that 
> reading from beyond the end of a file has an undefined 
> result, and thus 
> the observed behavior is acceptable. However, it is my understanding 
> that MPI file hints shall not change the semantics of I/O operations, 
> and given the result when data sieving is disabled, I'd argue 
> that the 
> current behavior is a bug. Fortunately, the fix appears to be quite 
> simple; I will be testing this out soon, and will report my 
> results. In 
> the meanwhile, my question to you all is whether the current behavior 
> should be considered a bug or not?
> 
> -- 
> Martin
> 
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> 



More information about the mpich-discuss mailing list