[mpich-discuss] MPI IO on lustre

burlen burlen.loring at gmail.com
Thu Mar 4 16:09:15 CST 2010


Hi Rob,

Thanks for the updates. We'll be looking into Cray MPT. We're looking 
forward to the new MPICH2 as well.

Burlen

Rob Latham wrote:
> Hi
>
> You've touched upon an issue with a fairly long history.  The short
> answer is that the MPI-IO on Lustre situation is today in much
> better shape than it has ever been.
>
> For a long time there was no optimized MPI-IO driver for Lustre.
> MPI-IO instead used the general purpose "unix file system" driver.
> This works well enough, except when doing collective I/O: the Lustre
> locking scheme requires a fairly sophisticated algorithm. 
>
> We now have two good options for MPI-IO on Lustre:  Cray's MPT-3.2 or
> newer has an MPI-IO library with the sophisticated "write to lustre"
> collective I/O algorithm.  
>
> MPICH2 also now has an optimized Lustre dirver, though it's taken us a
> while for the community to work out the bugs.   I think we might be at
> that point now, but am waiting to hear from more testers.
>
> I fear the authors of the studies you have read have reached the exact
> wrong conclusion.  It is not MPI-IO that has the defect.  Rather,
> Lustre's design makes it difficult to achieve high performance
> parallel I/O.  Fortunately, the task is merely difficult, not
> impossible, and the MPI-IO community has stepped up to the challenge.
>
> The next MPICH2 release will contain the most recent Lustre driver. I
> would like very much to hear your experiences with your simulation and
> the improved driver.   
>
> Thanks
> ==rob
>
> On Wed, Mar 03, 2010 at 12:28:51PM -0800, burlen wrote:
>   
>> Our simulation that is currently running on order of 1E4 processes,
>> and fast approaching 1E5 processes. IO is a substantial bottleneck.
>>
>> The book "Using MPI-2: Advanced Features of the Message-Passsing
>> Interface" makes the case that collective IO is one of the best
>> option for an HPC application. In contrast I have read more recent
>> studies which show that MPI-IO performs poorly on Lustre fs which is
>> deployed ubiquitously on the systems we use. Some study even
>> advocate abandoning MPI-IO in place of single file direct access via
>> posix api. Which is to say MPI-IO delivers no optimization at all.
>>
>> I am very curious as to the current state of Lustre ADIO in mpich2 ,
>> and its future direction. Obviously fine tuning of both Lustre
>> parameters and MPI-IO hints for the specific situation are critical.
>> If the fine tuning is done reasonably well, can we expect
>> high-preformance from MPI collective IO on Lustre fs currently or in
>> the near term?
>>
>> Thanks
>> Burlen
>>
>>
>> _______________________________________________
>> mpich-discuss mailing list
>> mpich-discuss at mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>     
>
>   



More information about the mpich-discuss mailing list