[mpich2-dev] improved scalability with O_NOATIME flag

Darius Buntinas buntinas at mcs.anl.gov
Tue Dec 14 12:33:21 CST 2010


I think that would be the right way to do it.  Unless __USE_GNU is required to be set for every source file (I don't think this would be the case), you can just define it in ad_open.c wrapped in appropriate ifdefs.

-d

On Dec 14, 2010, at 12:01 PM, Rob Latham wrote:

> Came across an interesting optimization that should be easy to add to
> ROMIO, but I want to do so in a portable way.
> 
> the O_NOATIME flag does what it says on the tin: don't update atime
> when you open this file.  Now O_RDONLY really does only read -- no
> metadata update needed.
> 
> In a collective open we can set this flag on all processors save one,
> and presumably avoid a metadata storm (think lustre and it's single
> metadata server).
> 
> So, what's the "MPICH way" to make use of a gnu-libc flag?  On my
> laptop it's protected by an "#ifdef __USE_GNU".  Is it ok
> to write a configure-time check for O_NOATIME that defines __USE_GNU?
> But then I have to set __USE_GNU inside adio/common/ad_open.c if
> we HAVE_O_NOATIME .
> 
> ==rob
> 
> ----- Forwarded message from Mark Howison <mark.howison at gmail.com> -----
> 
> Sender: hdf-forum-bounces at hdfgroup.org
> From: Mark Howison <mark.howison at gmail.com>
> Reply-To: HDF Users Discussion List <hdf-forum at hdfgroup.org>
> Subject: Re: [Hdf-forum] round-robin (not parallel) access to single hdf5
> 	file
> Date: Tue, 14 Dec 2010 12:03:05 -0500
> Message-ID: <AANLkTin92tPWF1r3XhvaCxNB1sMpkQxtt0qx1FLaE==u at mail.gmail.com>
> To: HDF Users Discussion List <hdf-forum at hdfgroup.org>
> X-Spam-Status: No, score=-2.1
> 
> On Tue, Dec 14, 2010 at 8:08 AM, Quincey Koziol <koziol at hdfgroup.org> wrote:
>>> If not, there is another optimization that I think was reported in a
>>> paper on PLFS or Adios about passing a flag to the fopen call on each
>>> MPI task that tells it not to update the creation/modification time
>>> except on the root task. This can greatly reduce the load on the
>>> metadata server for a parallel file system.
>> 
>>        Interesting, can you send me a reference for this?
> 
> I'm pretty sure the trick was to use O_NOATIME in the open() call,
> except on task 0. (You can find this on NICS webpage on I/O best
> practices.)
> 
> I know I came across it in the lit review for our HDF5/Lustre paper,
> but I can't put my fingers on the paper. It might haveI vaguely recall
> a scaling graph showing how this outperformed a regular open() call,
> and I think the text was 1-column wide... I'll keep looking through my
> woefully unorganized pile of PDFs.
> 
> Mark
> 
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> Hdf-forum at hdfgroup.org
> http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
> 
> ----- End forwarded message -----
> 
> -- 
> Rob Latham
> Mathematics and Computer Science Division
> Argonne National Lab, IL USA



More information about the mpich2-dev mailing list