[petsc-dev] [Ideas-team] Seeking OLCF users complaining about poor build times

Satish Balay balay at mcs.anl.gov
Fri Feb 27 10:45:20 CST 2015


On Fri, 27 Feb 2015, Todd Gamblin wrote:

> Barry:
> 
> I remember that ALCF attempted to address this problem at one point or
> another with "tmpicc" compiler wrappers.  As I remember the idea was that
> they stored the compiler's tmp files in some local storage on the login
> node.  I think that was back when ANL's main machine was Intrepid, and I
> don't know where those compilers went on Mira.  Do you remember this?

No idea about tmpicc thingy - but current compilers appear to use /tmp

And one can change this with TMPDIR env variable.

> 
> In general I'm not sure that just moving the compiler temp files is going
> to cut it.  I think you really want to do the build out of /tmp or some
> other filesystem.  Spack does this automatically for its builds -- on LLNL
> machines I build much faster by just finding the local tmp space and using
> it for all the builds.  Spack is also able to put the entire build out in
> tmp space, because you just tell it the software name, and it handles the
> details of where it is downloaded and expanded.  It's not perfect, because
> it looks at $TMP, $TMPDIR, and some other LLNL-specific places.
> 
> If it turns out that configuring NFS (or in ANL's case, I think it's GPFS)
> to be fast on a set of loaded login nodes is not feasible, it might be
> nice to have some kind of recommendations for build staging.

Yeah - we've been doing our build in /tmp [later ANL/MCS standardized
this usage by providing /sandbox - slightly different from /tmp] for
many years.

Its just that users won't think of doing this.  [also we usually
default to inplace build - and use /tmp for build - works well with
--prefix install].

Also I think such usage at LC centers might be prohibited.  And some
systems are configured in such a way that /tmp is really not useful
for source builds. [For eg - configure tends to do simpile runs - that
get blocked due to security settings on /tmp]

Satish

> 
> -Todd
> 
> 
> 
> On 2/27/15, 8:09 AM, "David E. Bernholdt" <bernholdtde at ornl.gov> wrote:
> 
> >Barry, thanks, this is extremely helpful.  I'll have the OLCF folks
> >contact Nathan if they need any further info or have other experiments
> >to try.
> >
> >On 02/27/2015 11:03 AM, Barry Smith wrote:
> >> 
> >>   Same text also in the attachment.
> >> 
> >>    Barry
> >> 
> >> David,
> >> 
> >>     Nathan Collier has kindly run a test on Titan, Satish on Mira and
> >>Hopper, and Victor on Ranger with a basic optimized build of PETSc (all
> >>C code)
> >> 
> >>     Please find below some configure and make timings from the latest
> >>PETSc master.
> >> 
> >>      The Titan times for both configure and make are unacceptable. For
> >>total build time Titan is 3.5 times slower than Mira and Hopper and at
> >>least 10 times slower than laptops. The "time" results on Titan are
> >>disturbing
> >> 
> >> configure 
> >> real	14m32.169s   (since the user + sys time is much less than real
> >>time, what is it waiting on?)
> >> user	1m51.527s
> >> sys	3m40.734s
> >> 
> >> make
> >> real	15m56.004s
> >> user	8m8.971s
> >> sys	52m42.734s  (why so much?)
> >> 
> >> which I read as either the filesystem or the compiler system (location
> >>of the compilers, license server of the compilers, ...) is really badly
> >>configured.
> >> 
> >>    The Hopper configure time with the default
> >>TMPDIR=/scratch/scratchdirs/balay is is unacceptable but if you actually
> >>use the real /tmp it becomes somewhat reasonable.
> >> 
> >> Feel free to share this information with local experts,
> >> 
> >>   
> >> 
> >> 
> >> I suggest you view the below table in a fixed width font editor like
> >>Emacs or Vi so the columns line up.
> >> 
> >>                     configure time    make time   Total      compilers
> >>   filesystem
> >> 
> >> Titan                14m32s         15m56s        30m28s      Intel 14
> >>  /lustre/atlas1/geo103/proj-shared/
> >>                      41m38s          9m5s         50m43s
> >> /ccs/home/  (no load on login node)
> >> 		     13m      
> >>(no load on a different login node)
> >> 
> >> Mira                  6m59s          1m49s         8m48s       IBM
> >>  /gpfs/mira-home/
> >> 
> >> Hopper               23m17           1m45s        25m2s
> >>  /global/u2/b/balay/petsc.clone default
> >>TMPDIR=/scratch/scratchdirs/balay
> >> 		      6m17s          1m39s         7m57s                   manually
> >>set TMPDIR=/tmp
> >> 
> >> NSF Ranger UT Austin  5m10s          1m28s         6m38s
> >>     default, whatever it is
> >> 
> >> Linux laptop            53s          1m13s         2m6s         Gnu
> >>      compile and compiler local
> >> 
> >> Apple laptop          1m14s            54s         2m8s         clang
> >>      compile and compiler local
> >> 
> >> Linux workstation     1m11s            22s         1m33s        Gnu
> >>    compile and compiler local
> >>                       1m37s            29s         2m6s         Gnu
> >>    compile directory local; compiler directory remote
> >>                       3m11s            25s         3m36s       Intel 13
> >>    compile directory local; compiler directory remote
> >> 
> >> PETSc has about 1000 source files that need compiling
> >> 
> >> The configure is essentially sequential, the make extremely parallel.
> >> 
> >> During configure the source code is on the listed file system, all .o
> >>and executables  are on /tmp
> >> 
> >> During the make the source code and all .o are on the listed file system
> >> 
> >> 
> >>> On Feb 25, 2015, at 11:23 AM, David E. Bernholdt
> >>><bernholdtde at ornl.gov> wrote:
> >>>
> >>> At the kick-off meetings, one of the general complaints I heard
> >>> expressed about the facilities was the slow build times compared to
> >>> personal systems.
> >>>
> >>> If you have this complaint and are an OLCF user, and are willing to
> >>>work
> >>> with us a little to try to understand your experience in more detail,
> >>> please contact me (individually, not reply-all).
> >>>
> >>> This is a facility thing, not an IDEAS thing, so I can't speak for the
> >>> other facilities.  But we've recently received some other similar
> >>> comments, and we're trying to dig into what's happening.
> >>>
> >>> Thanks
> >>> -- 
> >>> David E. Bernholdt                | Email: bernholdtde at ornl.gov
> >>> Oak Ridge National Laboratory     | Phone: +1 865-574-3147
> >>> http://www.csm.ornl.gov/~bernhold | Fax:   +1 865-576-5491
> >>> _______________________________________________
> >>> Ideas-team mailing list
> >>> Ideas-team at lists.mcs.anl.gov
> >>> https://lists.mcs.anl.gov/mailman/listinfo/ideas-team
> >
> >
> >-- 
> >David E. Bernholdt                | Email: bernholdtde at ornl.gov
> >Oak Ridge National Laboratory     | Phone: +1 865-574-3147
> >http://www.csm.ornl.gov/~bernhold | Fax:   +1 865-576-5491
> >_______________________________________________
> >Ideas-team mailing list
> >Ideas-team at lists.mcs.anl.gov
> >https://lists.mcs.anl.gov/mailman/listinfo/ideas-team
> 
> 
> 




More information about the petsc-dev mailing list