[petsc-dev] [Ideas-team] Seeking OLCF users complaining about poor build times

Barry Smith bsmith at mcs.anl.gov
Fri Feb 27 11:51:50 CST 2015


> On Feb 27, 2015, at 10:29 AM, Todd Gamblin <tgamblin at llnl.gov> wrote:
> 
> Barry:
> 
> If it turns out that configuring NFS (or in ANL's case, I think it's GPFS)
> to be fast on a set of loaded login nodes is not feasible, it might be
> nice to have some kind of recommendations for build staging.
> 
> -Todd

   Actually the parallel compiles of the 1000+ files on the "regular" filesystems at ANL and LBL is taking less than 2 minutes so I can't blame the filesystem bandwidth.

  Barry

> 
> 
> 
> On 2/27/15, 8:09 AM, "David E. Bernholdt" <bernholdtde at ornl.gov> wrote:
> 
>> Barry, thanks, this is extremely helpful.  I'll have the OLCF folks
>> contact Nathan if they need any further info or have other experiments
>> to try.
>> 
>> On 02/27/2015 11:03 AM, Barry Smith wrote:
>>> 
>>>  Same text also in the attachment.
>>> 
>>>   Barry
>>> 
>>> David,
>>> 
>>>    Nathan Collier has kindly run a test on Titan, Satish on Mira and
>>> Hopper, and Victor on Ranger with a basic optimized build of PETSc (all
>>> C code)
>>> 
>>>    Please find below some configure and make timings from the latest
>>> PETSc master.
>>> 
>>>     The Titan times for both configure and make are unacceptable. For
>>> total build time Titan is 3.5 times slower than Mira and Hopper and at
>>> least 10 times slower than laptops. The "time" results on Titan are
>>> disturbing
>>> 
>>> configure 
>>> real	14m32.169s   (since the user + sys time is much less than real
>>> time, what is it waiting on?)
>>> user	1m51.527s
>>> sys	3m40.734s
>>> 
>>> make
>>> real	15m56.004s
>>> user	8m8.971s
>>> sys	52m42.734s  (why so much?)
>>> 
>>> which I read as either the filesystem or the compiler system (location
>>> of the compilers, license server of the compilers, ...) is really badly
>>> configured.
>>> 
>>>   The Hopper configure time with the default
>>> TMPDIR=/scratch/scratchdirs/balay is is unacceptable but if you actually
>>> use the real /tmp it becomes somewhat reasonable.
>>> 
>>> Feel free to share this information with local experts,
>>> 
>>> 
>>> 
>>> 
>>> I suggest you view the below table in a fixed width font editor like
>>> Emacs or Vi so the columns line up.
>>> 
>>>                    configure time    make time   Total      compilers
>>>  filesystem
>>> 
>>> Titan                14m32s         15m56s        30m28s      Intel 14
>>> /lustre/atlas1/geo103/proj-shared/
>>>                     41m38s          9m5s         50m43s
>>> /ccs/home/  (no load on login node)
>>> 		     13m      
>>> (no load on a different login node)
>>> 
>>> Mira                  6m59s          1m49s         8m48s       IBM
>>> /gpfs/mira-home/
>>> 
>>> Hopper               23m17           1m45s        25m2s
>>> /global/u2/b/balay/petsc.clone default
>>> TMPDIR=/scratch/scratchdirs/balay
>>> 		      6m17s          1m39s         7m57s                   manually
>>> set TMPDIR=/tmp
>>> 
>>> NSF Ranger UT Austin  5m10s          1m28s         6m38s
>>>    default, whatever it is
>>> 
>>> Linux laptop            53s          1m13s         2m6s         Gnu
>>>     compile and compiler local
>>> 
>>> Apple laptop          1m14s            54s         2m8s         clang
>>>     compile and compiler local
>>> 
>>> Linux workstation     1m11s            22s         1m33s        Gnu
>>>   compile and compiler local
>>>                      1m37s            29s         2m6s         Gnu
>>>   compile directory local; compiler directory remote
>>>                      3m11s            25s         3m36s       Intel 13
>>>   compile directory local; compiler directory remote
>>> 
>>> PETSc has about 1000 source files that need compiling
>>> 
>>> The configure is essentially sequential, the make extremely parallel.
>>> 
>>> During configure the source code is on the listed file system, all .o
>>> and executables  are on /tmp
>>> 
>>> During the make the source code and all .o are on the listed file system
>>> 
>>> 
>>>> On Feb 25, 2015, at 11:23 AM, David E. Bernholdt
>>>> <bernholdtde at ornl.gov> wrote:
>>>> 
>>>> At the kick-off meetings, one of the general complaints I heard
>>>> expressed about the facilities was the slow build times compared to
>>>> personal systems.
>>>> 
>>>> If you have this complaint and are an OLCF user, and are willing to
>>>> work
>>>> with us a little to try to understand your experience in more detail,
>>>> please contact me (individually, not reply-all).
>>>> 
>>>> This is a facility thing, not an IDEAS thing, so I can't speak for the
>>>> other facilities.  But we've recently received some other similar
>>>> comments, and we're trying to dig into what's happening.
>>>> 
>>>> Thanks
>>>> -- 
>>>> David E. Bernholdt                | Email: bernholdtde at ornl.gov
>>>> Oak Ridge National Laboratory     | Phone: +1 865-574-3147
>>>> http://www.csm.ornl.gov/~bernhold | Fax:   +1 865-576-5491
>>>> _______________________________________________
>>>> Ideas-team mailing list
>>>> Ideas-team at lists.mcs.anl.gov
>>>> https://lists.mcs.anl.gov/mailman/listinfo/ideas-team
>> 
>> 
>> -- 
>> David E. Bernholdt                | Email: bernholdtde at ornl.gov
>> Oak Ridge National Laboratory     | Phone: +1 865-574-3147
>> http://www.csm.ornl.gov/~bernhold | Fax:   +1 865-576-5491
>> _______________________________________________
>> Ideas-team mailing list
>> Ideas-team at lists.mcs.anl.gov
>> https://lists.mcs.anl.gov/mailman/listinfo/ideas-team
> 
> 




More information about the petsc-dev mailing list