[petsc-dev] [Ideas-team] Seeking OLCF users complaining about poor build times
Todd Gamblin
tgamblin at llnl.gov
Fri Feb 27 10:49:34 CST 2015
On 2/27/15, 8:45 AM, "Satish Balay" <balay at mcs.anl.gov> wrote:
>On Fri, 27 Feb 2015, Todd Gamblin wrote:
>
>Its just that users won't think of doing this. [also we usually
>default to inplace build - and use /tmp for build - works well with
>--prefix install].
Yes exactly. Users are used to running curl -O (if that) in their home
directory and building there. So it is kind of sad that the home
directory is not optimized for compiles.
>Also I think such usage at LC centers might be prohibited. And some
>systems are configured in such a way that /tmp is really not useful
>for source builds. [For eg - configure tends to do simpile runs - that
>get blocked due to security settings on /tmp]
Why would it be prohibited? Especially if you remove the build after it's
done? What's a "simple run" and why do the security settings prohibit it?
I do not think LC (Livermore Computing -- maybe I shouldn't use that
acronym here?) does this.
-Todd
>
>Satish
>
>>
>> -Todd
>>
>>
>>
>> On 2/27/15, 8:09 AM, "David E. Bernholdt" <bernholdtde at ornl.gov> wrote:
>>
>> >Barry, thanks, this is extremely helpful. I'll have the OLCF folks
>> >contact Nathan if they need any further info or have other experiments
>> >to try.
>> >
>> >On 02/27/2015 11:03 AM, Barry Smith wrote:
>> >>
>> >> Same text also in the attachment.
>> >>
>> >> Barry
>> >>
>> >> David,
>> >>
>> >> Nathan Collier has kindly run a test on Titan, Satish on Mira and
>> >>Hopper, and Victor on Ranger with a basic optimized build of PETSc
>>(all
>> >>C code)
>> >>
>> >> Please find below some configure and make timings from the latest
>> >>PETSc master.
>> >>
>> >> The Titan times for both configure and make are unacceptable.
>>For
>> >>total build time Titan is 3.5 times slower than Mira and Hopper and at
>> >>least 10 times slower than laptops. The "time" results on Titan are
>> >>disturbing
>> >>
>> >> configure
>> >> real 14m32.169s (since the user + sys time is much less than real
>> >>time, what is it waiting on?)
>> >> user 1m51.527s
>> >> sys 3m40.734s
>> >>
>> >> make
>> >> real 15m56.004s
>> >> user 8m8.971s
>> >> sys 52m42.734s (why so much?)
>> >>
>> >> which I read as either the filesystem or the compiler system
>>(location
>> >>of the compilers, license server of the compilers, ...) is really
>>badly
>> >>configured.
>> >>
>> >> The Hopper configure time with the default
>> >>TMPDIR=/scratch/scratchdirs/balay is is unacceptable but if you
>>actually
>> >>use the real /tmp it becomes somewhat reasonable.
>> >>
>> >> Feel free to share this information with local experts,
>> >>
>> >>
>> >>
>> >>
>> >> I suggest you view the below table in a fixed width font editor like
>> >>Emacs or Vi so the columns line up.
>> >>
>> >> configure time make time Total
>>compilers
>> >> filesystem
>> >>
>> >> Titan 14m32s 15m56s 30m28s Intel
>>14
>> >> /lustre/atlas1/geo103/proj-shared/
>> >> 41m38s 9m5s 50m43s
>> >> /ccs/home/ (no load on login node)
>> >> 13m
>> >>(no load on a different login node)
>> >>
>> >> Mira 6m59s 1m49s 8m48s IBM
>> >> /gpfs/mira-home/
>> >>
>> >> Hopper 23m17 1m45s 25m2s
>> >> /global/u2/b/balay/petsc.clone default
>> >>TMPDIR=/scratch/scratchdirs/balay
>> >> 6m17s 1m39s 7m57s manually
>> >>set TMPDIR=/tmp
>> >>
>> >> NSF Ranger UT Austin 5m10s 1m28s 6m38s
>> >> default, whatever it is
>> >>
>> >> Linux laptop 53s 1m13s 2m6s Gnu
>> >> compile and compiler local
>> >>
>> >> Apple laptop 1m14s 54s 2m8s clang
>> >> compile and compiler local
>> >>
>> >> Linux workstation 1m11s 22s 1m33s Gnu
>> >> compile and compiler local
>> >> 1m37s 29s 2m6s Gnu
>> >> compile directory local; compiler directory remote
>> >> 3m11s 25s 3m36s Intel
>>13
>> >> compile directory local; compiler directory remote
>> >>
>> >> PETSc has about 1000 source files that need compiling
>> >>
>> >> The configure is essentially sequential, the make extremely parallel.
>> >>
>> >> During configure the source code is on the listed file system, all .o
>> >>and executables are on /tmp
>> >>
>> >> During the make the source code and all .o are on the listed file
>>system
>> >>
>> >>
>> >>> On Feb 25, 2015, at 11:23 AM, David E. Bernholdt
>> >>><bernholdtde at ornl.gov> wrote:
>> >>>
>> >>> At the kick-off meetings, one of the general complaints I heard
>> >>> expressed about the facilities was the slow build times compared to
>> >>> personal systems.
>> >>>
>> >>> If you have this complaint and are an OLCF user, and are willing to
>> >>>work
>> >>> with us a little to try to understand your experience in more
>>detail,
>> >>> please contact me (individually, not reply-all).
>> >>>
>> >>> This is a facility thing, not an IDEAS thing, so I can't speak for
>>the
>> >>> other facilities. But we've recently received some other similar
>> >>> comments, and we're trying to dig into what's happening.
>> >>>
>> >>> Thanks
>> >>> --
>> >>> David E. Bernholdt | Email: bernholdtde at ornl.gov
>> >>> Oak Ridge National Laboratory | Phone: +1 865-574-3147
>> >>> http://www.csm.ornl.gov/~bernhold | Fax: +1 865-576-5491
>> >>> _______________________________________________
>> >>> Ideas-team mailing list
>> >>> Ideas-team at lists.mcs.anl.gov
>> >>> https://lists.mcs.anl.gov/mailman/listinfo/ideas-team
>> >
>> >
>> >--
>> >David E. Bernholdt | Email: bernholdtde at ornl.gov
>> >Oak Ridge National Laboratory | Phone: +1 865-574-3147
>> >http://www.csm.ornl.gov/~bernhold | Fax: +1 865-576-5491
>> >_______________________________________________
>> >Ideas-team mailing list
>> >Ideas-team at lists.mcs.anl.gov
>> >https://lists.mcs.anl.gov/mailman/listinfo/ideas-team
>>
>>
>>
>
More information about the petsc-dev
mailing list