[petsc-users] Configuring PETSc for KNL

Richard Mills richardtmills at gmail.com
Mon Apr 3 13:40:33 CDT 2017


Fixing typo:  Meant to say "Keep in mind that individual KNL cores are much
less powerful than an individual Haswell *core*."

--Richard

On Mon, Apr 3, 2017 at 11:36 AM, Richard Mills <richardtmills at gmail.com>
wrote:

> Hi Justin,
>
> How is the MCDRAM (on-package "high-bandwidth memory") configured for your
> KNL runs?  And if it is in "flat" mode, what are you doing to ensure that
> you use the MCDRAM?  Doing this wrong seems to be one of the most common
> reasons for unexpected poor performance on KNL.
>
> I'm not that familiar with the environment on Cori, but I think that if
> you are building for KNL, you should add "-xMIC-AVX512" to your compiler
> flags to explicitly instruct the compiler to use the AVX512 instruction
> set.  I usually use something along the lines of
>
>   'COPTFLAGS=-g -O3 -fp-model fast -xMIC-AVX512'
>
> (The "-g" just adds symbols, which make the output from performance
> profiling tools much more useful.)
>
> That said, I think that if you are comparing 1024 Haswell cores vs. 1024
> KNL cores (so double the number of Haswell nodes), I'm not surprised that
> the simulations are almost twice as fast using the Haswell nodes.  Keep in
> mind that individual KNL cores are much less powerful than an individual
> Haswell node.  You are also using roughly twice the power footprint (dual
> socket Haswell node should be roughly equivalent to a KNL node, I
> believe).  How do things look on when you compare equal nodes?
>
> Cheers,
> Richard
>
> On Mon, Apr 3, 2017 at 11:13 AM, Justin Chang <jychang48 at gmail.com> wrote:
>
>> Hi all,
>>
>> On NERSC's Cori I have the following configure options for PETSc:
>>
>> ./configure --download-fblaslapack --with-cc=cc --with-clib-autodetect=0
>> --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-fc=ftn
>> --with-fortranlib-autodetect=0 --with-mpiexec=srun --with-64-bit-indices=1
>> COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-cori-opt
>>
>> Where I swapped out the default Intel programming environment with that
>> of Cray (e.g., 'module switch PrgEnv-intel/6.0.3 PrgEnv-cray/6.0.3'). I
>> want to document the performance difference between Cori's Haswell and KNL
>> processors.
>>
>> When I run a PETSc example like SNES ex48 on 1024 cores (32 Haswell and
>> 16 KNL nodes), the simulations are almost twice as fast on Haswell nodes.
>> Which leads me to suspect that I am not doing something right for KNL. Does
>> anyone know what are some "optimal" configure options for running PETSc on
>> KNL?
>>
>> Thanks,
>> Justin
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170403/095b5a7a/attachment-0001.html>


More information about the petsc-users mailing list