[petsc-users] Fwd: direct solvers on KNL
Barry Smith
bsmith at mcs.anl.gov
Fri Sep 1 09:57:48 CDT 2017
Since MUMPS and SuperLU don't have much profiling, you really need to use a profiling system, probably Intel's VTune is the way to go, to understand the performance on that machine.
Barry
> On Sep 1, 2017, at 6:40 AM, Jakub Kruzik <jakub.kruzik at vsb.cz> wrote:
>
> Hi,
>
> I am looking at a single node performance of MUMPS and SuperLU on KNL 7230 (on Theta). I am using KSP example ex2 (http://www.mcs.anl.gov/petsc/petsc-current/src/ksp/ksp/examples/tutorials/ex2.c.html) with m X n = 2880 x 2880. KNL runs in cache and quad modes.
>
> Times in seconds for 24 cores:
> mumps: 279
> superlu: 326
> cg: 116
>
> Times in seconds for 64 cores:
> mumps: 316
> superlu: 410
> cg : 49
>
> The performance for 24 cores is OK - both direct solvers are roughly 3.5 times slower than 2x E5-2680v3. (According to people from Intel, the single core performance of KNL is about 3-4 times lower than that of E5-2680v3). However, strong scalability is really bad.
>
> I am using cray-petsc/3.7.6.0 module. I tried my own PETSc compilation with MKL and MUMPS/SuperLU installed by PETSc configure but the results are similar.
>
> Please find attached Theta submission script and logs for KNL and Haswells.
>
> Why the performance of direct solvers on a full node is so bad?
>
> Best,
> Jakub
> <batch.sub><knl.log><haswell.log>
More information about the petsc-users
mailing list