[petsc-users] strange PETSc/KSP GMRES timings for MPI+OMP configuration on KNLs

Mon Jun 19 12:29:28 CDT 2017

    1000 by 1000 (sparse presumably) is way to small for scaling studies. With sparse matrices you want at least 10-20,000 unknowns per MPI process. 

     What is the Number of steps in the table? Is this Krylov iterations? If so that is also problematic because you are comparing two things: parallel work but much more work because many more iterations.

   Barry

> On Jun 16, 2017, at 7:57 AM, Damian Kaliszan <damian at man.poznan.pl> wrote:
> 
> Hi,
> 
> For  several  days  I've been trying to figure out what is going wrong
> with my python app timings solving Ax=b with KSP (GMRES) solver when trying to run on Intel's KNL 7210/7230.
> 
> I  downsized  the  problem  to  1000x1000 A matrix and a single node and
> observed the following:
> 
> <int_1.jpg>
> I'm attaching 2 extreme timings where configurations differ only by 1 OMP thread (64MPI/1 OMP vs 64/2 OMPs),
> 23321 vs 23325 slurm task ids.
> 
> Any help will be appreciated....
> 
> Best,
> Damian
> <slurm-23321.out><slurm-23325.out>