<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Mon, Jun 19, 2017 at 7:56 AM, Damian Kaliszan <span dir="ltr"><<a href="mailto:damian@man.poznan.pl" target="_blank">damian@man.poznan.pl</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br>

<br>

Please find attached 2 output files from 64MPI/1 OMP vs 64/2 OMPs examples,<br>

23321 vs 23325 slurm task ids.<br></blockquote><div><br></div><div>This is on 1 KNL? Then aren't you oversubscribing using 2 threads? This produces horrible</div><div>performance, like you see in this log.</div><div><br></div><div>  Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Best,<br>

Damian<br>

<br>

<br>

W liście datowanym 19 czerwca 2017 (15:39:53) napisano:<br>

<br>

<br>

On Mon, Jun 19, 2017 at 7:32 AM, Damian Kaliszan <<a href="mailto:damian@man.poznan.pl">damian@man.poznan.pl</a>> wrote:<br>

Hi,<br>

Thank you for the answer and the article.<br>

I  use  SLURM  (srun)  for  job  submission by running<br>

'srun script.py script_parameters' command inside batch script so this is SPMD model.<br>

What  I  noticed  is  that the problems I'm having now didn't happened<br>

before  on CPU E5-2697 v3  nodes (28 cores - the best perormance I had<br>

was using 14MPIs/2OMP per node). Problems started to appear when I moved to KNLs.<br>

The   funny   thing   is   that   switching  OMP  on/off  (by  setting<br>

OMP_NUM_THREADS   to   1)   doesn't  help  for  all  #NODES/# MPI/ #OMP<br>

combinations.  For  example, for 2 nodes, 16 MPIs, for OMP=1 and 2 the<br>

timings are huge and for 4 is OK.<br>

<br>

Lets narrow this down to MPI_Barrier(). What memory mode is KNL in? Did you require<br>

KNL to use only MCDRAM? Please show the MPI_Barrier()/MPI_Send() numbers for the different configurations.<br>

This measures just latency. We could also look at VecScale() to look at memory bandwidth achieved.<br>

<br>

  Thanks,<br>

<br>

    Matt<br>

<br>

Playing with affinitty didn't help so far.<br>

In  other  words at first glance results look completely random   (I can<br>

provide more such examples).<br>

<br>

<br>

<br>

Best,<br>

Damian<br>

<br>

W liście datowanym 19 czerwca 2017 (14:50:25) napisano:<br>

<br>

<br>

On Mon, Jun 19, 2017 at 6:42 AM, Damian Kaliszan <<a href="mailto:damian@man.poznan.pl">damian@man.poznan.pl</a>> wrote:<br>

Hi,<br>

<br>

Regarding my previous post<br>

I looked into both logs of 64MPI/1 OMP vs. 64MPI/2 OMP.<br>

<br>

<br>

What attracted my attention is huge difference in MPI timings in the following places:<br>

<br>

Average time to get PetscTime(): 2.14577e-07<br>

Average time for MPI_Barrier(): 3.9196e-05<br>

Average time for zero size MPI_Send(): 5.45382e-06<br>

<br>

vs.<br>

<br>

Average time to get PetscTime(): 4.05312e-07<br>

Average time for MPI_Barrier(): 0.348399<br>

Average time for zero size MPI_Send(): 0.029937<br>

<br>

Isn't something wrong with PETSc library itself?...<br>

<br>

 I don't think so. This is bad interaction of MPI and your threading mechanism. MPI_Barrier() and MPI_Send() are lower<br>

level than PETSc. What threading mode did you choose for MPI? This can have a performance impact.<br>

<br>

Also, the justifications for threading in this context are weak (or non-existent): <a href="http://www.orau.gov/hpcor2015/whitepapers/Exascale_Computing_without_Threads-Barry_Smith.pdf" rel="noreferrer" target="_blank">http://www.orau.gov/hpcor2015/<wbr>whitepapers/Exascale_<wbr>Computing_without_Threads-<wbr>Barry_Smith.pdf</a><br>

<br>

  Thanks,<br>

<br>

    Matt<br>

<br>

<br>

Best,<br>

Damian<br>

<br>

Wiadomość przekazana<br>

Od: Damian Kaliszan <<a href="mailto:damian@man.poznan.pl">damian@man.poznan.pl</a>><br>

Do: PETSc users list <<a href="mailto:petsc-users@mcs.anl.gov">petsc-users@mcs.anl.gov</a>><br>

Data: 16 czerwca 2017, 14:57:10<br>

Temat: [petsc-users] strange PETSc/KSP GMRES timings for MPI+OMP configuration on KNLs<br>

<br>

===8<===============Treść oryginalnej wiadomości===============<br>

Hi,<br>

<br>

For  several  days  I've been trying to figure out what is going wrong<br>

with my python app timings solving Ax=b with KSP (GMRES) solver when trying to run on Intel's KNL 7210/7230.<br>

<br>

I  downsized  the  problem  to  1000x1000 A matrix and a single node and<br>

observed the following:<br>

<br>

<br>

I'm attaching 2 extreme timings where configurations differ only by 1 OMP thread (64MPI/1 OMP vs 64/2 OMPs),<br>

23321 vs 23325 slurm task ids.<br>

<br>

Any help will be appreciated....<br>

<br>

Best,<br>

Damian<br>

<br>

===8<===========Koniec treści oryginalnej wiadomości===========<br>

<br>

<br>

<br>

------------------------------<wbr>-------------------------<br>

Damian Kaliszan<br>

<br>

Poznan Supercomputing and Networking Center<br>

HPC and Data Centres Technologies<br>

ul. Jana Pawła II 10<br>

61-139 Poznan<br>

POLAND<br>

<br>

phone (+48 61) 858 5109<br>

e-mail <a href="mailto:damian@man.poznan.pl">damian@man.poznan.pl</a><br>

www - <a href="http://www.man.poznan.pl/" rel="noreferrer" target="_blank">http://www.man.poznan.pl/</a><br>

------------------------------<wbr>-------------------------<br>

<br>

<br>

---------- Forwarded message ----------<br>

From: Damian Kaliszan <<a href="mailto:damian@man.poznan.pl">damian@man.poznan.pl</a>><br>

To: PETSc users list <<a href="mailto:petsc-users@mcs.anl.gov">petsc-users@mcs.anl.gov</a>><br>

Cc:<br>

Bcc:<br>

Date: Fri, 16 Jun 2017 14:57:10 +0200<br>

Subject: [petsc-users] strange PETSc/KSP GMRES timings for MPI+OMP configuration on KNLs<br>

Hi,<br>

<br>

For  several  days  I've been trying to figure out what is going wrong<br>

with my python app timings solving Ax=b with KSP (GMRES) solver when trying to run on Intel's KNL 7210/7230.<br>

<br>

I  downsized  the  problem  to  1000x1000 A matrix and a single node and<br>

observed the following:<br>

<br>

<br>

I'm attaching 2 extreme timings where configurations differ only by 1 OMP thread (64MPI/1 OMP vs 64/2 OMPs),<br>

23321 vs 23325 slurm task ids.<br>

<br>

Any help will be appreciated....<br>

<br>

Best,<br>

Damian<br>

<br>

<br>

<br>

--<br>

What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>

-- Norbert Wiener<br>

<br>

<a href="http://www.caam.rice.edu/~mk51/" rel="noreferrer" target="_blank">http://www.caam.rice.edu/~<wbr>mk51/</a><br>

<br>

<br>

<br>

------------------------------<wbr>-------------------------<br>

Damian Kaliszan<br>

<br>

Poznan Supercomputing and Networking Center<br>

HPC and Data Centres Technologies<br>

ul. Jana Pawła II 10<br>

61-139 Poznan<br>

POLAND<br>

<br>

phone <a href="tel:%28%2B48%2061%29%20858%205109" value="+48618585109">(+48 61) 858 5109</a><br>

e-mail <a href="mailto:damian@man.poznan.pl">damian@man.poznan.pl</a><br>

www - <a href="http://www.man.poznan.pl/" rel="noreferrer" target="_blank">http://www.man.poznan.pl/</a><br>

------------------------------<wbr>-------------------------<br>

<br>

<br>

<br>

<br>

--<br>

What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>

-- Norbert Wiener<br>

<br>

<a href="http://www.caam.rice.edu/~mk51/" rel="noreferrer" target="_blank">http://www.caam.rice.edu/~<wbr>mk51/</a><br>

<br>

<br>

<br>

------------------------------<wbr>-------------------------<br>

Damian Kaliszan<br>

<br>

Poznan Supercomputing and Networking Center<br>

HPC and Data Centres Technologies<br>

ul. Jana Pawła II 10<br>

61-139 Poznan<br>

POLAND<br>

<br>

phone <a href="tel:%28%2B48%2061%29%20858%205109" value="+48618585109">(+48 61) 858 5109</a><br>

e-mail <a href="mailto:damian@man.poznan.pl">damian@man.poznan.pl</a><br>

www - <a href="http://www.man.poznan.pl/" rel="noreferrer" target="_blank">http://www.man.poznan.pl/</a><br>

------------------------------<wbr>------------------------- </blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>-- Norbert Wiener</div><div><br></div><div><a href="http://www.caam.rice.edu/~mk51/" target="_blank">http://www.caam.rice.edu/~mk51/</a><br></div></div></div>

</div></div>