<div dir='auto'>Run two resource sets on one side versus separate nodes.</div><div class="gmail_extra"><br><div class="gmail_quote">On Sep 22, 2019 08:46, "Smith, Barry F." <bsmith@mcs.anl.gov> wrote:<br type="attribution" /><blockquote class="quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><p dir="ltr"><br>

   I'm guessing it would be very difficult to connect this particular performance bug with a decrease in performance for an actual full application since models don't catch this level of detail well (and  since you cannot run the application without the bug to see the better performance)?  IBM/Nvidia are not going to care about it if is just an abstract oddity as opposed to clearly demonstrating a problem for the use of the machine, especially if the machine is an orphan.<br>

<br>

> On Sep 22, 2019, at 8:35 AM, Jed Brown via petsc-dev <petsc-dev@mcs.anl.gov> wrote:<br>

> <br>

> Karl Rupp <rupp@iue.tuwien.ac.at> writes:<br>

> <br>

>>> I wonder if the single-node latency bugs on AC922 are related to these<br>

>>> weird performance results.<br>

>>> <br>

>>> https://docs.google.com/spreadsheets/d/1amFJIbpvs9oJcUc-WntsFHO_C0LE7xFJeor-oElt0LY/edit#gid=0<br>

>>> <br>

>> <br>

>> Thanks for these numbers!<br>

>> Intra-Node > Inter-Node is indeed weird. I haven't observed such an <br>

>> inversion before.<br>

> <br>

> As far as I know, it's been there since the machines were deployed<br>

> despite obviously being a bug.  I know people at LLNL regard it as a<br>

> bug, but it has not been their top priority (presumably at least in part<br>

> because applications have not clearly expressed the impact of latency<br>

> regressions on their science).<br>

<br>

</p>

</blockquote></div><br></div>