[petsc-dev] MatMult on Summit

Smith, Barry F. bsmith at mcs.anl.gov
Sun Sep 22 10:10:29 CDT 2019



> On Sep 22, 2019, at 9:56 AM, Jed Brown <jed at jedbrown.org> wrote:
> 
> Run two resource sets on one side versus separate nodes.

  I don't know what this is suppose to mean. Is it a toy situation where you show the problem is measurable or a real application run properly at scale where you show the problem has an affect. Facilities care about real applications at scale losing performance but toys don't mean that much unless if you can convince them that it actually affects the real application at scale as well.  

   This discuss is probably not important so we should drop it. 


> 
> On Sep 22, 2019 08:46, "Smith, Barry F." <bsmith at mcs.anl.gov> wrote:
> 
>    I'm guessing it would be very difficult to connect this particular performance bug with a decrease in performance for an actual full application since models don't catch this level of detail well (and  since you cannot run the application without the bug to see the better performance)?  IBM/Nvidia are not going to care about it if is just an abstract oddity as opposed to clearly demonstrating a problem for the use of the machine, especially if the machine is an orphan. 
> 
> > On Sep 22, 2019, at 8:35 AM, Jed Brown via petsc-dev <petsc-dev at mcs.anl.gov> wrote: 
> > 
> > Karl Rupp <rupp at iue.tuwien.ac.at> writes: 
> > 
> >>> I wonder if the single-node latency bugs on AC922 are related to these 
> >>> weird performance results. 
> >>> 
> >>> https://docs.google.com/spreadsheets/d/1amFJIbpvs9oJcUc-WntsFHO_C0LE7xFJeor-oElt0LY/edit#gid=0 
> >>> 
> >> 
> >> Thanks for these numbers! 
> >> Intra-Node > Inter-Node is indeed weird. I haven't observed such an 
> >> inversion before. 
> > 
> > As far as I know, it's been there since the machines were deployed 
> > despite obviously being a bug.  I know people at LLNL regard it as a 
> > bug, but it has not been their top priority (presumably at least in part 
> > because applications have not clearly expressed the impact of latency 
> > regressions on their science). 
> 
> 
> 



More information about the petsc-dev mailing list