[petsc-users] possible performance issues with PETSc on Cray

Jed Brown jed at jedbrown.org
Fri Apr 11 07:41:48 CDT 2014


Samar Khatiwala <spk at ldeo.columbia.edu> writes:

> Hi Jed,
>
> Thanks for the quick reply. This is very helpful. You may well be right that my matrices are not large enough 
> (~2. 5e6 x 2.5e6 and I'm running on 360 cores = 15 nodes x 24 cores/node on this XC-30) and my runs are 
> therefore sensitive to network latency. Would this, though, impact other people running jobs on nearby nodes? 
> (I suppose it would if I'm passing too many messages because of the small size of the matrices.)

It depends on your partition.  The Aries network on XC-30 is a
high-radix low-diameter network.  There should be many routes between
nodes, but the routing algorithm likely does not know which wires to
avoid.  This leads to performance variation, though I think it should
tend to be less extreme than when you obtain disconnected partitions on
Gemini.

The gold standard of reproducible performance is Blue Gene, where the
network is reconfigured to give you an isolated 5D torus.  A Blue Gene
may or may not be available or cost effective (reproducible performance
does not imply high performance/efficiency for a given workload).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 835 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140411/9632de84/attachment.pgp>


More information about the petsc-users mailing list