[petsc-users] possible performance issues with PETSc on Cray

Samar Khatiwala spk at ldeo.columbia.edu
Fri Apr 11 10:39:40 CDT 2014


Thanks again Jed. This has definitely helped narrow down the possibilities.

Best,

Samar

On Apr 11, 2014, at 8:41 AM, Jed Brown <jed at jedbrown.org> wrote:

> Samar Khatiwala <spk at ldeo.columbia.edu> writes:
> 
>> Hi Jed,
>> 
>> Thanks for the quick reply. This is very helpful. You may well be right that my matrices are not large enough 
>> (~2. 5e6 x 2.5e6 and I'm running on 360 cores = 15 nodes x 24 cores/node on this XC-30) and my runs are 
>> therefore sensitive to network latency. Would this, though, impact other people running jobs on nearby nodes? 
>> (I suppose it would if I'm passing too many messages because of the small size of the matrices.)
> 
> It depends on your partition.  The Aries network on XC-30 is a
> high-radix low-diameter network.  There should be many routes between
> nodes, but the routing algorithm likely does not know which wires to
> avoid.  This leads to performance variation, though I think it should
> tend to be less extreme than when you obtain disconnected partitions on
> Gemini.
> 
> The gold standard of reproducible performance is Blue Gene, where the
> network is reconfigured to give you an isolated 5D torus.  A Blue Gene
> may or may not be available or cost effective (reproducible performance
> does not imply high performance/efficiency for a given workload).



More information about the petsc-users mailing list