PETSc runs slower on a shared memory machine than on a cluster

Satish Balay balay at mcs.anl.gov
Fri Feb 2 16:01:49 CST 2007


On Fri, 2 Feb 2007, Satish Balay wrote:

> However with the sequential numerical codes - it primarily depends
> upon the bandwidth between the CPU and the memory. On the SMP box -
> depending upon how the memory subsystem is designed - the effective
> memory bandwidth per cpu could be a small fraction of the peak memory
> bandwidth [when all cpus are used]

> > The shared-memory machine is actually a little faster
> > than the cluster machines in terms of single process
> > runs.

To understand this better - think of comparing the performance in the
following 2 cases:

- run the sequential code when no other job is on the machine.
- run the sequential code when there is another [memory intensive] job
  using the other 15 nodes]

In a distributed cluster the performance numbers for both cases will
be same. For a SMP machine - the performance of the first run will be
much better than the second one [because of the sharing of memory
bandwidth with competing processors]

Satish




More information about the petsc-users mailing list