PETSc runs slower on a shared memory machine than on a cluster

Fri Feb 2 15:47:56 CST 2007

On 2/2/07, Shi Jin <jinzishuai at yahoo.com> wrote:
> I found out that on a shared-memory machine (60GB RAM,
> 16    CPUS), the code runs around 4 times slower than
> on a distributed memory cluster (4GB Ram, 4CPU/node),
> although they yield identical results.

> However, I read the PETSc FAQ and found that "the
> speed of sparse matrix computations is almost totally
> determined by the speed of the memory, not the speed
> of the CPU".

> This makes me wonder whether the poor performance of
> my code on a shared-memory machine is due to the
> competition of different process on the same memory
> bus. Since the code is still MPI based, a lot of data
> are moving around inside the memory. Is this a
> reasonable explanation of what I observed?

There is a point which is not clear for me.

When you run in your shared-memory machine...

- Are you running your as a 'sequential' program with a global,shared
memory space?

- Or are you running it through MPI, as a distributed memory
application using MPI message passing (where shared mem is the
underlying communication 'channel') ?

-- 
Lisandro Dalcín
---------------
Centro Internacional de Métodos Computacionales en Ingeniería (CIMEC)
Instituto de Desarrollo Tecnológico para la Industria Química (INTEC)
Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET)
PTLC - Güemes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594