[MPICH] MPICH2_sshm: cpi hangs

Kirill Birukov birk at inbox.ru
Tue Apr 11 05:36:57 CDT 2006


Hello!

I am new to MPICH2 and trying to get it to work. I've got a following
problem:

Simpliest example program 'cpi' hangs when I try to run it on more then
one cluster node with the following message:

# /common/mpich2-1.0.3/bin/mpiexec -l -np 2 examples/cpi

0: Process 0 of 2 is on NEXT
1: Process 1 of 2 is on NODE2

And nothing more...

The same command (with even more -np) runs fine if a running program
is somthing like true or hostname.
One process (-np 1) runs fine with cpi and with hostname.
Trying not to run processes locally (-1 option) does not change any
behaviour. 

What could it be?

P.S.
I've got a cluster form one central node (called NEXT) and two slave
nodes (called NODE1 and NODE2). All of them are 2-way Xeon machines
interconnected by Fast Ethernet. All run Linux RedHat 7.3 with kernel
version 2.4.23.
MPICH2 vers. 1.0.3 compiled --with-device=sshm to use shared memory
inside of SMP nodes and sockets as interconnection interface.

Kirill Birukov

R&D Institute Kvant
Moscow, Russia





More information about the mpich-discuss mailing list