[mpich-discuss] fail to run hello world program with MPICH2-1.3a2 on multiple nodes

Manhui Wang wangm9 at cardiff.ac.uk
Fri Jul 9 06:26:10 CDT 2010


Hello,

I have a problem about running MPI jobs on multinodes with newly
released MPICH2-1.3a2, which hydra is the default process manager.

I just tested the simplest hello world program. It works fine on any
single node, but fails on multinodes.

node6-b:~/testprogram> cat hosts
node6-b
node6-b
node7-b
node7-b

node6-b:~/testprogram> mpiexec -f hosts -n 4 ./hello
node6-b: hello world,length=7,my rank=0
node6-b: hello world,length=7,my rank=1
node7-b: hello world,length=7,my rank=3
node7-b: hello world,length=7,my rank=2
Fatal error in PMPI_Barrier: Other MPI error, error stack:
PMPI_Barrier(476).................: MPI_Barrier(MPI_COMM_WORLD) failed
MPIR_Barrier(82)..................:
MPIC_Sendrecv(161)................:
MPIC_Wait(519)....................:
MPIDI_CH3I_Progress(165)..........:
MPID_nem_mpich2_blocking_recv(880):
MPID_nem_tcp_connpoll(1714).......: Communication error
Fatal error in PMPI_Barrier: Other MPI error, error stack:
PMPI_Barrier(476).................: MPI_Barrier(MPI_COMM_WORLD) failed
MPIR_Barrier(82)..................:
MPIC_Sendrecv(161)................:
MPIC_Wait(519)....................:
MPIDI_CH3I_Progress(165)..........:
MPID_nem_mpich2_blocking_recv(895):
MPID_nem_tcp_connpoll(1714).......: Communication error
APPLICATION TERMINATED WITH THE EXIT STRING: Terminated (signal 15)


I built the MPICH2-1.3a2 library with Intel 11.1/069 compilers on 64-bit
AMD machine:

nice -n +18 ./configure  --with-device=ch3:nemesis
--prefix=/mympich2-install FC=ifort --enable-f90 F90=ifort --enable-f77
F77=ifort --enable-cc CC=icc --enable-cxx  CXX=icc 2>&1 | tee configure.log

nice -n +18 make 2>&1 | tee make.log

nice -n +18 make install 2>&1 | tee install.log


Could you please point out what is the problem? I have attached the
source code.

Thanks
Manhui
-- 
-----------
Manhui  Wang
School of Chemistry, Cardiff University,
Main Building, Park Place,
Cardiff CF10 3AT, UK
Telephone: +44 (0)29208 76637
-------------- next part --------------
A non-text attachment was scrubbed...
Name: hello_world.c
Type: text/x-c++src
Size: 642 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20100709/176da6d4/attachment.cc>


More information about the mpich-discuss mailing list