[mpich-discuss] halt after mpiexec
Gao, Yi
gaoyi.cn at gmail.com
Thu Jan 14 00:00:06 CST 2010
Dear all,
I'm new here and encounter a problem at the very beginning of learning mpi.
Basically, I get
mpiexec -n i /bin/hostname
works for any i >= 1 I've tested.
but
mpiexec -n i /path-to-example-dir/cpi
error for any i >= 2
The details are:
I have 3 machines, all running Ubuntu 9.10 with gcc/g++ 4.4.1
one has two cores, and the other two have one core for each.
(machine name: rome, 2 core;
julia, 1 core;
meg, 1 core )
On this minimal testing bed for me to learn mpi, I built using
mpich2-1.2.1 using the default configure in "installation guide"
Then on "rome", I put the mpd.hosts file in home dir with content:
julia
meg
Then I ran
mpdboot -n 3 # works
mpdtrace -l # works, show the three machine names and port num
mpiexec -l -n 3 /bin/hostname # works! show three machine names
but
mpiexec -l -n 3 /tmp/gth818n/mpich2-1.2.1/example/cpi # !!!!!!!! it
halted there.
Then I tried:
mpiexec -l -n 1 /tmp/gth818n/mpich2-1.2.1/example/cpi # works, run on
rome only and returns the result
But -n larger or equal than 2 causes it to halt, or getting such
errors (with -n 4):
Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(394).................: Initialization failed
MPID_Init(135)........................: channel initialization failed
MPIDI_CH3_Init(43)....................:
MPID_nem_init(202)....................:
MPIDI_CH3I_Seg_commit(366)............:
MPIU_SHMW_Hnd_deserialize(358)........:
MPIU_SHMW_Seg_open(897)...............:
MPIU_SHMW_Seg_create_attach_templ(671): open failed - No such file or directory
rank 3 in job 12 rome_39209 caused collective abort of all ranks
exit status of rank 3: return code 1
Then, I rebuild mpich2 on rome (coz it's SMP), with --with-device=ch3:ssm
But got same error.
Could any one gives me some directions to go?
Thanks in advance!
Best,
yi
More information about the mpich-discuss
mailing list