[mpich-discuss] mpdboot and hostsfile
Kenin Coloma
keninc at gmail.com
Tue Dec 1 17:33:58 CST 2009
In the mpich2-1.2.1, mpdboot stopped working (upgraded from mpich2-1.1.1)
for a fairly simple host file
(on compute06)
mpdboot --totalnum=6 --ncpus=0
host file:
compute07
compute08
compute09
compute10
compute11
mpdboot will hang after trying to launch mpd on compute10
[kcoloma at compute06 ~]$
/rd_personalization08/kcoloma/mpich_install/bin/mpdboot --totalnum=6
--ncpus=0 --file=/home/kcoloma/mpiHosts.txt
--mpd=/rd_personalization08/kcoloma/mpich_install/bin/mpd --verbose
running mpdallexit on compute06
LAUNCHED mpd on compute06 via
RUNNING: mpd on compute06
LAUNCHED mpd on compute07 via compute06
LAUNCHED mpd on compute08 via compute06
LAUNCHED mpd on compute09 via compute06
LAUNCHED mpd on compute10 via compute06
Traceback (most recent call last):
File "/rd_personalization08/kcoloma/mpich_install/bin/mpdboot", line 476,
in ?
mpdboot()
File "/rd_personalization08/kcoloma/mpich_install/bin/mpdboot", line 347,
in mpdboot
handle_mpd_output(fd,fd2idx,hostsAndInfo)
File "/rd_personalization08/kcoloma/mpich_install/bin/mpdboot", line 385,
in handle_mpd_output
for line in fd.readlines(): # handle output from shells that echo
stuff
KeyboardInterrupt
It will hang as long as --totalnum > 1.
mpdboot.py scripts are the same between the two versions of mpich, but the
mpd.py scripts changed to address ticket #905. I've found that rolling back
to the mpich2-1.1.1p1 mpd.py, fixes the mpdboot issue I'm having.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20091201/7e0250b1/attachment.htm>
More information about the mpich-discuss
mailing list