[MPICH] mpdboot problem

Tom Crick tc at cs.bath.ac.uk
Sun May 21 13:46:28 CDT 2006


Apologies, it seems that YAST does weird things with /etc/hosts after
you make changes, so this is where the 127.0.0.2 appears.

mpdboot works fine now.

Cheers,

Tom

On Sun, 2006-05-21 at 18:43 +0100, Tom Crick wrote:
> Hello,
> 
> I've been having an issue using mpdboot (MPICH2 1.0.2) on a beowulf
> cluster of 20 nodes running SuSE 9.2. After following the mpd
> troubleshooting guide in the MPICH2 install doc, I can still find no
> obvious answer why I am unable to start mpds on the 20 nodes using
> mpdboot.
> 
> mpdboot gives a message like:
> 
> mpdboot_grendel13_11 (err_exit 415): mpd failed to start correctly on
> grendel13
>   reason: 11: unable to ping local mpd;
>   invalid msg from mpd :{}:
>   ** mpd may have disappeared, perhaps due to mismatched secretwords
>   ** see msgs logged in syslog and /tmp/mpd2.logfile* on grendel13
>   last printed output from mpd before becoming a daemon: 32838
> 
> mpdboot_grendel13_11 (err_exit 421): contents of mpd logfile in /tmp:
> logfile for mpd with pid 3828
>   grendel13_32838: conn error in connect_rhs: Connection refused
>   grendel13_32838 (connect_rhs 602): failed to connect to rhs at
> 127.0.0.2 32849
>   grendel13_32838 (enter_ring 513): rhs connect failed
>   grendel13_32838 (run 215): failed to enter ring
> 
> 
> Even if you start an mpd manually on the head node and then on each work
> node e.g. "mpd -h <host> -p <port> &", it fails like above. Is it
> something to do with the "failed to connect to rhs at 127.0.0.2 32849"?
> 
> It is possible to ssh from every machine to every other and running
> "mpdcheck -v -f /etc/mpd.hosts -ssh" from the head node gives no errors
> or problems. Checking the log files on the failing machines gives no
> more info than above and the secretwords on all machines are the same.
> 
> Any ideas for next step of debugging? Should mpd be run as root?
> 
> Thanks and regards,
> 
> Tom
> 
> 
> 




More information about the mpich-discuss mailing list