[mpich-discuss] Trouble with new installation -- failed to connect to mpd
Dave Goodell
goodell at mcs.anl.gov
Mon Dec 1 07:14:03 CST 2008
Hi Ben,
Please try the MPD troubleshooting steps listed in appendix A of the
install guide: http://www.mcs.anl.gov/research/projects/mpich2/
documentation/files/mpich2-1.0.8-installguide.pdf
In particular, the mpdcheck utility should give you a better clue
about where the problem is.
-Dave
On Dec 1, 2008, at 4:11 AM, Benjamin Svetitsky wrote:
> Dear MPI community,
>
> I already have MIPCH 1.0.8 running well on a cluster of four Linux
> quad cores. But now I can't get it running on a new cluster. I
> think I installed everything exactly like the first system. But
> when I try to mpdboot as root I get a minimal error message:
>
> [root at nodeE ~]# mpdboot -n 4 -f /root/mpd.hosts
> mpdboot_nodeE (handle_mpd_output 401): failed to connect to mpd on
> nodeF
>
> The /root/mpd.hosts contains:
> nodeE
> nodeF
> nodeG
> nodeH
>
> Oddly enough, after the failure of mpdboot as above I find:
> [root at nodeE ~]# mpdtrace
> nodeE
> nodeF
>
> If I do mpdallexit and log into nodeF, the result is:
> [root at nodeF ~]# mpdboot -n 4 -f /root/mpd.hosts
> mpdboot_nodeF (handle_mpd_output 392): failed to handshake with mpd
> on nodeE; recvd output={}
>
> Do I have a network problem or is it an MPICH problem?
>
> Thanks,
> Ben
>
> --
> Prof. Benjamin Svetitsky Phone: +972-3-640 8870
> School of Physics and Astronomy Fax: +972-3-640 7932
> Tel Aviv University E-mail: bqs at julian.tau.ac.il
> 69978 Tel Aviv, Israel WWW: http://julian.tau.ac.il/~bqs
More information about the mpich-discuss
mailing list