[mpich-discuss] Trouble with new installation -- failed to connect to mpd

Dave Goodell goodell at mcs.anl.gov
Mon Dec 1 07:14:03 CST 2008


Hi Ben,

Please try the MPD troubleshooting steps listed in appendix A of the  
install guide: http://www.mcs.anl.gov/research/projects/mpich2/ 
documentation/files/mpich2-1.0.8-installguide.pdf

In particular, the mpdcheck utility should give you a better clue  
about where the problem is.

-Dave

On Dec 1, 2008, at 4:11 AM, Benjamin Svetitsky wrote:

> Dear MPI community,
>
> I already have MIPCH 1.0.8 running well on a cluster of four Linux  
> quad cores.  But now I can't get it running on a new cluster.  I  
> think I installed everything exactly like the first system.  But  
> when I try to mpdboot as root I get a minimal error message:
>
> [root at nodeE ~]# mpdboot -n 4 -f /root/mpd.hosts
> mpdboot_nodeE (handle_mpd_output 401): failed to connect to mpd on  
> nodeF
>
> The /root/mpd.hosts contains:
> nodeE
> nodeF
> nodeG
> nodeH
>
> Oddly enough, after the failure of mpdboot as above I find:
> [root at nodeE ~]# mpdtrace
> nodeE
> nodeF
>
> If I do mpdallexit and log into nodeF, the result is:
> [root at nodeF ~]# mpdboot -n 4 -f /root/mpd.hosts
> mpdboot_nodeF (handle_mpd_output 392): failed to handshake with mpd  
> on nodeE; recvd output={}
>
> Do I have a network problem or is it an MPICH problem?
>
> Thanks,
> 	Ben
>
> -- 
> Prof. Benjamin Svetitsky         Phone:            +972-3-640 8870
> School of Physics and Astronomy  Fax:              +972-3-640 7932
> Tel Aviv University              E-mail:      bqs at julian.tau.ac.il
> 69978 Tel Aviv, Israel           WWW: http://julian.tau.ac.il/~bqs




More information about the mpich-discuss mailing list