[mpich-discuss] mpdboot error

Cesar Covarrubias cesar at uci.edu
Wed Nov 5 18:31:34 CST 2008


I am seeing something interesting. Below is the output from the mpdcheck
-s:

-bash-3.2# mpdcheck -s
server listening at INADDR_ANY on: bduc-sched.nacs.uci.edu 55616
server has conn on <socket._socketobject object at 0xb7e962fc> from
('128.200.15.20', 41804)
server successfully recvd msg from client: hello_from_client_to_server

The nodes, however, are all on a private subnet, connected to a second
network interface on the head node. Each node is in the 192.168.0.0
range. Shouldn't that IP be the one returned instead of the public ip on
the head node?

Thanks,
Cesar

On Wed, 2008-11-05 at 18:04 -0600, Rajeev Thakur wrote:
> It is usually because of a problem with the networking configuration on the
> machines. To debug the problem, you can use the mpdcheck utility as
> described in Appendix A.2 of the MPICH2 installation guide.
> 
> Rajeev 
> 
> > -----Original Message-----
> > From: mpich-discuss-bounces at mcs.anl.gov 
> > [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Cesar 
> > Covarrubias
> > Sent: Wednesday, November 05, 2008 5:54 PM
> > To: mpich-discuss at mcs.anl.gov
> > Subject: [mpich-discuss] mpdboot error
> > 
> > Hello,
> > 
> > We are trying to launch a new cluster that uses mpich2, but I'm having
> > trouble when I run mpdboot. I get different errors every time I try to
> > run "mpdboot -n 5 -f mpd.hosts" Here is a sample of the errors:
> > 
> > mpdboot_bduc-sched.nacs.uci.edu (handle_mpd_output 392): failed to
> > handshake with mpd on bduc-i32-3; recvd output={}
> > 
> > mpdboot_bduc-sched.nacs.uci.edu (handle_mpd_output 401): failed to
> > connect to mpd on bduc-i32-2
> > 
> > mpdboot_bduc-sched.nacs.uci.edu (handle_mpd_output 401): failed to
> > connect to mpd on bduc-i32-1
> > 
> > Any thoughts?
> > 
> > Very Respectfully,
> > Cesar Covarrubias
> > UC Irvine
> > 
> > 
> 




More information about the mpich-discuss mailing list