FW: [MPICH] mpdcheck failure
Cavey, Lester
Lester.Cavey at ATK.COM
Fri Jun 8 12:57:04 CDT 2007
We got mpdcheck to work by setting our Firewall to 'Disabled'. We
appreciate your comments in helping to resolve this problem.
Thanks,
Lester
________________________________
From: owner-mpich-discuss at mcs.anl.gov
[mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of Cavey, Lester
Sent: Friday, June 08, 2007 8:06 AM
To: mpich-discuss at mcs.anl.gov
Subject: [MPICH] mpdcheck failure
We are still getting the same 'No route to host.' error (see below).
Our IT guy and I listed our machines names (in lowercase, if that
matters) with the DNS server. /etc/hosts contains each machine's IP
address, full name, and alias. /etc/mpd.hosts contains each machine's
full name (in a separate test, we just listed the aliases, but it made
no difference in the results). DHCP automatically detects our DNS
server (we can access the internet from the machines and browse with
Firefox). We can ssh between machines okay. Any ideas on why we are
still getting the error message?
Thanks,
Lester
Thanks. I've asked our IT guy to check on getting our machines names
listed with our DNS server, in the hope that this will get us closer to
overcoming the error.
Thanks,
Lester
-----Original Message-----
From: Ralph Butler [mailto:rbutler at mtsu.edu]
Sent: Friday, June 01, 2007 1:54 PM
To: Cavey, Lester
Cc: Rajeev Thakur
Subject: Fwd: [MPICH] Failed to ping mpd on client
This part:
socket.error: (113, 'No route to host') is telling you that
VA90DTEST01 cannot obtain info about VA90DTEST02 where the server is
running.
mpdcheck is NOT an mpi program. Running with the -s and -c options, it
is acting as an ordinary program that is server on one host and client
on the other. But, the connect system call is failing because there is
no route info. This means that there are host and/ or net config
problems. These problems are not mpi (or mpd) specific. They could
affect any client/ server programs.
Begin forwarded message:
> From: "Rajeev Thakur" <thakur at mcs.anl.gov>
> Date: June 1, 2007 12:42:55 PM CDT
> To: "'Ralph Butler'" <rbutler at mtsu.edu>
> Subject: FW: [MPICH] Failed to ping mpd on client
>
>
>
> From: Cavey, Lester [mailto:Lester.Cavey at ATK.COM]
> Sent: Friday, June 01, 2007 11:50 AM
> To: Rajeev Thakur
> Subject: RE: [MPICH] Failed to ping mpd on client
>
> The Installation Guide Troubleshooting works okay until these steps
> are reached:
>
> TERMINAL ONE:
>
> [VA90DLINUX02 ~] mpdcheck -s
>
> server listening at INADDR_ANY on: VA90DLINUX02 54688
>
>
>
> TERMINAL TWO:
>
> [root at VA90DTEST01 ~]# mpdcheck -c VA90DLINUX02 54688
>
> Traceback (most recent call last):
>
> File "/usr/local/mpich2/bin/mpdcheck", line 103, in ?
>
> sock.connect((argv[argidx+1],int(argv[argidx+2]))) # note double
> parens
>
> File "<string>", line 1, in connect
>
> socket.error: (113, 'No route to host')
>
>
>
> Thanks,
>
> Lester
>
>
>
> From: Rajeev Thakur [mailto:thakur at mcs.anl.gov]
> Sent: Friday, June 01, 2007 10:39 AM
> To: Cavey, Lester; mpich-discuss at mcs.anl.gov
> Subject: RE: [MPICH] Failed to ping mpd on client
>
>
>
> Probably something is not right on the networking setup on the two
> machines. To debug the problem, you can use the mpdcheck utility as
> described in the installation guide.
>
>
>
> Rajeev
>
>
>
> From: owner-mpich-discuss at mcs.anl.gov [mailto:owner-mpich-
> discuss at mcs.anl.gov] On Behalf Of Cavey, Lester
> Sent: Friday, June 01, 2007 8:39 AM
> To: mpich-discuss at mcs.anl.gov
> Subject: [MPICH] Failed to ping mpd on client
>
> I can run mpd on machine1 alone, and on machine2 alone, but I get
> an error when I have mpd running on machine2 and I enter the
> command 'mpdboot -n 2 -f /etc/mpd.hosts' on machine1. The error
> message is 'Failed to ping mpd on machine2.'. I can ping machine1
> from machine2, and I can ping machine2 from machine1. I have the
> machines listed in /etc/hosts (which also lists their IP addresses
> and aliases) and in /etc/mpd.hosts . I am using Red Hat Enterprise
> 5 on each of the i386 machines. Any ideas on what is causing the
> error?
>
>
>
> Thanks,
>
> Lester
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20070608/9866bcda/attachment.htm>
More information about the mpich-discuss
mailing list