<br><br><div><span class="gmail_quote">On 7/25/07, <b class="gmail_sendername">Reuti</b> <<a href="mailto:reuti@staff.uni-marburg.de">reuti@staff.uni-marburg.de</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Hi,<br><br>what about removing the <a href="http://127.0.0.2">127.0.0.2</a> entry from /etc/hosts and giving<br>server2 a sensible address therein. Maybe the started mpd on server4<br>tries to connect to the sender, i.e.
<a href="http://127.0.0.2">127.0.0.2</a> which will fail.</blockquote><div><br><br>I remove this line but cant see changes :S </div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
-- Reuti<br><br>Am 25.07.2007 um 02:12 schrieb Jorge Gonzalez:<br><br>><br>><br>> On 7/24/07, Anthony Chan <<a href="mailto:chan@mcs.anl.gov">chan@mcs.anl.gov</a>> wrote:<br>><br>> Did you try using "mpdcheck" to check if other network settings are
<br>> OK as<br>> described in MPICH2 user's guide ?<br>><br>> hi, thanks for the answer<br>> sorry for the long mail :P<br>><br>> I try launch in a "server2" machine the check, the .mpd.hosts file
<br>> are:<br>> server4<br>> server2<br>><br>> this is the output:<br>><br>> administrador@server2:~> mpdcheck<br>> *** first ipaddr for this host (via server2) is: <a href="http://127.0.0.2">
127.0.0.2</a><br>><br>> administrador@server2:~> mpdcheck -pc<br>> --- print results of: gethostbyname_ex(gethostname())<br>> ('server2', [], ['<a href="http://127.0.0.2">127.0.0.2</a>'])<br>
> --- try to run /bin/hostname<br>> server2<br>> --- try to run uname -a<br>> Linux server2 2.6.18.2-34-default #1 SMP Mon Nov 27 11:46:27 UTC<br>> 2006 x86_64 x86_64 x86_64 GNU/Linux<br>> --- try to print /etc/hosts
<br>> #<br>> # hosts This file describes a number of hostname-to-address<br>> # mappings for the TCP/IP subsystem. It is mostly<br>> # used at boot time, when no name servers are running.
<br>> # On small systems, this file can be used instead of a<br>> # "named" name server.<br>> # Syntax:<br>> #<br>> # IP-Address Full-Qualified-Hostname Short-Hostname
<br>> #<br>><br>> <a href="http://127.0.0.2">127.0.0.2</a> server2<br>> <a href="http://127.0.0.1">127.0.0.1</a> localhost<br>> XXX.XXX.123.25 server4<br>> XXX.XXX.122.50 server1<br>><br>
> # special IPv6 addresses<br>> ::1 localhost ipv6-localhost ipv6-loopback<br>><br>> fe00::0 ipv6-localnet<br>><br>> ff00::0 ipv6-mcastprefix<br>> ff02::1 ipv6-allnodes
<br>> ff02::2 ipv6-allrouters<br>> ff02::3 ipv6-allhosts<br>> <a href="http://127.0.0.2">127.0.0.2</a> server2 server2<br>><br>> --- try to print /etc/resolv.conf<br>> ### BEGIN INFO
<br>> #<br>> # Modified_by: dhcpcd<br>> # Backup: /etc/resolv.conf.saved.by.dhcpcd.eth0<br>> # Process: dhcpcd<br>> # Process_id: 3847<br>> # Script: /sbin/modify_resolvconf<br>> # Saveto:
<br>> # Info: This is a temporary resolv.conf created by service<br>> dhcpcd.<br>> # The previous file has been saved and will be<br>> restored later.<br>> #<br>> # If you don't like your
resolv.conf to be changed, you<br>> # can set MODIFY_{RESOLV,NAMED}_CONF_DYNAMICALLY=no.<br>> This<br>> # variables are placed in /etc/sysconfig/network/config.<br>> #<br>> # You can also configure service dhcpcd not to modify
<br>> it.<br>> #<br>> # If you don't like dhcpcd to change your nameserver<br>> # settings<br>> # then either set DHCLIENT_MODIFY_RESOLV_CONF=no<br>> # in /etc/sysconfig/network/dhcp, or
<br>> # set MODIFY_RESOLV_CONF_DYNAMICALLY=no in<br>> # /etc/sysconfig/network/config or (manually) use dhcpcd<br>> # with -R. If you only want to keep your searchlist,<br>
> set<br>> # DHCLIENT_KEEP_SEARCHLIST=yes in /etc/sysconfig/<br>> network/dhcp or<br>> # (manually) use the -K option.<br>> #<br>> ### END INFO<br>> search XXX.XXX.160.17 XXX.XXX.18
XXX.XXX.160.22 XXX.XXX.160.23<br>> nameserver XXX.XXX.160.17<br>> nameserver XXX.XXX.160.18<br>> nameserver XXX.XXX.160.22<br>> nameserver XXX.XXX.160.23<br>> --- try to run /sbin/ifconfig -a<br>> eth0 Link encap:Ethernet HWaddr 00:18:8B:1E:1F:D6
<br>> inet addr:XXX.XXX.123.136 Bcast:XXX.XXX.123.255 Mask:<br>> <a href="http://255.255.254.0">255.255.254.0</a><br>> inet6 addr: fe80::218:8bff:fe1e:1fd6/64 Scope:Link<br>> UP BROADCAST NOTRAILERS RUNNING MULTICAST MTU:1500
<br>> Metric:1<br>> RX packets:18300 errors:3 dropped:0 overruns:0 frame:4<br>> TX packets:890 errors:0 dropped:0 overruns:0 carrier:0<br>> collisions:0 txqueuelen:1000<br>> RX bytes:2217723 (
2.1 Mb) TX bytes:134240 (131.0 Kb)<br>> Interrupt:169<br>><br>> lo Link encap:Local Loopback<br>> inet addr:<a href="http://127.0.0.1">127.0.0.1</a> Mask: <a href="http://255.0.0.0">
255.0.0.0</a><br>> inet6 addr: ::1/128 Scope:Host<br>> UP LOOPBACK RUNNING MTU:16436 Metric:1<br>> RX packets:166 errors:0 dropped:0 overruns:0 frame:0<br>> TX packets:166 errors:0 dropped:0 overruns:0 carrier:0
<br>> collisions:0 txqueuelen:0<br>> RX bytes:20311 (19.8 Kb) TX bytes:20311 (19.8 Kb)<br>><br>> sit0 Link encap:IPv6-in-IPv4<br>> NOARP MTU:1480 Metric:1<br>> RX packets:0 errors:0 dropped:0 overruns:0 frame:0
<br>> TX packets:0 errors:0 dropped:0 overruns:0 carrier:0<br>> collisions:0 txqueuelen:0<br>> RX bytes:0 (0.0 b) TX bytes:0 (0.0 b)<br>><br>> --- try to print /etc/nsswitch.conf
<br>> #<br>> # /etc/nsswitch.conf<br>> #<br>> # An example Name Service Switch config file. This file should be<br>> # sorted with the most-used services at the beginning.<br>> #<br>> # The entry '[NOTFOUND=return]' means that the search for an
<br>> # entry should stop if the search in the previous entry turned<br>> # up nothing. Note that if the search failed due to some other reason<br>> # (like no NIS server responding) then the search continues with the
<br>> # next entry.<br>> #<br>> # Legal entries are:<br>> #<br>> # compat Use compatibility setup<br>> # nisplus Use NIS+ (NIS version 3)<br>> # nis Use NIS (NIS version 2), also
<br>> called YP<br>> # dns Use DNS (Domain Name Service)<br>> # files Use the local files<br>> # [NOTFOUND=return] Stop searching if not found so far
<br>> #<br>> # For more information, please read the nsswitch.conf.5 manual page.<br>> #<br>><br>> # passwd: files nis<br>> # shadow: files nis<br>> # group: files nis<br>><br>> passwd: compat<br>
> group: compat<br>><br>> hosts: files dns<br>> networks: files dns<br>><br>> services: files<br>> protocols: files<br>> rpc: files<br>> ethers: files
<br>> netmasks: files<br>> netgroup: files nis<br>> publickey: files<br>><br>> bootparams: files<br>> automount: files nis<br>> aliases: files<br>><br>><br>>
administrador@server2:~> mpdcheck -f .mpd.hosts -ssh -v<br>> obtaining hostname via gethostname and getfqdn<br>> gethostname gives server2<br>> getfqdn gives server2<br>> checking out unqualified hostname; make sure is not "localhost", etc.
<br>> checking out qualified hostname; make sure is not "localhost", etc.<br>> obtain IP addrs via qualified and unqualified hostnames; make sure<br>> other than <a href="http://127.0.0.1">127.0.0.1</a>
<br>> gethostbyname_ex: ('server2', [], ['<a href="http://127.0.0.2">127.0.0.2</a>'])<br>> *** first ipaddr for this host (via server2) is: <a href="http://127.0.0.2">127.0.0.2</a><br>> gethostbyname_ex: ('server2', [], ['
<a href="http://127.0.0.2">127.0.0.2</a>'])<br>> checking that IP addrs resolve to same host<br>> now do some gethostbyaddr and gethostbyname_ex for machines in<br>> hosts file<br>> checking gethostbyXXX for unqualified server4
<br>> gethostbyname_ex: ('server4', [], ['XXX.XXX.123.25'])<br>> checking gethostbyXXX for qualified server4<br>> gethostbyname_ex: ('server4', [], ['XXX.XXX.123.25'])<br>> checking gethostbyXXX for unqualified server2
<br>> gethostbyname_ex: ('server2', [], ['<a href="http://127.0.0.2">127.0.0.2</a>'])<br>> checking gethostbyXXX for qualified server2<br>> gethostbyname_ex: ('server2', [], ['<a href="http://127.0.0.2">
127.0.0.2</a>'])<br>> trying: ssh server4 -x -n /bin/echo hello<br>> trying: ssh server2 -x -n /bin/echo hello<br>> starting server: /usr/local/bin/mpdcheck.py -s<br>> starting client: ssh server4 -x -n /usr/local/bin/mpdcheck.py -c
<br>> server2 25734<br>> ** timed out waiting for client on server4 to produce output<br>> client on server4 failed to access the server<br>><br>> after I try<br>> administrador@server2:~> ssh server4 -x -n /bin/echo helloJorge
<br>><br>> and the output are<br>> helloJorge<br>><br>><br>> administrador@server2:~> mpdboot -f .mpd.hosts -n 2<br>> mpdboot_server2 (handle_mpd_output 383): failed to connect to mpd<br>> on server4
<br>> administrador@server2:~> mpdboot -f .mpd.hosts -n 2 -v<br>> running mpdallexit on server2<br>> LAUNCHED mpd on server2 via<br>> RUNNING: mpd on server2<br>> LAUNCHED mpd on server4 via server2<br>
> mpdboot_server2 (handle_mpd_output 383): failed to connect to mpd<br>> on server4<br>><br>><br>> I dont know why failed the access :S<br>><br>> thanks<br>><br>> On Mon, 23 Jul 2007, Jorge Gonzalez wrote:
<br>><br>> > Hi all<br>> ><br>> > I'm configuring a cluster of Two Pc using Suse 10.2 x64,<br>> Mpich2-1.0.5p4,<br>> > OpenSSH_4.4p1<br>> ><br>> > I had configured succesfully the ssh server on each machine.
<br>> > also I had configured the ssh clients with the command<br>> > ssh server1 (without password)<br>> > ssh server2 (without password)<br>> ><br>> > However when I tread to bring a ring of these two machines with
<br>> the command<br>> > mpdbood -n 2 -f .mpd.hosts<br>> ><br>> > the following message is obtained are:<br>> > mpdboot_server1 (handle_mpd_output 383): failed to connect to mpd<br>> on server2
<br>> ><br>> > can somebody tell me what I am doing wrong?<br>> ><br>> > the file .mpd.hosts contains the next two lines:<br>> > server1<br>> > server2<br>> ><br>> > I was to read this:
<br>> > <a href="http://www-unix.mcs.anl.gov/web-mail-archive/lists/mpich-discuss/">http://www-unix.mcs.anl.gov/web-mail-archive/lists/mpich-discuss/</a><br>> 2006/08/msg00009.html<br>> > <a href="http://www-unix.mcs.anl.gov/web-mail-archive/lists/mpich-discuss/">
http://www-unix.mcs.anl.gov/web-mail-archive/lists/mpich-discuss/</a><br>> 2006/04/msg00037.html<br>> ><br>> ><br>> > Thanks for all<br>> ><br>> > --<br>> > Jorge Andres Gonzalez<br>
> > jag2kn (at) <a href="http://gmail.com">gmail.com</a><br>> > jagonalezce (at) <a href="http://unal.edu.co">unal.edu.co</a><br>> > Universidad Nacional de Colombia<br>> > Cel: 301 217 78 60<br>> > Linux Counter 345082
<br>> > Bogotá - Colombia - Sur América<br>> ><br>><br>><br>><br>> --<br>> JAG<br>> jag2kn (at) <a href="http://gmail.com">gmail.com</a><br>> Cel: 301 217 78 60<br>> Linux Counter 345082
<br>> Bogotá - Colombia - Sur América<br><br></blockquote></div><br><br clear="all"><br>-- <br>JAG<br>jag2kn (at) <a href="http://gmail.com">gmail.com</a><br>Cel: 301 217 78 60<br>Linux Counter 345082<br>
Bogotá - Colombia - Sur América