[MPICH] Problem setting up a ring

Brett Gordon brgordon at gmail.com
Tue Apr 3 22:34:19 CDT 2007


Hello,

I have successfully installed mpich2-1.0.5 on two linux boxes. Both
succeed in the standard tests involving one host solving the 'cpi'
program.

However, I'm running into two (probably related) problems:

1) When I try to run mpd as a server and client on the same computer
(as on page 31 of the install documentation), I get the following:

brgordon at veritas:~> mpdcheck -s
server listening at INADDR_ANY on: veritas 23761
brgordon at veritas:~> mpdcheck -c veritas 23761
veritas_23761 (recv_dict_msg 549):recv_dict_msg: errmsg=:invalid
literal for int(): hello_fr:
  mpdtb:
    /home/brgordon/mpich2-install/bin/mpdlib.py,  549,  recv_dict_msg
    /home/brgordon/mpich2-install/bin/mpdlib.py,  989,
handle_ring_listener_connection
    /home/brgordon/mpich2-install/bin/mpdlib.py,  743,  handle_active_streams
    /home/brgordon/mpich2-install/bin/mpd,  286,  runmainloop
    /home/brgordon/mpich2-install/bin/mpd,  255,  run
    /home/brgordon/mpich2-install/bin/mpd,  1470,  ?

veritas_23761 (handle_ring_listener_connection 993): INVALID msg from
new connection :('128.2.93.142', 16587): msg=:{}:
Traceback (most recent call last):
  File "/home/brgordon/mpich2-install/bin/mpdcheck", line 105, in ?
    msg = sock.recv(64)
socket.error: (104, 'Connection reset by peer')

2) I also can't get a ring to work. I have setup ssh to work without
using passwords ('ssh veritas date' works fine). The workaround for
mpdboot on page 9 of the install doc does not work for me, nor does
running 'mpdcheck -f mpd.hosts -ssh'.

When I try to run mpdboot, I get
brgordon at elaine ~]$ mpdboot -n 2 -f mpd.hosts
mpdboot_elaine.tepper.cmu.edu (handle_mpd_output 383): failed to
connect to mpd on veritas


I feel like I'm getting close to having this working, so I would
greatly appreciate any help. Please let me know if there is more
information I can provide.

Thanks,
Brett




More information about the mpich-discuss mailing list