[mpich-discuss] mpdboot fails

Matt Thiffault matt.thiffault at gmail.com
Mon Oct 25 10:55:40 CDT 2010


So, I have a number of machines I'm trying to set up with MPICH2, but I'm
starting off with just 2. One is behind an NAT router and the other has a
public ip, but is protected by an iptables firewall.

I can ssh back and forth between them, and I have them keyed so that you
don't need a password to do so after starting an ssh-agent and giving the
password to the RSA key.

I have MPICH_PORT_RANGE and MPIEXEC_PORT_RANGE set in /etc/profile on both
systems to 10000:10100 and both firewalls allow traffic through on those
ports (I've tested this with netcat).

mpdboot fails with the following output:
mthiffau at foehammer ~ (255)% mpdboot -n 2 -v -f mpd.hosts
running mpdallexit on foehammer
LAUNCHED mpd on foehammer  via
RUNNING: mpd on foehammer
LAUNCHED mpd on halo.mthiffau.ca  via  foehammer
mpdboot_foehammer (handle_mpd_output 415): failed to connect to mpd on
halo.mthiffau.ca

So, mpd starts locally, and on the other machine, but afterwards
comunication seems to break down. Does the ssh-ing back and forth have to be
completely passwordless? Is there something else I have to do to ensure the
correct port range gets used?

I suspect it might be the latter, as even with the environment variables
set, mpdcheck -s starts listening on random ports that are outside the range
specified.

Any help would be appreciated, thanks a bunch.

Matt Thiffault
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20101025/e50bfe93/attachment.htm>


More information about the mpich-discuss mailing list