[mpich-discuss] Hydra

Pavan Balaji balaji at mcs.anl.gov
Wed Dec 15 11:44:27 CST 2010


On 12/15/2010 11:09 AM, Mark Beauharnois wrote:
> On ‘minserv1’ we execute:
>
> [minserv1 ~] $mpdboot --totalnum=2 --ncpus=8 --ifhn=minserv1-gig -f
> ${HOME}/mpd.hosts

MPD is the old process manager. With Hydra that's not needed.

> [proxy:0:1 at minserv2] HYDU_sock_connect (./utils/sock/sock.c:138): unable
> to get host address (Success)
>
> [proxy:0:1 at minserv2] main (./pm/pmiserv/pmip.c:208): unable to connect
> to server minserv1 at port 55501 (check for firewalls!)
>
> We& #8217;re that ‘minserv2’ appears to be trying to connect to
> ‘minserv1’ (the using the non-gigabit interface) when we’ve started mpd
> with the specification to explicitly use that interface?

The host names specified in the host file will be used for all MPI 
communication. However, the process manager control traffic still goes 
over the default interface; in your case that's the non-gigabit 
interface. This is not performance critical, but requires the default 
interface to be functional.

As a short term work-around, can you make sure that the default network 
interface can also be used for communication (for the control traffic)?

I'll try to fix this for the upcoming release.

  -- Pavan

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji


More information about the mpich-discuss mailing list