[mpich-discuss] Hydra

Mark Beauharnois mark at asrc.cestm.albany.edu
Wed Dec 15 12:12:04 CST 2010


Pavan, thanks very much for the clarification!

-----Original Message-----
From: Pavan Balaji [mailto:balaji at mcs.anl.gov] 
Sent: Wednesday, December 15, 2010 12:44 PM
To: mpich-discuss at mcs.anl.gov
Cc: Mark Beauharnois
Subject: Re: [mpich-discuss] Hydra


On 12/15/2010 11:09 AM, Mark Beauharnois wrote:
> On 'minserv1' we execute:
>
> [minserv1 ~] $mpdboot --totalnum=2 --ncpus=8 --ifhn=minserv1-gig -f
> ${HOME}/mpd.hosts

MPD is the old process manager. With Hydra that's not needed.

> [proxy:0:1 at minserv2] HYDU_sock_connect (./utils/sock/sock.c:138): unable
> to get host address (Success)
>
> [proxy:0:1 at minserv2] main (./pm/pmiserv/pmip.c:208): unable to connect
> to server minserv1 at port 55501 (check for firewalls!)
>
> We& #8217;re that 'minserv2' appears to be trying to connect to
> 'minserv1' (the using the non-gigabit interface) when we've started mpd
> with the specification to explicitly use that interface?

The host names specified in the host file will be used for all MPI 
communication. However, the process manager control traffic still goes 
over the default interface; in your case that's the non-gigabit 
interface. This is not performance critical, but requires the default 
interface to be functional.

As a short term work-around, can you make sure that the default network 
interface can also be used for communication (for the control traffic)?

I'll try to fix this for the upcoming release.

  -- Pavan

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji




More information about the mpich-discuss mailing list