[mpich-discuss] Hydra
Pavan Balaji
balaji at mcs.anl.gov
Wed Dec 15 11:44:27 CST 2010
On 12/15/2010 11:09 AM, Mark Beauharnois wrote:
> On ‘minserv1’ we execute:
>
> [minserv1 ~] $mpdboot --totalnum=2 --ncpus=8 --ifhn=minserv1-gig -f
> ${HOME}/mpd.hosts
MPD is the old process manager. With Hydra that's not needed.
> [proxy:0:1 at minserv2] HYDU_sock_connect (./utils/sock/sock.c:138): unable
> to get host address (Success)
>
> [proxy:0:1 at minserv2] main (./pm/pmiserv/pmip.c:208): unable to connect
> to server minserv1 at port 55501 (check for firewalls!)
>
> We& #8217;re that ‘minserv2’ appears to be trying to connect to
> ‘minserv1’ (the using the non-gigabit interface) when we’ve started mpd
> with the specification to explicitly use that interface?
The host names specified in the host file will be used for all MPI
communication. However, the process manager control traffic still goes
over the default interface; in your case that's the non-gigabit
interface. This is not performance critical, but requires the default
interface to be functional.
As a short term work-around, can you make sure that the default network
interface can also be used for communication (for the control traffic)?
I'll try to fix this for the upcoming release.
-- Pavan
--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
More information about the mpich-discuss
mailing list