[mpich-discuss] machinefile error

SULLIVAN David (AREVA) David.Sullivan at areva.com
Wed Jul 21 14:44:54 CDT 2010


I have done that, as well as configured it to log in without a password.
I have no idea why it caches the key each time. 

-----Original Message-----
From: mpich-discuss-bounces at mcs.anl.gov
[mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Dave Goodell
Sent: Wednesday, July 21, 2010 3:41 PM
To: mpich-discuss at mcs.anl.gov
Subject: Re: [mpich-discuss] machinefile error

Try sshing to each node once before trying your mpiexec again.  That or
find some other way to put the host key fingerprints for your cluster
nodes in to the known_hosts file.

Hydra docs are here:
http://wiki.mcs.anl.gov/mpich2/index.php/Using_the_Hydra_Process_Manager

-Dave

On Jul 21, 2010, at 2:24 PM CDT, SULLIVAN David (AREVA) wrote:

> Ok. Attempting to use Hydra yields the following error:
> 
> [mcnp5_1-4 at node1 ~]$ mpiexec.hydra -f nodes -n 12 mcnp5.mpi i=aw04 
> o=mpi.te The authenticity of host 'node1 (155.55.55.5)' can't be 
> established.
> RSA key fingerprint is
28:bd:96:b9:2c:3f:05:cc:ca:67:f6:35:70:77:17:06.
> Are you sure you want to continue connecting (yes/no)? [proxy at node2] 
> HYDU_sock_connect (./utils/sock/sock.c:141): [proxy at node3] connect 
> error (No route to host) [proxy at node2] main 
> (./pm/pmiserv/pmi_proxy.c:108): unable to connect to the main server 
> HYDU_sock_connect (./utils/sock/sock.c:141): connect error (No route 
> to
> host)
> [proxy at node3] main (./pm/pmiserv/pmi_proxy.c:108): unable to connect 
> to the main server
> 
> Host key verification failed.
> [mcnp5_1-4 at node1 ~]$
> 
> There is very little in the user's guide on Hydra, so I am at your 
> mercy for any information.
> 
> Thanks for the help-
> 
> Dave
> 
> -----Original Message-----
> From: mpich-discuss-bounces at mcs.anl.gov 
> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Pavan Balaji
> Sent: Wednesday, July 21, 2010 1:23 PM
> To: mpich-discuss at mcs.anl.gov
> Subject: Re: [mpich-discuss] machinefile error
> 
> 
> On 07/20/2010 10:56 AM, SULLIVAN David (AREVA) wrote:
>> If I explicitly call mpiexec.hydra I get all sorts of problems, but I

>> think it has to do with how the nodes are set up. On advice from 
>> debianclusters.org I entered the command mpd --daemon --ncpus=4. Next

>> to see where this is running I enter mpdtrace -l. The response is
>> node1_12345 (127.0.0.1). That ip address doesn't look like it will
> work.
> 
> What problems are you having with Hydra? There are usage instructions 
> on MPD in the MPICH2 user's guide (on our website), but I suggest you 
> try Hydra first.
> 
>  -- Pavan
> 
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> _______________________________________________
> mpich-discuss mailing list
> mpich-discuss at mcs.anl.gov
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss

_______________________________________________
mpich-discuss mailing list
mpich-discuss at mcs.anl.gov
https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss


More information about the mpich-discuss mailing list