[mpich-discuss] Modifying ssh calls for MPICH2

Reuti reuti at staff.uni-marburg.de
Tue Jan 17 16:02:39 CST 2012


Am 17.01.2012 um 21:46 schrieb Kekatpure, Rohan Deodatta (-EXP):

> Dear MPICH-discuss moderators,
> 
> I am new to MPICH and have installed it to be used in conjunction with PETSc. I successfully compiled and installed MPICH2 (version 1.4.1p1) and ran the one-process and multi-process examples. However, when I run (a successfully built) PETSc, MPICH makes ssh calls to my local machine. For example it will call:
> 
> ssh s918992.sandia.gov  (s918992 is the name of my machine)
> 
> At that point, I get the error,
> 
> "ssh: Could not resolve hostname s198992.sandia.gov: nodename nor servname provided, or not known"
> 
> I am guessing that this is Sandia Firewall issue. However, if I invoke ssh with 
> 
> ssh s918992.local

There are two things to note:

1) if it's only local, it shouldn't use ssh but fork instead

2a) most likely it's the result of a name mismatch, it tries to make an ssh connection as it things it's an external machine when it compares the names

You are on which distribution? Somewhere you need to adjust the local domainname, e.g. in /etc/hosts to read s198992.sandia.gov instead of s198992.local

2b) OTOH: you could also create a hostfile for mpirun which includes s918992.local, and it should work too.

An additonal point is, that it looks like there is no DNS entry to resolve s198992.sandia.gov to a valid TCP/IP address. But when you setup the correct name, it should work anyway.

-- Reuti

PS: Instead of PETSc, maybe you check it first with an mpihello. If this is working fine, maybe something in PETSc needs to be adjusted. As the default mpihello is too fast to be observed, you can inlcude a loop:

#include <stdio.h>
#include <mpi.h>
#include <signal.h>

main(int argc, char **argv)
{
   int node;

   int i,j;
   float f;

   MPI_Init(NULL,NULL);
   MPI_Comm_rank(MPI_COMM_WORLD, &node);
     
   printf("Hello World from Node %d.\n", node);
       for(j=0; j <= 100000; j++)
           for(i=0;i <= 100000; i++)
               f=i*2.718281828*i+i+i*3.141592654;

   MPI_Finalize();
}

Then issue:

ps -e f

(f w/o -) and it should show something like:

26213 pts/0    Ss     0:00  |       \_ -bash
26289 pts/0    S+     0:00  |           \_ mpirun -np 2 ./mpihello
26290 pts/0    S+     0:00  |               \_ /home/reuti/local/mpich2-1.4/bin/hydra_pmi_proxy --control-port pcfoobar:35625 --demux
26291 pts/0    R+     0:06  |                   \_ ./mpihello
26292 pts/0    R+     0:06  |                   \_ ./mpihello


> then I am able to go through the usual ssh process. 
> 
> My question is: is there a way to configure MPICH2 such that it will invoke ssh with "ssh s918992.local" instead of "ssh s918992.sandia.gov" ? If you feel there is a better way to handle this, please feel free to let me know the better way. I am a newbie to Unix as well as MPICH and PETSc. I hope you will be patient with me.
> 
> Thanks,
> Rohan
> 
> 
> _______________________________________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss



More information about the mpich-discuss mailing list