[MPICH] MPICH Connection Problem

Natarajan, Senthil senthil at pitt.edu
Thu Mar 15 13:49:55 CDT 2007


Yes, here are the few lines.

AUTOMOUNTFIX="sed -e s@/tmp_mnt/@/@g"
DEFAULT_DEVICE=ch_p4
RSHCOMMAND="ssh"
SYNCLOC=/bin/sync
CC="cc"
COMM=
GLOBUSDIR=@GLOBUSDIR@
CLINKER="cc"
prefix=/u/mpich1.2.4
bindir=/u/mpich1.2.4/bin

-----Original Message-----
From: Rajeev Thakur [mailto:thakur at mcs.anl.gov] 
Sent: Thursday, March 15, 2007 2:45 PM
To: Natarajan, Senthil; 'Jan Wagner'
Cc: mpich-discuss at mcs.anl.gov; ashton at mcs.anl.gov
Subject: RE: [MPICH] MPICH Connection Problem

Your mpirun script file should contain the following line if it has been
configured to use ssh.
RSHCOMMAND="ssh" 

Does it?

Rajeev

> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov 
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of 
> Natarajan, Senthil
> Sent: Thursday, March 15, 2007 12:51 PM
> To: Jan Wagner
> Cc: mpich-discuss at mcs.anl.gov; ashton at mcs.anl.gov
> Subject: RE: [MPICH] MPICH Connection Problem
> 
> Hi Jan,
> Thanks for Info.
> I tried configure with  -rsh=RSHCOMMAND then -rsh=ssh and with
> environmental variable P4_RSHCOMMAND, RSHCOMMAND. 
> 
> And all the combinations which you suggested but nothing seems to be
> working still I am having connection refused problem.
> 
> mpirun -v -np 2 -machinefile machines tspRunOneBranch randomOut10.txt
> running /home/condor-nobody/teststuff/tspRunOneBranch on 2 LINUX ch_p4
> processors
> Created /home/condor-nobody/teststuff/PI20605
> connect to address xxx.xx.xxx.95: Connection refused
> Trying krb4 rsh...
> connect to address xxx.xx.xxx.95: Connection refused
> trying normal rsh (/usr/bin/rsh)
> machine2: Connection refused
> p0_20689:  p4_error: Timeout in making connection to remote process on
> machine2: 0
> 
> The problem is, it is not even contacting the other machine (I am
> watching the network activity on other machine) but it says connection
> refused to other machine. I am not sure why it is not using ssh, even
> though I configured with the option, compiled and installed. 
> Even trying
> with the above environmental variables to set ssh.
> 
> I have the iptables on, but I am not seeing any connection 
> drop between
> the two machines in the system log.
> 
> Thanks,
> Senthil
> 
> 
> 
> 
> 
> -----Original Message-----
> From: Jan Wagner [mailto:jwagner at kurp.hut.fi] 
> Sent: Thursday, March 15, 2007 12:37 PM
> To: Natarajan, Senthil
> Cc: mpich-discuss at mcs.anl.gov; ashton at mcs.anl.gov
> Subject: Re: [MPICH] MPICH Connection Problem
> 
> Hi,
> 
> On Thu, 15 Mar 2007, Natarajan, Senthil wrote:
> > I am using MPICH1.2.4 on Linux. I installed with the option 
> -rsh=ssh.
> >
> > After successfully installed, I am trying to run a simple 
> mpi job with
> > the two machines.
> >
> > I have generated the key pair (ssh-keygen) and copied to other
> machine,
> > and I can ssh between the machines with out password.
> >
> > I am trying to run a simple mpi job, but it with out trying 
> to connect
> 
> > other machine, complains about connection refused.
> 
> Just a thought, but if you check with e.g. 'ps ax' what processes are 
> started, you could see with what ssh parameters mpich tries to execute
> the 
> remote programs.
> 
> But ok, at least from your mpich output it looks like it is 
> still trying
> 
> to use old rsh instead of ssh.
> 
> Try setting
> $ export P4_RSHCOMMAND=ssh
> $ export RSHCOMMAND=ssh
> 
> and do the mpirun again. Then it should use ssh. If not, try 
> configuring
> 
> and compiling again, this time with -rsh=RSHCOMMAND.
> 
> (The 1.2.5 and 1.2.7 configure/compile is a bit strange, when I 
> too compiled with -rsh=ssh a few days ago it did not want to use ssh. 
> Compiling with -rsh=RSHCOMMAND complained to me near the compile end
> that 
> I should not use this "old" option, but use -rsh=ssh instead. But 
> the complained about RSHCOMMAND works! In contrast to the 
> "new" option. 
> Odd. Well go figure... ;-) )
> 
> Oh and also note that you'd probably need to stop 
> iptables/ipchains, or 
> configure them properly, as with ssh mpich tries to set up 
> some ssh port
> 
> forwarding / port tunneling. Connections will time out.
> 
>   - Jan
> 
> 




More information about the mpich-discuss mailing list