[MPICH] MPICH Connection Problem
Rajeev Thakur
thakur at mcs.anl.gov
Thu Mar 15 13:45:00 CDT 2007
Your mpirun script file should contain the following line if it has been
configured to use ssh.
RSHCOMMAND="ssh"
Does it?
Rajeev
> -----Original Message-----
> From: owner-mpich-discuss at mcs.anl.gov
> [mailto:owner-mpich-discuss at mcs.anl.gov] On Behalf Of
> Natarajan, Senthil
> Sent: Thursday, March 15, 2007 12:51 PM
> To: Jan Wagner
> Cc: mpich-discuss at mcs.anl.gov; ashton at mcs.anl.gov
> Subject: RE: [MPICH] MPICH Connection Problem
>
> Hi Jan,
> Thanks for Info.
> I tried configure with -rsh=RSHCOMMAND then -rsh=ssh and with
> environmental variable P4_RSHCOMMAND, RSHCOMMAND.
>
> And all the combinations which you suggested but nothing seems to be
> working still I am having connection refused problem.
>
> mpirun -v -np 2 -machinefile machines tspRunOneBranch randomOut10.txt
> running /home/condor-nobody/teststuff/tspRunOneBranch on 2 LINUX ch_p4
> processors
> Created /home/condor-nobody/teststuff/PI20605
> connect to address xxx.xx.xxx.95: Connection refused
> Trying krb4 rsh...
> connect to address xxx.xx.xxx.95: Connection refused
> trying normal rsh (/usr/bin/rsh)
> machine2: Connection refused
> p0_20689: p4_error: Timeout in making connection to remote process on
> machine2: 0
>
> The problem is, it is not even contacting the other machine (I am
> watching the network activity on other machine) but it says connection
> refused to other machine. I am not sure why it is not using ssh, even
> though I configured with the option, compiled and installed.
> Even trying
> with the above environmental variables to set ssh.
>
> I have the iptables on, but I am not seeing any connection
> drop between
> the two machines in the system log.
>
> Thanks,
> Senthil
>
>
>
>
>
> -----Original Message-----
> From: Jan Wagner [mailto:jwagner at kurp.hut.fi]
> Sent: Thursday, March 15, 2007 12:37 PM
> To: Natarajan, Senthil
> Cc: mpich-discuss at mcs.anl.gov; ashton at mcs.anl.gov
> Subject: Re: [MPICH] MPICH Connection Problem
>
> Hi,
>
> On Thu, 15 Mar 2007, Natarajan, Senthil wrote:
> > I am using MPICH1.2.4 on Linux. I installed with the option
> -rsh=ssh.
> >
> > After successfully installed, I am trying to run a simple
> mpi job with
> > the two machines.
> >
> > I have generated the key pair (ssh-keygen) and copied to other
> machine,
> > and I can ssh between the machines with out password.
> >
> > I am trying to run a simple mpi job, but it with out trying
> to connect
>
> > other machine, complains about connection refused.
>
> Just a thought, but if you check with e.g. 'ps ax' what processes are
> started, you could see with what ssh parameters mpich tries to execute
> the
> remote programs.
>
> But ok, at least from your mpich output it looks like it is
> still trying
>
> to use old rsh instead of ssh.
>
> Try setting
> $ export P4_RSHCOMMAND=ssh
> $ export RSHCOMMAND=ssh
>
> and do the mpirun again. Then it should use ssh. If not, try
> configuring
>
> and compiling again, this time with -rsh=RSHCOMMAND.
>
> (The 1.2.5 and 1.2.7 configure/compile is a bit strange, when I
> too compiled with -rsh=ssh a few days ago it did not want to use ssh.
> Compiling with -rsh=RSHCOMMAND complained to me near the compile end
> that
> I should not use this "old" option, but use -rsh=ssh instead. But
> the complained about RSHCOMMAND works! In contrast to the
> "new" option.
> Odd. Well go figure... ;-) )
>
> Oh and also note that you'd probably need to stop
> iptables/ipchains, or
> configure them properly, as with ssh mpich tries to set up
> some ssh port
>
> forwarding / port tunneling. Connections will time out.
>
> - Jan
>
>
More information about the mpich-discuss
mailing list