[Swift-devel] ssh:pbs to beagle

Mihael Hategan hategan at mcs.anl.gov
Thu Apr 28 12:22:42 CDT 2011


You could omit the passphrase and you'd be asked for it. I think.

On Thu, 2011-04-28 at 12:00 -0500, Michael Wilde wrote:
> OK. Was there a cookbook on the ssh settings? Did you set up a $HOME/.ssh/auth.defaults per the user guide?
> 
> Here is an auth.defaults example. Im not sure its 100% correct, but could serve as a base for you:
> 
> xlogin1.pads.ci.uchicago.edu.type=password
> xlogin1.pads.ci.uchicago.edu.username=wilde
> 
> login.pads.ci.uchicago.edu.type=key
> login.pads.ci.uchicago.edu.username=wilde
> login.pads.ci.uchicago.edu.key=/home/wilde/.ssh/swift_rsa
> login.pads.ci.uchicago.edu.passphrase=yourpassphrasehere # MAKE SURE mode=600!!!
> 
> login1.pads.ci.uchicago.edu.type=key
> login1.pads.ci.uchicago.edu.username=wilde
> login1.pads.ci.uchicago.edu.key=/home/wilde/.ssh/swift_rsa
> login1.pads.ci.uchicago.edu.passphrase=yourpassphrasehere # MAKE SURE mode=600!!!
> 
> login.mcs.anl.gov.type=key
> login.mcs.anl.gov.username=wilde
> login.mcs.anl.gov.key=/home/wilde/.ssh/swift_rsa
> login.mcs.anl.gov.passphrase=yourpassphrasehere # MAKE SURE mode=600!!!
> 
> - Mike
> 
> ----- Original Message -----
> > It does look like an ssh problem. I am getting the same stderr and log
> > messages on trying to communicate from Bridled to Communicado.
> > 
> > Ketan
> > 
> > On Apr 28, 2011, at 11:19 AM, Michael Wilde wrote:
> > 
> > > Have you already run a simple hellow-world swift test from
> > > communicado to bridled to make sure you have ssh configured
> > > correctly? I would do that first.
> > >
> > > Im not sure if an ssh problem explains what you show below, or not.
> > >
> > > - Mike
> > >
> > > ----- Original Message -----
> > >> Thanks, I made the change. However, now, I am getting the following
> > >> on
> > >> my stderr
> > >>
> > >>
> > >> ===========
> > >> [ketan at bridled catsn.works]$ swift -config cf -tc.file tc
> > >> -sites.file
> > >> beagle-coaster-ssh-pbs.xml catsn.swift -n=1
> > >> Swift svn swift-r4252 (swift modified locally) cog-r3088 (cog
> > >> modified
> > >> locally)
> > >>
> > >> RunID: 20110428-1022-n9s0k0e0
> > >> Progress:
> > >> [ketan]
> > >> Progress: Initializing site shared directory:1
> > >> [ketan] Progress: Initializing site shared directory:1
> > >> Progress: Initializing site shared directory:1
> > >> Progress: Initializing site shared directory:1
> > >> Progress: Initializing site shared directory:1
> > >> Progress: Initializing site shared directory:1
> > >> Progress: Initializing site shared directory:1
> > >> Progress: Initializing site shared directory:1
> > >> Progress: Initializing site shared directory:1
> > >> Progress: Initializing site shared directory:1
> > >> Progress: Initializing site shared directory:1
> > >> Progress: Initializing site shared directory:1
> > >> Progress: Initializing site shared directory:1
> > >> Progress: Initializing site shared directory:1
> > >> ========
> > >>
> > >> And from the log it seems some network transmission has failed:
> > >>
> > >> 2011-04-28 10:22:45,261-0500 INFO TransportProtocolCommon Sending
> > >> SSH_MSG_SERVICE_REQUEST
> > >> 2011-04-28 10:22:45,264-0500 INFO TransportProtocolCommon Received
> > >> SSH_MSG_SERVICE_ACCEPT
> > >> 2011-04-28 10:24:27,626-0500 INFO TransportProtocolCommon The
> > >> Transport Protocol thread failed
> > >> java.io.IOException: The socket is EOF
> > >> at
> > >> com.sshtools.j2ssh.transport.TransportProtocolInputStream.readBufferedData(TransportProtocolInputStream.java:183)
> > >> at
> > >> com.sshtools.j2ssh.transport.TransportProtocolInputStream.readMessage(TransportProtocolInputStream.java:226)
> > >> at
> > >> com.sshtools.j2ssh.transport.TransportProtocolCommon.processMessages(TransportProtocolCommon.java:1440)
> > >> at
> > >> com.sshtools.j2ssh.transport.TransportProtocolCommon.startBinaryPacketProtocol(TransportProtocolCommon.java:1034)
> > >> at
> > >> com.sshtools.j2ssh.transport.TransportProtocolCommon.run(TransportProtocolCommon.java:393)
> > >> at java.lang.Thread.run(Thread.java:662)
> > >>
> > >>
> > >> Any clues?
> > >> Ketan
> > >>
> > >>
> > >> On Apr 28, 2011, at 10:20 AM, Michael Wilde wrote:
> > >>
> > >>> The pool name in your sites file is pads-remote-pbs-coasters-ssh
> > >>> but
> > >>> you used pbs in your tc.data.
> > >>>
> > >>> - Mike
> > >>>
> > >>> ----- Original Message -----
> > >>>> Hello,
> > >>>>
> > >>>> Some context:
> > >>>> I am trying to submit a big run on Beagle using swift + coasters.
> > >>>> However, a previous run is already underway on beagle. So, there
> > >>>> are
> > >>>> two difficulties running a new run from its login node:
> > >>>>
> > >>>> 1. Running another swift from the same jvm will result in chaos
> > >>>> on
> > >>>> the
> > >>>> logs (As far as I know, please correct me if this is not the case
> > >>>> anymore)
> > >>>>
> > >>>> 2. Login node is already under load because of my running
> > >>>> previous
> > >>>> big
> > >>>> run.
> > >>>>
> > >>>> /context
> > >>>>
> > >>>> So, I am now trying to submit this big run from a remote host
> > >>>> (bridled). I know this has been done on PADS using ssh:pbs,
> > >>>> provider
> > >>>> coaster. I tried the similar approach on a trial swift script but
> > >>>> getting error.
> > >>>>
> > >>>> Following is the error message:
> > >>>>
> > >>>> [ketan at bridled catsn.works]$ swift -config cf -tc.file tc
> > >>>> -sites.file
> > >>>> beagle-coaster-ssh-pbs.xml catsn.swift -n=1
> > >>>> Swift svn swift-r4252 (swift modified locally) cog-r3088 (cog
> > >>>> modified
> > >>>> locally)
> > >>>>
> > >>>> RunID: 20110428-1002-c8rvqhe6
> > >>>> Progress:
> > >>>> The application "cat" is not available in your tc.data catalog
> > >>>> Caused by:
> > >>>> org.globus.cog.karajan.scheduler.NoSuchResourceException
> > >>>> Final status: Failed:1
> > >>>> The following errors have occurred:
> > >>>> 1. The application "cat" is not available in your tc.data catalog
> > >>>>
> > >>>>
> > >>>> Attached are my .swift, sites.xml and tc.data files.
> > >>>>
> > >>>> Could someone indicate if what I am doing is doable and if so how
> > >>>> can
> > >>>> I correctly configure my sites and tc setup.
> > >>>>
> > >>>> Thanks.
> > >>>> Ketan
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> _______________________________________________
> > >>>> Swift-devel mailing list
> > >>>> Swift-devel at ci.uchicago.edu
> > >>>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> > >>>
> > >>> --
> > >>> Michael Wilde
> > >>> Computation Institute, University of Chicago
> > >>> Mathematics and Computer Science Division
> > >>> Argonne National Laboratory
> > >>>
> > >
> > > --
> > > Michael Wilde
> > > Computation Institute, University of Chicago
> > > Mathematics and Computer Science Division
> > > Argonne National Laboratory
> > >
> 





More information about the Swift-devel mailing list