[Swift-devel] ssh:pbs to beagle

Michael Wilde wilde at mcs.anl.gov
Thu Apr 28 11:19:15 CDT 2011


Have you already run a simple hellow-world swift test from communicado to bridled to make sure you have ssh configured correctly? I would do that first.

Im not sure if an ssh problem explains what you show below, or not.

- Mike

----- Original Message -----
> Thanks, I made the change. However, now, I am getting the following on
> my stderr
> 
> 
> ===========
> [ketan at bridled catsn.works]$ swift -config cf -tc.file tc -sites.file
> beagle-coaster-ssh-pbs.xml catsn.swift -n=1
> Swift svn swift-r4252 (swift modified locally) cog-r3088 (cog modified
> locally)
> 
> RunID: 20110428-1022-n9s0k0e0
> Progress:
> [ketan]
> Progress: Initializing site shared directory:1
> [ketan] Progress: Initializing site shared directory:1
> Progress: Initializing site shared directory:1
> Progress: Initializing site shared directory:1
> Progress: Initializing site shared directory:1
> Progress: Initializing site shared directory:1
> Progress: Initializing site shared directory:1
> Progress: Initializing site shared directory:1
> Progress: Initializing site shared directory:1
> Progress: Initializing site shared directory:1
> Progress: Initializing site shared directory:1
> Progress: Initializing site shared directory:1
> Progress: Initializing site shared directory:1
> Progress: Initializing site shared directory:1
> ========
> 
> And from the log it seems some network transmission has failed:
> 
> 2011-04-28 10:22:45,261-0500 INFO TransportProtocolCommon Sending
> SSH_MSG_SERVICE_REQUEST
> 2011-04-28 10:22:45,264-0500 INFO TransportProtocolCommon Received
> SSH_MSG_SERVICE_ACCEPT
> 2011-04-28 10:24:27,626-0500 INFO TransportProtocolCommon The
> Transport Protocol thread failed
> java.io.IOException: The socket is EOF
> at
> com.sshtools.j2ssh.transport.TransportProtocolInputStream.readBufferedData(TransportProtocolInputStream.java:183)
> at
> com.sshtools.j2ssh.transport.TransportProtocolInputStream.readMessage(TransportProtocolInputStream.java:226)
> at
> com.sshtools.j2ssh.transport.TransportProtocolCommon.processMessages(TransportProtocolCommon.java:1440)
> at
> com.sshtools.j2ssh.transport.TransportProtocolCommon.startBinaryPacketProtocol(TransportProtocolCommon.java:1034)
> at
> com.sshtools.j2ssh.transport.TransportProtocolCommon.run(TransportProtocolCommon.java:393)
> at java.lang.Thread.run(Thread.java:662)
> 
> 
> Any clues?
> Ketan
> 
> 
> On Apr 28, 2011, at 10:20 AM, Michael Wilde wrote:
> 
> > The pool name in your sites file is pads-remote-pbs-coasters-ssh but
> > you used pbs in your tc.data.
> >
> > - Mike
> >
> > ----- Original Message -----
> >> Hello,
> >>
> >> Some context:
> >> I am trying to submit a big run on Beagle using swift + coasters.
> >> However, a previous run is already underway on beagle. So, there
> >> are
> >> two difficulties running a new run from its login node:
> >>
> >> 1. Running another swift from the same jvm will result in chaos on
> >> the
> >> logs (As far as I know, please correct me if this is not the case
> >> anymore)
> >>
> >> 2. Login node is already under load because of my running previous
> >> big
> >> run.
> >>
> >> /context
> >>
> >> So, I am now trying to submit this big run from a remote host
> >> (bridled). I know this has been done on PADS using ssh:pbs,
> >> provider
> >> coaster. I tried the similar approach on a trial swift script but
> >> getting error.
> >>
> >> Following is the error message:
> >>
> >> [ketan at bridled catsn.works]$ swift -config cf -tc.file tc
> >> -sites.file
> >> beagle-coaster-ssh-pbs.xml catsn.swift -n=1
> >> Swift svn swift-r4252 (swift modified locally) cog-r3088 (cog
> >> modified
> >> locally)
> >>
> >> RunID: 20110428-1002-c8rvqhe6
> >> Progress:
> >> The application "cat" is not available in your tc.data catalog
> >> Caused by: org.globus.cog.karajan.scheduler.NoSuchResourceException
> >> Final status: Failed:1
> >> The following errors have occurred:
> >> 1. The application "cat" is not available in your tc.data catalog
> >>
> >>
> >> Attached are my .swift, sites.xml and tc.data files.
> >>
> >> Could someone indicate if what I am doing is doable and if so how
> >> can
> >> I correctly configure my sites and tc setup.
> >>
> >> Thanks.
> >> Ketan
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> Swift-devel mailing list
> >> Swift-devel at ci.uchicago.edu
> >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel
> >
> > --
> > Michael Wilde
> > Computation Institute, University of Chicago
> > Mathematics and Computer Science Division
> > Argonne National Laboratory
> >

-- 
Michael Wilde
Computation Institute, University of Chicago
Mathematics and Computer Science Division
Argonne National Laboratory




More information about the Swift-devel mailing list