[Swift-devel] ssh test case on pads/beagle

Michael Wilde wilde at mcs.anl.gov
Thu Aug 11 08:57:36 CDT 2011


Mihael, Ive never seen sites.xml entries showing up in the log - are they supposed to be now? They are not in the log Alberto attached, nor have I seen them in any other log yet. 

Can we log all the files mentioned in the command line report (the first line of the log) right at the front, along with the source text? Ie, script, tc, sites, and config?  Ideally values for all of the swift.properties? Ideally auth.defaults with suitable masking?  0.94 feature?

>> 2011-08-11 08:28:03,762-0500 DEBUG Loader arguments: [001-catsn-ssh.swift, -tc.file, tc.template.data, -sites.file, sites.template.xml, -config, cf] 

Alberto, stop by and we can try to debug this in person, as ssh requires a fair bit of correct configuration to work. 

We need to look at the cf, sites.template.xml, and cf file.

- Mike 


----- Original Message ----- 


From: "Alberto Chavez" <alberto_chavez at live.com> 
To: "Mihael Hategan" <hategan at mcs.anl.gov> 
Cc: "Swift Devel" <swift-devel at ci.uchicago.edu> 
Sent: Thursday, August 11, 2011 8:31:53 AM 
Subject: Re: [Swift-devel] ssh test case on pads/beagle 


Sure, attached are the output of stdout and stderror, and the log generated by swift. 



> Subject: RE: [Swift-devel] ssh test case on pads/beagle 
> From: hategan at mcs.anl.gov 
> To: alberto_chavez at live.com 
> CC: jonmon at mcs.anl.gov; swift-devel at ci.uchicago.edu 
> Date: Thu, 11 Aug 2011 00:18:07 -0700 
> 
> Can you post (a link to) the entire log file? Since it contains both the 
> tc.data and sites.xml and the error, it's probably always better to post 
> than individual snippets. 
> 
> On Thu, 2011-08-11 at 01:17 -0500, Alberto Chavez wrote: 
> > Sure: 
> > 
> > <config> 
> > <pool handle="ssh"> 
> > <execution provider="ssh" url="steamroller" jobmanager="ssh"/> 
> > <filesystem provider="ssh" url="steamroller" /> 
> > <profile key="jobThrottle" namespace="karajan">0</profile> 
> > <workdirectory>/home/achavez/swiftwork</workdirectory> 
> > </pool> 
> > </config> 
> > 
> > 
> > ______________________________________________________________________ 
> > To: alberto_chavez at live.com 
> > From: jonmon at mcs.anl.gov 
> > CC: hategan at mcs.anl.gov; swift-devel at ci.uchicago.edu 
> > Subject: Re: [Swift-devel] ssh test case on pads/beagle 
> > Date: Wed, 10 Aug 2011 23:54:24 -0500 
> > 
> > Could you post the sites file? 
> > 
> > ----- Reply message ----- 
> > From: "Alberto Chavez" <alberto_chavez at live.com> 
> > Date: Wed, Aug 10, 2011 7:16 pm 
> > Subject: [Swift-devel] ssh test case on pads/beagle 
> > To: <jonmon at mcs.anl.gov> 
> > Cc: "Mihael Hategan" <hategan at mcs.anl.gov>, "Swift Devel" 
> > <swift-devel at ci.uchicago.edu> 
> > 
> > 
> > 
> > Exit code "127" normally means that a particular function doesn't 
> > exist. Are you sure that all those paths to apps exist? 
> > > Yes, I doubled check that and those are the right paths to the apps. 
> > 
> > 
> > Also, I am not sure if this is a problem but shouldn't there be a 
> > third column in the app file? LIke 
> > "ssh echo /bin/echo null null null" 
> > 
> > 
> > 
> > 
> > Looking at the documentation for the transformation catalog, the 
> > structure should be: 
> > 
> > site, transformation name, executable path, installation status, 
> > platform, and profile entries. 
> > 
> > 
> > 
> > 
> > 
> > The installation status and platform fields are not used. Set them 
> > to INSTALLED and INTEL32::LINUX respectively. 
> > 
> > The profiles field should be set to null if no profile entries are to 
> > be specified, or should contain the profile entries separated by 
> > semicolons. 
> > 
> > 
> > but even when I switch the columns to INSTALLED and INTEL32::LINUX and 
> > keep the profiles field set to null, I'm still getting the same exit 
> > code. 
> > 
> > 
> > On Aug 10, 2011, at 6:41 PM, Alberto Chavez wrote: 
> > 
> > I changed my ssh-key, and they worked on the MCS machines 
> > because the authorized_keys file has not been updated yet on 
> > the CI Machines. 
> > I created a new ssh-key using: 
> > ssh-keygen -t rsa -b 2048 
> > exactly as the MCS site suggested, 
> > On the other hand, I still have a problem, I am getting the 
> > following error: 
> > 
> > 
> > Swift svn swift-r4978 cog-r3226 
> > 
> > 
> > RunID: 20110810-1819-1cdo2o62 
> > Progress: time: Wed, 10 Aug 2011 18:19:42 -0500 
> > Exception in cat: 
> > Arguments: [data.txt] 
> > Host: ssh 
> > Directory: 
> > 001-catsn-ssh-20110810-1819-1cdo2o62/jobs/9/cat-9jd0g9ek 
> > - - - 
> > Caused by: null 
> > Caused by: 
> > org.globus.cog.abstraction.impl.common.execution.JobException: 
> > Job failed with an exit code of 127 
> > Final status: time: Wed, 10 Aug 2011 18:20:00 -0500 
> > Failed:10 
> > The following errors have occurred: 
> > 1. Job failed with an exit code of 127 (10 times) 
> > 
> > 
> > 
> > 
> > These are the contents of the log: 
> > 
> > 
> > Execution completed with errors 
> > 
> > 
> > 2011-08-10 18:19:43,251-0500 INFO ConnectionProtocol Freeing 
> > channel 0 [Unnamed Channel] 
> > 2011-08-10 18:19:43,263-0500 INFO Exec Exit code 127 
> > 2011-08-10 18:19:43,269-0500 INFO ConnectionProtocol Freeing 
> > channel 0 [Unnamed Channel] 
> > 2011-08-10 18:19:43,277-0500 DEBUG vdl:execute2 
> > APPLICATION_EXCEPTION jobid=cat-9jd0g9ek - Application 
> > exception: null 
> > Caused by: 
> > org.globus.cog.abstraction.impl.common.execution.JobException: 
> > Job failed with an exit code of 127 
> > 2011-08-10 18:19:43,280-0500 INFO vdl:execute END_FAILURE 
> > thread=0-5-3-1 tr=cat 
> > 2011-08-10 18:19:43,281-0500 INFO vdl:execute Exception in 
> > cat: 
> > at 
> > org.globus.cog.karajan.workflow.nodes.FlowNode.fail(FlowNode.java:250) 
> > at 
> > org.globus.cog.karajan.workflow.nodes.FlowNode.fail(FlowNode.java:254) 
> > at 
> > org.globus.cog.karajan.workflow.nodes.GenerateErrorNode.post(GenerateErrorNode.java:27) 
> > at 
> > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) 
> > at 
> > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) 
> > at 
> > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) 
> > at 
> > org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28) 
> > at 
> > org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) 
> > at 
> > org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) 
> > at 
> > org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) 
> > at 
> > org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) 
> > at 
> > org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) 
> > at 
> > org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) 
> > at 
> > org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) 
> > at 
> > org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) 
> > at java.util.concurrent.Executors 
> > $RunnableAdapter.call(Executors.java:471) 
> > at java.util.concurrent.FutureTask 
> > $Sync.innerRun(FutureTask.java:334) 
> > at 
> > java.util.concurrent.FutureTask.run(FutureTask.java:166) 
> > at 
> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) 
> > at java.util.concurrent.ThreadPoolExecutor 
> > $Worker.run(ThreadPoolExecutor.java:603) 
> > at java.lang.Thread.run(Thread.java:636) 
> > 2011-08-10 18:20:00,332-0500 INFO ExecutionContext Detailed 
> > exception: 
> > 
> > 
> > Execution completed with errors 
> > 
> > 
> > at 
> > org.globus.cog.karajan.workflow.nodes.FlowNode.fail(FlowNode.java:250) 
> > at 
> > org.globus.cog.karajan.workflow.nodes.FlowNode.fail(FlowNode.java:254) 
> > at 
> > org.globus.cog.karajan.workflow.nodes.GenerateErrorNode.post(GenerateErrorNode.java:27) 
> > at 
> > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) 
> > at 
> > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) 
> > at 
> > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) 
> > at 
> > org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28) 
> > at 
> > org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) 
> > at 
> > org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) 
> > at 
> > org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) 
> > at 
> > org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) 
> > at 
> > org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) 
> > at 
> > org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) 
> > at 
> > org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) 
> > at 
> > org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) 
> > at java.util.concurrent.Executors 
> > $RunnableAdapter.call(Executors.java:471) 
> > at java.util.concurrent.FutureTask 
> > $Sync.innerRun(FutureTask.java:334) 
> > at 
> > java.util.concurrent.FutureTask.run(FutureTask.java:166) 
> > at 
> > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) 
> > at java.util.concurrent.ThreadPoolExecutor 
> > $Worker.run(ThreadPoolExecutor.java:603) 
> > at java.lang.Thread.run(Thread.java:636) 
> > 
> > I believe that the problem resides on the TC file because when 
> > I run a much simpler SwiftScript like: 
> > 
> > 
> > int i = 9; 
> > trace(i); 
> > 
> > 
> > I get the following output: 
> > 
> > 
> > swift traceme.swift -tc.file tc.template.data 
> > -sites.file sites.template.xml -config cf 
> > Swift svn swift-r4978 cog-r3226 
> > 
> > 
> > RunID: 20110810-1832-buktjj3d 
> > Progress: time: Wed, 10 Aug 2011 18:32:30 -0500 
> > SwiftScript trace: 9.0 
> > Final status: time: Wed, 10 Aug 2011 18:32:30 -0500 
> > 
> > 
> > but as soon as I start using the commands stated the TC file, 
> > I get the "exit code 127" 
> > 
> > 
> > My tc file reads: 
> > 
> > 
> > ssh echo /bin/echo null null 
> > ssh cat /bin/cat null null 
> > ssh ls /bin/ls null null 
> > ssh grep /bin/grep null null 
> > ssh sort /bin/sort null null 
> > ssh paste /bin/paste null null 
> > ssh wc /usr/bin/wc null null 
> > 
> > 
> > I am working on the login node of the MCS machine trying to 
> > ssh via Swift to steamroller. 
> > 
> > 
> > > Subject: Re: [Swift-devel] ssh test case on pads/beagle 
> > > From: hategan at mcs.anl.gov 
> > > To: alberto_chavez at live.com 
> > > CC: swift-devel at ci.uchicago.edu 
> > > Date: Tue, 9 Aug 2011 11:57:06 -0700 
> > > 
> > > Hmm: Unsupported passphrase algorithm: AES-128-CBC 
> > > 
> > > I'll try to see how that can be fixed. In the mean time, can 
> > you 
> > > generate a new key pair with 3DES encryption instead and use 
> > that? 
> > > 
> > > On Tue, 2011-08-09 at 13:43 -0500, Alberto Chavez wrote: 
> > > > Hello, 
> > > > 
> > > > 
> > > > I am trying to run a simpler case than ssh-pbs-coaster 
> > test case, and 
> > > > I'm still having the same error. 
> > > > Now I am running only ssh test case 
> > > > (/tests/providers/ssh/001-catsn-ssn.swift) 
> > > > 
> > > > 
> > > > The command line is: 
> > > > swift -config cf -tc.file tc.template.data -sites.file 
> > > > sites.template.xml 001-catsn-ssh.swift 
> > > > 
> > > > 
> > > > The output: 
> > > > Swift svn swift-r4861 (swift modified locally) cog-r3183 
> > > > 
> > > > 
> > > > RunID: 20110809-1336-ohte788a 
> > > > Progress: time: Tue, 09 Aug 2011 13:36:42 -0500 
> > > > Exception in cat: 
> > > > Arguments: [data.txt] 
> > > > Host: ssh 
> > > > Directory: 
> > 001-catsn-ssh-20110809-1336-ohte788a/jobs/m/cat-mq74h7ek 
> > > > - - - 
> > > > 
> > > > 
> > > > Caused by: null 
> > > > Caused by: 
> > > > 
> > org.globus.cog.abstraction.impl.common.task.InvalidSecurityContextException: Invalid private key or passphrase 
> > > > Caused by: 
> > > > 
> > com.sshtools.j2ssh.transport.publickey.InvalidSshKeyException: 
> > Can't 
> > > > read key due to cryptography problems: 
> > > > java.security.NoSuchAlgorithmException: Unsupported 
> > passphrase 
> > > > algorithm: AES-128-CBC 
> > > > Progress: time: Tue, 09 Aug 2011 13:36:43 -0500 Selecting 
> > site:8 
> > > > Submitting:1 Failed:1 
> > > > Exception in cat: 
> > > > Arguments: [data.txt] 
> > > > Host: ssh 
> > > > Directory: 
> > 001-catsn-ssh-20110809-1336-ohte788a/jobs/n/cat-nq74h7ek 
> > > > - - - 
> > > > 
> > > > 
> > > > Caused by: null 
> > > > Caused by: 
> > > > 
> > org.globus.cog.abstraction.impl.common.task.InvalidSecurityContextException: Invalid private key or passphrase 
> > > > Caused by: 
> > > > 
> > com.sshtools.j2ssh.transport.publickey.InvalidSshKeyException: 
> > Can't 
> > > > read key due to cryptography problems: 
> > > > java.security.NoSuchAlgorithmException: Unsupported 
> > passphrase 
> > > > algorithm: AES-128-CBC 
> > > > Progress: time: Tue, 09 Aug 2011 13:36:44 -0500 Selecting 
> > site:7 
> > > > Submitting:1 Failed:2 
> > > > Exception in cat: 
> > > > Arguments: [data.txt] 
> > > > Host: ssh 
> > > > Directory: 
> > 001-catsn-ssh-20110809-1336-ohte788a/jobs/o/cat-oq74h7ek 
> > > > - - - 
> > > > 
> > > > 
> > > > Caused by: null 
> > > > Caused by: 
> > > > 
> > org.globus.cog.abstraction.impl.common.task.InvalidSecurityContextException: Invalid private key or passphrase 
> > > > Caused by: 
> > > > 
> > com.sshtools.j2ssh.transport.publickey.InvalidSshKeyException: 
> > Can't 
> > > > read key due to cryptography problems: 
> > > > java.security.NoSuchAlgorithmException: Unsupported 
> > passphrase 
> > > > algorithm: AES-128-CBC 
> > > > "error_log.log" 105L, 5770C 
> > > > 
> > > > 
> > > > My auth.defaults reads: 
> > > > 
> > > > 
> > > > login1.beagle.ci.uchicago.edu.type=key 
> > > > login1.beagle.ci.uchicago.edu.username=achavez 
> > > > 
> > login1.beagle.ci.uchicago.edu.key=/home/Alberto/.ssh/identity 
> > > > 
> > > > 
> > > > login1.pads.ci.uchicago.edu.type=key 
> > > > login1.pads.ci.uchicago.edu.username=achavez 
> > > > 
> > login1.pads.ci.uchicago.edu.key=/home/Alberto/.ssh/identity 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > and it has been set to 600, I ommited the passphrase line, 
> > but it is 
> > > > there, and the passphrase is right because I just verified 
> > it in two 
> > > > ways: 
> > > > 1) by logging to pads and beagle without providing a 
> > password 
> > > > 2) "changed" the password. I the "new" password is the 
> > same as the 
> > > > "old" one. 
> > > > 
> > > > sites.templates.xml: 
> > > > 
> > > > <config> 
> > > > <pool handle="ssh"> 
> > > > <execution provider="ssh" 
> > url="login1.pads.ci.uchicago.edu" 
> > > > jobmanager="ssh"/> 
> > > > <filesystem provider="ssh" 
> > url="login1.pads.ci.uchicago.edu" /> 
> > > > <profile key="jobThrottle" namespace="karajan">0</profile> 
> > > > <workdirectory>/home/achavez/swiftwork</workdirectory> 
> > > > </pool> 
> > > > </config> 
> > > > 
> > > > 
> > > > config file: 
> > > > 
> > > > wrapperlog.always.transfer=true 
> > > > sitedir.keep=true 
> > > > execution.retries=0 
> > > > lazy.errors=true 
> > > > status.mode=provider 
> > > > use.provider.staging=true 
> > > > provider.staging.pin.swiftfiles=false 
> > > > foreach.max.threads=10 
> > > > provenance.log=true 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > I also tried a simpler SwiftScript: 
> > > > 
> > > > 
> > > > type filemsg; 
> > > > 
> > > > 
> > > > app (filemsg output) hello(string s) 
> > > > { 
> > > > echo s stdout=@filename(output); 
> > > > } 
> > > > 
> > > > 
> > > > filemsg myfile<"dogcatdinosaur.out">; 
> > > > myfile = hello("dog,cat,dinosaur"); 
> > > > 
> > > > 
> > > > and I get the following output: 
> > > > 
> > > > 
> > > > Swift svn swift-r4861 (swift modified locally) cog-r3183 
> > > > 
> > > > 
> > > > RunID: 20110809-1343-2es2hel2 
> > > > Progress: time: Tue, 09 Aug 2011 13:43:25 -0500 
> > > > Exception in echo: 
> > > > Arguments: [dog,cat,dinosaur] 
> > > > Host: ssh 
> > > > Directory: 
> > hello_swift-20110809-1343-2es2hel2/jobs/0/echo-0oldh7ek 
> > > > - - - 
> > > > 
> > > > 
> > > > Caused by: null 
> > > > Caused by: 
> > > > 
> > org.globus.cog.abstraction.impl.common.task.InvalidSecurityContextException: Invalid private key or passphrase 
> > > > Caused by: 
> > > > 
> > com.sshtools.j2ssh.transport.publickey.InvalidSshKeyException: 
> > Can't 
> > > > read key due to cryptography problems: 
> > > > java.security.NoSuchAlgorithmException: Unsupported 
> > passphrase 
> > > > algorithm: AES-128-CBC 
> > > > Final status: time: Tue, 09 Aug 2011 13:43:26 -0500 
> > Failed:1 
> > > > The following errors have occurred: 
> > > > 1. Can't read key due to cryptography problems: 
> > > > java.security.NoSuchAlgorithmException: Unsupported 
> > passphrase 
> > > > algorithm: AES-128-CBC 
> > > > 
> > > > 
> > > > 
> > > > 
> > > > any thoughts on this? 
> > > > _______________________________________________ 
> > > > Swift-devel mailing list 
> > > > Swift-devel at ci.uchicago.edu 
> > > 
> > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel 
> > > 
> > > 
> > 
> > _______________________________________________ 
> > Swift-devel mailing list 
> > Swift-devel at ci.uchicago.edu 
> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel 
> > 
> > 
> > 
> 
> 

_______________________________________________ 
Swift-devel mailing list 
Swift-devel at ci.uchicago.edu 
https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel 



-- 
Michael Wilde 
Computation Institute, University of Chicago 
Mathematics and Computer Science Division 
Argonne National Laboratory 




More information about the Swift-devel mailing list