From wozniak at mcs.anl.gov Tue Jan 3 09:50:33 2012 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Tue, 3 Jan 2012 09:50:33 -0600 (Central Standard Time) Subject: [Swift-devel] provider.staging.pin.swiftfiles feature In-Reply-To: References: Message-ID: Some notes about this are at: http://www.ci.uchicago.edu/wiki/bin/view/SWFT/CoastersPinned I should move this to the new site and bring it into the guide if it is still useful. The point of this is to avoid recopying small Swift system files into local storage for each job. If you set that property and use Coasters with provider staging, it should just work. The example on that page shows that _swiftwrap.staging is copied into local (RAM?) disk and "pinned"- worker.pl remembers that it is there and does not transfer it again. Justin On Fri, 23 Dec 2011, Ketan Maheshwari wrote: > I was looking into the provider staging trying to address the issue of > having files accessed directly at remote sites skipping provider staging ( > https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=676) . > > I see the option provider.staging.pin.swiftfiles of swift.properties > appears into vdl-int.k as well as worker.pl but couldn't quite got at what > is the functionality. > > Could someone indicate this. -- Justin M Wozniak From jonmon at mcs.anl.gov Thu Jan 5 11:35:13 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Thu, 5 Jan 2012 11:35:13 -0600 Subject: [Swift-devel] ssh template Message-ID: <9F0FDFFF-A999-4059-AB14-7B99AEB028DE@mcs.anl.gov> Mike and Mihael, Yesterday we were talking about a command line execution provider. How would that look in the sites file? For instance how would I start a coaster service in the a sites file with the command line provider? Would it be where ssh2 is the the ssh command line provider. If it was something like this I can then create an ssh master channel in my scripts and then Swift can just follow and use the master channel when starting coasters. Is this how the command line provider will work? How do you both imagine it working? From hategan at mcs.anl.gov Thu Jan 5 13:29:14 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 05 Jan 2012 11:29:14 -0800 Subject: [Swift-devel] ssh template In-Reply-To: <9F0FDFFF-A999-4059-AB14-7B99AEB028DE@mcs.anl.gov> References: <9F0FDFFF-A999-4059-AB14-7B99AEB028DE@mcs.anl.gov> Message-ID: <1325791754.19513.2.camel@blabla> On Thu, 2012-01-05 at 11:35 -0600, Jonathan Monette wrote: > Mike and Mihael, > Yesterday we were talking about a command line execution provider. > How would that look in the sites file? For instance how would I start > a coaster service in the a sites file with the command line provider? > Would it be url="url"> where ssh2 is the the ssh command line provider. Yes, that looks right. > > If it was something like this I can then create an ssh master channel > in my scripts and then Swift can just follow and use the master > channel when starting coasters. Is this how the command line provider > will work? How do you both imagine it working? Pretty much. You might not have to if the OS has an agent started and handles connections already. The same thing would happen as if you typed "ssh url" on the command line. From jonmon at mcs.anl.gov Thu Jan 5 13:34:14 2012 From: jonmon at mcs.anl.gov (jonmon at mcs.anl.gov) Date: Thu, 5 Jan 2012 19:34:14 +0000 Subject: [Swift-devel] ssh template Message-ID: <694698302-1325792056-cardhu_decombobulator_blackberry.rim.net-2014298853-@b15.c3.bise6.blackberry> Ok thanks. Just wanted to make sure that what I was thinking and what you were planning was on the same page. ------Original Message------ From: Mihael Hategan To: Jonathan Monette Cc: Michael Wilde Cc: Swift Devel Subject: Re: ssh template Sent: Jan 5, 2012 1:29 PM On Thu, 2012-01-05 at 11:35 -0600, Jonathan Monette wrote: > Mike and Mihael, > Yesterday we were talking about a command line execution provider. > How would that look in the sites file? For instance how would I start > a coaster service in the a sites file with the command line provider? > Would it be url="url"> where ssh2 is the the ssh command line provider. Yes, that looks right. > > If it was something like this I can then create an ssh master channel > in my scripts and then Swift can just follow and use the master > channel when starting coasters. Is this how the command line provider > will work? How do you both imagine it working? Pretty much. You might not have to if the OS has an agent started and handles connections already. The same thing would happen as if you typed "ssh url" on the command line. From hockyg at uchicago.edu Thu Jan 5 16:05:43 2012 From: hockyg at uchicago.edu (Glen Hocky) Date: Thu, 5 Jan 2012 17:05:43 -0500 Subject: [Swift-devel] failure on stage out [PADS Support #17814] In-Reply-To: References: <8B221B56-80BC-44AA-922A-4AAB29839ED5@ci.uchicago.edu> <1221311096.115198.1325795468969.JavaMail.root@zimbra.anl.gov> Message-ID: Mike, Ti fixed the problem I was having with PBS staging out on pads This reminded me of a small suggestion I had, so I'm copying it to swift devel. Right now job submission scripts are placed in ~/.globus/scripts This eventually makes it very hard to look in ~/.globus/scripts and find a specific submit file if there is ever an error I would suggest putting submit scripts in ~/.globus/scripts/DATE or ~/.globus/scripts/YEAR/MONTH/DAY A secondary benefit is it makes it very easy to clean up this folder by just tarring up each month separately -Glen From wilde at mcs.anl.gov Thu Jan 5 16:12:08 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 5 Jan 2012 16:12:08 -0600 (CST) Subject: [Swift-devel] failure on stage out [PADS Support #17814] In-Reply-To: Message-ID: <188835291.115780.1325801528269.JavaMail.root@zimbra.anl.gov> I like that idea. Maybe we should have the location of the scripts and logs be settable from a swift.config property and/or a provider.properties setting. Ive never likes these files going under .globus, and that almost guarantees them to be unreadable by Swift support people. Although I see the desire for keeping these private by default. I'll open a bugzilla ticket for this. Thanks, Glen. - Mike ----- Original Message ----- > From: "Glen Hocky" > To: "Mike Wilde" , "swift-devel" > Sent: Thursday, January 5, 2012 4:05:43 PM > Subject: Re: failure on stage out [PADS Support #17814] > Mike, > Ti fixed the problem I was having with PBS staging out on pads > This reminded me of a small suggestion I had, so I'm copying it to > swift devel. > > Right now job submission scripts are placed in ~/.globus/scripts > This eventually makes it very hard to look in ~/.globus/scripts and > find a specific submit file if there is ever an error > > I would suggest putting submit scripts in ~/.globus/scripts/DATE or > ~/.globus/scripts/YEAR/MONTH/DAY > > A secondary benefit is it makes it very easy to clean up this folder > by just tarring up each month separately > > -Glen -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From wilde at mcs.anl.gov Fri Jan 6 11:20:17 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 6 Jan 2012 11:20:17 -0600 (CST) Subject: [Swift-devel] Fwd: [pads-users] Switching for SoftEnv to Modules In-Reply-To: Message-ID: <2049326938.118070.1325870417986.JavaMail.root@zimbra.anl.gov> This would be a good time to ensure that we have up to date Swift modules and softenv packages installed on all CI and Argonne machines: - CI login hosts and Swift lab systems (bridled, communicado) - PADS - Beagle - MCS login hosts and compute servers - Fusion - Intrepid, Surveyor, Challenger, Eureka, Gadzooks Mike ----- Forwarded Message ----- From: "Forum for PADS user discussions" To: pads-users at ci.uchicago.edu Sent: Friday, January 6, 2012 10:26:03 AM Subject: [pads-users] Switching for SoftEnv to Modules Because Beagle requires the use of Modules and because FutureGrid (of which CI runs a testbed) and TeraGrid XD have settled on using Modules, we thought it would be beneficial for all the software management environments to be consistent. So we've decided to migrate PADS, and the rest of the CI, away from SoftEnv to Modules. We'll be doing the migration in stages. The first stage will be to make SoftEnv use the same installations as Modules. This will help us identify software packages that still need to be installed the Modules way as well as test out the software that's currently there. During this time we highly encourage users to try out and test Modules as much as possible. We're shooting to have this phase done within 2 weeks. The second stage will be to remove all the extraneous SoftEnv software installations to clean up the software tree and flush out any remaining un-migrated software. We're shooting to have this phase done by the February maintenance day. The third stage will be to turn off SoftEnv and make Modules the default on PADS. We're shooting for this to be done by the March maintenance day to ensure everyone has adequate time to fully test the software tree. I've been using Modules exclusively for some months and several other users have been using it for several weeks, so I feel confident the basics are all there and working. If you'd like to start using Modules now and get a jump start, you can do the following: 1) comment out everything in your ~/.soft file. Don't remove the file, just comment everything out. 2a) For bash/zsh users add the following to your ~/.bashrc or ~/.zshrc, respectively: . /soft/Modules/etc/modules.sh 2b) For csh/tcsh users add the following your ~/.cshrc or ~/.tcshrc, respectively: source /soft/Modules/etc/modules.csh That's it! I'll be working on Modules documentation over the next month, but here are some basic commands: - module list: Show all currently loaded modules in your environment - module avail: Show all available modules for use - module load : Load the specified module (and any dependencies) into your environment - module unload : Unload the specified module from your environment - module purge: Reset your environment to a pristine setting - module show : Show what environment variables the specified module sets You can also create a ~/.modulerc file so that certain modules will get loaded at initialization - useful for having modules loaded for submitted jobs: #%Module1.0 module load openmpi module load hdf5 #EOF As always if you have any questions or concerns, please email pads-support at ci.uchicago.edu. _______________________________________________ pads-users mailing list pads-users at ci.uchicago.edu https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/pads-users -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From wilde at mcs.anl.gov Mon Jan 9 10:29:56 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 9 Jan 2012 10:29:56 -0600 (CST) Subject: [Swift-devel] CFP: CloudFlow 2012 Workshop - Submission Deadline Extended to Jan 16th, 2012 In-Reply-To: Message-ID: <949735761.123385.1326126596714.JavaMail.root@zimbra.anl.gov> First International Workshop on Workflow Models, Systems, Services and Applications in the Cloud (CloudFlow) 2012 To be held in conjunction with the 26th IEEE International Parallel & Distributed Processing Symposium (IPDPS) 2012, Shanghai, China, May 21-25, 2012. http://www.cloud-uestc.cn/cloudflow/home.html Overview Cloud computing is gaining tremendous momentum in both academia and industry, more and more people are migrating their data and applications into the Cloud. We have observed wide adoption of the MapReduce computing model and the open source Hadoop system for large scale distributed data processing, and a variety of ad hoc mashup techniques that weave together Web applications. However, these are just first steps towards managing complex task and data dependencies in the Cloud, as there are more challenging issues such as large parameter space exploration, data partitioning and distribution, scheduling and optimization, smart reruns, and provenance tracking associated with workflow execution. Cloud needs structured and mature workflow technologies to handle such issues, and vice versa, as Cloud offers unprecedented scalability to workflow systems, and could potentially change the way we perceive and conduct research and experiments. The scale and complexity of the science and data analytics problems that can be handled can be greatly increased on the Cloud, and the on-demand nature of resource allocation on the Cloud will also help improve resource utilization and user experience. As Cloud computing provides a paradigm-shifting utility-oriented computing model in terms of the unprecedented size of datacenter-level resource pool and the on-demand resource provisioning mechanism, there are lots of challenges in bringing Cloud and workflows together. We need high level languages and computing models for large scale workflow specification; we need to adapt existing workflow architectures into the Cloud, and integrate workflow systems with Cloud infrastructure and resources; we also need to leverage Cloud data storage technologies to efficiently distribute data over a large number of nodes and explore data locality during computation etc. We organize the CloudFlow workshop as a venue for the workflow and Cloud communities to define models and paradigms, present their state-of-the-art work, share their thoughts and experiences, and explore new directions in realizing workflows in the Cloud. Topics: We welcome the submission of original work related to the topics listed below, which include (in the context of Cloud): ? Models and Languages for Large Scale Workflow Specification ? Workflow Architecture and Framework ? Large Scale Workflow Systems ? Service Workflow ? Workflow Composition and Orchestration ? Workflow Migration into the Cloud ? Workflow Scheduling and Optimization ? Cloud Middleware in Support of Workflow ? Virtualized Environment ? Workflow Applications and Case Studies ? Performance and Scalability Analysis ? Peta-Scale Data Processing ? Event Processing and Messaging ? Real-Time Analytics ? Provenance Paper Submission Authors are invited to submit papers with unpublished, original work. The papers should not exceed 10 single-spaced double-column pages using 10-point size font on 8.5x11 inch pages (IEEE conference style), including figures, tables, and references. Paper submission should be done via the online CMT system, Microsoft?s Academic Conference Management Service ( https://cmt.research.microsoft.com/CF2012 ) by midnight January 16th, 2012 Pacific Time. The final format should be in PDF. Proceedings of the workshop will be published by the IEEE Digital Library and distributed at the conference. Selected excellent work may be eligible for additional post-conference publication as journal articles or book chapters. Submission implies the willingness of at least one of the authors to register and present the paper. Important Dates Paper submission: January 16th, 2012 Acceptance notification: February 15th, 2012 Final paper due: Feb 26th, 2012 Organization Workshop Chairs: Dr. Yong Zhao University of Electronic Science and Technology of China, China yongzh04 at gmail.com Dr. Cui Lin California State University, Fresno, USA clin at csufresno.edu Dr. Shiyong Lu Wayne State University, USA shiyong at wayne.edu Program Chairs: Dr. Wenhong Tian University of Electronic Science and Technology of China, China Dr. Ruini Xue Tsinghua University, China Steering Committee ? Daniel S. Katz, University of Chicago, U.S.A. ? Mike Wilde, University of Chicago, U.S.A. ? Ewa Deelman, University of South California, U.S.A. ? Tevfik Kosar, University at Buffalo, U.S.A. ? Ilkay Altintas, San Diego Supercomputer Center, U.S.A. ? Ioan Raicu, Illinois Institute of Technology, U.S.A. ? Yogesh Simmhan, University of Southern California, U.S.A. ? Ian Taylor, Cardiff University, U.K. ? Weimin Zheng, Tsinghua University, China ? Hai Jin, Huazhong University of Science and Engineering, China ? Wanchun Dou, Nanjing University, China ? Hui Zhang, National Science and Technology Infrastructure, China Program Committee ? Shawn Bowers, Gonzaga University, U.S.A. ? Douglas Thain, University of Notre Dame, U.S.A. ? Ian Gorton, Pacific Northwest National Laboratory, U.S.A. ? Artem Chebotko, University of Texas at Pan American, U.S.A. ? Weisong Shi, Wayne State University, U.S.A. ? Paolo Missier, Newcastle University, U.K. ? Wei Tan, IBM T. J. Watson Research Center, U.S.A. ? Jianwu Wang, San Diego Super Computer Center, U.S.A. ? Ping Yang, Binghamton University, U.S.A. ? Jian Guo, Harvard University, U.S.A. ? Liqiang Wang, University of Wyoming, U.S.A. ? Paul Groth, VU University Amsterdam, the Netherlands ? Zhiming Zhao, University of Amsterdam, the Netherlands ? Marta Mattoso, Federal University of Rio de Janeiro, Brazil ? Mostafa Ezziyyani, Abdelmalek Essa?di Univeristy, Morocco ? Wenhong Tian, University of Electronic Science and Technology of China, China ? Ruini Xue, Tsinghua University, China ? Jian Cao, Shanghai Jiaotong University, China ? Jianxun Liu, Hunan University of Science and Technology, China ? Song Zhang, Chinese Academy of Sciences, China ? Hua Hu, Hangzhou Dianzi University, China From hategan at mcs.anl.gov Thu Jan 12 00:21:33 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 11 Jan 2012 22:21:33 -0800 Subject: [Swift-devel] command line ssh provider... Message-ID: <1326349293.15245.3.camel@blabla> ... is in trunk (cog r3347). I was able to start coasters with it. The provider is called "ssh-cl". It is ssh, so ~/.ssh/config and agents will apply. Please test. Mihael From jonmon at mcs.anl.gov Thu Jan 12 13:29:10 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Thu, 12 Jan 2012 13:29:10 -0600 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <1326349293.15245.3.camel@blabla> References: <1326349293.15245.3.camel@blabla> Message-ID: <65E856C5-BB3D-4DE8-B168-AF44401E2BFD@mcs.anl.gov> Mike, You mentioned that you were able to use ssh command line provider using catsn this morning. Was it using agents? Mihael did you test using an agent? How do I specify for it to use an agent if available? I can do a simple hostname test from communicado to bridled but it asks for my password instead of using the agent I have set up. On Jan 12, 2012, at 12:21 AM, Mihael Hategan wrote: > ... is in trunk (cog r3347). I was able to start coasters with it. The > provider is called "ssh-cl". It is ssh, so ~/.ssh/config and agents will > apply. Please test. > > Mihael > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From hategan at mcs.anl.gov Thu Jan 12 13:33:17 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 12 Jan 2012 11:33:17 -0800 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <65E856C5-BB3D-4DE8-B168-AF44401E2BFD@mcs.anl.gov> References: <1326349293.15245.3.camel@blabla> <65E856C5-BB3D-4DE8-B168-AF44401E2BFD@mcs.anl.gov> Message-ID: <1326396797.17869.1.camel@blabla> On Thu, 2012-01-12 at 13:29 -0600, Jonathan Monette wrote: > Mike, > You mentioned that you were able to use ssh command line provider > using catsn this morning. Was it using agents? Mihael did you test > using an agent? How do I specify for it to use an agent if available? > I can do a simple hostname test from communicado to bridled but it > asks for my password instead of using the agent I have set up. Yes, it was using an agent. I'll add a flag to enable ssh debugging so that we can see exactly what's happening. You can probably hack ...sshcl.JobSubmissionTaskHandler to add a bunch of "-v" arguments to ssh in the mean time. From jonmon at mcs.anl.gov Thu Jan 12 13:34:20 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Thu, 12 Jan 2012 13:34:20 -0600 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <1326396797.17869.1.camel@blabla> References: <1326349293.15245.3.camel@blabla> <65E856C5-BB3D-4DE8-B168-AF44401E2BFD@mcs.anl.gov> <1326396797.17869.1.camel@blabla> Message-ID: <0863FB2A-9A33-4D62-9EFA-5FB14E1AEC9B@mcs.anl.gov> Thanks. I'll try that in the meantime. On Jan 12, 2012, at 1:33 PM, Mihael Hategan wrote: > > > On Thu, 2012-01-12 at 13:29 -0600, Jonathan Monette wrote: >> Mike, >> You mentioned that you were able to use ssh command line provider >> using catsn this morning. Was it using agents? Mihael did you test >> using an agent? How do I specify for it to use an agent if available? >> I can do a simple hostname test from communicado to bridled but it >> asks for my password instead of using the agent I have set up. > > Yes, it was using an agent. > > I'll add a flag to enable ssh debugging so that we can see exactly > what's happening. You can probably > hack ...sshcl.JobSubmissionTaskHandler to add a bunch of "-v" arguments > to ssh in the mean time. > > From hategan at mcs.anl.gov Thu Jan 12 13:40:54 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 12 Jan 2012 11:40:54 -0800 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <0863FB2A-9A33-4D62-9EFA-5FB14E1AEC9B@mcs.anl.gov> References: <1326349293.15245.3.camel@blabla> <65E856C5-BB3D-4DE8-B168-AF44401E2BFD@mcs.anl.gov> <1326396797.17869.1.camel@blabla> <0863FB2A-9A33-4D62-9EFA-5FB14E1AEC9B@mcs.anl.gov> Message-ID: <1326397254.18012.0.camel@blabla> To actually get the stdout you should make the job fail somehow. Maybe by typing the wrong password. On Thu, 2012-01-12 at 13:34 -0600, Jonathan Monette wrote: > Thanks. I'll try that in the meantime. > > On Jan 12, 2012, at 1:33 PM, Mihael Hategan wrote: > > > > > > > On Thu, 2012-01-12 at 13:29 -0600, Jonathan Monette wrote: > >> Mike, > >> You mentioned that you were able to use ssh command line provider > >> using catsn this morning. Was it using agents? Mihael did you test > >> using an agent? How do I specify for it to use an agent if available? > >> I can do a simple hostname test from communicado to bridled but it > >> asks for my password instead of using the agent I have set up. > > > > Yes, it was using an agent. > > > > I'll add a flag to enable ssh debugging so that we can see exactly > > what's happening. You can probably > > hack ...sshcl.JobSubmissionTaskHandler to add a bunch of "-v" arguments > > to ssh in the mean time. > > > > > From wilde at mcs.anl.gov Thu Jan 12 13:45:41 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 12 Jan 2012 13:45:41 -0600 (CST) Subject: [Swift-devel] command line ssh provider... In-Reply-To: <65E856C5-BB3D-4DE8-B168-AF44401E2BFD@mcs.anl.gov> Message-ID: <1221228734.137542.1326397541461.JavaMail.root@zimbra.anl.gov> ssh-cl worked for me going from communicado to both login.ci and bridled. I *assumed* it used my agent because I did not get a password prompt from the swift run. And I dont get a password prompt when running the ssh command line. It failed when I tried to use coasters with either provider staging (to login.mcs) or localhost/shared workdir (to login.ci). The command line and stdout/err for the coaster/local-workdir case is below. The logs are on ci net under ~wilde/swift/lab. Config and sites file was: com$ cat cf wrapperlog.always.transfer=true sitedir.keep=true execution.retries=0 lazy.errors=false status.mode=provider use.provider.staging=false provider.staging.pin.swiftfiles=false com$ cat sshcl.xml /home/wilde/swiftwork com$ com$ cat sshclcoast.xml 8 1 1 1 .01 10000 /home/wilde/swiftwork com$ - Mike com$ which swift ~/swift/src/trunk/cog/modules/swift/dist/swift-svn/bin/swift com$ pwd /home/wilde/swift/lab com$ swift -tc.file tc -sites.file sshcl.xml -config cf catsn.swift -n=1 Swift trunk swift-r5498 cog-r3347 RunID: 20120112-1343-a7mk2zyc Progress: time: Thu, 12 Jan 2012 13:43:04 -0600 Final status: Thu, 12 Jan 2012 13:43:04 -0600 Finished successfully:1 com$ swift -tc.file tc -sites.file sshclcoast.xml -config cf catsn.swift -n=1 Swift trunk swift-r5498 cog-r3347 RunID: 20120112-1343-ql7sn3f7 Progress: time: Thu, 12 Jan 2012 13:43:20 -0600 Failed to transfer wrapper log for job cat-ihhm6jlk EXCEPTION Exception in cat: Arguments: [data.txt] Host: localhost Directory: catsn-20120112-1343-ql7sn3f7/jobs/i/cat-ihhm6jlk stderr.txt: stdout.txt: ---- Caused by: null Caused by: org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Could not submit job Caused by: org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Could not start coaster service Caused by: org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Task ended before registration was received. STDOUT: Failed to download bootstrap jar from http://communicado.ci.uchicago.edu:45621 STDERR: This machine accepts SSH public key and One Time Password (OTP) logins only. If you do not have a public key set up, you will be prompted for a password. This is *not* your CI password, but the One Time Password generated from your OTP token. Do not type your CI password, it will not work. If you do not have a token or public key, you will not be able to login. See http://www.ci.uchicago.edu/faq for more information. Caused by: org.globus.cog.abstraction.impl.common.execution.JobException: Job failed with an exit code of 1 Execution failed: Job failed with an exit code of 1 com$ ----- Original Message ----- > From: "Jonathan Monette" > To: "Mihael Hategan" > Cc: "Swift Devel" , "Michael Wilde" > Sent: Thursday, January 12, 2012 1:29:10 PM > Subject: Re: [Swift-devel] command line ssh provider... > Mike, > You mentioned that you were able to use ssh command line provider > using catsn this morning. Was it using agents? Mihael did you test > using an agent? How do I specify for it to use an agent if available? > I can do a simple hostname test from communicado to bridled but it > asks for my password instead of using the agent I have set up. > > > On Jan 12, 2012, at 12:21 AM, Mihael Hategan wrote: > > > ... is in trunk (cog r3347). I was able to start coasters with it. > > The > > provider is called "ssh-cl". It is ssh, so ~/.ssh/config and agents > > will > > apply. Please test. > > > > Mihael > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From hategan at mcs.anl.gov Thu Jan 12 20:34:36 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 12 Jan 2012 18:34:36 -0800 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <1221228734.137542.1326397541461.JavaMail.root@zimbra.anl.gov> References: <1221228734.137542.1326397541461.JavaMail.root@zimbra.anl.gov> Message-ID: <1326422076.26278.0.camel@blabla> Can't test it right now because UCDavis decided to firewall stuff, but I do get the bootstrap script to start and it gets to the wget part. So the question is, do you get a bootstrap log? On Thu, 2012-01-12 at 13:45 -0600, Michael Wilde wrote: > ssh-cl worked for me going from communicado to both login.ci and bridled. > > I *assumed* it used my agent because I did not get a password prompt from the swift run. And I dont get a password prompt when running the ssh command line. > > It failed when I tried to use coasters with either provider staging (to login.mcs) or localhost/shared workdir (to login.ci). > > The command line and stdout/err for the coaster/local-workdir case is below. The logs are on ci net under ~wilde/swift/lab. Config and sites file was: > > com$ cat cf > wrapperlog.always.transfer=true > sitedir.keep=true > execution.retries=0 > lazy.errors=false > status.mode=provider > use.provider.staging=false > provider.staging.pin.swiftfiles=false > > com$ cat sshcl.xml > > > > > /home/wilde/swiftwork > > > com$ > > com$ cat sshclcoast.xml > > > > > 8 > 1 > 1 > 1 > .01 > 10000 > > > /home/wilde/swiftwork > > > > com$ > > > > - Mike > > com$ which swift > ~/swift/src/trunk/cog/modules/swift/dist/swift-svn/bin/swift > com$ pwd > /home/wilde/swift/lab > com$ swift -tc.file tc -sites.file sshcl.xml -config cf catsn.swift -n=1 > Swift trunk swift-r5498 cog-r3347 > > RunID: 20120112-1343-a7mk2zyc > Progress: time: Thu, 12 Jan 2012 13:43:04 -0600 > Final status: Thu, 12 Jan 2012 13:43:04 -0600 Finished successfully:1 > com$ swift -tc.file tc -sites.file sshclcoast.xml -config cf catsn.swift -n=1 > Swift trunk swift-r5498 cog-r3347 > > RunID: 20120112-1343-ql7sn3f7 > Progress: time: Thu, 12 Jan 2012 13:43:20 -0600 > Failed to transfer wrapper log for job cat-ihhm6jlk > EXCEPTION Exception in cat: > Arguments: [data.txt] > Host: localhost > Directory: catsn-20120112-1343-ql7sn3f7/jobs/i/cat-ihhm6jlk > stderr.txt: > > stdout.txt: > > ---- > > Caused by: null > Caused by: org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Could not submit job > Caused by: org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Could not start coaster service > Caused by: org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Task ended before registration was received. > STDOUT: Failed to download bootstrap jar from http://communicado.ci.uchicago.edu:45621 > > STDERR: This machine accepts SSH public key and One Time Password (OTP) logins only. > If you do not have a public key set up, you will be prompted for a password. > This is *not* your CI password, but the One Time Password generated from your > OTP token. Do not type your CI password, it will not work. If you do not > have a token or public key, you will not be able to login. > > See http://www.ci.uchicago.edu/faq for more information. > > Caused by: org.globus.cog.abstraction.impl.common.execution.JobException: Job failed with an exit code of 1 > Execution failed: > Job failed with an exit code of 1 > com$ > > > ----- Original Message ----- > > From: "Jonathan Monette" > > To: "Mihael Hategan" > > Cc: "Swift Devel" , "Michael Wilde" > > Sent: Thursday, January 12, 2012 1:29:10 PM > > Subject: Re: [Swift-devel] command line ssh provider... > > Mike, > > You mentioned that you were able to use ssh command line provider > > using catsn this morning. Was it using agents? Mihael did you test > > using an agent? How do I specify for it to use an agent if available? > > I can do a simple hostname test from communicado to bridled but it > > asks for my password instead of using the agent I have set up. > > > > > > On Jan 12, 2012, at 12:21 AM, Mihael Hategan wrote: > > > > > ... is in trunk (cog r3347). I was able to start coasters with it. > > > The > > > provider is called "ssh-cl". It is ssh, so ~/.ssh/config and agents > > > will > > > apply. Please test. > > > > > > Mihael > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > From wilde at mcs.anl.gov Thu Jan 12 21:19:36 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 12 Jan 2012 21:19:36 -0600 (CST) Subject: [Swift-devel] command line ssh provider... In-Reply-To: <1326422076.26278.0.camel@blabla> Message-ID: <1300975460.138708.1326424776491.JavaMail.root@zimbra.anl.gov> The boostrap log shows this: com$ cat ~/coaster-bootstrap-1460623968.log using plain mode BS: http://communicado.ci.uchicago.edu:45621 Failed to download bootstrap jar from http://communicado.ci.uchicago.edu:45621 com$ - Mike ----- Original Message ----- > From: "Mihael Hategan" > To: "Michael Wilde" > Cc: "Jonathan Monette" , "Swift Devel" > Sent: Thursday, January 12, 2012 8:34:36 PM > Subject: Re: [Swift-devel] command line ssh provider... > Can't test it right now because UCDavis decided to firewall stuff, but > I > do get the bootstrap script to start and it gets to the wget part. > > So the question is, do you get a bootstrap log? > > On Thu, 2012-01-12 at 13:45 -0600, Michael Wilde wrote: > > ssh-cl worked for me going from communicado to both login.ci and > > bridled. > > > > I *assumed* it used my agent because I did not get a password prompt > > from the swift run. And I dont get a password prompt when running > > the ssh command line. > > > > It failed when I tried to use coasters with either provider staging > > (to login.mcs) or localhost/shared workdir (to login.ci). > > > > The command line and stdout/err for the coaster/local-workdir case > > is below. The logs are on ci net under ~wilde/swift/lab. Config and > > sites file was: > > > > com$ cat cf > > wrapperlog.always.transfer=true > > sitedir.keep=true > > execution.retries=0 > > lazy.errors=false > > status.mode=provider > > use.provider.staging=false > > provider.staging.pin.swiftfiles=false > > > > com$ cat sshcl.xml > > > > > > > > > > /home/wilde/swiftwork > > > > > > com$ > > > > com$ cat sshclcoast.xml > > > > > > > jobmanager="ssh-cl:local"/> > > > > 8 > > 1 > > 1 > > 1 > > .01 > > 10000 > > > > > > /home/wilde/swiftwork > > > > > > > > com$ > > > > > > > > - Mike > > > > com$ which swift > > ~/swift/src/trunk/cog/modules/swift/dist/swift-svn/bin/swift > > com$ pwd > > /home/wilde/swift/lab > > com$ swift -tc.file tc -sites.file sshcl.xml -config cf catsn.swift > > -n=1 > > Swift trunk swift-r5498 cog-r3347 > > > > RunID: 20120112-1343-a7mk2zyc > > Progress: time: Thu, 12 Jan 2012 13:43:04 -0600 > > Final status: Thu, 12 Jan 2012 13:43:04 -0600 Finished > > successfully:1 > > com$ swift -tc.file tc -sites.file sshclcoast.xml -config cf > > catsn.swift -n=1 > > Swift trunk swift-r5498 cog-r3347 > > > > RunID: 20120112-1343-ql7sn3f7 > > Progress: time: Thu, 12 Jan 2012 13:43:20 -0600 > > Failed to transfer wrapper log for job cat-ihhm6jlk > > EXCEPTION Exception in cat: > > Arguments: [data.txt] > > Host: localhost > > Directory: catsn-20120112-1343-ql7sn3f7/jobs/i/cat-ihhm6jlk > > stderr.txt: > > > > stdout.txt: > > > > ---- > > > > Caused by: null > > Caused by: > > org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: > > Could not submit job > > Caused by: > > org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: > > Could not start coaster service > > Caused by: > > org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: > > Task ended before registration was received. > > STDOUT: Failed to download bootstrap jar from > > http://communicado.ci.uchicago.edu:45621 > > > > STDERR: This machine accepts SSH public key and One Time Password > > (OTP) logins only. > > If you do not have a public key set up, you will be prompted for a > > password. > > This is *not* your CI password, but the One Time Password generated > > from your > > OTP token. Do not type your CI password, it will not work. If you do > > not > > have a token or public key, you will not be able to login. > > > > See http://www.ci.uchicago.edu/faq for more information. > > > > Caused by: > > org.globus.cog.abstraction.impl.common.execution.JobException: Job > > failed with an exit code of 1 > > Execution failed: > > Job failed with an exit code of 1 > > com$ > > > > > > ----- Original Message ----- > > > From: "Jonathan Monette" > > > To: "Mihael Hategan" > > > Cc: "Swift Devel" , "Michael Wilde" > > > > > > Sent: Thursday, January 12, 2012 1:29:10 PM > > > Subject: Re: [Swift-devel] command line ssh provider... > > > Mike, > > > You mentioned that you were able to use ssh command line provider > > > using catsn this morning. Was it using agents? Mihael did you test > > > using an agent? How do I specify for it to use an agent if > > > available? > > > I can do a simple hostname test from communicado to bridled but it > > > asks for my password instead of using the agent I have set up. > > > > > > > > > On Jan 12, 2012, at 12:21 AM, Mihael Hategan wrote: > > > > > > > ... is in trunk (cog r3347). I was able to start coasters with > > > > it. > > > > The > > > > provider is called "ssh-cl". It is ssh, so ~/.ssh/config and > > > > agents > > > > will > > > > apply. Please test. > > > > > > > > Mihael > > > > > > > > _______________________________________________ > > > > Swift-devel mailing list > > > > Swift-devel at ci.uchicago.edu > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From ketancmaheshwari at gmail.com Thu Jan 12 21:23:58 2012 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Thu, 12 Jan 2012 21:23:58 -0600 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <1300975460.138708.1326424776491.JavaMail.root@zimbra.anl.gov> References: <1326422076.26278.0.camel@blabla> <1300975460.138708.1326424776491.JavaMail.root@zimbra.anl.gov> Message-ID: This would be because of firewall at communicado. Probably need to set GLOBUS_TCP_PORT_RANGE=50000,51000 On Thu, Jan 12, 2012 at 9:19 PM, Michael Wilde wrote: > The boostrap log shows this: > > com$ cat ~/coaster-bootstrap-1460623968.log > using plain mode > BS: http://communicado.ci.uchicago.edu:45621 > Failed to download bootstrap jar from > http://communicado.ci.uchicago.edu:45621 > com$ > > - Mike > > ----- Original Message ----- > > From: "Mihael Hategan" > > To: "Michael Wilde" > > Cc: "Jonathan Monette" , "Swift Devel" < > swift-devel at ci.uchicago.edu> > > Sent: Thursday, January 12, 2012 8:34:36 PM > > Subject: Re: [Swift-devel] command line ssh provider... > > Can't test it right now because UCDavis decided to firewall stuff, but > > I > > do get the bootstrap script to start and it gets to the wget part. > > > > So the question is, do you get a bootstrap log? > > > > On Thu, 2012-01-12 at 13:45 -0600, Michael Wilde wrote: > > > ssh-cl worked for me going from communicado to both login.ci and > > > bridled. > > > > > > I *assumed* it used my agent because I did not get a password prompt > > > from the swift run. And I dont get a password prompt when running > > > the ssh command line. > > > > > > It failed when I tried to use coasters with either provider staging > > > (to login.mcs) or localhost/shared workdir (to login.ci). > > > > > > The command line and stdout/err for the coaster/local-workdir case > > > is below. The logs are on ci net under ~wilde/swift/lab. Config and > > > sites file was: > > > > > > com$ cat cf > > > wrapperlog.always.transfer=true > > > sitedir.keep=true > > > execution.retries=0 > > > lazy.errors=false > > > status.mode=provider > > > use.provider.staging=false > > > provider.staging.pin.swiftfiles=false > > > > > > com$ cat sshcl.xml > > > > > > > > > > > > > > > /home/wilde/swiftwork > > > > > > > > > com$ > > > > > > com$ cat sshclcoast.xml > > > > > > > > > > > jobmanager="ssh-cl:local"/> > > > > > > 8 > > > 1 > > > 1 > > > 1 > > > .01 > > > 10000 > > > > > > > > > /home/wilde/swiftwork > > > > > > > > > > > > com$ > > > > > > > > > > > > - Mike > > > > > > com$ which swift > > > ~/swift/src/trunk/cog/modules/swift/dist/swift-svn/bin/swift > > > com$ pwd > > > /home/wilde/swift/lab > > > com$ swift -tc.file tc -sites.file sshcl.xml -config cf catsn.swift > > > -n=1 > > > Swift trunk swift-r5498 cog-r3347 > > > > > > RunID: 20120112-1343-a7mk2zyc > > > Progress: time: Thu, 12 Jan 2012 13:43:04 -0600 > > > Final status: Thu, 12 Jan 2012 13:43:04 -0600 Finished > > > successfully:1 > > > com$ swift -tc.file tc -sites.file sshclcoast.xml -config cf > > > catsn.swift -n=1 > > > Swift trunk swift-r5498 cog-r3347 > > > > > > RunID: 20120112-1343-ql7sn3f7 > > > Progress: time: Thu, 12 Jan 2012 13:43:20 -0600 > > > Failed to transfer wrapper log for job cat-ihhm6jlk > > > EXCEPTION Exception in cat: > > > Arguments: [data.txt] > > > Host: localhost > > > Directory: catsn-20120112-1343-ql7sn3f7/jobs/i/cat-ihhm6jlk > > > stderr.txt: > > > > > > stdout.txt: > > > > > > ---- > > > > > > Caused by: null > > > Caused by: > > > org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: > > > Could not submit job > > > Caused by: > > > org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: > > > Could not start coaster service > > > Caused by: > > > org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: > > > Task ended before registration was received. > > > STDOUT: Failed to download bootstrap jar from > > > http://communicado.ci.uchicago.edu:45621 > > > > > > STDERR: This machine accepts SSH public key and One Time Password > > > (OTP) logins only. > > > If you do not have a public key set up, you will be prompted for a > > > password. > > > This is *not* your CI password, but the One Time Password generated > > > from your > > > OTP token. Do not type your CI password, it will not work. If you do > > > not > > > have a token or public key, you will not be able to login. > > > > > > See http://www.ci.uchicago.edu/faq for more information. > > > > > > Caused by: > > > org.globus.cog.abstraction.impl.common.execution.JobException: Job > > > failed with an exit code of 1 > > > Execution failed: > > > Job failed with an exit code of 1 > > > com$ > > > > > > > > > ----- Original Message ----- > > > > From: "Jonathan Monette" > > > > To: "Mihael Hategan" > > > > Cc: "Swift Devel" , "Michael Wilde" > > > > > > > > Sent: Thursday, January 12, 2012 1:29:10 PM > > > > Subject: Re: [Swift-devel] command line ssh provider... > > > > Mike, > > > > You mentioned that you were able to use ssh command line provider > > > > using catsn this morning. Was it using agents? Mihael did you test > > > > using an agent? How do I specify for it to use an agent if > > > > available? > > > > I can do a simple hostname test from communicado to bridled but it > > > > asks for my password instead of using the agent I have set up. > > > > > > > > > > > > On Jan 12, 2012, at 12:21 AM, Mihael Hategan wrote: > > > > > > > > > ... is in trunk (cog r3347). I was able to start coasters with > > > > > it. > > > > > The > > > > > provider is called "ssh-cl". It is ssh, so ~/.ssh/config and > > > > > agents > > > > > will > > > > > apply. Please test. > > > > > > > > > > Mihael > > > > > > > > > > _______________________________________________ > > > > > Swift-devel mailing list > > > > > Swift-devel at ci.uchicago.edu > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at mcs.anl.gov Thu Jan 12 21:24:54 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Thu, 12 Jan 2012 21:24:54 -0600 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <1300975460.138708.1326424776491.JavaMail.root@zimbra.anl.gov> References: <1300975460.138708.1326424776491.JavaMail.root@zimbra.anl.gov> Message-ID: <137BC8A4-71EE-4E93-AF87-07E542C0AB45@mcs.anl.gov> I am getting a different problem. The provider does not seem to be using an agent. Starting from my macbook I can ssh -A jonmon at login.ci.uchicago.edu and then do ssh -A jonmon at communicado.ci.uchicago.edu and then ssh -A jonmon at bridled.ci.uchicago.edu in the terminal and none of them require a password. However if I ssh -A jonmon at login.ci.uchicago.edu and then ssh -A jonmon at communicado.ci.uchicago.edu, then start a Swift run that does a simple hostname call on bridled.ci.uchicago.edu I am prompted for my ci password every time. I am more than certain that this is a configuration issue so I ask for suggestions. My next step is to completely undo all my ssh keys in the authorized key files and start fresh with new keys and passphrases that are not in my macbook keychain. I do not really want to basically revert back to nothing regarding ssh configuration but this seems to be my only alternative. Any suggestions? On Jan 12, 2012, at 9:19 PM, Michael Wilde wrote: > The boostrap log shows this: > > com$ cat ~/coaster-bootstrap-1460623968.log > using plain mode > BS: http://communicado.ci.uchicago.edu:45621 > Failed to download bootstrap jar from http://communicado.ci.uchicago.edu:45621 > com$ > > - Mike > > ----- Original Message ----- >> From: "Mihael Hategan" >> To: "Michael Wilde" >> Cc: "Jonathan Monette" , "Swift Devel" >> Sent: Thursday, January 12, 2012 8:34:36 PM >> Subject: Re: [Swift-devel] command line ssh provider... >> Can't test it right now because UCDavis decided to firewall stuff, but >> I >> do get the bootstrap script to start and it gets to the wget part. >> >> So the question is, do you get a bootstrap log? >> >> On Thu, 2012-01-12 at 13:45 -0600, Michael Wilde wrote: >>> ssh-cl worked for me going from communicado to both login.ci and >>> bridled. >>> >>> I *assumed* it used my agent because I did not get a password prompt >>> from the swift run. And I dont get a password prompt when running >>> the ssh command line. >>> >>> It failed when I tried to use coasters with either provider staging >>> (to login.mcs) or localhost/shared workdir (to login.ci). >>> >>> The command line and stdout/err for the coaster/local-workdir case >>> is below. The logs are on ci net under ~wilde/swift/lab. Config and >>> sites file was: >>> >>> com$ cat cf >>> wrapperlog.always.transfer=true >>> sitedir.keep=true >>> execution.retries=0 >>> lazy.errors=false >>> status.mode=provider >>> use.provider.staging=false >>> provider.staging.pin.swiftfiles=false >>> >>> com$ cat sshcl.xml >>> >>> >>> >>> >>> /home/wilde/swiftwork >>> >>> >>> com$ >>> >>> com$ cat sshclcoast.xml >>> >>> >>> >> jobmanager="ssh-cl:local"/> >>> >>> 8 >>> 1 >>> 1 >>> 1 >>> .01 >>> 10000 >>> >>> >>> /home/wilde/swiftwork >>> >>> >>> >>> com$ >>> >>> >>> >>> - Mike >>> >>> com$ which swift >>> ~/swift/src/trunk/cog/modules/swift/dist/swift-svn/bin/swift >>> com$ pwd >>> /home/wilde/swift/lab >>> com$ swift -tc.file tc -sites.file sshcl.xml -config cf catsn.swift >>> -n=1 >>> Swift trunk swift-r5498 cog-r3347 >>> >>> RunID: 20120112-1343-a7mk2zyc >>> Progress: time: Thu, 12 Jan 2012 13:43:04 -0600 >>> Final status: Thu, 12 Jan 2012 13:43:04 -0600 Finished >>> successfully:1 >>> com$ swift -tc.file tc -sites.file sshclcoast.xml -config cf >>> catsn.swift -n=1 >>> Swift trunk swift-r5498 cog-r3347 >>> >>> RunID: 20120112-1343-ql7sn3f7 >>> Progress: time: Thu, 12 Jan 2012 13:43:20 -0600 >>> Failed to transfer wrapper log for job cat-ihhm6jlk >>> EXCEPTION Exception in cat: >>> Arguments: [data.txt] >>> Host: localhost >>> Directory: catsn-20120112-1343-ql7sn3f7/jobs/i/cat-ihhm6jlk >>> stderr.txt: >>> >>> stdout.txt: >>> >>> ---- >>> >>> Caused by: null >>> Caused by: >>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: >>> Could not submit job >>> Caused by: >>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: >>> Could not start coaster service >>> Caused by: >>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: >>> Task ended before registration was received. >>> STDOUT: Failed to download bootstrap jar from >>> http://communicado.ci.uchicago.edu:45621 >>> >>> STDERR: This machine accepts SSH public key and One Time Password >>> (OTP) logins only. >>> If you do not have a public key set up, you will be prompted for a >>> password. >>> This is *not* your CI password, but the One Time Password generated >>> from your >>> OTP token. Do not type your CI password, it will not work. If you do >>> not >>> have a token or public key, you will not be able to login. >>> >>> See http://www.ci.uchicago.edu/faq for more information. >>> >>> Caused by: >>> org.globus.cog.abstraction.impl.common.execution.JobException: Job >>> failed with an exit code of 1 >>> Execution failed: >>> Job failed with an exit code of 1 >>> com$ >>> >>> >>> ----- Original Message ----- >>>> From: "Jonathan Monette" >>>> To: "Mihael Hategan" >>>> Cc: "Swift Devel" , "Michael Wilde" >>>> >>>> Sent: Thursday, January 12, 2012 1:29:10 PM >>>> Subject: Re: [Swift-devel] command line ssh provider... >>>> Mike, >>>> You mentioned that you were able to use ssh command line provider >>>> using catsn this morning. Was it using agents? Mihael did you test >>>> using an agent? How do I specify for it to use an agent if >>>> available? >>>> I can do a simple hostname test from communicado to bridled but it >>>> asks for my password instead of using the agent I have set up. >>>> >>>> >>>> On Jan 12, 2012, at 12:21 AM, Mihael Hategan wrote: >>>> >>>>> ... is in trunk (cog r3347). I was able to start coasters with >>>>> it. >>>>> The >>>>> provider is called "ssh-cl". It is ssh, so ~/.ssh/config and >>>>> agents >>>>> will >>>>> apply. Please test. >>>>> >>>>> Mihael >>>>> >>>>> _______________________________________________ >>>>> Swift-devel mailing list >>>>> Swift-devel at ci.uchicago.edu >>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>> > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > From ketancmaheshwari at gmail.com Thu Jan 12 21:42:41 2012 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Thu, 12 Jan 2012 21:42:41 -0600 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <137BC8A4-71EE-4E93-AF87-07E542C0AB45@mcs.anl.gov> References: <1300975460.138708.1326424776491.JavaMail.root@zimbra.anl.gov> <137BC8A4-71EE-4E93-AF87-07E542C0AB45@mcs.anl.gov> Message-ID: Jon, Do you have auth.defaults file in ~/.ssh/ set? It has the following structure: .type=key .username= .key=/path/to/key .passphrase= file perm should be 600 Regards, Ketan On Thu, Jan 12, 2012 at 9:24 PM, Jonathan Monette wrote: > I am getting a different problem. The provider does not seem to be using > an agent. > > Starting from my macbook I can ssh -A jonmon at login.ci.uchicago.edu and > then do ssh -A jonmon at communicado.ci.uchicago.edu and then ssh -A > jonmon at bridled.ci.uchicago.edu in the terminal and none of them require a > password. > > However if I ssh -A jonmon at login.ci.uchicago.edu and then ssh -A > jonmon at communicado.ci.uchicago.edu, then start a Swift run that does a > simple hostname call on bridled.ci.uchicago.edu I am prompted for my ci > password every time. > > I am more than certain that this is a configuration issue so I ask for > suggestions. My next step is to completely undo all my ssh keys in the > authorized key files and start fresh with new keys and passphrases that are > not in my macbook keychain. I do not really want to basically revert back > to nothing regarding ssh configuration but this seems to be my only > alternative. Any suggestions? > > On Jan 12, 2012, at 9:19 PM, Michael Wilde wrote: > > > The boostrap log shows this: > > > > com$ cat ~/coaster-bootstrap-1460623968.log > > using plain mode > > BS: http://communicado.ci.uchicago.edu:45621 > > Failed to download bootstrap jar from > http://communicado.ci.uchicago.edu:45621 > > com$ > > > > - Mike > > > > ----- Original Message ----- > >> From: "Mihael Hategan" > >> To: "Michael Wilde" > >> Cc: "Jonathan Monette" , "Swift Devel" < > swift-devel at ci.uchicago.edu> > >> Sent: Thursday, January 12, 2012 8:34:36 PM > >> Subject: Re: [Swift-devel] command line ssh provider... > >> Can't test it right now because UCDavis decided to firewall stuff, but > >> I > >> do get the bootstrap script to start and it gets to the wget part. > >> > >> So the question is, do you get a bootstrap log? > >> > >> On Thu, 2012-01-12 at 13:45 -0600, Michael Wilde wrote: > >>> ssh-cl worked for me going from communicado to both login.ci and > >>> bridled. > >>> > >>> I *assumed* it used my agent because I did not get a password prompt > >>> from the swift run. And I dont get a password prompt when running > >>> the ssh command line. > >>> > >>> It failed when I tried to use coasters with either provider staging > >>> (to login.mcs) or localhost/shared workdir (to login.ci). > >>> > >>> The command line and stdout/err for the coaster/local-workdir case > >>> is below. The logs are on ci net under ~wilde/swift/lab. Config and > >>> sites file was: > >>> > >>> com$ cat cf > >>> wrapperlog.always.transfer=true > >>> sitedir.keep=true > >>> execution.retries=0 > >>> lazy.errors=false > >>> status.mode=provider > >>> use.provider.staging=false > >>> provider.staging.pin.swiftfiles=false > >>> > >>> com$ cat sshcl.xml > >>> > >>> > >>> > >>> > >>> /home/wilde/swiftwork > >>> > >>> > >>> com$ > >>> > >>> com$ cat sshclcoast.xml > >>> > >>> > >>> >>> jobmanager="ssh-cl:local"/> > >>> > >>> 8 > >>> 1 > >>> 1 > >>> 1 > >>> .01 > >>> 10000 > >>> > >>> > >>> /home/wilde/swiftwork > >>> > >>> > >>> > >>> com$ > >>> > >>> > >>> > >>> - Mike > >>> > >>> com$ which swift > >>> ~/swift/src/trunk/cog/modules/swift/dist/swift-svn/bin/swift > >>> com$ pwd > >>> /home/wilde/swift/lab > >>> com$ swift -tc.file tc -sites.file sshcl.xml -config cf catsn.swift > >>> -n=1 > >>> Swift trunk swift-r5498 cog-r3347 > >>> > >>> RunID: 20120112-1343-a7mk2zyc > >>> Progress: time: Thu, 12 Jan 2012 13:43:04 -0600 > >>> Final status: Thu, 12 Jan 2012 13:43:04 -0600 Finished > >>> successfully:1 > >>> com$ swift -tc.file tc -sites.file sshclcoast.xml -config cf > >>> catsn.swift -n=1 > >>> Swift trunk swift-r5498 cog-r3347 > >>> > >>> RunID: 20120112-1343-ql7sn3f7 > >>> Progress: time: Thu, 12 Jan 2012 13:43:20 -0600 > >>> Failed to transfer wrapper log for job cat-ihhm6jlk > >>> EXCEPTION Exception in cat: > >>> Arguments: [data.txt] > >>> Host: localhost > >>> Directory: catsn-20120112-1343-ql7sn3f7/jobs/i/cat-ihhm6jlk > >>> stderr.txt: > >>> > >>> stdout.txt: > >>> > >>> ---- > >>> > >>> Caused by: null > >>> Caused by: > >>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: > >>> Could not submit job > >>> Caused by: > >>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: > >>> Could not start coaster service > >>> Caused by: > >>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: > >>> Task ended before registration was received. > >>> STDOUT: Failed to download bootstrap jar from > >>> http://communicado.ci.uchicago.edu:45621 > >>> > >>> STDERR: This machine accepts SSH public key and One Time Password > >>> (OTP) logins only. > >>> If you do not have a public key set up, you will be prompted for a > >>> password. > >>> This is *not* your CI password, but the One Time Password generated > >>> from your > >>> OTP token. Do not type your CI password, it will not work. If you do > >>> not > >>> have a token or public key, you will not be able to login. > >>> > >>> See http://www.ci.uchicago.edu/faq for more information. > >>> > >>> Caused by: > >>> org.globus.cog.abstraction.impl.common.execution.JobException: Job > >>> failed with an exit code of 1 > >>> Execution failed: > >>> Job failed with an exit code of 1 > >>> com$ > >>> > >>> > >>> ----- Original Message ----- > >>>> From: "Jonathan Monette" > >>>> To: "Mihael Hategan" > >>>> Cc: "Swift Devel" , "Michael Wilde" > >>>> > >>>> Sent: Thursday, January 12, 2012 1:29:10 PM > >>>> Subject: Re: [Swift-devel] command line ssh provider... > >>>> Mike, > >>>> You mentioned that you were able to use ssh command line provider > >>>> using catsn this morning. Was it using agents? Mihael did you test > >>>> using an agent? How do I specify for it to use an agent if > >>>> available? > >>>> I can do a simple hostname test from communicado to bridled but it > >>>> asks for my password instead of using the agent I have set up. > >>>> > >>>> > >>>> On Jan 12, 2012, at 12:21 AM, Mihael Hategan wrote: > >>>> > >>>>> ... is in trunk (cog r3347). I was able to start coasters with > >>>>> it. > >>>>> The > >>>>> provider is called "ssh-cl". It is ssh, so ~/.ssh/config and > >>>>> agents > >>>>> will > >>>>> apply. Please test. > >>>>> > >>>>> Mihael > >>>>> > >>>>> _______________________________________________ > >>>>> Swift-devel mailing list > >>>>> Swift-devel at ci.uchicago.edu > >>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >>> > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at mcs.anl.gov Thu Jan 12 21:44:19 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Thu, 12 Jan 2012 21:44:19 -0600 Subject: [Swift-devel] command line ssh provider... In-Reply-To: References: <1300975460.138708.1326424776491.JavaMail.root@zimbra.anl.gov> <137BC8A4-71EE-4E93-AF87-07E542C0AB45@mcs.anl.gov> Message-ID: <1CED5457-8158-4470-89E6-D3700EFC0C18@mcs.anl.gov> I didn't think the ssh commandline provider would be using auth.defaults. I thought the command line provider was created so that an auth.defaults would not be necessary. Am I wrong in thinking that Mihael? On Jan 12, 2012, at 9:42 PM, Ketan Maheshwari wrote: > Jon, > > Do you have auth.defaults file in ~/.ssh/ set? It has the following structure: > > .type=key > .username= > .key=/path/to/key > .passphrase= > > file perm should be 600 > > Regards, > Ketan > > On Thu, Jan 12, 2012 at 9:24 PM, Jonathan Monette wrote: > I am getting a different problem. The provider does not seem to be using an agent. > > Starting from my macbook I can ssh -A jonmon at login.ci.uchicago.edu and then do ssh -A jonmon at communicado.ci.uchicago.edu and then ssh -A jonmon at bridled.ci.uchicago.edu in the terminal and none of them require a password. > > However if I ssh -A jonmon at login.ci.uchicago.edu and then ssh -A jonmon at communicado.ci.uchicago.edu, then start a Swift run that does a simple hostname call on bridled.ci.uchicago.edu I am prompted for my ci password every time. > > I am more than certain that this is a configuration issue so I ask for suggestions. My next step is to completely undo all my ssh keys in the authorized key files and start fresh with new keys and passphrases that are not in my macbook keychain. I do not really want to basically revert back to nothing regarding ssh configuration but this seems to be my only alternative. Any suggestions? > > On Jan 12, 2012, at 9:19 PM, Michael Wilde wrote: > > > The boostrap log shows this: > > > > com$ cat ~/coaster-bootstrap-1460623968.log > > using plain mode > > BS: http://communicado.ci.uchicago.edu:45621 > > Failed to download bootstrap jar from http://communicado.ci.uchicago.edu:45621 > > com$ > > > > - Mike > > > > ----- Original Message ----- > >> From: "Mihael Hategan" > >> To: "Michael Wilde" > >> Cc: "Jonathan Monette" , "Swift Devel" > >> Sent: Thursday, January 12, 2012 8:34:36 PM > >> Subject: Re: [Swift-devel] command line ssh provider... > >> Can't test it right now because UCDavis decided to firewall stuff, but > >> I > >> do get the bootstrap script to start and it gets to the wget part. > >> > >> So the question is, do you get a bootstrap log? > >> > >> On Thu, 2012-01-12 at 13:45 -0600, Michael Wilde wrote: > >>> ssh-cl worked for me going from communicado to both login.ci and > >>> bridled. > >>> > >>> I *assumed* it used my agent because I did not get a password prompt > >>> from the swift run. And I dont get a password prompt when running > >>> the ssh command line. > >>> > >>> It failed when I tried to use coasters with either provider staging > >>> (to login.mcs) or localhost/shared workdir (to login.ci). > >>> > >>> The command line and stdout/err for the coaster/local-workdir case > >>> is below. The logs are on ci net under ~wilde/swift/lab. Config and > >>> sites file was: > >>> > >>> com$ cat cf > >>> wrapperlog.always.transfer=true > >>> sitedir.keep=true > >>> execution.retries=0 > >>> lazy.errors=false > >>> status.mode=provider > >>> use.provider.staging=false > >>> provider.staging.pin.swiftfiles=false > >>> > >>> com$ cat sshcl.xml > >>> > >>> > >>> > >>> > >>> /home/wilde/swiftwork > >>> > >>> > >>> com$ > >>> > >>> com$ cat sshclcoast.xml > >>> > >>> > >>> >>> jobmanager="ssh-cl:local"/> > >>> > >>> 8 > >>> 1 > >>> 1 > >>> 1 > >>> .01 > >>> 10000 > >>> > >>> > >>> /home/wilde/swiftwork > >>> > >>> > >>> > >>> com$ > >>> > >>> > >>> > >>> - Mike > >>> > >>> com$ which swift > >>> ~/swift/src/trunk/cog/modules/swift/dist/swift-svn/bin/swift > >>> com$ pwd > >>> /home/wilde/swift/lab > >>> com$ swift -tc.file tc -sites.file sshcl.xml -config cf catsn.swift > >>> -n=1 > >>> Swift trunk swift-r5498 cog-r3347 > >>> > >>> RunID: 20120112-1343-a7mk2zyc > >>> Progress: time: Thu, 12 Jan 2012 13:43:04 -0600 > >>> Final status: Thu, 12 Jan 2012 13:43:04 -0600 Finished > >>> successfully:1 > >>> com$ swift -tc.file tc -sites.file sshclcoast.xml -config cf > >>> catsn.swift -n=1 > >>> Swift trunk swift-r5498 cog-r3347 > >>> > >>> RunID: 20120112-1343-ql7sn3f7 > >>> Progress: time: Thu, 12 Jan 2012 13:43:20 -0600 > >>> Failed to transfer wrapper log for job cat-ihhm6jlk > >>> EXCEPTION Exception in cat: > >>> Arguments: [data.txt] > >>> Host: localhost > >>> Directory: catsn-20120112-1343-ql7sn3f7/jobs/i/cat-ihhm6jlk > >>> stderr.txt: > >>> > >>> stdout.txt: > >>> > >>> ---- > >>> > >>> Caused by: null > >>> Caused by: > >>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: > >>> Could not submit job > >>> Caused by: > >>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: > >>> Could not start coaster service > >>> Caused by: > >>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: > >>> Task ended before registration was received. > >>> STDOUT: Failed to download bootstrap jar from > >>> http://communicado.ci.uchicago.edu:45621 > >>> > >>> STDERR: This machine accepts SSH public key and One Time Password > >>> (OTP) logins only. > >>> If you do not have a public key set up, you will be prompted for a > >>> password. > >>> This is *not* your CI password, but the One Time Password generated > >>> from your > >>> OTP token. Do not type your CI password, it will not work. If you do > >>> not > >>> have a token or public key, you will not be able to login. > >>> > >>> See http://www.ci.uchicago.edu/faq for more information. > >>> > >>> Caused by: > >>> org.globus.cog.abstraction.impl.common.execution.JobException: Job > >>> failed with an exit code of 1 > >>> Execution failed: > >>> Job failed with an exit code of 1 > >>> com$ > >>> > >>> > >>> ----- Original Message ----- > >>>> From: "Jonathan Monette" > >>>> To: "Mihael Hategan" > >>>> Cc: "Swift Devel" , "Michael Wilde" > >>>> > >>>> Sent: Thursday, January 12, 2012 1:29:10 PM > >>>> Subject: Re: [Swift-devel] command line ssh provider... > >>>> Mike, > >>>> You mentioned that you were able to use ssh command line provider > >>>> using catsn this morning. Was it using agents? Mihael did you test > >>>> using an agent? How do I specify for it to use an agent if > >>>> available? > >>>> I can do a simple hostname test from communicado to bridled but it > >>>> asks for my password instead of using the agent I have set up. > >>>> > >>>> > >>>> On Jan 12, 2012, at 12:21 AM, Mihael Hategan wrote: > >>>> > >>>>> ... is in trunk (cog r3347). I was able to start coasters with > >>>>> it. > >>>>> The > >>>>> provider is called "ssh-cl". It is ssh, so ~/.ssh/config and > >>>>> agents > >>>>> will > >>>>> apply. Please test. > >>>>> > >>>>> Mihael > >>>>> > >>>>> _______________________________________________ > >>>>> Swift-devel mailing list > >>>>> Swift-devel at ci.uchicago.edu > >>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >>> > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > Ketan > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Thu Jan 12 21:45:48 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 12 Jan 2012 21:45:48 -0600 (CST) Subject: [Swift-devel] command line ssh provider... In-Reply-To: Message-ID: <713012293.138743.1326426347999.JavaMail.root@zimbra.anl.gov> I thought that this new ssh-cl provider should not use auth.defaults. Jon, I would wait for ideas from Mihael before dismantling your ssh configuration. - Mike ----- Original Message ----- > From: "Ketan Maheshwari" > To: "Jonathan Monette" > Cc: "Michael Wilde" , "Swift Devel" > Sent: Thursday, January 12, 2012 9:42:41 PM > Subject: Re: [Swift-devel] command line ssh provider... > Jon, > > > Do you have auth.defaults file in ~/.ssh/ set? It has the following > structure: > > > > .type=key > .username= > .key=/path/to/key > .passphrase= > > > file perm should be 600 > > Regards, > Ketan > > > On Thu, Jan 12, 2012 at 9:24 PM, Jonathan Monette < jonmon at mcs.anl.gov > > wrote: > > > I am getting a different problem. The provider does not seem to be > using an agent. > > Starting from my macbook I can ssh -A jonmon at login.ci.uchicago.edu and > then do ssh -A jonmon at communicado.ci.uchicago.edu and then ssh -A > jonmon at bridled.ci.uchicago.edu in the terminal and none of them > require a password. > > However if I ssh -A jonmon at login.ci.uchicago.edu and then ssh -A > jonmon at communicado.ci.uchicago.edu , then start a Swift run that does > a simple hostname call on bridled.ci.uchicago.edu I am prompted for my > ci password every time. > > I am more than certain that this is a configuration issue so I ask for > suggestions. My next step is to completely undo all my ssh keys in the > authorized key files and start fresh with new keys and passphrases > that are not in my macbook keychain. I do not really want to basically > revert back to nothing regarding ssh configuration but this seems to > be my only alternative. Any suggestions? > > > > > On Jan 12, 2012, at 9:19 PM, Michael Wilde wrote: > > > The boostrap log shows this: > > > > com$ cat ~/coaster-bootstrap-1460623968.log > > using plain mode > > BS: http://communicado.ci.uchicago.edu:45621 > > Failed to download bootstrap jar from > > http://communicado.ci.uchicago.edu:45621 > > com$ > > > > - Mike > > > > ----- Original Message ----- > >> From: "Mihael Hategan" < hategan at mcs.anl.gov > > >> To: "Michael Wilde" < wilde at mcs.anl.gov > > >> Cc: "Jonathan Monette" < jonmon at mcs.anl.gov >, "Swift Devel" < > >> swift-devel at ci.uchicago.edu > > >> Sent: Thursday, January 12, 2012 8:34:36 PM > >> Subject: Re: [Swift-devel] command line ssh provider... > >> Can't test it right now because UCDavis decided to firewall stuff, > >> but > >> I > >> do get the bootstrap script to start and it gets to the wget part. > >> > >> So the question is, do you get a bootstrap log? > >> > >> On Thu, 2012-01-12 at 13:45 -0600, Michael Wilde wrote: > >>> ssh-cl worked for me going from communicado to both login.ci and > >>> bridled. > >>> > >>> I *assumed* it used my agent because I did not get a password > >>> prompt > >>> from the swift run. And I dont get a password prompt when running > >>> the ssh command line. > >>> > >>> It failed when I tried to use coasters with either provider > >>> staging > >>> (to login.mcs) or localhost/shared workdir (to login.ci ). > >>> > >>> The command line and stdout/err for the coaster/local-workdir case > >>> is below. The logs are on ci net under ~wilde/swift/lab. Config > >>> and > >>> sites file was: > >>> > >>> com$ cat cf > >>> wrapperlog.always.transfer=true > >>> sitedir.keep=true > >>> execution.retries=0 > >>> lazy.errors=false > >>> status.mode=provider > >>> use.provider.staging=false > >>> provider.staging.pin.swiftfiles=false > >>> > >>> com$ cat sshcl.xml > >>> > >>> > >>> > >>> > >>> /home/wilde/swiftwork > >>> > >>> > >>> com$ > >>> > >>> com$ cat sshclcoast.xml > >>> > >>> > >>> >>> jobmanager="ssh-cl:local"/> > >>> > >>> 8 > >>> 1 > >>> 1 > >>> 1 > >>> .01 > >>> 10000 > >>> > >>> > >>> /home/wilde/swiftwork > >>> > >>> > >>> > >>> com$ > >>> > >>> > >>> > >>> - Mike > >>> > >>> com$ which swift > >>> ~/swift/src/trunk/cog/modules/swift/dist/swift-svn/bin/swift > >>> com$ pwd > >>> /home/wilde/swift/lab > >>> com$ swift -tc.file tc -sites.file sshcl.xml -config cf > >>> catsn.swift > >>> -n=1 > >>> Swift trunk swift-r5498 cog-r3347 > >>> > >>> RunID: 20120112-1343-a7mk2zyc > >>> Progress: time: Thu, 12 Jan 2012 13:43:04 -0600 > >>> Final status: Thu, 12 Jan 2012 13:43:04 -0600 Finished > >>> successfully:1 > >>> com$ swift -tc.file tc -sites.file sshclcoast.xml -config cf > >>> catsn.swift -n=1 > >>> Swift trunk swift-r5498 cog-r3347 > >>> > >>> RunID: 20120112-1343-ql7sn3f7 > >>> Progress: time: Thu, 12 Jan 2012 13:43:20 -0600 > >>> Failed to transfer wrapper log for job cat-ihhm6jlk > >>> EXCEPTION Exception in cat: > >>> Arguments: [data.txt] > >>> Host: localhost > >>> Directory: catsn-20120112-1343-ql7sn3f7/jobs/i/cat-ihhm6jlk > >>> stderr.txt: > >>> > >>> stdout.txt: > >>> > >>> ---- > >>> > >>> Caused by: null > >>> Caused by: > >>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: > >>> Could not submit job > >>> Caused by: > >>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: > >>> Could not start coaster service > >>> Caused by: > >>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: > >>> Task ended before registration was received. > >>> STDOUT: Failed to download bootstrap jar from > >>> http://communicado.ci.uchicago.edu:45621 > >>> > >>> STDERR: This machine accepts SSH public key and One Time Password > >>> (OTP) logins only. > >>> If you do not have a public key set up, you will be prompted for a > >>> password. > >>> This is *not* your CI password, but the One Time Password > >>> generated > >>> from your > >>> OTP token. Do not type your CI password, it will not work. If you > >>> do > >>> not > >>> have a token or public key, you will not be able to login. > >>> > >>> See http://www.ci.uchicago.edu/faq for more information. > >>> > >>> Caused by: > >>> org.globus.cog.abstraction.impl.common.execution.JobException: Job > >>> failed with an exit code of 1 > >>> Execution failed: > >>> Job failed with an exit code of 1 > >>> com$ > >>> > >>> > >>> ----- Original Message ----- > >>>> From: "Jonathan Monette" < jonmon at mcs.anl.gov > > >>>> To: "Mihael Hategan" < hategan at mcs.anl.gov > > >>>> Cc: "Swift Devel" < swift-devel at ci.uchicago.edu >, "Michael > >>>> Wilde" > >>>> < wilde at mcs.anl.gov > > >>>> Sent: Thursday, January 12, 2012 1:29:10 PM > >>>> Subject: Re: [Swift-devel] command line ssh provider... > >>>> Mike, > >>>> You mentioned that you were able to use ssh command line provider > >>>> using catsn this morning. Was it using agents? Mihael did you > >>>> test > >>>> using an agent? How do I specify for it to use an agent if > >>>> available? > >>>> I can do a simple hostname test from communicado to bridled but > >>>> it > >>>> asks for my password instead of using the agent I have set up. > >>>> > >>>> > >>>> On Jan 12, 2012, at 12:21 AM, Mihael Hategan wrote: > >>>> > >>>>> ... is in trunk (cog r3347). I was able to start coasters with > >>>>> it. > >>>>> The > >>>>> provider is called "ssh-cl". It is ssh, so ~/.ssh/config and > >>>>> agents > >>>>> will > >>>>> apply. Please test. > >>>>> > >>>>> Mihael > >>>>> > >>>>> _______________________________________________ > >>>>> Swift-devel mailing list > >>>>> Swift-devel at ci.uchicago.edu > >>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >>> > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > -- > Ketan -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From jonmon at mcs.anl.gov Thu Jan 12 21:47:16 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Thu, 12 Jan 2012 21:47:16 -0600 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <713012293.138743.1326426347999.JavaMail.root@zimbra.anl.gov> References: <713012293.138743.1326426347999.JavaMail.root@zimbra.anl.gov> Message-ID: That is my plan. On Jan 12, 2012, at 9:45 PM, Michael Wilde wrote: > I thought that this new ssh-cl provider should not use auth.defaults. > > Jon, I would wait for ideas from Mihael before dismantling your ssh configuration. > > - Mike > > ----- Original Message ----- >> From: "Ketan Maheshwari" >> To: "Jonathan Monette" >> Cc: "Michael Wilde" , "Swift Devel" >> Sent: Thursday, January 12, 2012 9:42:41 PM >> Subject: Re: [Swift-devel] command line ssh provider... >> Jon, >> >> >> Do you have auth.defaults file in ~/.ssh/ set? It has the following >> structure: >> >> >> >> .type=key >> .username= >> .key=/path/to/key >> .passphrase= >> >> >> file perm should be 600 >> >> Regards, >> Ketan >> >> >> On Thu, Jan 12, 2012 at 9:24 PM, Jonathan Monette < jonmon at mcs.anl.gov >>> wrote: >> >> >> I am getting a different problem. The provider does not seem to be >> using an agent. >> >> Starting from my macbook I can ssh -A jonmon at login.ci.uchicago.edu and >> then do ssh -A jonmon at communicado.ci.uchicago.edu and then ssh -A >> jonmon at bridled.ci.uchicago.edu in the terminal and none of them >> require a password. >> >> However if I ssh -A jonmon at login.ci.uchicago.edu and then ssh -A >> jonmon at communicado.ci.uchicago.edu , then start a Swift run that does >> a simple hostname call on bridled.ci.uchicago.edu I am prompted for my >> ci password every time. >> >> I am more than certain that this is a configuration issue so I ask for >> suggestions. My next step is to completely undo all my ssh keys in the >> authorized key files and start fresh with new keys and passphrases >> that are not in my macbook keychain. I do not really want to basically >> revert back to nothing regarding ssh configuration but this seems to >> be my only alternative. Any suggestions? >> >> >> >> >> On Jan 12, 2012, at 9:19 PM, Michael Wilde wrote: >> >>> The boostrap log shows this: >>> >>> com$ cat ~/coaster-bootstrap-1460623968.log >>> using plain mode >>> BS: http://communicado.ci.uchicago.edu:45621 >>> Failed to download bootstrap jar from >>> http://communicado.ci.uchicago.edu:45621 >>> com$ >>> >>> - Mike >>> >>> ----- Original Message ----- >>>> From: "Mihael Hategan" < hategan at mcs.anl.gov > >>>> To: "Michael Wilde" < wilde at mcs.anl.gov > >>>> Cc: "Jonathan Monette" < jonmon at mcs.anl.gov >, "Swift Devel" < >>>> swift-devel at ci.uchicago.edu > >>>> Sent: Thursday, January 12, 2012 8:34:36 PM >>>> Subject: Re: [Swift-devel] command line ssh provider... >>>> Can't test it right now because UCDavis decided to firewall stuff, >>>> but >>>> I >>>> do get the bootstrap script to start and it gets to the wget part. >>>> >>>> So the question is, do you get a bootstrap log? >>>> >>>> On Thu, 2012-01-12 at 13:45 -0600, Michael Wilde wrote: >>>>> ssh-cl worked for me going from communicado to both login.ci and >>>>> bridled. >>>>> >>>>> I *assumed* it used my agent because I did not get a password >>>>> prompt >>>>> from the swift run. And I dont get a password prompt when running >>>>> the ssh command line. >>>>> >>>>> It failed when I tried to use coasters with either provider >>>>> staging >>>>> (to login.mcs) or localhost/shared workdir (to login.ci ). >>>>> >>>>> The command line and stdout/err for the coaster/local-workdir case >>>>> is below. The logs are on ci net under ~wilde/swift/lab. Config >>>>> and >>>>> sites file was: >>>>> >>>>> com$ cat cf >>>>> wrapperlog.always.transfer=true >>>>> sitedir.keep=true >>>>> execution.retries=0 >>>>> lazy.errors=false >>>>> status.mode=provider >>>>> use.provider.staging=false >>>>> provider.staging.pin.swiftfiles=false >>>>> >>>>> com$ cat sshcl.xml >>>>> >>>>> >>>>> >>>>> >>>>> /home/wilde/swiftwork >>>>> >>>>> >>>>> com$ >>>>> >>>>> com$ cat sshclcoast.xml >>>>> >>>>> >>>>> >>>> jobmanager="ssh-cl:local"/> >>>>> >>>>> 8 >>>>> 1 >>>>> 1 >>>>> 1 >>>>> .01 >>>>> 10000 >>>>> >>>>> >>>>> /home/wilde/swiftwork >>>>> >>>>> >>>>> >>>>> com$ >>>>> >>>>> >>>>> >>>>> - Mike >>>>> >>>>> com$ which swift >>>>> ~/swift/src/trunk/cog/modules/swift/dist/swift-svn/bin/swift >>>>> com$ pwd >>>>> /home/wilde/swift/lab >>>>> com$ swift -tc.file tc -sites.file sshcl.xml -config cf >>>>> catsn.swift >>>>> -n=1 >>>>> Swift trunk swift-r5498 cog-r3347 >>>>> >>>>> RunID: 20120112-1343-a7mk2zyc >>>>> Progress: time: Thu, 12 Jan 2012 13:43:04 -0600 >>>>> Final status: Thu, 12 Jan 2012 13:43:04 -0600 Finished >>>>> successfully:1 >>>>> com$ swift -tc.file tc -sites.file sshclcoast.xml -config cf >>>>> catsn.swift -n=1 >>>>> Swift trunk swift-r5498 cog-r3347 >>>>> >>>>> RunID: 20120112-1343-ql7sn3f7 >>>>> Progress: time: Thu, 12 Jan 2012 13:43:20 -0600 >>>>> Failed to transfer wrapper log for job cat-ihhm6jlk >>>>> EXCEPTION Exception in cat: >>>>> Arguments: [data.txt] >>>>> Host: localhost >>>>> Directory: catsn-20120112-1343-ql7sn3f7/jobs/i/cat-ihhm6jlk >>>>> stderr.txt: >>>>> >>>>> stdout.txt: >>>>> >>>>> ---- >>>>> >>>>> Caused by: null >>>>> Caused by: >>>>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: >>>>> Could not submit job >>>>> Caused by: >>>>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: >>>>> Could not start coaster service >>>>> Caused by: >>>>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: >>>>> Task ended before registration was received. >>>>> STDOUT: Failed to download bootstrap jar from >>>>> http://communicado.ci.uchicago.edu:45621 >>>>> >>>>> STDERR: This machine accepts SSH public key and One Time Password >>>>> (OTP) logins only. >>>>> If you do not have a public key set up, you will be prompted for a >>>>> password. >>>>> This is *not* your CI password, but the One Time Password >>>>> generated >>>>> from your >>>>> OTP token. Do not type your CI password, it will not work. If you >>>>> do >>>>> not >>>>> have a token or public key, you will not be able to login. >>>>> >>>>> See http://www.ci.uchicago.edu/faq for more information. >>>>> >>>>> Caused by: >>>>> org.globus.cog.abstraction.impl.common.execution.JobException: Job >>>>> failed with an exit code of 1 >>>>> Execution failed: >>>>> Job failed with an exit code of 1 >>>>> com$ >>>>> >>>>> >>>>> ----- Original Message ----- >>>>>> From: "Jonathan Monette" < jonmon at mcs.anl.gov > >>>>>> To: "Mihael Hategan" < hategan at mcs.anl.gov > >>>>>> Cc: "Swift Devel" < swift-devel at ci.uchicago.edu >, "Michael >>>>>> Wilde" >>>>>> < wilde at mcs.anl.gov > >>>>>> Sent: Thursday, January 12, 2012 1:29:10 PM >>>>>> Subject: Re: [Swift-devel] command line ssh provider... >>>>>> Mike, >>>>>> You mentioned that you were able to use ssh command line provider >>>>>> using catsn this morning. Was it using agents? Mihael did you >>>>>> test >>>>>> using an agent? How do I specify for it to use an agent if >>>>>> available? >>>>>> I can do a simple hostname test from communicado to bridled but >>>>>> it >>>>>> asks for my password instead of using the agent I have set up. >>>>>> >>>>>> >>>>>> On Jan 12, 2012, at 12:21 AM, Mihael Hategan wrote: >>>>>> >>>>>>> ... is in trunk (cog r3347). I was able to start coasters with >>>>>>> it. >>>>>>> The >>>>>>> provider is called "ssh-cl". It is ssh, so ~/.ssh/config and >>>>>>> agents >>>>>>> will >>>>>>> apply. Please test. >>>>>>> >>>>>>> Mihael >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Swift-devel mailing list >>>>>>> Swift-devel at ci.uchicago.edu >>>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>>> >>> >>> -- >>> Michael Wilde >>> Computation Institute, University of Chicago >>> Mathematics and Computer Science Division >>> Argonne National Laboratory >>> >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> >> >> >> >> -- >> Ketan > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > From wilde at mcs.anl.gov Thu Jan 12 21:49:21 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 12 Jan 2012 21:49:21 -0600 (CST) Subject: [Swift-devel] command line ssh provider... In-Reply-To: Message-ID: <1005731577.138745.1326426561224.JavaMail.root@zimbra.anl.gov> Thanks, Ketan - yes, setting that env var gets me further. Now I realize that I also need a valid proxy for automatic coaster's use of GSI. Ive create a proxy on both the client and the server, and set GLOBUS_TCP_PORT_RANGE and _SOURCE_RANGE on both sides. I also needed to set X509_CERT_DIR and CADIR on both sides because my default environment is getting an expired CA CRL somewhere. Now Im getting a GSI failure further down, I think when the service tries to reach the client. Im going to try again, being careful to create the proxies on both sides from a valid recent OSG release. - Mike ----- Original Message ----- > From: "Ketan Maheshwari" > To: "Michael Wilde" > Cc: "Mihael Hategan" , "Swift Devel" > Sent: Thursday, January 12, 2012 9:23:58 PM > Subject: Re: [Swift-devel] command line ssh provider... > This would be because of firewall at communicado. Probably need to set > GLOBUS_TCP_PORT_RANGE=50000,51000 > > > On Thu, Jan 12, 2012 at 9:19 PM, Michael Wilde < wilde at mcs.anl.gov > > wrote: > > > The boostrap log shows this: > > com$ cat ~/coaster-bootstrap-1460623968.log > using plain mode > BS: http://communicado.ci.uchicago.edu:45621 > > Failed to download bootstrap jar from > http://communicado.ci.uchicago.edu:45621 > com$ > > - Mike > > > > > ----- Original Message ----- > > From: "Mihael Hategan" < hategan at mcs.anl.gov > > > To: "Michael Wilde" < wilde at mcs.anl.gov > > > Cc: "Jonathan Monette" < jonmon at mcs.anl.gov >, "Swift Devel" < > > swift-devel at ci.uchicago.edu > > > Sent: Thursday, January 12, 2012 8:34:36 PM > > Subject: Re: [Swift-devel] command line ssh provider... > > Can't test it right now because UCDavis decided to firewall stuff, > > but > > I > > do get the bootstrap script to start and it gets to the wget part. > > > > So the question is, do you get a bootstrap log? > > > > On Thu, 2012-01-12 at 13:45 -0600, Michael Wilde wrote: > > > ssh-cl worked for me going from communicado to both login.ci and > > > bridled. > > > > > > I *assumed* it used my agent because I did not get a password > > > prompt > > > from the swift run. And I dont get a password prompt when running > > > the ssh command line. > > > > > > It failed when I tried to use coasters with either provider > > > staging > > > (to login.mcs) or localhost/shared workdir (to login.ci ). > > > > > > The command line and stdout/err for the coaster/local-workdir case > > > is below. The logs are on ci net under ~wilde/swift/lab. Config > > > and > > > sites file was: > > > > > > com$ cat cf > > > wrapperlog.always.transfer=true > > > sitedir.keep=true > > > execution.retries=0 > > > lazy.errors=false > > > status.mode=provider > > > use.provider.staging=false > > > provider.staging.pin.swiftfiles=false > > > > > > com$ cat sshcl.xml > > > > > > > > > > > > > > > /home/wilde/swiftwork > > > > > > > > > com$ > > > > > > com$ cat sshclcoast.xml > > > > > > > > > > > jobmanager="ssh-cl:local"/> > > > > > > 8 > > > 1 > > > 1 > > > 1 > > > .01 > > > 10000 > > > > > > > > > /home/wilde/swiftwork > > > > > > > > > > > > com$ > > > > > > > > > > > > - Mike > > > > > > com$ which swift > > > ~/swift/src/trunk/cog/modules/swift/dist/swift-svn/bin/swift > > > com$ pwd > > > /home/wilde/swift/lab > > > com$ swift -tc.file tc -sites.file sshcl.xml -config cf > > > catsn.swift > > > -n=1 > > > Swift trunk swift-r5498 cog-r3347 > > > > > > RunID: 20120112-1343-a7mk2zyc > > > Progress: time: Thu, 12 Jan 2012 13:43:04 -0600 > > > Final status: Thu, 12 Jan 2012 13:43:04 -0600 Finished > > > successfully:1 > > > com$ swift -tc.file tc -sites.file sshclcoast.xml -config cf > > > catsn.swift -n=1 > > > Swift trunk swift-r5498 cog-r3347 > > > > > > RunID: 20120112-1343-ql7sn3f7 > > > Progress: time: Thu, 12 Jan 2012 13:43:20 -0600 > > > Failed to transfer wrapper log for job cat-ihhm6jlk > > > EXCEPTION Exception in cat: > > > Arguments: [data.txt] > > > Host: localhost > > > Directory: catsn-20120112-1343-ql7sn3f7/jobs/i/cat-ihhm6jlk > > > stderr.txt: > > > > > > stdout.txt: > > > > > > ---- > > > > > > Caused by: null > > > Caused by: > > > org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: > > > Could not submit job > > > Caused by: > > > org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: > > > Could not start coaster service > > > Caused by: > > > org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: > > > Task ended before registration was received. > > > STDOUT: Failed to download bootstrap jar from > > > http://communicado.ci.uchicago.edu:45621 > > > > > > STDERR: This machine accepts SSH public key and One Time Password > > > (OTP) logins only. > > > If you do not have a public key set up, you will be prompted for a > > > password. > > > This is *not* your CI password, but the One Time Password > > > generated > > > from your > > > OTP token. Do not type your CI password, it will not work. If you > > > do > > > not > > > have a token or public key, you will not be able to login. > > > > > > See http://www.ci.uchicago.edu/faq for more information. > > > > > > Caused by: > > > org.globus.cog.abstraction.impl.common.execution.JobException: Job > > > failed with an exit code of 1 > > > Execution failed: > > > Job failed with an exit code of 1 > > > com$ > > > > > > > > > ----- Original Message ----- > > > > From: "Jonathan Monette" < jonmon at mcs.anl.gov > > > > > To: "Mihael Hategan" < hategan at mcs.anl.gov > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu >, "Michael > > > > Wilde" > > > > < wilde at mcs.anl.gov > > > > > Sent: Thursday, January 12, 2012 1:29:10 PM > > > > Subject: Re: [Swift-devel] command line ssh provider... > > > > Mike, > > > > You mentioned that you were able to use ssh command line > > > > provider > > > > using catsn this morning. Was it using agents? Mihael did you > > > > test > > > > using an agent? How do I specify for it to use an agent if > > > > available? > > > > I can do a simple hostname test from communicado to bridled but > > > > it > > > > asks for my password instead of using the agent I have set up. > > > > > > > > > > > > On Jan 12, 2012, at 12:21 AM, Mihael Hategan wrote: > > > > > > > > > ... is in trunk (cog r3347). I was able to start coasters with > > > > > it. > > > > > The > > > > > provider is called "ssh-cl". It is ssh, so ~/.ssh/config and > > > > > agents > > > > > will > > > > > apply. Please test. > > > > > > > > > > Mihael > > > > > > > > > > _______________________________________________ > > > > > Swift-devel mailing list > > > > > Swift-devel at ci.uchicago.edu > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > -- > Ketan -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From benc at hawaga.org.uk Fri Jan 13 03:00:07 2012 From: benc at hawaga.org.uk (Ben Clifford) Date: Fri, 13 Jan 2012 09:00:07 +0000 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <137BC8A4-71EE-4E93-AF87-07E542C0AB45@mcs.anl.gov> References: <1300975460.138708.1326424776491.JavaMail.root@zimbra.anl.gov> <137BC8A4-71EE-4E93-AF87-07E542C0AB45@mcs.anl.gov> Message-ID: one guess, based only on reading this thread, is that the SSH_AGENT environment variable from your login session (which tells the 'ssh' commandline program how to get back to the agent that it should use) is not getting passed all the way through swift and ssh-ci to the ssh command executed in there. I didn't look at the code, though, or try to determine the truth of this in any way. On Jan 13, 2012, at 3:24 AM, Jonathan Monette wrote: > I am getting a different problem. The provider does not seem to be using an agent. > > Starting from my macbook I can ssh -A jonmon at login.ci.uchicago.edu and then do ssh -A jonmon at communicado.ci.uchicago.edu and then ssh -A jonmon at bridled.ci.uchicago.edu in the terminal and none of them require a password. > > However if I ssh -A jonmon at login.ci.uchicago.edu and then ssh -A jonmon at communicado.ci.uchicago.edu, then start a Swift run that does a simple hostname call on bridled.ci.uchicago.edu I am prompted for my ci password every time. > > I am more than certain that this is a configuration issue so I ask for suggestions. My next step is to completely undo all my ssh keys in the authorized key files and start fresh with new keys and passphrases that are not in my macbook keychain. I do not really want to basically revert back to nothing regarding ssh configuration but this seems to be my only alternative. Any suggestions? > > On Jan 12, 2012, at 9:19 PM, Michael Wilde wrote: > >> The boostrap log shows this: >> >> com$ cat ~/coaster-bootstrap-1460623968.log >> using plain mode >> BS: http://communicado.ci.uchicago.edu:45621 >> Failed to download bootstrap jar from http://communicado.ci.uchicago.edu:45621 >> com$ >> >> - Mike >> >> ----- Original Message ----- >>> From: "Mihael Hategan" >>> To: "Michael Wilde" >>> Cc: "Jonathan Monette" , "Swift Devel" >>> Sent: Thursday, January 12, 2012 8:34:36 PM >>> Subject: Re: [Swift-devel] command line ssh provider... >>> Can't test it right now because UCDavis decided to firewall stuff, but >>> I >>> do get the bootstrap script to start and it gets to the wget part. >>> >>> So the question is, do you get a bootstrap log? >>> >>> On Thu, 2012-01-12 at 13:45 -0600, Michael Wilde wrote: >>>> ssh-cl worked for me going from communicado to both login.ci and >>>> bridled. >>>> >>>> I *assumed* it used my agent because I did not get a password prompt >>>> from the swift run. And I dont get a password prompt when running >>>> the ssh command line. >>>> >>>> It failed when I tried to use coasters with either provider staging >>>> (to login.mcs) or localhost/shared workdir (to login.ci). >>>> >>>> The command line and stdout/err for the coaster/local-workdir case >>>> is below. The logs are on ci net under ~wilde/swift/lab. Config and >>>> sites file was: >>>> >>>> com$ cat cf >>>> wrapperlog.always.transfer=true >>>> sitedir.keep=true >>>> execution.retries=0 >>>> lazy.errors=false >>>> status.mode=provider >>>> use.provider.staging=false >>>> provider.staging.pin.swiftfiles=false >>>> >>>> com$ cat sshcl.xml >>>> >>>> >>>> >>>> >>>> /home/wilde/swiftwork >>>> >>>> >>>> com$ >>>> >>>> com$ cat sshclcoast.xml >>>> >>>> >>>> >>> jobmanager="ssh-cl:local"/> >>>> >>>> 8 >>>> 1 >>>> 1 >>>> 1 >>>> .01 >>>> 10000 >>>> >>>> >>>> /home/wilde/swiftwork >>>> >>>> >>>> >>>> com$ >>>> >>>> >>>> >>>> - Mike >>>> >>>> com$ which swift >>>> ~/swift/src/trunk/cog/modules/swift/dist/swift-svn/bin/swift >>>> com$ pwd >>>> /home/wilde/swift/lab >>>> com$ swift -tc.file tc -sites.file sshcl.xml -config cf catsn.swift >>>> -n=1 >>>> Swift trunk swift-r5498 cog-r3347 >>>> >>>> RunID: 20120112-1343-a7mk2zyc >>>> Progress: time: Thu, 12 Jan 2012 13:43:04 -0600 >>>> Final status: Thu, 12 Jan 2012 13:43:04 -0600 Finished >>>> successfully:1 >>>> com$ swift -tc.file tc -sites.file sshclcoast.xml -config cf >>>> catsn.swift -n=1 >>>> Swift trunk swift-r5498 cog-r3347 >>>> >>>> RunID: 20120112-1343-ql7sn3f7 >>>> Progress: time: Thu, 12 Jan 2012 13:43:20 -0600 >>>> Failed to transfer wrapper log for job cat-ihhm6jlk >>>> EXCEPTION Exception in cat: >>>> Arguments: [data.txt] >>>> Host: localhost >>>> Directory: catsn-20120112-1343-ql7sn3f7/jobs/i/cat-ihhm6jlk >>>> stderr.txt: >>>> >>>> stdout.txt: >>>> >>>> ---- >>>> >>>> Caused by: null >>>> Caused by: >>>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: >>>> Could not submit job >>>> Caused by: >>>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: >>>> Could not start coaster service >>>> Caused by: >>>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: >>>> Task ended before registration was received. >>>> STDOUT: Failed to download bootstrap jar from >>>> http://communicado.ci.uchicago.edu:45621 >>>> >>>> STDERR: This machine accepts SSH public key and One Time Password >>>> (OTP) logins only. >>>> If you do not have a public key set up, you will be prompted for a >>>> password. >>>> This is *not* your CI password, but the One Time Password generated >>>> from your >>>> OTP token. Do not type your CI password, it will not work. If you do >>>> not >>>> have a token or public key, you will not be able to login. >>>> >>>> See http://www.ci.uchicago.edu/faq for more information. >>>> >>>> Caused by: >>>> org.globus.cog.abstraction.impl.common.execution.JobException: Job >>>> failed with an exit code of 1 >>>> Execution failed: >>>> Job failed with an exit code of 1 >>>> com$ >>>> >>>> >>>> ----- Original Message ----- >>>>> From: "Jonathan Monette" >>>>> To: "Mihael Hategan" >>>>> Cc: "Swift Devel" , "Michael Wilde" >>>>> >>>>> Sent: Thursday, January 12, 2012 1:29:10 PM >>>>> Subject: Re: [Swift-devel] command line ssh provider... >>>>> Mike, >>>>> You mentioned that you were able to use ssh command line provider >>>>> using catsn this morning. Was it using agents? Mihael did you test >>>>> using an agent? How do I specify for it to use an agent if >>>>> available? >>>>> I can do a simple hostname test from communicado to bridled but it >>>>> asks for my password instead of using the agent I have set up. >>>>> >>>>> >>>>> On Jan 12, 2012, at 12:21 AM, Mihael Hategan wrote: >>>>> >>>>>> ... is in trunk (cog r3347). I was able to start coasters with >>>>>> it. >>>>>> The >>>>>> provider is called "ssh-cl". It is ssh, so ~/.ssh/config and >>>>>> agents >>>>>> will >>>>>> apply. Please test. >>>>>> >>>>>> Mihael >>>>>> >>>>>> _______________________________________________ >>>>>> Swift-devel mailing list >>>>>> Swift-devel at ci.uchicago.edu >>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>> >> >> -- >> Michael Wilde >> Computation Institute, University of Chicago >> Mathematics and Computer Science Division >> Argonne National Laboratory >> > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > From hategan at mcs.anl.gov Fri Jan 13 03:12:44 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 13 Jan 2012 01:12:44 -0800 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <1CED5457-8158-4470-89E6-D3700EFC0C18@mcs.anl.gov> References: <1300975460.138708.1326424776491.JavaMail.root@zimbra.anl.gov> <137BC8A4-71EE-4E93-AF87-07E542C0AB45@mcs.anl.gov> <1CED5457-8158-4470-89E6-D3700EFC0C18@mcs.anl.gov> Message-ID: <1326445964.30161.1.camel@blabla> On Thu, 2012-01-12 at 21:44 -0600, Jonathan Monette wrote: > I didn't think the ssh commandline provider would be using > auth.defaults. I thought the command line provider was created so > that an auth.defaults would not be necessary. Am I wrong in thinking > that Mihael? You are correct. The command line ssh provider does not use auth.defaults. In fact, I'm thinking of changing the java ssh provider to use .ssh/config, given that it does mostly the same thing. But I digress. We need to troubleshoot your problem. Anything you can do with ssh on the command line you should be able to do with this new provider, given that it is ssh on the command line. From hategan at mcs.anl.gov Fri Jan 13 03:13:29 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 13 Jan 2012 01:13:29 -0800 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <713012293.138743.1326426347999.JavaMail.root@zimbra.anl.gov> References: <713012293.138743.1326426347999.JavaMail.root@zimbra.anl.gov> Message-ID: <1326446009.30161.2.camel@blabla> On Thu, 2012-01-12 at 21:45 -0600, Michael Wilde wrote: > I thought that this new ssh-cl provider should not use auth.defaults. > > Jon, I would wait for ideas from Mihael before dismantling your ssh configuration. Yes. Don't dismantle just yet. And if you do, make a backup please. From wilde at mcs.anl.gov Fri Jan 13 06:47:08 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 13 Jan 2012 06:47:08 -0600 (CST) Subject: [Swift-devel] command line ssh provider... In-Reply-To: Message-ID: <1891742880.139077.1326458828682.JavaMail.root@zimbra.anl.gov> I ssh to communicado from my mac using the following command: ssh -A -t login.ci.uchicago.edu ssh -A -t communicado.ci.uchicago.edu then I get the following ssh env vars, and the basic ssh-cl provider seems to work: com$ env | grep -i ssh SSH_CLIENT=128.135.125.155 47429 22 SSH_TTY=/dev/pts/0 SSH_AUTH_SOCK=/tmp/ssh-iGZFq22173/agent.22173 SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass CVS_RSH=cvs-ssh SSH_CONNECTION=128.135.125.155 47429 128.135.125.17 22 com$ export | grep -i ssh declare -x CVS_RSH="cvs-ssh" declare -x SSH_ASKPASS="/usr/libexec/openssh/gnome-ssh-askpass" declare -x SSH_AUTH_SOCK="/tmp/ssh-iGZFq22173/agent.22173" declare -x SSH_CLIENT="128.135.125.155 47429 22" declare -x SSH_CONNECTION="128.135.125.155 47429 128.135.125.17 22" declare -x SSH_TTY="/dev/pts/0" com$ (I still have problems, unrelated I think, with getting coasters to work with ssh-cl). - Mike ----- Original Message ----- > From: "Ben Clifford" > To: "Jonathan Monette" > Cc: "Michael Wilde" , "Swift Devel" > Sent: Friday, January 13, 2012 3:00:07 AM > Subject: Re: [Swift-devel] command line ssh provider... > one guess, based only on reading this thread, is that the SSH_AGENT > environment variable from your login session (which tells the 'ssh' > commandline program how to get back to the agent that it should use) > is not getting passed all the way through swift and ssh-ci to the ssh > command executed in there. I didn't look at the code, though, or try > to determine the truth of this in any way. > > On Jan 13, 2012, at 3:24 AM, Jonathan Monette wrote: > > > I am getting a different problem. The provider does not seem to be > > using an agent. > > > > Starting from my macbook I can ssh -A jonmon at login.ci.uchicago.edu > > and then do ssh -A jonmon at communicado.ci.uchicago.edu and then ssh > > -A jonmon at bridled.ci.uchicago.edu in the terminal and none of them > > require a password. > > > > However if I ssh -A jonmon at login.ci.uchicago.edu and then ssh -A > > jonmon at communicado.ci.uchicago.edu, then start a Swift run that does > > a simple hostname call on bridled.ci.uchicago.edu I am prompted for > > my ci password every time. > > > > I am more than certain that this is a configuration issue so I ask > > for suggestions. My next step is to completely undo all my ssh keys > > in the authorized key files and start fresh with new keys and > > passphrases that are not in my macbook keychain. I do not really > > want to basically revert back to nothing regarding ssh configuration > > but this seems to be my only alternative. Any suggestions? > > > > On Jan 12, 2012, at 9:19 PM, Michael Wilde wrote: > > > >> The boostrap log shows this: > >> > >> com$ cat ~/coaster-bootstrap-1460623968.log > >> using plain mode > >> BS: http://communicado.ci.uchicago.edu:45621 > >> Failed to download bootstrap jar from > >> http://communicado.ci.uchicago.edu:45621 > >> com$ > >> > >> - Mike > >> > >> ----- Original Message ----- > >>> From: "Mihael Hategan" > >>> To: "Michael Wilde" > >>> Cc: "Jonathan Monette" , "Swift Devel" > >>> > >>> Sent: Thursday, January 12, 2012 8:34:36 PM > >>> Subject: Re: [Swift-devel] command line ssh provider... > >>> Can't test it right now because UCDavis decided to firewall stuff, > >>> but > >>> I > >>> do get the bootstrap script to start and it gets to the wget part. > >>> > >>> So the question is, do you get a bootstrap log? > >>> > >>> On Thu, 2012-01-12 at 13:45 -0600, Michael Wilde wrote: > >>>> ssh-cl worked for me going from communicado to both login.ci and > >>>> bridled. > >>>> > >>>> I *assumed* it used my agent because I did not get a password > >>>> prompt > >>>> from the swift run. And I dont get a password prompt when running > >>>> the ssh command line. > >>>> > >>>> It failed when I tried to use coasters with either provider > >>>> staging > >>>> (to login.mcs) or localhost/shared workdir (to login.ci). > >>>> > >>>> The command line and stdout/err for the coaster/local-workdir > >>>> case > >>>> is below. The logs are on ci net under ~wilde/swift/lab. Config > >>>> and > >>>> sites file was: > >>>> > >>>> com$ cat cf > >>>> wrapperlog.always.transfer=true > >>>> sitedir.keep=true > >>>> execution.retries=0 > >>>> lazy.errors=false > >>>> status.mode=provider > >>>> use.provider.staging=false > >>>> provider.staging.pin.swiftfiles=false > >>>> > >>>> com$ cat sshcl.xml > >>>> > >>>> > >>>> > >>>> > >>>> /home/wilde/swiftwork > >>>> > >>>> > >>>> com$ > >>>> > >>>> com$ cat sshclcoast.xml > >>>> > >>>> > >>>> >>>> jobmanager="ssh-cl:local"/> > >>>> > >>>> 8 > >>>> 1 > >>>> 1 > >>>> 1 > >>>> .01 > >>>> 10000 > >>>> > >>>> > >>>> /home/wilde/swiftwork > >>>> > >>>> > >>>> > >>>> com$ > >>>> > >>>> > >>>> > >>>> - Mike > >>>> > >>>> com$ which swift > >>>> ~/swift/src/trunk/cog/modules/swift/dist/swift-svn/bin/swift > >>>> com$ pwd > >>>> /home/wilde/swift/lab > >>>> com$ swift -tc.file tc -sites.file sshcl.xml -config cf > >>>> catsn.swift > >>>> -n=1 > >>>> Swift trunk swift-r5498 cog-r3347 > >>>> > >>>> RunID: 20120112-1343-a7mk2zyc > >>>> Progress: time: Thu, 12 Jan 2012 13:43:04 -0600 > >>>> Final status: Thu, 12 Jan 2012 13:43:04 -0600 Finished > >>>> successfully:1 > >>>> com$ swift -tc.file tc -sites.file sshclcoast.xml -config cf > >>>> catsn.swift -n=1 > >>>> Swift trunk swift-r5498 cog-r3347 > >>>> > >>>> RunID: 20120112-1343-ql7sn3f7 > >>>> Progress: time: Thu, 12 Jan 2012 13:43:20 -0600 > >>>> Failed to transfer wrapper log for job cat-ihhm6jlk > >>>> EXCEPTION Exception in cat: > >>>> Arguments: [data.txt] > >>>> Host: localhost > >>>> Directory: catsn-20120112-1343-ql7sn3f7/jobs/i/cat-ihhm6jlk > >>>> stderr.txt: > >>>> > >>>> stdout.txt: > >>>> > >>>> ---- > >>>> > >>>> Caused by: null > >>>> Caused by: > >>>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: > >>>> Could not submit job > >>>> Caused by: > >>>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: > >>>> Could not start coaster service > >>>> Caused by: > >>>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: > >>>> Task ended before registration was received. > >>>> STDOUT: Failed to download bootstrap jar from > >>>> http://communicado.ci.uchicago.edu:45621 > >>>> > >>>> STDERR: This machine accepts SSH public key and One Time Password > >>>> (OTP) logins only. > >>>> If you do not have a public key set up, you will be prompted for > >>>> a > >>>> password. > >>>> This is *not* your CI password, but the One Time Password > >>>> generated > >>>> from your > >>>> OTP token. Do not type your CI password, it will not work. If you > >>>> do > >>>> not > >>>> have a token or public key, you will not be able to login. > >>>> > >>>> See http://www.ci.uchicago.edu/faq for more information. > >>>> > >>>> Caused by: > >>>> org.globus.cog.abstraction.impl.common.execution.JobException: > >>>> Job > >>>> failed with an exit code of 1 > >>>> Execution failed: > >>>> Job failed with an exit code of 1 > >>>> com$ > >>>> > >>>> > >>>> ----- Original Message ----- > >>>>> From: "Jonathan Monette" > >>>>> To: "Mihael Hategan" > >>>>> Cc: "Swift Devel" , "Michael Wilde" > >>>>> > >>>>> Sent: Thursday, January 12, 2012 1:29:10 PM > >>>>> Subject: Re: [Swift-devel] command line ssh provider... > >>>>> Mike, > >>>>> You mentioned that you were able to use ssh command line > >>>>> provider > >>>>> using catsn this morning. Was it using agents? Mihael did you > >>>>> test > >>>>> using an agent? How do I specify for it to use an agent if > >>>>> available? > >>>>> I can do a simple hostname test from communicado to bridled but > >>>>> it > >>>>> asks for my password instead of using the agent I have set up. > >>>>> > >>>>> > >>>>> On Jan 12, 2012, at 12:21 AM, Mihael Hategan wrote: > >>>>> > >>>>>> ... is in trunk (cog r3347). I was able to start coasters with > >>>>>> it. > >>>>>> The > >>>>>> provider is called "ssh-cl". It is ssh, so ~/.ssh/config and > >>>>>> agents > >>>>>> will > >>>>>> apply. Please test. > >>>>>> > >>>>>> Mihael > >>>>>> > >>>>>> _______________________________________________ > >>>>>> Swift-devel mailing list > >>>>>> Swift-devel at ci.uchicago.edu > >>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >>>> > >> > >> -- > >> Michael Wilde > >> Computation Institute, University of Chicago > >> Mathematics and Computer Science Division > >> Argonne National Laboratory > >> > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From benc at hawaga.org.uk Fri Jan 13 08:26:59 2012 From: benc at hawaga.org.uk (Ben Clifford) Date: Fri, 13 Jan 2012 14:26:59 +0000 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <1891742880.139077.1326458828682.JavaMail.root@zimbra.anl.gov> References: <1891742880.139077.1326458828682.JavaMail.root@zimbra.anl.gov> Message-ID: SSH_AUTH_SOCK is the variable I intended to refer to. But if that's working for you, then my suggestion probably isn't the problem... On Jan 13, 2012, at 12:47 PM, Michael Wilde wrote: > I ssh to communicado from my mac using the following command: > > ssh -A -t login.ci.uchicago.edu ssh -A -t communicado.ci.uchicago.edu > > then I get the following ssh env vars, and the basic ssh-cl provider seems to work: > > com$ env | grep -i ssh > SSH_CLIENT=128.135.125.155 47429 22 > SSH_TTY=/dev/pts/0 > SSH_AUTH_SOCK=/tmp/ssh-iGZFq22173/agent.22173 > SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass > CVS_RSH=cvs-ssh > SSH_CONNECTION=128.135.125.155 47429 128.135.125.17 22 > com$ export | grep -i ssh > declare -x CVS_RSH="cvs-ssh" > declare -x SSH_ASKPASS="/usr/libexec/openssh/gnome-ssh-askpass" > declare -x SSH_AUTH_SOCK="/tmp/ssh-iGZFq22173/agent.22173" > declare -x SSH_CLIENT="128.135.125.155 47429 22" > declare -x SSH_CONNECTION="128.135.125.155 47429 128.135.125.17 22" > declare -x SSH_TTY="/dev/pts/0" > com$ > > (I still have problems, unrelated I think, with getting coasters to work with ssh-cl). > > - Mike > > ----- Original Message ----- >> From: "Ben Clifford" >> To: "Jonathan Monette" >> Cc: "Michael Wilde" , "Swift Devel" >> Sent: Friday, January 13, 2012 3:00:07 AM >> Subject: Re: [Swift-devel] command line ssh provider... >> one guess, based only on reading this thread, is that the SSH_AGENT >> environment variable from your login session (which tells the 'ssh' >> commandline program how to get back to the agent that it should use) >> is not getting passed all the way through swift and ssh-ci to the ssh >> command executed in there. I didn't look at the code, though, or try >> to determine the truth of this in any way. >> >> On Jan 13, 2012, at 3:24 AM, Jonathan Monette wrote: >> >>> I am getting a different problem. The provider does not seem to be >>> using an agent. >>> >>> Starting from my macbook I can ssh -A jonmon at login.ci.uchicago.edu >>> and then do ssh -A jonmon at communicado.ci.uchicago.edu and then ssh >>> -A jonmon at bridled.ci.uchicago.edu in the terminal and none of them >>> require a password. >>> >>> However if I ssh -A jonmon at login.ci.uchicago.edu and then ssh -A >>> jonmon at communicado.ci.uchicago.edu, then start a Swift run that does >>> a simple hostname call on bridled.ci.uchicago.edu I am prompted for >>> my ci password every time. >>> >>> I am more than certain that this is a configuration issue so I ask >>> for suggestions. My next step is to completely undo all my ssh keys >>> in the authorized key files and start fresh with new keys and >>> passphrases that are not in my macbook keychain. I do not really >>> want to basically revert back to nothing regarding ssh configuration >>> but this seems to be my only alternative. Any suggestions? >>> >>> On Jan 12, 2012, at 9:19 PM, Michael Wilde wrote: >>> >>>> The boostrap log shows this: >>>> >>>> com$ cat ~/coaster-bootstrap-1460623968.log >>>> using plain mode >>>> BS: http://communicado.ci.uchicago.edu:45621 >>>> Failed to download bootstrap jar from >>>> http://communicado.ci.uchicago.edu:45621 >>>> com$ >>>> >>>> - Mike >>>> >>>> ----- Original Message ----- >>>>> From: "Mihael Hategan" >>>>> To: "Michael Wilde" >>>>> Cc: "Jonathan Monette" , "Swift Devel" >>>>> >>>>> Sent: Thursday, January 12, 2012 8:34:36 PM >>>>> Subject: Re: [Swift-devel] command line ssh provider... >>>>> Can't test it right now because UCDavis decided to firewall stuff, >>>>> but >>>>> I >>>>> do get the bootstrap script to start and it gets to the wget part. >>>>> >>>>> So the question is, do you get a bootstrap log? >>>>> >>>>> On Thu, 2012-01-12 at 13:45 -0600, Michael Wilde wrote: >>>>>> ssh-cl worked for me going from communicado to both login.ci and >>>>>> bridled. >>>>>> >>>>>> I *assumed* it used my agent because I did not get a password >>>>>> prompt >>>>>> from the swift run. And I dont get a password prompt when running >>>>>> the ssh command line. >>>>>> >>>>>> It failed when I tried to use coasters with either provider >>>>>> staging >>>>>> (to login.mcs) or localhost/shared workdir (to login.ci). >>>>>> >>>>>> The command line and stdout/err for the coaster/local-workdir >>>>>> case >>>>>> is below. The logs are on ci net under ~wilde/swift/lab. Config >>>>>> and >>>>>> sites file was: >>>>>> >>>>>> com$ cat cf >>>>>> wrapperlog.always.transfer=true >>>>>> sitedir.keep=true >>>>>> execution.retries=0 >>>>>> lazy.errors=false >>>>>> status.mode=provider >>>>>> use.provider.staging=false >>>>>> provider.staging.pin.swiftfiles=false >>>>>> >>>>>> com$ cat sshcl.xml >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> /home/wilde/swiftwork >>>>>> >>>>>> >>>>>> com$ >>>>>> >>>>>> com$ cat sshclcoast.xml >>>>>> >>>>>> >>>>>> >>>>> jobmanager="ssh-cl:local"/> >>>>>> >>>>>> 8 >>>>>> 1 >>>>>> 1 >>>>>> 1 >>>>>> .01 >>>>>> 10000 >>>>>> >>>>>> >>>>>> /home/wilde/swiftwork >>>>>> >>>>>> >>>>>> >>>>>> com$ >>>>>> >>>>>> >>>>>> >>>>>> - Mike >>>>>> >>>>>> com$ which swift >>>>>> ~/swift/src/trunk/cog/modules/swift/dist/swift-svn/bin/swift >>>>>> com$ pwd >>>>>> /home/wilde/swift/lab >>>>>> com$ swift -tc.file tc -sites.file sshcl.xml -config cf >>>>>> catsn.swift >>>>>> -n=1 >>>>>> Swift trunk swift-r5498 cog-r3347 >>>>>> >>>>>> RunID: 20120112-1343-a7mk2zyc >>>>>> Progress: time: Thu, 12 Jan 2012 13:43:04 -0600 >>>>>> Final status: Thu, 12 Jan 2012 13:43:04 -0600 Finished >>>>>> successfully:1 >>>>>> com$ swift -tc.file tc -sites.file sshclcoast.xml -config cf >>>>>> catsn.swift -n=1 >>>>>> Swift trunk swift-r5498 cog-r3347 >>>>>> >>>>>> RunID: 20120112-1343-ql7sn3f7 >>>>>> Progress: time: Thu, 12 Jan 2012 13:43:20 -0600 >>>>>> Failed to transfer wrapper log for job cat-ihhm6jlk >>>>>> EXCEPTION Exception in cat: >>>>>> Arguments: [data.txt] >>>>>> Host: localhost >>>>>> Directory: catsn-20120112-1343-ql7sn3f7/jobs/i/cat-ihhm6jlk >>>>>> stderr.txt: >>>>>> >>>>>> stdout.txt: >>>>>> >>>>>> ---- >>>>>> >>>>>> Caused by: null >>>>>> Caused by: >>>>>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: >>>>>> Could not submit job >>>>>> Caused by: >>>>>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: >>>>>> Could not start coaster service >>>>>> Caused by: >>>>>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: >>>>>> Task ended before registration was received. >>>>>> STDOUT: Failed to download bootstrap jar from >>>>>> http://communicado.ci.uchicago.edu:45621 >>>>>> >>>>>> STDERR: This machine accepts SSH public key and One Time Password >>>>>> (OTP) logins only. >>>>>> If you do not have a public key set up, you will be prompted for >>>>>> a >>>>>> password. >>>>>> This is *not* your CI password, but the One Time Password >>>>>> generated >>>>>> from your >>>>>> OTP token. Do not type your CI password, it will not work. If you >>>>>> do >>>>>> not >>>>>> have a token or public key, you will not be able to login. >>>>>> >>>>>> See http://www.ci.uchicago.edu/faq for more information. >>>>>> >>>>>> Caused by: >>>>>> org.globus.cog.abstraction.impl.common.execution.JobException: >>>>>> Job >>>>>> failed with an exit code of 1 >>>>>> Execution failed: >>>>>> Job failed with an exit code of 1 >>>>>> com$ >>>>>> >>>>>> >>>>>> ----- Original Message ----- >>>>>>> From: "Jonathan Monette" >>>>>>> To: "Mihael Hategan" >>>>>>> Cc: "Swift Devel" , "Michael Wilde" >>>>>>> >>>>>>> Sent: Thursday, January 12, 2012 1:29:10 PM >>>>>>> Subject: Re: [Swift-devel] command line ssh provider... >>>>>>> Mike, >>>>>>> You mentioned that you were able to use ssh command line >>>>>>> provider >>>>>>> using catsn this morning. Was it using agents? Mihael did you >>>>>>> test >>>>>>> using an agent? How do I specify for it to use an agent if >>>>>>> available? >>>>>>> I can do a simple hostname test from communicado to bridled but >>>>>>> it >>>>>>> asks for my password instead of using the agent I have set up. >>>>>>> >>>>>>> >>>>>>> On Jan 12, 2012, at 12:21 AM, Mihael Hategan wrote: >>>>>>> >>>>>>>> ... is in trunk (cog r3347). I was able to start coasters with >>>>>>>> it. >>>>>>>> The >>>>>>>> provider is called "ssh-cl". It is ssh, so ~/.ssh/config and >>>>>>>> agents >>>>>>>> will >>>>>>>> apply. Please test. >>>>>>>> >>>>>>>> Mihael >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Swift-devel mailing list >>>>>>>> Swift-devel at ci.uchicago.edu >>>>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>>>> >>>> >>>> -- >>>> Michael Wilde >>>> Computation Institute, University of Chicago >>>> Mathematics and Computer Science Division >>>> Argonne National Laboratory >>>> >>> >>> _______________________________________________ >>> Swift-devel mailing list >>> Swift-devel at ci.uchicago.edu >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>> > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > From wilde at mcs.anl.gov Fri Jan 13 18:00:01 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 13 Jan 2012 18:00:01 -0600 (CST) Subject: [Swift-devel] command line ssh provider... In-Reply-To: Message-ID: <1935551723.142011.1326499201599.JavaMail.root@zimbra.anl.gov> Latest update on this: I was trying to get coasters to work with jobmanager ssh-cl:local, communicado to bridled. Its close now. You need to: - make sure you set the right hostname in sites.xml :) - create valid x509 proxies on both sides -- I sourced /opt/osg/setup.sh and then ran grid-proxy-init manually -- also I *think* need to source this in your .bashrc or equiv so that the remote side gets the right CADIR in its env - set GLOBUS_TCP_PORT_RANGE=50000,51000 -- Mihael says this should get exported from client -- I added it to .bashrc to be sure it was set on both sides Once I had done that, coasters booted OK. Then I hit a suspected problem in the local provider: it was not accepting jobs, but seemed set up OK. Mihael is investigating. Its very likely that this *will* work end to end e.g. from communicado to PADS using ssh-cl:pbs. Another good test is to access eg surveyor, and intrepid using an OTP via ssh-cl. - Mike ----- Original Message ----- > From: "Ben Clifford" > To: "Michael Wilde" > Cc: "Swift Devel" , "Jonathan Monette" > Sent: Friday, January 13, 2012 8:26:59 AM > Subject: Re: [Swift-devel] command line ssh provider... > SSH_AUTH_SOCK is the variable I intended to refer to. But if that's > working for you, then my suggestion probably isn't the problem... > > On Jan 13, 2012, at 12:47 PM, Michael Wilde wrote: > > > I ssh to communicado from my mac using the following command: > > > > ssh -A -t login.ci.uchicago.edu ssh -A -t > > communicado.ci.uchicago.edu > > > > then I get the following ssh env vars, and the basic ssh-cl provider > > seems to work: > > > > com$ env | grep -i ssh > > SSH_CLIENT=128.135.125.155 47429 22 > > SSH_TTY=/dev/pts/0 > > SSH_AUTH_SOCK=/tmp/ssh-iGZFq22173/agent.22173 > > SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass > > CVS_RSH=cvs-ssh > > SSH_CONNECTION=128.135.125.155 47429 128.135.125.17 22 > > com$ export | grep -i ssh > > declare -x CVS_RSH="cvs-ssh" > > declare -x SSH_ASKPASS="/usr/libexec/openssh/gnome-ssh-askpass" > > declare -x SSH_AUTH_SOCK="/tmp/ssh-iGZFq22173/agent.22173" > > declare -x SSH_CLIENT="128.135.125.155 47429 22" > > declare -x SSH_CONNECTION="128.135.125.155 47429 128.135.125.17 22" > > declare -x SSH_TTY="/dev/pts/0" > > com$ > > > > (I still have problems, unrelated I think, with getting coasters to > > work with ssh-cl). > > > > - Mike > > > > ----- Original Message ----- > >> From: "Ben Clifford" > >> To: "Jonathan Monette" > >> Cc: "Michael Wilde" , "Swift Devel" > >> > >> Sent: Friday, January 13, 2012 3:00:07 AM > >> Subject: Re: [Swift-devel] command line ssh provider... > >> one guess, based only on reading this thread, is that the SSH_AGENT > >> environment variable from your login session (which tells the 'ssh' > >> commandline program how to get back to the agent that it should > >> use) > >> is not getting passed all the way through swift and ssh-ci to the > >> ssh > >> command executed in there. I didn't look at the code, though, or > >> try > >> to determine the truth of this in any way. > >> > >> On Jan 13, 2012, at 3:24 AM, Jonathan Monette wrote: > >> > >>> I am getting a different problem. The provider does not seem to be > >>> using an agent. > >>> > >>> Starting from my macbook I can ssh -A jonmon at login.ci.uchicago.edu > >>> and then do ssh -A jonmon at communicado.ci.uchicago.edu and then ssh > >>> -A jonmon at bridled.ci.uchicago.edu in the terminal and none of them > >>> require a password. > >>> > >>> However if I ssh -A jonmon at login.ci.uchicago.edu and then ssh -A > >>> jonmon at communicado.ci.uchicago.edu, then start a Swift run that > >>> does > >>> a simple hostname call on bridled.ci.uchicago.edu I am prompted > >>> for > >>> my ci password every time. > >>> > >>> I am more than certain that this is a configuration issue so I ask > >>> for suggestions. My next step is to completely undo all my ssh > >>> keys > >>> in the authorized key files and start fresh with new keys and > >>> passphrases that are not in my macbook keychain. I do not really > >>> want to basically revert back to nothing regarding ssh > >>> configuration > >>> but this seems to be my only alternative. Any suggestions? > >>> > >>> On Jan 12, 2012, at 9:19 PM, Michael Wilde wrote: > >>> > >>>> The boostrap log shows this: > >>>> > >>>> com$ cat ~/coaster-bootstrap-1460623968.log > >>>> using plain mode > >>>> BS: http://communicado.ci.uchicago.edu:45621 > >>>> Failed to download bootstrap jar from > >>>> http://communicado.ci.uchicago.edu:45621 > >>>> com$ > >>>> > >>>> - Mike > >>>> > >>>> ----- Original Message ----- > >>>>> From: "Mihael Hategan" > >>>>> To: "Michael Wilde" > >>>>> Cc: "Jonathan Monette" , "Swift Devel" > >>>>> > >>>>> Sent: Thursday, January 12, 2012 8:34:36 PM > >>>>> Subject: Re: [Swift-devel] command line ssh provider... > >>>>> Can't test it right now because UCDavis decided to firewall > >>>>> stuff, > >>>>> but > >>>>> I > >>>>> do get the bootstrap script to start and it gets to the wget > >>>>> part. > >>>>> > >>>>> So the question is, do you get a bootstrap log? > >>>>> > >>>>> On Thu, 2012-01-12 at 13:45 -0600, Michael Wilde wrote: > >>>>>> ssh-cl worked for me going from communicado to both login.ci > >>>>>> and > >>>>>> bridled. > >>>>>> > >>>>>> I *assumed* it used my agent because I did not get a password > >>>>>> prompt > >>>>>> from the swift run. And I dont get a password prompt when > >>>>>> running > >>>>>> the ssh command line. > >>>>>> > >>>>>> It failed when I tried to use coasters with either provider > >>>>>> staging > >>>>>> (to login.mcs) or localhost/shared workdir (to login.ci). > >>>>>> > >>>>>> The command line and stdout/err for the coaster/local-workdir > >>>>>> case > >>>>>> is below. The logs are on ci net under ~wilde/swift/lab. Config > >>>>>> and > >>>>>> sites file was: > >>>>>> > >>>>>> com$ cat cf > >>>>>> wrapperlog.always.transfer=true > >>>>>> sitedir.keep=true > >>>>>> execution.retries=0 > >>>>>> lazy.errors=false > >>>>>> status.mode=provider > >>>>>> use.provider.staging=false > >>>>>> provider.staging.pin.swiftfiles=false > >>>>>> > >>>>>> com$ cat sshcl.xml > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> /home/wilde/swiftwork > >>>>>> > >>>>>> > >>>>>> com$ > >>>>>> > >>>>>> com$ cat sshclcoast.xml > >>>>>> > >>>>>> > >>>>>> >>>>>> jobmanager="ssh-cl:local"/> > >>>>>> > >>>>>> 8 > >>>>>> 1 > >>>>>> 1 > >>>>>> 1 > >>>>>> .01 > >>>>>> >>>>>> key="initialScore">10000 > >>>>>> > >>>>>> > >>>>>> /home/wilde/swiftwork > >>>>>> > >>>>>> > >>>>>> > >>>>>> com$ > >>>>>> > >>>>>> > >>>>>> > >>>>>> - Mike > >>>>>> > >>>>>> com$ which swift > >>>>>> ~/swift/src/trunk/cog/modules/swift/dist/swift-svn/bin/swift > >>>>>> com$ pwd > >>>>>> /home/wilde/swift/lab > >>>>>> com$ swift -tc.file tc -sites.file sshcl.xml -config cf > >>>>>> catsn.swift > >>>>>> -n=1 > >>>>>> Swift trunk swift-r5498 cog-r3347 > >>>>>> > >>>>>> RunID: 20120112-1343-a7mk2zyc > >>>>>> Progress: time: Thu, 12 Jan 2012 13:43:04 -0600 > >>>>>> Final status: Thu, 12 Jan 2012 13:43:04 -0600 Finished > >>>>>> successfully:1 > >>>>>> com$ swift -tc.file tc -sites.file sshclcoast.xml -config cf > >>>>>> catsn.swift -n=1 > >>>>>> Swift trunk swift-r5498 cog-r3347 > >>>>>> > >>>>>> RunID: 20120112-1343-ql7sn3f7 > >>>>>> Progress: time: Thu, 12 Jan 2012 13:43:20 -0600 > >>>>>> Failed to transfer wrapper log for job cat-ihhm6jlk > >>>>>> EXCEPTION Exception in cat: > >>>>>> Arguments: [data.txt] > >>>>>> Host: localhost > >>>>>> Directory: catsn-20120112-1343-ql7sn3f7/jobs/i/cat-ihhm6jlk > >>>>>> stderr.txt: > >>>>>> > >>>>>> stdout.txt: > >>>>>> > >>>>>> ---- > >>>>>> > >>>>>> Caused by: null > >>>>>> Caused by: > >>>>>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: > >>>>>> Could not submit job > >>>>>> Caused by: > >>>>>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: > >>>>>> Could not start coaster service > >>>>>> Caused by: > >>>>>> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: > >>>>>> Task ended before registration was received. > >>>>>> STDOUT: Failed to download bootstrap jar from > >>>>>> http://communicado.ci.uchicago.edu:45621 > >>>>>> > >>>>>> STDERR: This machine accepts SSH public key and One Time > >>>>>> Password > >>>>>> (OTP) logins only. > >>>>>> If you do not have a public key set up, you will be prompted > >>>>>> for > >>>>>> a > >>>>>> password. > >>>>>> This is *not* your CI password, but the One Time Password > >>>>>> generated > >>>>>> from your > >>>>>> OTP token. Do not type your CI password, it will not work. If > >>>>>> you > >>>>>> do > >>>>>> not > >>>>>> have a token or public key, you will not be able to login. > >>>>>> > >>>>>> See http://www.ci.uchicago.edu/faq for more information. > >>>>>> > >>>>>> Caused by: > >>>>>> org.globus.cog.abstraction.impl.common.execution.JobException: > >>>>>> Job > >>>>>> failed with an exit code of 1 > >>>>>> Execution failed: > >>>>>> Job failed with an exit code of 1 > >>>>>> com$ > >>>>>> > >>>>>> > >>>>>> ----- Original Message ----- > >>>>>>> From: "Jonathan Monette" > >>>>>>> To: "Mihael Hategan" > >>>>>>> Cc: "Swift Devel" , "Michael > >>>>>>> Wilde" > >>>>>>> > >>>>>>> Sent: Thursday, January 12, 2012 1:29:10 PM > >>>>>>> Subject: Re: [Swift-devel] command line ssh provider... > >>>>>>> Mike, > >>>>>>> You mentioned that you were able to use ssh command line > >>>>>>> provider > >>>>>>> using catsn this morning. Was it using agents? Mihael did you > >>>>>>> test > >>>>>>> using an agent? How do I specify for it to use an agent if > >>>>>>> available? > >>>>>>> I can do a simple hostname test from communicado to bridled > >>>>>>> but > >>>>>>> it > >>>>>>> asks for my password instead of using the agent I have set up. > >>>>>>> > >>>>>>> > >>>>>>> On Jan 12, 2012, at 12:21 AM, Mihael Hategan wrote: > >>>>>>> > >>>>>>>> ... is in trunk (cog r3347). I was able to start coasters > >>>>>>>> with > >>>>>>>> it. > >>>>>>>> The > >>>>>>>> provider is called "ssh-cl". It is ssh, so ~/.ssh/config and > >>>>>>>> agents > >>>>>>>> will > >>>>>>>> apply. Please test. > >>>>>>>> > >>>>>>>> Mihael > >>>>>>>> > >>>>>>>> _______________________________________________ > >>>>>>>> Swift-devel mailing list > >>>>>>>> Swift-devel at ci.uchicago.edu > >>>>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >>>>>> > >>>> > >>>> -- > >>>> Michael Wilde > >>>> Computation Institute, University of Chicago > >>>> Mathematics and Computer Science Division > >>>> Argonne National Laboratory > >>>> > >>> > >>> _______________________________________________ > >>> Swift-devel mailing list > >>> Swift-devel at ci.uchicago.edu > >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >>> > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From hategan at mcs.anl.gov Fri Jan 13 18:09:18 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 13 Jan 2012 16:09:18 -0800 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <1935551723.142011.1326499201599.JavaMail.root@zimbra.anl.gov> References: <1935551723.142011.1326499201599.JavaMail.root@zimbra.anl.gov> Message-ID: <1326499758.1063.1.camel@blabla> On Fri, 2012-01-13 at 18:00 -0600, Michael Wilde wrote: > Another good test is to access eg surveyor, and intrepid using an OTP via ssh-cl. A word of caution there: if the ssh client asks for the password on the command line (instead of through ssh-askpass or some other gui), things won't work very well. It might be possible to add some detection for that in the provider, but that's not a high priority given that there is a workaround (askpass). From iraicu at cs.iit.edu Sat Jan 14 08:10:30 2012 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Sat, 14 Jan 2012 08:10:30 -0600 Subject: [Swift-devel] CFP: ACM HPDC 2012, abstracts due January 16th, 2012 Message-ID: <4F118CD6.9090905@cs.iit.edu> **** CALL FOR PAPERS **** The 21st International ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC'12) Delft University of Technology, Delft, the Netherlands June 18-22, 2012 http://www.hpdc.org/2012 The ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC) is the premier annual conference on the design, the implementation, the evaluation, and the use of parallel and distributed systems for high-end computing. HPDC'12 will take place in Delft, the Netherlands, a historical, picturesque city that is less than one hour away from Amsterdam-Schiphol airport. The conference will be held on June 20-22 (Wednesday to Friday), with affiliated workshops taking place on June 18-19 (Monday and Tuesday). **** SUBMISSION DEADLINES **** Abstracts: 16 January 2012 Papers: 23 January 2012 (No extensions!) **** HPDC'12 GENERAL CHAIR **** Dick Epema, Delft University of Technology, Delft, the Netherlands **** HPDC'12 PROGRAM CO-CHAIRS **** Thilo Kielmann, Vrije Universiteit, Amsterdam, the Netherlands Matei Ripeanu, The University of British Columbia, Vancouver, Canada **** HPDC'12 WORKSHOPS CHAIR **** Alexandru Iosup, Delft University of Technology, Delft, the Netherlands **** SCOPE AND TOPICS **** Submissions are welcomed on all forms of high-performance parallel and distributed computing, including but not limited to clusters, clouds, grids, utility computing, data-intensive computing, and massively multicore systems. Submissions that explore solutions to estimate and reduce the energy footprint of such systems are particularly encouraged. All papers will be evaluated for their originality, potential impact, correctness, quality of presentation, appropriate presentation of related work, and relevance to the conference, with a strong preference for rigorous results obtained in operational parallel and distributed systems. The topics of interest of the conference include, but are not limited to, the following, in the context of high-performance parallel and distributed computing: - Systems, networks, and architectures for high-end computing - Massively multicore systems - Virtualization of machines, networks, and storage - Programming languages and environments - I/O, storage systems, and data management - Resource management, energy and cost minimizations - Performance modeling and analysis - Fault tolerance, reliability, and availability - Data-intensive computing - Applications of parallel and distributed computing **** PAPER SUBMISSION GUIDELINES **** Authors are invited to submit technical papers of at most 12 pages in PDF format, including figures and references. Papers should be formatted in the ACM Proceedings Style and submitted via the conference web site. No changes to the margins, spacing, or font sizes as specified by the style file are allowed. Accepted papers will appear in the conference proceedings, and will be incorporated into the ACM Digital Library. A limited number of papers will be accepted as posters. Papers must be self-contained and provide the technical substance required for the program committee to evaluate their contributions. Submitted papers must be original work that has not appeared in and is not under consideration for another conference or a journal. See the ACM Prior Publication Policy for more details. **** IMPORTANT DATES **** Abstracts Due: 16 January 2012 Papers Due: 23 January 2012 (No extensions!) Reviews Released to Authors: 8 March 2012 Author Rebuttals Due: 12 March 2012 Author Notifications: 19 March 2012 Final Papers Due: 16 April 2012 Conference Dates: 18-22 June 2012 -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================================= ================================================================= From iraicu at cs.iit.edu Sat Jan 14 12:00:43 2012 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Sat, 14 Jan 2012 12:00:43 -0600 Subject: [Swift-devel] CFP: IEEE eScience 2012 in Chicago IL USA Message-ID: <4F11C2CB.1070809@cs.iit.edu> Call for Papers 8th IEEE International Conference on eScience October 8-12, 2012 Chicago, IL, USA Researchers in all disciplines are increasingly adopting digital tools, techniques and practices, often in communities and projects that span disciplines, laboratories, organizations, and national boundaries. The eScience 2012 conference is designed to bring together leading international and interdisciplinary research communities, developers, and users of eScience applications and enabling IT technologies. The conference serves as a forum to present the results of the latest applications research and product/tool developments and to highlight related activities from around the world. Also, we are now entering the second decade of eScience and the 2012 conference gives an opportunity to take stock of what has been achieved so far and look forward to the challenges and opportunities the next decade will bring. A special emphasis of the 2012 conference is on advances in the application of technology in a particular discipline. Accordingly, significant advances in applications science and technology will be considered as important as the development of new technologies themselves. Further, we welcome contributions in educational activities under any of these disciplines. As a result, the conference will be structured around two e-Science tracks: * *eScience Algorithms and Applications* o eScience application areas, including: + Physical sciences + Biomedical sciences + Social sciences and humanities o Data-oriented approaches and applications o Compute-oriented approaches and applications o Extreme scale approaches and applications * *Cyberinfrastructure to support eScience* o Novel hardware o Novel uses of production infrastructure o Software and services o Tools The conference proceedings will be published by the IEEE Computer Society Press, USA and will be made available online through the IEEE Digital Library. Selected papers will be invited to submit extended versions to a special issue of the Future Generation Computer Systems (FGCS) journal. SUBMISSION PROCESS Authors are invited to submit papers with unpublished, original work of not more than 8 pages of double column text using single spaced 10 point size on 8.5 x 11 inch pages, as per IEEE 8.5 x 11 manuscript guidelines. (Up to 2 additional pages may be purchased for US$150/page) Templates are available from http://www.ieee.org/conferences_events/conferences/publishing/templates.html. Authors should submit a PDF file that will print on a PostScript printer to https://www.easychair.org/conferences/?conf=escience2012 (Note that paper submitters also must submit an abstract in advance of the paper deadline. This should be done through the same site where papers are submitted.) It is a requirement that at least one author of each accepted paper attend the conference. ORGANIZATION General Chair * *Ian Foster*, University of Chicago & Argonne National Laboratory, USA Program Co-Chairs * *Daniel S. Katz*, University of Chicago & Argonne National Laboratory, USA * *Heinz Stockinger*, SIB Swiss Institute of Bioinformatics, Switzerland Program Vice Co-Chairs * eScience Algorithms and Applications Track o *David Abramson*, Monash University, Australia o *Gabrielle Allen*, Louisiana State University, USA * Cyberinfrastructure to support eScience Track o *Rosa M. Badia*, Barcelona Supercomputing Center / CSIC, Spain o *Geoffrey Fox*, Indiana University, USA Sponsorship Chair * *Charlie Catlett*, Argonne National Laboratory, USA Conference Manager and Finance Chair * *Julie Wulf-Knoerzer*, University of Chicago & Argonne National Laboratory, USA Publicity Chairs * *Kento Aida*, National Institute of Informatics, Japan * *Ioan Raicu*, Illinois Institute of Technology, USA * *David Wallom*, Oxford e-Research Centre, UK Local Organizing Committee * *Ninfa Mayorga*, University of Chicago, USA * *Evelyn Rayburn*, University of Chicago, USA * *Lynn Valentini*, Argonne National Laboratory, USA Program Committee * eScience Algorithms and Applications Track o *Srinivas Aluru*, Iowa State University, USA o *Ashiq Anjum*, University of Derby, UK o *David A. Bader*, Georgia Institute of Technology, USA o *Jon Blower*, University of Reading, UK o *Paul Bonnington*, Monash University, Australia o *Simon Cox*, University of Southampton, UK o *David De Roure*, Oxford e-Research Centre, UK o *George Djorgovski*, California Institute of Technology, USA o *Anshu Dubey*, University of Chicago & Argonne National Laboratory, USA o *Yuri Estrin*, Monash University, Australia o *Dan Fay*, Microsoft, USA o *Jeremy Frey*, University of Southampton, UK o *Wolfgang Gentzsch*, HPC Consultant, Germany o *Lutz Gross*, The University of Queensland, Austrialia o *Sverker Holmgren*, Uppsala University, Sweden o *Bill Howe*, University of Washington, USA o *Marina Jirotka*, University of Oxford, UK o *Timoleon Kipouros*, University of Cambridge, UK o *Kerstin Kleese van Dam*, Pacific Northwest National Laboratory, USA o *Arun S. Konagurthu*, Monash University, Australia o *Peter Kunszt*, SystemsX.ch, Switzerland o *Alexey Lastovetsky*, University College Dublin, Ireland o *Andrew Lewis*, Griffith University, Australia o *Sergio Maffioletti*, University of Zurich, Switzerland o *Amitava Majumdar*, San Diego Supercomputer Center, University of California at San Diego, USA o *Rui Mao*, Shenzhen University, China o *Madhav V. Marathe*, Virginia Tech, USA o *Maryann Martone*, University of California at San Diego, USA o *Louis Moresi*, Monash University, Australia o *Riccardo Murri*, University of Zurich, Switzerland o *Silvia D. Olabarriaga*, Academic Medical Center of the University of Amsterdam, Netherlands o *Enrique S. Quintana-Ort?*, Universidad Jaume I, Spain o *Abani Patra*, University at Buffalo, USA o *Rob Pennington*, NSF, USA o *Andrew Perry*, Monash University, Australia o *Beth Plale*, Indiana University, USA o *Michael Resch*, University of Stuttgart, Germany o *Adrian Sandu*, Virginia Tech, USA o *Mark Savill*, Cranfield University, UK o *Erik Schnetter*, Perimeter Institute for Theoretical Physics, Canada o *Edward Seidel*, Louisiana State University, USA o *Suzanne M. Shontz*, The Pennsylvania State University, USA o *David Skinner*, Lawrence Berkeley National Laboratory, USA o *Alan Sussman*, University of Maryland, USA o *Alex Szalay*, Johns Hopkins University, USA o *Domenico Talia*, ICAR-CNR & University of Calabria, Italy o *Jian Tao*, Louisiana State University, USA o *David Wallom*, Oxford e-Research Centre, UK o *Shaowen Wang*, University of Illinois at Urbana-Champaign, USA o *Michael Wilde*, Argonne National Laboratory & University of Chicago, USA o *Nancy Wilkins-Diehr*, San Diego Supercomputer Center, University of California at San Diego, USA o *Wu Zhang*, Shanghai University, China o *Yunquan Zhang*, Chinese Academy of Sciences, China * Cyberinfrastructure to support eScience Track o *Deb Agarwal*, Lawrence Berkeley National Laboratory, USA o *Ilkay Altintas*, San Diego Supercomputer Center, University of California at San Diego, USA o *Henri Bal*, Vrije Universiteit, Netherlands o *Roger Barga*, Microsoft, USA o *Martin Berzins*, University of Utah, USA o *John Brooke*, University of Manchester, UK o *Thomas Fahringer*, University of Innsbruck, Austria o *Gilles Fedak*, INRIA, France o *Jos? A. B. Fortes*, University of Florida, USA o *Yolanda Gil*, ISI/USC, USA o *Madhusudhan Govindaraju*, SUNY Binghamton, USA o *Thomas Hacker*, Purdue University, USA o *Ken Hawick*, Massey University, New Zealand o *Marty Humphrey*, University of Virginia, USA o *Hai Jin*, Huazhong University of Science and Technology, China o *Thilo Kielmann*, Vrije Universiteit, Netherlands o *Scott Klasky*, Oak Ridge National Laboratory, USA o *Isao Kojima*, AIST, Japan o *Tevfik Kosar*, University at Buffalo, USA o *Dieter Kranzlmueller*, LMU & LRZ Munich, Germany o *Erwin Laure*, KTH, Sweden o *Jysoo Lee*, KISTI, Korea o *Li Xiaoming*, Peking University, China o *Bertram Lud?scher*, University of California, Davis, USA o *Andrew Lumsdaine*, Indiana University, USA o *Tanu Malik*, University of Chicago, USA o *Satoshi Matsuoka*, Tokyo Institute of Technology, Japan o *Reagan Moore*, University of North Carolina at Chapel Hill, USA o *Shirley Moore*, University of Kentucky, USA o *Steven Newhouse*, EGI, Netherlands o *Dhabaleswar K. (DK) Panda*, The Ohio State University, USA o *Manish Parashar*, Rutgers University, USA o *Ron Perrott*, University of Oxford, UK o *Depei Qian*, Beihang University, China o *Judy Qui*, Indiana University, USA o *Ioan Raicu*, Illinois Institute of Technology, USA o *Lavanya Ramakrishnan*, Lawrence Berkeley National Laboratory, USA o *Omer Rana*, Cardiff University, UK o *Paul Roe*, Queensland University of Technology, Australia o *Bruno Schulze*, LNCC, Brazil o *Marc Snir*, Argonne National Laboratory & University of Illinois at Urbana-Champaign, USA o *Xian-He Sun*, Illinois Institute of Technology, USA o *Yoshio Tanaka*, AIST, Japan o *Michela Taufer*, University of Delaware, USA o *Kerry Taylor*, CSIRO, Australia o *Douglas Thain*, University of Notre Dame, USA o *Paul Watson*, Newcastle University, UK o *Jun Zhao*, University of Oxford, UK -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================================= ================================================================= -------------- next part -------------- An HTML attachment was scrubbed... URL: From iraicu at cs.iit.edu Sat Jan 14 21:58:00 2012 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Sat, 14 Jan 2012 21:58:00 -0600 Subject: [Swift-devel] Call for Workshops at IEEE eScience, due January 23, 2012 Message-ID: <4F124EC8.2050308@cs.iit.edu> Call for Workshops 8th IEEE International Conference on eScience October 8-12, 2012 Chicago, IL, USA The 8th IEEE eScience conference (e-Science 2012), sponsored by the IEEE Computer Society's Technical Committee for Scalable Computing (TCSC), will be held in Chicago Illinois from 8-12th October 2012. The eScience 2011 conference is designed to bring together leading international and interdisciplinary research communities, developers, and users of eScience applications and enabling IT technologies. Multiple e-Science 2012 Workshops will be held on Monday and Tuesday, 8th and 9th October, co-located with the main conference. Workshops are an important part of the conference in providing opportunity for researchers to present their work in a more focused way than the conference itself and to have discussion of particular topics of interest to the community. We cordially invite you to submit workshop proposals on any eScience related topic to the Workshop Chair. To help those interested know their purpose and scope, workshop proposals should include: * A description of the workshop, its focus, goals, and outcome * A draft call for papers * Names and affiliations of the organizers and tentative composition of the committees * Expected numbers of submissions and accepted papers * Prior history of this workshop, if any. Please include: number of submissions, number of accepted papers, and attendee count. Workshop organizers are responsible for establishing a program committee, collecting and evaluating submissions, notifying authors of acceptance or rejection in due time, ensuring a transparent and fair selection process, organizing selected papers into sessions, and assigning session chairs. Proposals will be selected that show clear focus and objectives in areas of emerging or developing interest guaranteed to generate significant interest in the community. Once accepted, the workshop should establish its own paper submission system. For each paper selected for publication, an author must be registered for eScience 2012. Each paper must be presented in person by at least one of the authors. It is expected that the proceedings of the eScience 2012 workshops will be published by the IEEE Computer Society Press, USA and will be made available online through the IEEE Digital Library. SUBMISSION PROCESS Workshop proposals should be emailed to escience2012-workshops at fnal.gov ORGANIZATION General Chair * *Ian Foster*, University of Chicago & Argonne National Laboratory, USA Program Co-Chairs * *Daniel S. Katz*, University of Chicago & Argonne National Laboratory, USA * *Heinz Stockinger*, SIB Swiss Institute of Bioinformatics, Switzerland Workshops Chair * *Ruth Pordes*, FNAL, USA Sponsorship Chair * *Charlie Catlett*, Argonne National Laboratory, USA Conference Manager and Finance Chair * *Julie Wulf-Knoerzer*, University of Chicago & Argonne National Laboratory, USA Publicity Chairs * *Kento Aida*, National Institute of Informatics, Japan * *Ioan Raicu*, Illinois Institute of Technology, USA * *David Wallom*, Oxford e-Research Centre, UK Local Organizing Committee * *Ninfa Mayorga*, University of Chicago, USA * *Evelyn Rayburn*, University of Chicago, USA * *Lynn Valentini*, Argonne National Laboratory, USA -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================================= ================================================================= -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Mon Jan 16 03:36:59 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 16 Jan 2012 01:36:59 -0800 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <1935551723.142011.1326499201599.JavaMail.root@zimbra.anl.gov> References: <1935551723.142011.1326499201599.JavaMail.root@zimbra.anl.gov> Message-ID: <1326706619.18881.0.camel@blabla> On Fri, 2012-01-13 at 18:00 -0600, Michael Wilde wrote: > Once I had done that, coasters booted OK. Then I hit a suspected > problem in the local provider: it was not accepting jobs, but seemed > set up OK. Mihael is investigating. Works for me, so I suspect it's more subtle. If you see it again, can you send me a jstack output of the coaster service? From wilde at mcs.anl.gov Mon Jan 16 10:07:20 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 16 Jan 2012 10:07:20 -0600 (CST) Subject: [Swift-devel] askpass for command line ssh provider? In-Reply-To: <1326499758.1063.1.camel@blabla> Message-ID: <2133311711.144547.1326730040269.JavaMail.root@zimbra.anl.gov> Was: Re: [Swift-devel] command line ssh provider... After a bit more thought, it seems that enabling the ssh-cl provider to prompt for passwords is perhaps not a required feature. We will for example need to access many systems that needs a one time password. But its likely that such mechanisms need to be set up outside of Swift (or at least outside the main line of the provider), using agents or master channels, else the user would get multiple password prompts per endpoint. For now, we can do this outside of Swift proper (ie in the various portals, ideally via scripts that we package in swift/bin which can be used by both command line users and by portal code). Later we can consider if its reasonable to make the ssh-cl provider smart enough to invoke such channel or agent setup scripts automatically when needed. - Mike ----- Original Message ----- > From: "Mihael Hategan" > To: "Michael Wilde" > Cc: "Ben Clifford" , "Swift Devel" > Sent: Friday, January 13, 2012 6:09:18 PM > Subject: Re: [Swift-devel] command line ssh provider... > On Fri, 2012-01-13 at 18:00 -0600, Michael Wilde wrote: > > Another good test is to access eg surveyor, and intrepid using an > > OTP via ssh-cl. > > A word of caution there: if the ssh client asks for the password on > the > command line (instead of through ssh-askpass or some other gui), > things > won't work very well. It might be possible to add some detection for > that in the provider, but that's not a high priority given that there > is > a workaround (askpass). -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From jonmon at mcs.anl.gov Mon Jan 16 10:31:21 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Mon, 16 Jan 2012 10:31:21 -0600 Subject: [Swift-devel] askpass for command line ssh provider? In-Reply-To: <2133311711.144547.1326730040269.JavaMail.root@zimbra.anl.gov> References: <2133311711.144547.1326730040269.JavaMail.root@zimbra.anl.gov> Message-ID: <7E5428A4-6882-4D5A-8EFC-94849E7CC2E3@mcs.anl.gov> I always thought the solution to the OTP situation was to set up a master channel. Inside a portal this is easy. The portal knows which sites are used and which sites require a OTP. The portal can then set up a master channel. For the situation for the agents, the portal can always create the agent itself after prompting for a password once can't it? In both scenarios the portal creates the mechanisms to limit the number of passwords that are required. For Swift, I do not think that these solutions work since Swift needs to be more general(maybe creating agent approach but that won't work for OTP situations). On Jan 16, 2012, at 10:07 AM, Michael Wilde wrote: > Was: Re: [Swift-devel] command line ssh provider... > > After a bit more thought, it seems that enabling the ssh-cl provider to prompt for passwords is perhaps not a required feature. > > We will for example need to access many systems that needs a one time password. > > But its likely that such mechanisms need to be set up outside of Swift (or at least outside the main line of the provider), using agents or master channels, else the user would get multiple password prompts per endpoint. > > For now, we can do this outside of Swift proper (ie in the various portals, ideally via scripts that we package in swift/bin which can be used by both command line users and by portal code). > > Later we can consider if its reasonable to make the ssh-cl provider smart enough to invoke such channel or agent setup scripts automatically when needed. > > - Mike > > > > ----- Original Message ----- >> From: "Mihael Hategan" >> To: "Michael Wilde" >> Cc: "Ben Clifford" , "Swift Devel" >> Sent: Friday, January 13, 2012 6:09:18 PM >> Subject: Re: [Swift-devel] command line ssh provider... >> On Fri, 2012-01-13 at 18:00 -0600, Michael Wilde wrote: >>> Another good test is to access eg surveyor, and intrepid using an >>> OTP via ssh-cl. >> >> A word of caution there: if the ssh client asks for the password on >> the >> command line (instead of through ssh-askpass or some other gui), >> things >> won't work very well. It might be possible to add some detection for >> that in the provider, but that's not a high priority given that there >> is >> a workaround (askpass). > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From foster at anl.gov Mon Jan 16 10:38:38 2012 From: foster at anl.gov (Ian Foster) Date: Mon, 16 Jan 2012 10:38:38 -0600 Subject: [Swift-devel] askpass for command line ssh provider? In-Reply-To: <7E5428A4-6882-4D5A-8EFC-94849E7CC2E3@mcs.anl.gov> References: <2133311711.144547.1326730040269.JavaMail.root@zimbra.anl.gov> <7E5428A4-6882-4D5A-8EFC-94849E7CC2E3@mcs.anl.gov> Message-ID: <5AA70E03-617F-437D-B8D5-1E7E5507F58C@anl.gov> I wonder if we can leverage what Globus Online is doing for this purpose? On Jan 16, 2012, at 10:31 AM, Jonathan Monette wrote: > I always thought the solution to the OTP situation was to set up a master channel. Inside a portal this is easy. The portal knows which sites are used and which sites require a OTP. The portal can then set up a master channel. For the situation for the agents, the portal can always create the agent itself after prompting for a password once can't it? In both scenarios the portal creates the mechanisms to limit the number of passwords that are required. > > For Swift, I do not think that these solutions work since Swift needs to be more general(maybe creating agent approach but that won't work for OTP situations). > > On Jan 16, 2012, at 10:07 AM, Michael Wilde wrote: > >> Was: Re: [Swift-devel] command line ssh provider... >> >> After a bit more thought, it seems that enabling the ssh-cl provider to prompt for passwords is perhaps not a required feature. >> >> We will for example need to access many systems that needs a one time password. >> >> But its likely that such mechanisms need to be set up outside of Swift (or at least outside the main line of the provider), using agents or master channels, else the user would get multiple password prompts per endpoint. >> >> For now, we can do this outside of Swift proper (ie in the various portals, ideally via scripts that we package in swift/bin which can be used by both command line users and by portal code). >> >> Later we can consider if its reasonable to make the ssh-cl provider smart enough to invoke such channel or agent setup scripts automatically when needed. >> >> - Mike >> >> >> >> ----- Original Message ----- >>> From: "Mihael Hategan" >>> To: "Michael Wilde" >>> Cc: "Ben Clifford" , "Swift Devel" >>> Sent: Friday, January 13, 2012 6:09:18 PM >>> Subject: Re: [Swift-devel] command line ssh provider... >>> On Fri, 2012-01-13 at 18:00 -0600, Michael Wilde wrote: >>>> Another good test is to access eg surveyor, and intrepid using an >>>> OTP via ssh-cl. >>> >>> A word of caution there: if the ssh client asks for the password on >>> the >>> command line (instead of through ssh-askpass or some other gui), >>> things >>> won't work very well. It might be possible to add some detection for >>> that in the provider, but that's not a high priority given that there >>> is >>> a workaround (askpass). >> >> -- >> Michael Wilde >> Computation Institute, University of Chicago >> Mathematics and Computer Science Division >> Argonne National Laboratory >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From wilde at mcs.anl.gov Mon Jan 16 10:44:30 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 16 Jan 2012 10:44:30 -0600 (CST) Subject: [Swift-devel] askpass for command line ssh provider? In-Reply-To: <7E5428A4-6882-4D5A-8EFC-94849E7CC2E3@mcs.anl.gov> Message-ID: <612981587.144852.1326732270215.JavaMail.root@zimbra.anl.gov> ----- Original Message ----- > From: "Jonathan Monette" > To: "Michael Wilde" > Cc: "Mihael Hategan" , "Swift Devel" > Sent: Monday, January 16, 2012 10:31:21 AM > Subject: Re: [Swift-devel] askpass for command line ssh provider? > I always thought the solution to the OTP situation was to set up a > master channel. Inside a portal this is easy. The portal knows which > sites are used and which sites require a OTP. The portal can then set > up a master channel. For the situation for the agents, the portal can > always create the agent itself after prompting for a password once > can't it? In both scenarios the portal creates the mechanisms to limit > the number of passwords that are required. I think I agree with this - its similar to what I wrote below. Im not sure I fully understand yet when you need a master channel and when you want an agent. I *think* that you want a master channel whenever multi-hop SSH is needed, and an agent in the rest of the cases. There also might be some subtleties related to the various forward and reverse tunnels we've needed to set up for various coaster configurations in clouds and other firewalled environments. > For Swift, I do not think that these solutions work since Swift needs > to be more general(maybe creating agent approach but that won't work > for OTP situations). Can you clarify what you mean here? It seems that are 2 issues to work through: - can we and should we create a useful set of manually-executable scripts for Swift users that encapsulate the various useful ssh configurations and incantations? I hope the answer to this is "yes" to both. - Should the swift command invoke any of these scripts automatically from the ssh-cl provider (or some other point in processing)? I am less sure about the answer to this. I think the best approach is to initially make this manual, and show the user how to create wrapper scripts around the swift command this set up the necessary ssh access. Or possible, command line options? Or a .swift-ssh-setup rc file run by the swift command? - Mike > On Jan 16, 2012, at 10:07 AM, Michael Wilde wrote: > > > Was: Re: [Swift-devel] command line ssh provider... > > > > After a bit more thought, it seems that enabling the ssh-cl provider > > to prompt for passwords is perhaps not a required feature. > > > > We will for example need to access many systems that needs a one > > time password. > > > > But its likely that such mechanisms need to be set up outside of > > Swift (or at least outside the main line of the provider), using > > agents or master channels, else the user would get multiple password > > prompts per endpoint. > > > > For now, we can do this outside of Swift proper (ie in the various > > portals, ideally via scripts that we package in swift/bin which can > > be used by both command line users and by portal code). > > > > Later we can consider if its reasonable to make the ssh-cl provider > > smart enough to invoke such channel or agent setup scripts > > automatically when needed. > > > > - Mike > > > > > > > > ----- Original Message ----- > >> From: "Mihael Hategan" > >> To: "Michael Wilde" > >> Cc: "Ben Clifford" , "Swift Devel" > >> > >> Sent: Friday, January 13, 2012 6:09:18 PM > >> Subject: Re: [Swift-devel] command line ssh provider... > >> On Fri, 2012-01-13 at 18:00 -0600, Michael Wilde wrote: > >>> Another good test is to access eg surveyor, and intrepid using an > >>> OTP via ssh-cl. > >> > >> A word of caution there: if the ssh client asks for the password on > >> the > >> command line (instead of through ssh-askpass or some other gui), > >> things > >> won't work very well. It might be possible to add some detection > >> for > >> that in the provider, but that's not a high priority given that > >> there > >> is > >> a workaround (askpass). > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From wilde at mcs.anl.gov Mon Jan 16 10:47:56 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 16 Jan 2012 10:47:56 -0600 (CST) Subject: [Swift-devel] askpass for command line ssh provider? In-Reply-To: <5AA70E03-617F-437D-B8D5-1E7E5507F58C@anl.gov> Message-ID: <1505380425.144865.1326732476501.JavaMail.root@zimbra.anl.gov> Hi Ian, Yes, we'd like very much to do that; it was always our first preference. When we explored it last quarter, GO did not yet have a solution for this, but the intent was to stay in touch and use any solutions that became available, especially as one of the portal mechanisms here is the "GO Swift" prototype. Is such a solution now available or in the works? Who's the point person for GO authentication? - Mike ----- Original Message ----- > From: "Ian Foster" > To: "Jonathan Monette" > Cc: "Michael Wilde" , "Swift Devel" > Sent: Monday, January 16, 2012 10:38:38 AM > Subject: Re: [Swift-devel] askpass for command line ssh provider? > I wonder if we can leverage what Globus Online is doing for this > purpose? > > On Jan 16, 2012, at 10:31 AM, Jonathan Monette wrote: > > > I always thought the solution to the OTP situation was to set up a > > master channel. Inside a portal this is easy. The portal knows which > > sites are used and which sites require a OTP. The portal can then > > set up a master channel. For the situation for the agents, the > > portal can always create the agent itself after prompting for a > > password once can't it? In both scenarios the portal creates the > > mechanisms to limit the number of passwords that are required. > > > > For Swift, I do not think that these solutions work since Swift > > needs to be more general(maybe creating agent approach but that > > won't work for OTP situations). > > > > On Jan 16, 2012, at 10:07 AM, Michael Wilde wrote: > > > >> Was: Re: [Swift-devel] command line ssh provider... > >> > >> After a bit more thought, it seems that enabling the ssh-cl > >> provider to prompt for passwords is perhaps not a required feature. > >> > >> We will for example need to access many systems that needs a one > >> time password. > >> > >> But its likely that such mechanisms need to be set up outside of > >> Swift (or at least outside the main line of the provider), using > >> agents or master channels, else the user would get multiple > >> password prompts per endpoint. > >> > >> For now, we can do this outside of Swift proper (ie in the various > >> portals, ideally via scripts that we package in swift/bin which can > >> be used by both command line users and by portal code). > >> > >> Later we can consider if its reasonable to make the ssh-cl provider > >> smart enough to invoke such channel or agent setup scripts > >> automatically when needed. > >> > >> - Mike > >> > >> > >> > >> ----- Original Message ----- > >>> From: "Mihael Hategan" > >>> To: "Michael Wilde" > >>> Cc: "Ben Clifford" , "Swift Devel" > >>> > >>> Sent: Friday, January 13, 2012 6:09:18 PM > >>> Subject: Re: [Swift-devel] command line ssh provider... > >>> On Fri, 2012-01-13 at 18:00 -0600, Michael Wilde wrote: > >>>> Another good test is to access eg surveyor, and intrepid using an > >>>> OTP via ssh-cl. > >>> > >>> A word of caution there: if the ssh client asks for the password > >>> on > >>> the > >>> command line (instead of through ssh-askpass or some other gui), > >>> things > >>> won't work very well. It might be possible to add some detection > >>> for > >>> that in the provider, but that's not a high priority given that > >>> there > >>> is > >>> a workaround (askpass). > >> > >> -- > >> Michael Wilde > >> Computation Institute, University of Chicago > >> Mathematics and Computer Science Division > >> Argonne National Laboratory > >> > >> _______________________________________________ > >> Swift-devel mailing list > >> Swift-devel at ci.uchicago.edu > >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From foster at anl.gov Mon Jan 16 10:50:23 2012 From: foster at anl.gov (Ian Foster) Date: Mon, 16 Jan 2012 10:50:23 -0600 Subject: [Swift-devel] askpass for command line ssh provider? In-Reply-To: <1505380425.144865.1326732476501.JavaMail.root@zimbra.anl.gov> References: <1505380425.144865.1326732476501.JavaMail.root@zimbra.anl.gov> Message-ID: <15EDF1F5-9DB5-45EE-AC6B-1447FBB85F4B@anl.gov> Good question: I am not sure if it is there yet either!! Rachana is the right person. On Jan 16, 2012, at 10:47 AM, Michael Wilde wrote: > Hi Ian, > > Yes, we'd like very much to do that; it was always our first preference. When we explored it last quarter, GO did not yet have a solution for this, but the intent was to stay in touch and use any solutions that became available, especially as one of the portal mechanisms here is the "GO Swift" prototype. > > Is such a solution now available or in the works? Who's the point person for GO authentication? > > - Mike > > > ----- Original Message ----- >> From: "Ian Foster" >> To: "Jonathan Monette" >> Cc: "Michael Wilde" , "Swift Devel" >> Sent: Monday, January 16, 2012 10:38:38 AM >> Subject: Re: [Swift-devel] askpass for command line ssh provider? >> I wonder if we can leverage what Globus Online is doing for this >> purpose? >> >> On Jan 16, 2012, at 10:31 AM, Jonathan Monette wrote: >> >>> I always thought the solution to the OTP situation was to set up a >>> master channel. Inside a portal this is easy. The portal knows which >>> sites are used and which sites require a OTP. The portal can then >>> set up a master channel. For the situation for the agents, the >>> portal can always create the agent itself after prompting for a >>> password once can't it? In both scenarios the portal creates the >>> mechanisms to limit the number of passwords that are required. >>> >>> For Swift, I do not think that these solutions work since Swift >>> needs to be more general(maybe creating agent approach but that >>> won't work for OTP situations). >>> >>> On Jan 16, 2012, at 10:07 AM, Michael Wilde wrote: >>> >>>> Was: Re: [Swift-devel] command line ssh provider... >>>> >>>> After a bit more thought, it seems that enabling the ssh-cl >>>> provider to prompt for passwords is perhaps not a required feature. >>>> >>>> We will for example need to access many systems that needs a one >>>> time password. >>>> >>>> But its likely that such mechanisms need to be set up outside of >>>> Swift (or at least outside the main line of the provider), using >>>> agents or master channels, else the user would get multiple >>>> password prompts per endpoint. >>>> >>>> For now, we can do this outside of Swift proper (ie in the various >>>> portals, ideally via scripts that we package in swift/bin which can >>>> be used by both command line users and by portal code). >>>> >>>> Later we can consider if its reasonable to make the ssh-cl provider >>>> smart enough to invoke such channel or agent setup scripts >>>> automatically when needed. >>>> >>>> - Mike >>>> >>>> >>>> >>>> ----- Original Message ----- >>>>> From: "Mihael Hategan" >>>>> To: "Michael Wilde" >>>>> Cc: "Ben Clifford" , "Swift Devel" >>>>> >>>>> Sent: Friday, January 13, 2012 6:09:18 PM >>>>> Subject: Re: [Swift-devel] command line ssh provider... >>>>> On Fri, 2012-01-13 at 18:00 -0600, Michael Wilde wrote: >>>>>> Another good test is to access eg surveyor, and intrepid using an >>>>>> OTP via ssh-cl. >>>>> >>>>> A word of caution there: if the ssh client asks for the password >>>>> on >>>>> the >>>>> command line (instead of through ssh-askpass or some other gui), >>>>> things >>>>> won't work very well. It might be possible to add some detection >>>>> for >>>>> that in the provider, but that's not a high priority given that >>>>> there >>>>> is >>>>> a workaround (askpass). >>>> >>>> -- >>>> Michael Wilde >>>> Computation Institute, University of Chicago >>>> Mathematics and Computer Science Division >>>> Argonne National Laboratory >>>> >>>> _______________________________________________ >>>> Swift-devel mailing list >>>> Swift-devel at ci.uchicago.edu >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>> >>> _______________________________________________ >>> Swift-devel mailing list >>> Swift-devel at ci.uchicago.edu >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From ketancmaheshwari at gmail.com Mon Jan 16 11:05:38 2012 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Mon, 16 Jan 2012 11:05:38 -0600 Subject: [Swift-devel] timeout on OSG with coasters provider staging Message-ID: Hi Mihael, I could reproduce this timeout exception on OSG with catsn Swift jobs. These are 100 jobs with a data size of 10MB each. So, 2000MB of data movement in all. I tried with 1 worker running on a single OSG site. I tried three different OSG sites: Nebraska, UChicago and RENCI. In each of these cases, I run into the following timeout after ~4 minutes of run (15-70 jobs complete during this period) . : Timeout org.globus.cog.karajan.workflow.service.TimeoutException: Handler(562, PUT): timed out receiving request. Last time 940817-011255.807, now: 120115-194100.072 at org.globus.cog.karajan.workflow.service.handlers.RequestHandler.handleTimeout(RequestHandler.java:124) at org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.checkTimeouts(AbstractKarajanChannel.java:131) at org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.checkTimeouts(AbstractKarajanChannel.java:123) at org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel$1.run(AbstractKarajanChannel.java:116) at java.util.TimerThread.mainLoop(Timer.java:512) at java.util.TimerThread.run(Timer.java:462) Command(168, SUBMITJOB): handling reply timeout; sendReqTime=120115-193900.255, sendTime=120115-193900.255, now=120115-194100.416, channel=SC-null This is followed by messages similar to the above last line but the progress of workflow halts. Here is the tarball of the experiment: http://ci.uchicago.edu/~ketan/catsn-exp-formihael.tgz It contains a README which has the steps to run: basically start-service on localhost -> start worker on OSG site -> run swift Regards, -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at mcs.anl.gov Mon Jan 16 12:07:10 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Mon, 16 Jan 2012 12:07:10 -0600 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <1326706619.18881.0.camel@blabla> References: <1935551723.142011.1326499201599.JavaMail.root@zimbra.anl.gov> <1326706619.18881.0.camel@blabla> Message-ID: So I am still having problems getting an agent to be using. I am running this on communicado. From my mac I ssh -A jonmon at login.ci.uchicago.edu then ssh -A jonmon at communicado.ci.uchicago.edu Here is what shows up when I grep for ssh as Mike did. [jonmon at communicado: ~/Workspace/Swift/ssh-cl-test]$ env | grep -i ssh SSH_CLIENT=128.135.125.155 48486 22 SSH_TTY=/dev/pts/0 RSHCOMMAND=/usr/bin/ssh SSH_AUTH_SOCK=/tmp/ssh-mtYNz20726/agent.20726 PWD=/home/jonmon/Workspace/Swift/ssh-cl-test SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass CVS_RSH=ssh SSH_CONNECTION=128.135.125.155 48486 128.135.125.17 22 HOSTFILE=/home/jonmon/.ssh/known_hosts [jonmon at communicado: ~/Workspace/Swift/ssh-cl-test]$ export | grep -i ssh declare -x CVS_RSH="ssh" declare -x HOSTFILE="/home/jonmon/.ssh/known_hosts" declare -x PWD="/home/jonmon/Workspace/Swift/ssh-cl-test" declare -x RSHCOMMAND="/usr/bin/ssh" declare -x SSH_ASKPASS="/usr/libexec/openssh/gnome-ssh-askpass" declare -x SSH_AUTH_SOCK="/tmp/ssh-mtYNz20726/agent.20726" declare -x SSH_CLIENT="128.135.125.155 48486 22" declare -x SSH_CONNECTION="128.135.125.155 48486 128.135.125.17 22" declare -x SSH_TTY="/dev/pts/0" [jonmon at communicado: ~/Workspace/Swift/ssh-cl-test]$ ll /tmp/ssh-mtYNz20726/ total 0 srwxr-xr-x 1 jonmon ci-users 0 Jan 16 12:01 agent.20726= As you can see there is an agent set up where the SSH_AUTH_SOCK variable is pointing to. However when I try to run my swift script that just runs hostname on the remote machine I get prompted for a password. [jonmon at communicado: ~/Workspace/Swift/ssh-cl-test]$ ll total 6.0K -rw-r--r-- 1 jonmon ci-users 297 Jul 14 2011 cf -rw-r--r-- 1 jonmon ci-users 111 Jan 12 15:40 hostname.swift -rw-r--r-- 1 jonmon ci-users 24 Jan 12 15:42 hostname.txt -rwxr-xr-x 1 jonmon ci-users 79 Jan 12 13:09 run.sh* -rw-r--r-- 1 jonmon ci-users 361 Jan 16 11:58 sites.xml -rw-r--r-- 1 jonmon ci-users 46 Jan 12 13:12 tc [jonmon at communicado: ~/Workspace/Swift/ssh-cl-test]$ cat sites.xml /home/jonmon/Workspace/Swift/swift.workdir 0.20 1 [jonmon at communicado: ~/Workspace/Swift/ssh-cl-test]$ cat run.sh #!/bin/bash swift -config cf -tc.file tc -sites.file sites.xml hostname.swift [jonmon at communicado: ~/Workspace/Swift/ssh-cl-test]$ ./run.sh Swift trunk swift-r5498 (swift modified locally) cog-r3347 (cog modified locally) RunID: 20120116-1205-eke5ge02 Progress: time: Mon, 16 Jan 2012 12:05:22 -0600 Password: Any ideas what to try? I also tried the -t option that Mike tried but there was not effect. Still prompted for a password. On Jan 16, 2012, at 3:36 AM, Mihael Hategan wrote: > On Fri, 2012-01-13 at 18:00 -0600, Michael Wilde wrote: >> Once I had done that, coasters booted OK. Then I hit a suspected >> problem in the local provider: it was not accepting jobs, but seemed >> set up OK. Mihael is investigating. > > Works for me, so I suspect it's more subtle. If you see it again, can > you send me a jstack output of the coaster service? > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From jonmon at mcs.anl.gov Mon Jan 16 12:18:53 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Mon, 16 Jan 2012 12:18:53 -0600 Subject: [Swift-devel] askpass for command line ssh provider? In-Reply-To: <612981587.144852.1326732270215.JavaMail.root@zimbra.anl.gov> References: <612981587.144852.1326732270215.JavaMail.root@zimbra.anl.gov> Message-ID: <4ECF0140-9154-4A35-A735-63895138EF96@mcs.anl.gov> On Jan 16, 2012, at 10:44 AM, Michael Wilde wrote: > > > ----- Original Message ----- >> From: "Jonathan Monette" >> To: "Michael Wilde" >> Cc: "Mihael Hategan" , "Swift Devel" >> Sent: Monday, January 16, 2012 10:31:21 AM >> Subject: Re: [Swift-devel] askpass for command line ssh provider? >> I always thought the solution to the OTP situation was to set up a >> master channel. Inside a portal this is easy. The portal knows which >> sites are used and which sites require a OTP. The portal can then set >> up a master channel. For the situation for the agents, the portal can >> always create the agent itself after prompting for a password once >> can't it? In both scenarios the portal creates the mechanisms to limit >> the number of passwords that are required. > > I think I agree with this - its similar to what I wrote below. > > Im not sure I fully understand yet when you need a master channel and when you want an agent. I *think* that you want a master channel whenever multi-hop SSH is needed, and an agent in the rest of the cases. You also want a master channel for the OTP situation I believe. > There also might be some subtleties related to the various forward and reverse tunnels we've needed to set up for various coaster configurations in clouds and other firewalled environments. > >> For Swift, I do not think that these solutions work since Swift needs >> to be more general(maybe creating agent approach but that won't work >> for OTP situations). > > Can you clarify what you mean here? It seems that are 2 issues to work through: > > - can we and should we create a useful set of manually-executable scripts for Swift users that encapsulate the various useful ssh configurations and incantations? > > I hope the answer to this is "yes" to both. > > - Should the swift command invoke any of these scripts automatically from the ssh-cl provider (or some other point in processing)? > > I am less sure about the answer to this. I think the best approach is to initially make this manual, and show the user how to create wrapper scripts around the swift command this set up the necessary ssh access. Or possible, command line options? Or a .swift-ssh-setup rc file run by the swift command? If we require the user to set up the ssh configuration manually, either himself or using some scripts we provide, then the problem is solved but if we want Swift to set up the ssh configuration for the user then we run into issues. I think a rc file would be necessary where the user specifies what type of machine they are trying to access. This way Swift can determine which set of executable scripts it needs to use. The reason I said this may not work for Swift is because how does Swift know whether the machine is it trying to access uses OTP(BlueGene) or needs to be multi-hopped because it is behind a firewall(MCS cluster). I think doing the manual approach is best for now. > > - Mike > >> On Jan 16, 2012, at 10:07 AM, Michael Wilde wrote: >> >>> Was: Re: [Swift-devel] command line ssh provider... >>> >>> After a bit more thought, it seems that enabling the ssh-cl provider >>> to prompt for passwords is perhaps not a required feature. >>> >>> We will for example need to access many systems that needs a one >>> time password. >>> >>> But its likely that such mechanisms need to be set up outside of >>> Swift (or at least outside the main line of the provider), using >>> agents or master channels, else the user would get multiple password >>> prompts per endpoint. >>> >>> For now, we can do this outside of Swift proper (ie in the various >>> portals, ideally via scripts that we package in swift/bin which can >>> be used by both command line users and by portal code). >>> >>> Later we can consider if its reasonable to make the ssh-cl provider >>> smart enough to invoke such channel or agent setup scripts >>> automatically when needed. >>> >>> - Mike >>> >>> >>> >>> ----- Original Message ----- >>>> From: "Mihael Hategan" >>>> To: "Michael Wilde" >>>> Cc: "Ben Clifford" , "Swift Devel" >>>> >>>> Sent: Friday, January 13, 2012 6:09:18 PM >>>> Subject: Re: [Swift-devel] command line ssh provider... >>>> On Fri, 2012-01-13 at 18:00 -0600, Michael Wilde wrote: >>>>> Another good test is to access eg surveyor, and intrepid using an >>>>> OTP via ssh-cl. >>>> >>>> A word of caution there: if the ssh client asks for the password on >>>> the >>>> command line (instead of through ssh-askpass or some other gui), >>>> things >>>> won't work very well. It might be possible to add some detection >>>> for >>>> that in the provider, but that's not a high priority given that >>>> there >>>> is >>>> a workaround (askpass). >>> >>> -- >>> Michael Wilde >>> Computation Institute, University of Chicago >>> Mathematics and Computer Science Division >>> Argonne National Laboratory >>> >>> _______________________________________________ >>> Swift-devel mailing list >>> Swift-devel at ci.uchicago.edu >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > From hategan at mcs.anl.gov Mon Jan 16 12:53:03 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 16 Jan 2012 10:53:03 -0800 Subject: [Swift-devel] askpass for command line ssh provider? In-Reply-To: <7E5428A4-6882-4D5A-8EFC-94849E7CC2E3@mcs.anl.gov> References: <2133311711.144547.1326730040269.JavaMail.root@zimbra.anl.gov> <7E5428A4-6882-4D5A-8EFC-94849E7CC2E3@mcs.anl.gov> Message-ID: <1326739983.19812.1.camel@blabla> The plain ssh provider does re-use connections (and credentials), whereas I don't think the agent deals with OTPs. Mihael On Mon, 2012-01-16 at 10:31 -0600, Jonathan Monette wrote: > I always thought the solution to the OTP situation was to set up a master channel. Inside a portal this is easy. The portal knows which sites are used and which sites require a OTP. The portal can then set up a master channel. For the situation for the agents, the portal can always create the agent itself after prompting for a password once can't it? In both scenarios the portal creates the mechanisms to limit the number of passwords that are required. > > For Swift, I do not think that these solutions work since Swift needs to be more general(maybe creating agent approach but that won't work for OTP situations). > > On Jan 16, 2012, at 10:07 AM, Michael Wilde wrote: > > > Was: Re: [Swift-devel] command line ssh provider... > > > > After a bit more thought, it seems that enabling the ssh-cl provider to prompt for passwords is perhaps not a required feature. > > > > We will for example need to access many systems that needs a one time password. > > > > But its likely that such mechanisms need to be set up outside of Swift (or at least outside the main line of the provider), using agents or master channels, else the user would get multiple password prompts per endpoint. > > > > For now, we can do this outside of Swift proper (ie in the various portals, ideally via scripts that we package in swift/bin which can be used by both command line users and by portal code). > > > > Later we can consider if its reasonable to make the ssh-cl provider smart enough to invoke such channel or agent setup scripts automatically when needed. > > > > - Mike > > > > > > > > ----- Original Message ----- > >> From: "Mihael Hategan" > >> To: "Michael Wilde" > >> Cc: "Ben Clifford" , "Swift Devel" > >> Sent: Friday, January 13, 2012 6:09:18 PM > >> Subject: Re: [Swift-devel] command line ssh provider... > >> On Fri, 2012-01-13 at 18:00 -0600, Michael Wilde wrote: > >>> Another good test is to access eg surveyor, and intrepid using an > >>> OTP via ssh-cl. > >> > >> A word of caution there: if the ssh client asks for the password on > >> the > >> command line (instead of through ssh-askpass or some other gui), > >> things > >> won't work very well. It might be possible to add some detection for > >> that in the provider, but that's not a high priority given that there > >> is > >> a workaround (askpass). > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > From hategan at mcs.anl.gov Mon Jan 16 12:58:27 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 16 Jan 2012 10:58:27 -0800 Subject: [Swift-devel] command line ssh provider... In-Reply-To: References: <1935551723.142011.1326499201599.JavaMail.root@zimbra.anl.gov> <1326706619.18881.0.camel@blabla> Message-ID: <1326740307.19812.3.camel@blabla> Right. So now after getting to communicado, what happens if you manually type ssh -v bridled... Also, have you tried to add the "-v" in the ssh-cl provider and see what's in the logs? Mihael On Mon, 2012-01-16 at 12:07 -0600, Jonathan Monette wrote: > So I am still having problems getting an agent to be using. I am running this on communicado. From my mac I ssh -A jonmon at login.ci.uchicago.edu then ssh -A jonmon at communicado.ci.uchicago.edu > > Here is what shows up when I grep for ssh as Mike did. > > [jonmon at communicado: ~/Workspace/Swift/ssh-cl-test]$ env | grep -i ssh > SSH_CLIENT=128.135.125.155 48486 22 > SSH_TTY=/dev/pts/0 > RSHCOMMAND=/usr/bin/ssh > SSH_AUTH_SOCK=/tmp/ssh-mtYNz20726/agent.20726 > PWD=/home/jonmon/Workspace/Swift/ssh-cl-test > SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass > CVS_RSH=ssh > SSH_CONNECTION=128.135.125.155 48486 128.135.125.17 22 > HOSTFILE=/home/jonmon/.ssh/known_hosts > > [jonmon at communicado: ~/Workspace/Swift/ssh-cl-test]$ export | grep -i ssh > declare -x CVS_RSH="ssh" > declare -x HOSTFILE="/home/jonmon/.ssh/known_hosts" > declare -x PWD="/home/jonmon/Workspace/Swift/ssh-cl-test" > declare -x RSHCOMMAND="/usr/bin/ssh" > declare -x SSH_ASKPASS="/usr/libexec/openssh/gnome-ssh-askpass" > declare -x SSH_AUTH_SOCK="/tmp/ssh-mtYNz20726/agent.20726" > declare -x SSH_CLIENT="128.135.125.155 48486 22" > declare -x SSH_CONNECTION="128.135.125.155 48486 128.135.125.17 22" > declare -x SSH_TTY="/dev/pts/0" > > [jonmon at communicado: ~/Workspace/Swift/ssh-cl-test]$ ll /tmp/ssh-mtYNz20726/ > total 0 > srwxr-xr-x 1 jonmon ci-users 0 Jan 16 12:01 agent.20726= > > As you can see there is an agent set up where the SSH_AUTH_SOCK variable is pointing to. However when I try to run my swift script that just runs hostname on the remote machine I get prompted for a password. > > [jonmon at communicado: ~/Workspace/Swift/ssh-cl-test]$ ll > total 6.0K > -rw-r--r-- 1 jonmon ci-users 297 Jul 14 2011 cf > -rw-r--r-- 1 jonmon ci-users 111 Jan 12 15:40 hostname.swift > -rw-r--r-- 1 jonmon ci-users 24 Jan 12 15:42 hostname.txt > -rwxr-xr-x 1 jonmon ci-users 79 Jan 12 13:09 run.sh* > -rw-r--r-- 1 jonmon ci-users 361 Jan 16 11:58 sites.xml > -rw-r--r-- 1 jonmon ci-users 46 Jan 12 13:12 tc > > [jonmon at communicado: ~/Workspace/Swift/ssh-cl-test]$ cat sites.xml > > > > > /home/jonmon/Workspace/Swift/swift.workdir > 0.20 > 1 > > > > [jonmon at communicado: ~/Workspace/Swift/ssh-cl-test]$ cat run.sh > #!/bin/bash > > swift -config cf -tc.file tc -sites.file sites.xml hostname.swift > [jonmon at communicado: ~/Workspace/Swift/ssh-cl-test]$ ./run.sh > Swift trunk swift-r5498 (swift modified locally) cog-r3347 (cog modified locally) > > RunID: 20120116-1205-eke5ge02 > Progress: time: Mon, 16 Jan 2012 12:05:22 -0600 > Password: > > Any ideas what to try? I also tried the -t option that Mike tried but there was not effect. Still prompted for a password. > > On Jan 16, 2012, at 3:36 AM, Mihael Hategan wrote: > > > On Fri, 2012-01-13 at 18:00 -0600, Michael Wilde wrote: > >> Once I had done that, coasters booted OK. Then I hit a suspected > >> problem in the local provider: it was not accepting jobs, but seemed > >> set up OK. Mihael is investigating. > > > > Works for me, so I suspect it's more subtle. If you see it again, can > > you send me a jstack output of the coaster service? > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > From jonmon at mcs.anl.gov Mon Jan 16 13:10:38 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Mon, 16 Jan 2012 13:10:38 -0600 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <1326740307.19812.3.camel@blabla> References: <1935551723.142011.1326499201599.JavaMail.root@zimbra.anl.gov> <1326706619.18881.0.camel@blabla> <1326740307.19812.3.camel@blabla> Message-ID: <71A31CC9-EE7C-47AF-9F3A-5BAFC1DF19D9@mcs.anl.gov> On Jan 16, 2012, at 12:58 PM, Mihael Hategan wrote: > Right. So now after getting to communicado, what happens if you manually > type ssh -v bridled? [jonmon at communicado: ~/Workspace/Swift/ssh-cl-test]$ ssh -v -A jonmon at bridled.ci.uchicago.edu OpenSSH_4.3p2, OpenSSL 0.9.8e-fips-rhel5 01 Jul 2008 debug1: Reading configuration data /etc/ssh/ssh_config debug1: Applying options for * debug1: Connecting to bridled.ci.uchicago.edu [128.135.125.18] port 22. debug1: Connection established. debug1: identity file /home/jonmon/.ssh/identity type -1 debug1: identity file /home/jonmon/.ssh/id_rsa type -1 debug1: identity file /home/jonmon/.ssh/id_dsa type -1 debug1: loaded 3 keys debug1: Remote protocol version 2.0, remote software version OpenSSH_4.3 debug1: match: OpenSSH_4.3 pat OpenSSH* debug1: Enabling compatibility mode for protocol 2.0 debug1: Local version string SSH-2.0-OpenSSH_4.3 debug1: SSH2_MSG_KEXINIT sent debug1: SSH2_MSG_KEXINIT received debug1: kex: server->client aes128-ctr hmac-md5 none debug1: kex: client->server aes128-ctr hmac-md5 none debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<1024<8192) sent debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP debug1: SSH2_MSG_KEX_DH_GEX_INIT sent debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY debug1: Host 'bridled.ci.uchicago.edu' is known and matches the RSA host key. debug1: Found key in /home/jonmon/.ssh/known_hosts:16 debug1: ssh_rsa_verify: signature correct debug1: SSH2_MSG_NEWKEYS sent debug1: expecting SSH2_MSG_NEWKEYS debug1: SSH2_MSG_NEWKEYS received debug1: SSH2_MSG_SERVICE_REQUEST sent debug1: SSH2_MSG_SERVICE_ACCEPT received debug1: Authentications that can continue: publickey,gssapi-with-mic,password,keyboard-interactive debug1: Next authentication method: gssapi-with-mic debug1: Unspecified GSS failure. Minor code may provide more information No credentials cache found debug1: Unspecified GSS failure. Minor code may provide more information No credentials cache found debug1: Unspecified GSS failure. Minor code may provide more information No credentials cache found debug1: Next authentication method: publickey debug1: Offering public key: /Users/jonmon/.ssh/id_rsa debug1: Server accepts key: pkalg ssh-rsa blen 277 debug1: Authentication succeeded (publickey). debug1: channel 0: new [client-session] debug1: Entering interactive session. debug1: Requesting authentication agent forwarding. Last login: Mon Nov 21 13:14:32 2011 from login.ci.uchicago.edu I do not have a private key in my ~/.ssh directory. This is all done through agent forwarding from my mac(which does have the private key). > > Also, have you tried to add the "-v" in the ssh-cl provider and see > what's in the logs? > I added the line: cmdarray("-v"); as the first parameter in the ArrayList. However I cannot make anything show up in the logs. I tried entering in the wrong password as well as the correct password. Is there anything else that needs to be added to the JobSubmissionTaskHandler file? Or where else should I look besides the logs? > Mihael > > On Mon, 2012-01-16 at 12:07 -0600, Jonathan Monette wrote: >> So I am still having problems getting an agent to be using. I am running this on communicado. From my mac I ssh -A jonmon at login.ci.uchicago.edu then ssh -A jonmon at communicado.ci.uchicago.edu >> >> Here is what shows up when I grep for ssh as Mike did. >> >> [jonmon at communicado: ~/Workspace/Swift/ssh-cl-test]$ env | grep -i ssh >> SSH_CLIENT=128.135.125.155 48486 22 >> SSH_TTY=/dev/pts/0 >> RSHCOMMAND=/usr/bin/ssh >> SSH_AUTH_SOCK=/tmp/ssh-mtYNz20726/agent.20726 >> PWD=/home/jonmon/Workspace/Swift/ssh-cl-test >> SSH_ASKPASS=/usr/libexec/openssh/gnome-ssh-askpass >> CVS_RSH=ssh >> SSH_CONNECTION=128.135.125.155 48486 128.135.125.17 22 >> HOSTFILE=/home/jonmon/.ssh/known_hosts >> >> [jonmon at communicado: ~/Workspace/Swift/ssh-cl-test]$ export | grep -i ssh >> declare -x CVS_RSH="ssh" >> declare -x HOSTFILE="/home/jonmon/.ssh/known_hosts" >> declare -x PWD="/home/jonmon/Workspace/Swift/ssh-cl-test" >> declare -x RSHCOMMAND="/usr/bin/ssh" >> declare -x SSH_ASKPASS="/usr/libexec/openssh/gnome-ssh-askpass" >> declare -x SSH_AUTH_SOCK="/tmp/ssh-mtYNz20726/agent.20726" >> declare -x SSH_CLIENT="128.135.125.155 48486 22" >> declare -x SSH_CONNECTION="128.135.125.155 48486 128.135.125.17 22" >> declare -x SSH_TTY="/dev/pts/0" >> >> [jonmon at communicado: ~/Workspace/Swift/ssh-cl-test]$ ll /tmp/ssh-mtYNz20726/ >> total 0 >> srwxr-xr-x 1 jonmon ci-users 0 Jan 16 12:01 agent.20726= >> >> As you can see there is an agent set up where the SSH_AUTH_SOCK variable is pointing to. However when I try to run my swift script that just runs hostname on the remote machine I get prompted for a password. >> >> [jonmon at communicado: ~/Workspace/Swift/ssh-cl-test]$ ll >> total 6.0K >> -rw-r--r-- 1 jonmon ci-users 297 Jul 14 2011 cf >> -rw-r--r-- 1 jonmon ci-users 111 Jan 12 15:40 hostname.swift >> -rw-r--r-- 1 jonmon ci-users 24 Jan 12 15:42 hostname.txt >> -rwxr-xr-x 1 jonmon ci-users 79 Jan 12 13:09 run.sh* >> -rw-r--r-- 1 jonmon ci-users 361 Jan 16 11:58 sites.xml >> -rw-r--r-- 1 jonmon ci-users 46 Jan 12 13:12 tc >> >> [jonmon at communicado: ~/Workspace/Swift/ssh-cl-test]$ cat sites.xml >> >> >> >> >> /home/jonmon/Workspace/Swift/swift.workdir >> 0.20 >> 1 >> >> >> >> [jonmon at communicado: ~/Workspace/Swift/ssh-cl-test]$ cat run.sh >> #!/bin/bash >> >> swift -config cf -tc.file tc -sites.file sites.xml hostname.swift >> [jonmon at communicado: ~/Workspace/Swift/ssh-cl-test]$ ./run.sh >> Swift trunk swift-r5498 (swift modified locally) cog-r3347 (cog modified locally) >> >> RunID: 20120116-1205-eke5ge02 >> Progress: time: Mon, 16 Jan 2012 12:05:22 -0600 >> Password: >> >> Any ideas what to try? I also tried the -t option that Mike tried but there was not effect. Still prompted for a password. >> >> On Jan 16, 2012, at 3:36 AM, Mihael Hategan wrote: >> >>> On Fri, 2012-01-13 at 18:00 -0600, Michael Wilde wrote: >>>> Once I had done that, coasters booted OK. Then I hit a suspected >>>> problem in the local provider: it was not accepting jobs, but seemed >>>> set up OK. Mihael is investigating. >>> >>> Works for me, so I suspect it's more subtle. If you see it again, can >>> you send me a jstack output of the coaster service? >>> >>> _______________________________________________ >>> Swift-devel mailing list >>> Swift-devel at ci.uchicago.edu >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> > > From hategan at mcs.anl.gov Mon Jan 16 13:14:17 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 16 Jan 2012 11:14:17 -0800 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <71A31CC9-EE7C-47AF-9F3A-5BAFC1DF19D9@mcs.anl.gov> References: <1935551723.142011.1326499201599.JavaMail.root@zimbra.anl.gov> <1326706619.18881.0.camel@blabla> <1326740307.19812.3.camel@blabla> <71A31CC9-EE7C-47AF-9F3A-5BAFC1DF19D9@mcs.anl.gov> Message-ID: <1326741257.20114.0.camel@blabla> On Mon, 2012-01-16 at 13:10 -0600, Jonathan Monette wrote: > as the first parameter in the ArrayList. However I cannot make > anything show up in the logs. I tried entering in the wrong password > as well as the correct password. Is there anything else that needs to > be added to the JobSubmissionTaskHandler file? Or where else should I > look besides the logs? You need to type some wrong password to make the job fail which captures the output. From hategan at mcs.anl.gov Mon Jan 16 13:16:31 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 16 Jan 2012 11:16:31 -0800 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <1326741257.20114.0.camel@blabla> References: <1935551723.142011.1326499201599.JavaMail.root@zimbra.anl.gov> <1326706619.18881.0.camel@blabla> <1326740307.19812.3.camel@blabla> <71A31CC9-EE7C-47AF-9F3A-5BAFC1DF19D9@mcs.anl.gov> <1326741257.20114.0.camel@blabla> Message-ID: <1326741391.20114.2.camel@blabla> On Mon, 2012-01-16 at 11:14 -0800, Mihael Hategan wrote: > On Mon, 2012-01-16 at 13:10 -0600, Jonathan Monette wrote: > > > as the first parameter in the ArrayList. However I cannot make > > anything show up in the logs. I tried entering in the wrong password > > as well as the correct password. Is there anything else that needs to > > be added to the JobSubmissionTaskHandler file? Or where else should I > > look besides the logs? > > You need to type some wrong password to make the job fail which captures > the output. But then that won't work without a GUI askpass so, nevermind. You can try to kill the ssh sub-process? From ketancmaheshwari at gmail.com Mon Jan 16 13:30:04 2012 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Mon, 16 Jan 2012 13:30:04 -0600 Subject: [Swift-devel] java heap space exceeded ssh-SGE Message-ID: Hi Mihael, As discussed in the Swift call last Wednesday, here is the log for the java heap space exceeded run: http://ci.uchicago.edu/~ketan/postproc-20120116-1139-evtqkpj9.log.tar.gz Here is the stderr message: Execution failed: GC overhead limit exceeded Progress: time: Mon, 16 Jan 2012 12:38:41 -0600 Initializing:18 Selecting site:1421 Submitting:62 Active:167 Stage out:9 Finished successfully:3702 Failed but can retry:216 Progress: time: Mon, 16 Jan 2012 12:38:53 -0600 Initializing:18 Selecting site:1421 Submitting:62 Active:166 Stage out:10 Finished successfully:3702 Failed but can retry:216 Uncaught exception: java.lang.OutOfMemoryError: GC overhead limit exceeded in vdl:new @ postproc.kml, line: 2009 java.lang.OutOfMemoryError: GC overhead limit exceeded at java.util.Arrays.copyOfRange(Arrays.java:3209) at java.lang.String.(String.java:215) at java.lang.StringBuilder.toString(StringBuilder.java:430) at org.griphyn.vdl.mapping.AbstractDataNode.closeShallow(AbstractDataNode.java:413) at org.griphyn.vdl.mapping.AbstractDataNode.setValue(AbstractDataNode.java:358) at org.griphyn.vdl.mapping.RootDataNode.setValue(RootDataNode.java:229) at org.griphyn.vdl.karajan.lib.New.function(New.java:129) at org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:62) at org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) at org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) at org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) at org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) at org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) at org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) at org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Exception is: java.lang.OutOfMemoryError: GC overhead limit exceeded Near Karajan line: vdl:new @ postproc.kml, line: 2009 Progress: time: Mon, 16 Jan 2012 12:39:02 -0600 Initializing:19 Selecting site:1423 Submitting:62 Active:159 Stage out:15 Finished successfully:3704 Failed but can retry:216 In a next run, I will try to capture memory usage snapshot at time intervals and see if that provides some more info. Meanwhile may be you get some idea from the log. Regards, -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Mon Jan 16 13:38:24 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 16 Jan 2012 11:38:24 -0800 Subject: [Swift-devel] timeout on OSG with coasters provider staging In-Reply-To: References: Message-ID: <1326742704.20900.0.camel@blabla> Nothing interesting there. Do you also happen to have the service and worker logs? On Mon, 2012-01-16 at 11:05 -0600, Ketan Maheshwari wrote: > Hi Mihael, > > > I could reproduce this timeout exception on OSG with catsn Swift jobs. > > > These are 100 jobs with a data size of 10MB each. So, 2000MB of data > movement in all. > > > I tried with 1 worker running on a single OSG site. I tried three > different OSG sites: Nebraska, UChicago and RENCI. > > > In each of these cases, I run into the following timeout after ~4 > minutes of run (15-70 jobs complete during this period) . : > > > Timeout > org.globus.cog.karajan.workflow.service.TimeoutException: Handler(562, > PUT): timed out receiving request. Last time 940817-011255.807, now: > 120115-194100.072 > at > org.globus.cog.karajan.workflow.service.handlers.RequestHandler.handleTimeout(RequestHandler.java:124) > at > org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.checkTimeouts(AbstractKarajanChannel.java:131) > at > org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.checkTimeouts(AbstractKarajanChannel.java:123) > at > org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel$1.run(AbstractKarajanChannel.java:116) > at java.util.TimerThread.mainLoop(Timer.java:512) > at java.util.TimerThread.run(Timer.java:462) > Command(168, SUBMITJOB): handling reply timeout; > sendReqTime=120115-193900.255, sendTime=120115-193900.255, > now=120115-194100.416, channel=SC-null > > > This is followed by messages similar to the above last line but the > progress of workflow halts. > > > Here is the tarball of the > experiment: http://ci.uchicago.edu/~ketan/catsn-exp-formihael.tgz > > > It contains a README which has the steps to run: basically > start-service on localhost -> start worker on OSG site -> run swift > > > Regards, > -- > Ketan > > > From ketancmaheshwari at gmail.com Mon Jan 16 14:24:35 2012 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Mon, 16 Jan 2012 14:24:35 -0600 Subject: [Swift-devel] timeout on OSG with coasters provider staging In-Reply-To: <1326742704.20900.0.camel@blabla> References: <1326742704.20900.0.camel@blabla> Message-ID: Mihael, Please find service log here: http://ci.uchicago.edu/~ketan/swift.log.tar.gz worker logs seems to have lost. I'll see if I can find'em. Regards, Ketan On Mon, Jan 16, 2012 at 1:38 PM, Mihael Hategan wrote: > Nothing interesting there. Do you also happen to have the service and > worker logs? > > On Mon, 2012-01-16 at 11:05 -0600, Ketan Maheshwari wrote: > > Hi Mihael, > > > > > > I could reproduce this timeout exception on OSG with catsn Swift jobs. > > > > > > These are 100 jobs with a data size of 10MB each. So, 2000MB of data > > movement in all. > > > > > > I tried with 1 worker running on a single OSG site. I tried three > > different OSG sites: Nebraska, UChicago and RENCI. > > > > > > In each of these cases, I run into the following timeout after ~4 > > minutes of run (15-70 jobs complete during this period) . : > > > > > > Timeout > > org.globus.cog.karajan.workflow.service.TimeoutException: Handler(562, > > PUT): timed out receiving request. Last time 940817-011255.807, now: > > 120115-194100.072 > > at > > > org.globus.cog.karajan.workflow.service.handlers.RequestHandler.handleTimeout(RequestHandler.java:124) > > at > > > org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.checkTimeouts(AbstractKarajanChannel.java:131) > > at > > > org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.checkTimeouts(AbstractKarajanChannel.java:123) > > at > > > org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel$1.run(AbstractKarajanChannel.java:116) > > at java.util.TimerThread.mainLoop(Timer.java:512) > > at java.util.TimerThread.run(Timer.java:462) > > Command(168, SUBMITJOB): handling reply timeout; > > sendReqTime=120115-193900.255, sendTime=120115-193900.255, > > now=120115-194100.416, channel=SC-null > > > > > > This is followed by messages similar to the above last line but the > > progress of workflow halts. > > > > > > Here is the tarball of the > > experiment: http://ci.uchicago.edu/~ketan/catsn-exp-formihael.tgz > > > > > > It contains a README which has the steps to run: basically > > start-service on localhost -> start worker on OSG site -> run swift > > > > > > Regards, > > -- > > Ketan > > > > > > > > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at mcs.anl.gov Mon Jan 16 15:24:27 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Mon, 16 Jan 2012 15:24:27 -0600 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <1326741391.20114.2.camel@blabla> References: <1935551723.142011.1326499201599.JavaMail.root@zimbra.anl.gov> <1326706619.18881.0.camel@blabla> <1326740307.19812.3.camel@blabla> <71A31CC9-EE7C-47AF-9F3A-5BAFC1DF19D9@mcs.anl.gov> <1326741257.20114.0.camel@blabla> <1326741391.20114.2.camel@blabla> Message-ID: So. I started a Swift run and when it was waiting for me to type in a password I opened another terminal and kill what looked like the process that was connecting to bridled. Swift just started a new process. I then kill -9 the process and this time Swift crashed. Still no ssh debug messages appear in the logs. On Jan 16, 2012, at 1:16 PM, Mihael Hategan wrote: > On Mon, 2012-01-16 at 11:14 -0800, Mihael Hategan wrote: >> On Mon, 2012-01-16 at 13:10 -0600, Jonathan Monette wrote: >> >>> as the first parameter in the ArrayList. However I cannot make >>> anything show up in the logs. I tried entering in the wrong password >>> as well as the correct password. Is there anything else that needs to >>> be added to the JobSubmissionTaskHandler file? Or where else should I >>> look besides the logs? >> >> You need to type some wrong password to make the job fail which captures >> the output. > > But then that won't work without a GUI askpass so, nevermind. You can > try to kill the ssh sub-process? > From hategan at mcs.anl.gov Mon Jan 16 21:29:02 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 16 Jan 2012 19:29:02 -0800 Subject: [Swift-devel] command line ssh provider... In-Reply-To: References: <1935551723.142011.1326499201599.JavaMail.root@zimbra.anl.gov> <1326706619.18881.0.camel@blabla> <1326740307.19812.3.camel@blabla> <71A31CC9-EE7C-47AF-9F3A-5BAFC1DF19D9@mcs.anl.gov> <1326741257.20114.0.camel@blabla> <1326741391.20114.2.camel@blabla> Message-ID: <1326770942.26577.0.camel@blabla> Hmm. I added logging of stdout/stderr in the local provider when the job fails. You may need to enable DEBUG in log4j.properties on org.globus.cog.abstraction.impl.execution.local.JobSubmissionTaskHandler. Mihael On Mon, 2012-01-16 at 15:24 -0600, Jonathan Monette wrote: > So. I started a Swift run and when it was waiting for me to type in a password I opened another terminal and kill what looked like the process that was connecting to bridled. Swift just started a new process. I then kill -9 the process and this time Swift crashed. Still no ssh debug messages appear in the logs. > On Jan 16, 2012, at 1:16 PM, Mihael Hategan wrote: > > > On Mon, 2012-01-16 at 11:14 -0800, Mihael Hategan wrote: > >> On Mon, 2012-01-16 at 13:10 -0600, Jonathan Monette wrote: > >> > >>> as the first parameter in the ArrayList. However I cannot make > >>> anything show up in the logs. I tried entering in the wrong password > >>> as well as the correct password. Is there anything else that needs to > >>> be added to the JobSubmissionTaskHandler file? Or where else should I > >>> look besides the logs? > >> > >> You need to type some wrong password to make the job fail which captures > >> the output. > > > > But then that won't work without a GUI askpass so, nevermind. You can > > try to kill the ssh sub-process? > > > From jonmon at mcs.anl.gov Mon Jan 16 21:31:22 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Mon, 16 Jan 2012 21:31:22 -0600 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <1326770942.26577.0.camel@blabla> References: <1935551723.142011.1326499201599.JavaMail.root@zimbra.anl.gov> <1326706619.18881.0.camel@blabla> <1326740307.19812.3.camel@blabla> <71A31CC9-EE7C-47AF-9F3A-5BAFC1DF19D9@mcs.anl.gov> <1326741257.20114.0.camel@blabla> <1326741391.20114.2.camel@blabla> <1326770942.26577.0.camel@blabla> Message-ID: <23BC1A56-0E52-4FCD-8CF7-61A210D095ED@mcs.anl.gov> I will give that a try to see if that sheds light on why the agent is not being used. On Jan 16, 2012, at 9:29 PM, Mihael Hategan wrote: > Hmm. > > I added logging of stdout/stderr in the local provider when the job > fails. You may need to enable DEBUG in log4j.properties on > org.globus.cog.abstraction.impl.execution.local.JobSubmissionTaskHandler. > > Mihael > > On Mon, 2012-01-16 at 15:24 -0600, Jonathan Monette wrote: >> So. I started a Swift run and when it was waiting for me to type in a password I opened another terminal and kill what looked like the process that was connecting to bridled. Swift just started a new process. I then kill -9 the process and this time Swift crashed. Still no ssh debug messages appear in the logs. >> On Jan 16, 2012, at 1:16 PM, Mihael Hategan wrote: >> >>> On Mon, 2012-01-16 at 11:14 -0800, Mihael Hategan wrote: >>>> On Mon, 2012-01-16 at 13:10 -0600, Jonathan Monette wrote: >>>> >>>>> as the first parameter in the ArrayList. However I cannot make >>>>> anything show up in the logs. I tried entering in the wrong password >>>>> as well as the correct password. Is there anything else that needs to >>>>> be added to the JobSubmissionTaskHandler file? Or where else should I >>>>> look besides the logs? >>>> >>>> You need to type some wrong password to make the job fail which captures >>>> the output. >>> >>> But then that won't work without a GUI askpass so, nevermind. You can >>> try to kill the ssh sub-process? >>> >> > > From jonmon at mcs.anl.gov Tue Jan 17 11:33:43 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Tue, 17 Jan 2012 11:33:43 -0600 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <23BC1A56-0E52-4FCD-8CF7-61A210D095ED@mcs.anl.gov> References: <1935551723.142011.1326499201599.JavaMail.root@zimbra.anl.gov> <1326706619.18881.0.camel@blabla> <1326740307.19812.3.camel@blabla> <71A31CC9-EE7C-47AF-9F3A-5BAFC1DF19D9@mcs.anl.gov> <1326741257.20114.0.camel@blabla> <1326741391.20114.2.camel@blabla> <1326770942.26577.0.camel@blabla> <23BC1A56-0E52-4FCD-8CF7-61A210D095ED@mcs.anl.gov> Message-ID: I am assuming that I have to get the sshcl provider to fail again for the stdout/stderr to be recorded? I typed the correct password in and this is what showed up: 2-01-17 11:25:16,375-0600 DEBUG JobSubmissionTaskHandler Job: executable: /bin/bash arguments: /home/jonmon/Workspace/Swift/swift.workdir/hostname-20120117-1125-ora1c4k1/shared/_swiftwrap hostname-km209rlk -jobdir k -scratch -e /bin/hostname -out hostname.txt -err stderr.txt -i -d -if -of hostname.txt -k -cdmfile -status provider -a stdout: null stderr: null directory: /home/jonmon/Workspace/Swift/swift.workdir/hostname-20120117-1125-ora1c4k1 batch: false redirected: false attributes: null env: SWIFT_GEN_SCRIPTS=1 This is what shows up in ps: jonmon 19162 0.0 0.0 59456 3344 pts/0 S+ 11:25 0:00 ssh -v bridled.ci.uchicago.edu /bin/bash -s so -v is being used in the ssh command but Swift isn't recording the output. I have also tried entering in the wrong password but Swift just keeps asking for the password until I type the correct one in. I see in the JobSubmissionTaskHandler file there is a way to specify a username. In the sites file I use the line: but the username is not used, I do not see the -l in the command line of the sshcl provider. I wanted to try providing a username to the ssh command to see if that some how made it realize to use the agent set up for the user jonmon. On Jan 16, 2012, at 9:31 PM, Jonathan Monette wrote: > I will give that a try to see if that sheds light on why the agent is not being used. > > On Jan 16, 2012, at 9:29 PM, Mihael Hategan wrote: > >> Hmm. >> >> I added logging of stdout/stderr in the local provider when the job >> fails. You may need to enable DEBUG in log4j.properties on >> org.globus.cog.abstraction.impl.execution.local.JobSubmissionTaskHandler. >> >> Mihael >> >> On Mon, 2012-01-16 at 15:24 -0600, Jonathan Monette wrote: >>> So. I started a Swift run and when it was waiting for me to type in a password I opened another terminal and kill what looked like the process that was connecting to bridled. Swift just started a new process. I then kill -9 the process and this time Swift crashed. Still no ssh debug messages appear in the logs. >>> On Jan 16, 2012, at 1:16 PM, Mihael Hategan wrote: >>> >>>> On Mon, 2012-01-16 at 11:14 -0800, Mihael Hategan wrote: >>>>> On Mon, 2012-01-16 at 13:10 -0600, Jonathan Monette wrote: >>>>> >>>>>> as the first parameter in the ArrayList. However I cannot make >>>>>> anything show up in the logs. I tried entering in the wrong password >>>>>> as well as the correct password. Is there anything else that needs to >>>>>> be added to the JobSubmissionTaskHandler file? Or where else should I >>>>>> look besides the logs? >>>>> >>>>> You need to type some wrong password to make the job fail which captures >>>>> the output. >>>> >>>> But then that won't work without a GUI askpass so, nevermind. You can >>>> try to kill the ssh sub-process? >>>> >>> >> >> > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From hategan at mcs.anl.gov Tue Jan 17 12:48:48 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 17 Jan 2012 10:48:48 -0800 Subject: [Swift-devel] command line ssh provider... In-Reply-To: References: <1935551723.142011.1326499201599.JavaMail.root@zimbra.anl.gov> <1326706619.18881.0.camel@blabla> <1326740307.19812.3.camel@blabla> <71A31CC9-EE7C-47AF-9F3A-5BAFC1DF19D9@mcs.anl.gov> <1326741257.20114.0.camel@blabla> <1326741391.20114.2.camel@blabla> <1326770942.26577.0.camel@blabla> <23BC1A56-0E52-4FCD-8CF7-61A210D095ED@mcs.anl.gov> Message-ID: <1326826128.31470.2.camel@blabla> On Tue, 2012-01-17 at 11:33 -0600, Jonathan Monette wrote: > I am assuming that I have to get the sshcl provider to fail again for > the stdout/stderr to be recorded? Yes! > I typed the correct password in and this is what showed up: > > 2-01-17 11:25:16,375-0600 DEBUG JobSubmissionTaskHandler Job: > executable: /bin/bash > arguments: /home/jonmon/Workspace/Swift/swift.workdir/hostname-20120117-1125-ora1c4k1/shared/_swiftwrap hostname-km209rlk -jobdir k -scratch -e /bin/hostname -out hostname.txt -err stderr.txt -i -d -if -of hostname.txt -k -cdmfile -status provider -a > stdout: null > stderr: null > directory: /home/jonmon/Workspace/Swift/swift.workdir/hostname-20120117-1125-ora1c4k1 > batch: false > redirected: false > attributes: null > env: SWIFT_GEN_SCRIPTS=1 > > This is what shows up in ps: > jonmon 19162 0.0 0.0 59456 3344 pts/0 S+ 11:25 0:00 ssh -v bridled.ci.uchicago.edu /bin/bash -s > > so -v is being used in the ssh command but Swift isn't recording the output. Right. Only in the case of failure is it logged. > > I have also tried entering in the wrong password but Swift just keeps > asking for the password until I type the correct one in. It's not swift asking for the password, but ssh. You have to mistype it enough times so that ssh will fail. > > I see in the JobSubmissionTaskHandler file there is a way to specify a > username. In the sites file I use the line: provider="ssh-cl" username="jonmon" url="bridled.ci.uchicago.edu"/> Yeah, I'll remove that. Use ~/.ssh/config instead. > > but the username is not used, I do not see the -l in the command line of the sshcl provider. I wanted to try providing a username to the ssh command to see if that some how made it realize to use the agent set up for the user jonmon. > From jonmon at mcs.anl.gov Tue Jan 17 13:20:51 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Tue, 17 Jan 2012 13:20:51 -0600 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <1326826128.31470.2.camel@blabla> References: <1935551723.142011.1326499201599.JavaMail.root@zimbra.anl.gov> <1326706619.18881.0.camel@blabla> <1326740307.19812.3.camel@blabla> <71A31CC9-EE7C-47AF-9F3A-5BAFC1DF19D9@mcs.anl.gov> <1326741257.20114.0.camel@blabla> <1326741391.20114.2.camel@blabla> <1326770942.26577.0.camel@blabla> <23BC1A56-0E52-4FCD-8CF7-61A210D095ED@mcs.anl.gov> <1326826128.31470.2.camel@blabla> Message-ID: <0C8E7E0F-F0CF-4977-81D0-9FFD01CC3360@mcs.anl.gov> Making ssh fail do to an invalid password does not log the stdout/stderr. 2012-01-17 13:19:05,117-0600 INFO GridExec TASK_DEFINITION: Task(type=JOB_SUBMISSION, identity=urn:0-1-1-1326827944463) is /bin/bash shared/_swiftwrap hostname-c8cjdrlk -jobdir c -scratch -e /bin/hostname -out hostname.txt -err stderr.txt -i -d -if -of hostname.txt -k -cdmfile -status provider -a 2012-01-17 13:19:05,123-0600 DEBUG JobSubmissionTaskHandler Job: executable: /bin/bash arguments: /home/jonmon/Workspace/Swift/swift.workdir/hostname-20120117-1319-e2aautsf/shared/_swiftwrap hostname-c8cjdrlk -jobdir c -scratch -e /bin/hostname -out hostname.txt -err stderr.txt -i -d -if -of hostname.txt -k -cdmfile -status provider -a stdout: null stderr: null directory: /home/jonmon/Workspace/Swift/swift.workdir/hostname-20120117-1319-e2aautsf batch: false redirected: false attributes: null env: SWIFT_GEN_SCRIPTS=1 On Jan 17, 2012, at 12:48 PM, Mihael Hategan wrote: > On Tue, 2012-01-17 at 11:33 -0600, Jonathan Monette wrote: >> I am assuming that I have to get the sshcl provider to fail again for >> the stdout/stderr to be recorded? > > Yes! > >> I typed the correct password in and this is what showed up: >> >> 2-01-17 11:25:16,375-0600 DEBUG JobSubmissionTaskHandler Job: >> executable: /bin/bash >> arguments: /home/jonmon/Workspace/Swift/swift.workdir/hostname-20120117-1125-ora1c4k1/shared/_swiftwrap hostname-km209rlk -jobdir k -scratch -e /bin/hostname -out hostname.txt -err stderr.txt -i -d -if -of hostname.txt -k -cdmfile -status provider -a >> stdout: null >> stderr: null >> directory: /home/jonmon/Workspace/Swift/swift.workdir/hostname-20120117-1125-ora1c4k1 >> batch: false >> redirected: false >> attributes: null >> env: SWIFT_GEN_SCRIPTS=1 >> >> This is what shows up in ps: >> jonmon 19162 0.0 0.0 59456 3344 pts/0 S+ 11:25 0:00 ssh -v bridled.ci.uchicago.edu /bin/bash -s >> >> so -v is being used in the ssh command but Swift isn't recording the output. > > Right. Only in the case of failure is it logged. > >> >> I have also tried entering in the wrong password but Swift just keeps >> asking for the password until I type the correct one in. > > It's not swift asking for the password, but ssh. You have to mistype it > enough times so that ssh will fail. > >> >> I see in the JobSubmissionTaskHandler file there is a way to specify a >> username. In the sites file I use the line: > provider="ssh-cl" username="jonmon" url="bridled.ci.uchicago.edu"/> > > Yeah, I'll remove that. Use ~/.ssh/config instead. > >> >> but the username is not used, I do not see the -l in the command line of the sshcl provider. I wanted to try providing a username to the ssh command to see if that some how made it realize to use the agent set up for the user jonmon. >> > > From jonmon at mcs.anl.gov Tue Jan 17 13:49:39 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Tue, 17 Jan 2012 13:49:39 -0600 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <1326826128.31470.2.camel@blabla> References: <1935551723.142011.1326499201599.JavaMail.root@zimbra.anl.gov> <1326706619.18881.0.camel@blabla> <1326740307.19812.3.camel@blabla> <71A31CC9-EE7C-47AF-9F3A-5BAFC1DF19D9@mcs.anl.gov> <1326741257.20114.0.camel@blabla> <1326741391.20114.2.camel@blabla> <1326770942.26577.0.camel@blabla> <23BC1A56-0E52-4FCD-8CF7-61A210D095ED@mcs.anl.gov> <1326826128.31470.2.camel@blabla> Message-ID: Ok. Mike gave some suggestions and got it working. I have the line 1. Apparently, if any env variables are set in the sites file, other env's are overwritten. Once this line is removed, the normal env variables are kept. This is why the sshcl couldn't fine the agent because the variables that pointed to it was removed. I am filing a bugzilla ticket for this env issue as I do not think this is the desired behavior. On Jan 17, 2012, at 12:48 PM, Mihael Hategan wrote: > On Tue, 2012-01-17 at 11:33 -0600, Jonathan Monette wrote: >> I am assuming that I have to get the sshcl provider to fail again for >> the stdout/stderr to be recorded? > > Yes! > >> I typed the correct password in and this is what showed up: >> >> 2-01-17 11:25:16,375-0600 DEBUG JobSubmissionTaskHandler Job: >> executable: /bin/bash >> arguments: /home/jonmon/Workspace/Swift/swift.workdir/hostname-20120117-1125-ora1c4k1/shared/_swiftwrap hostname-km209rlk -jobdir k -scratch -e /bin/hostname -out hostname.txt -err stderr.txt -i -d -if -of hostname.txt -k -cdmfile -status provider -a >> stdout: null >> stderr: null >> directory: /home/jonmon/Workspace/Swift/swift.workdir/hostname-20120117-1125-ora1c4k1 >> batch: false >> redirected: false >> attributes: null >> env: SWIFT_GEN_SCRIPTS=1 >> >> This is what shows up in ps: >> jonmon 19162 0.0 0.0 59456 3344 pts/0 S+ 11:25 0:00 ssh -v bridled.ci.uchicago.edu /bin/bash -s >> >> so -v is being used in the ssh command but Swift isn't recording the output. > > Right. Only in the case of failure is it logged. > >> >> I have also tried entering in the wrong password but Swift just keeps >> asking for the password until I type the correct one in. > > It's not swift asking for the password, but ssh. You have to mistype it > enough times so that ssh will fail. > >> >> I see in the JobSubmissionTaskHandler file there is a way to specify a >> username. In the sites file I use the line: > provider="ssh-cl" username="jonmon" url="bridled.ci.uchicago.edu"/> > > Yeah, I'll remove that. Use ~/.ssh/config instead. > >> >> but the username is not used, I do not see the -l in the command line of the sshcl provider. I wanted to try providing a username to the ssh command to see if that some how made it realize to use the agent set up for the user jonmon. >> > > From hategan at mcs.anl.gov Tue Jan 17 16:44:24 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 17 Jan 2012 14:44:24 -0800 Subject: [Swift-devel] command line ssh provider... In-Reply-To: References: <1935551723.142011.1326499201599.JavaMail.root@zimbra.anl.gov> <1326706619.18881.0.camel@blabla> <1326740307.19812.3.camel@blabla> <71A31CC9-EE7C-47AF-9F3A-5BAFC1DF19D9@mcs.anl.gov> <1326741257.20114.0.camel@blabla> <1326741391.20114.2.camel@blabla> <1326770942.26577.0.camel@blabla> <23BC1A56-0E52-4FCD-8CF7-61A210D095ED@mcs.anl.gov> <1326826128.31470.2.camel@blabla> Message-ID: <1326840264.328.3.camel@blabla> That seems to be a limitation of Runtime.exec(). If you don't supply and env array, the subprocess inherits the environment from the parent. If you do, it overrides it (including unsetting the unspecified variables). Traditionally, this was the only way of starting a subprocess. Starting with 1.5, ProcessBuilder was added, which offers a solution to this problem. I will change the code to use that. On Tue, 2012-01-17 at 13:49 -0600, Jonathan Monette wrote: > Ok. Mike gave some suggestions and got it working. I have the line 1. Apparently, if any env variables are set in the sites file, other env's are overwritten. Once this line is removed, the normal env variables are kept. This is why the sshcl couldn't fine the agent because the variables that pointed to it was removed. I am filing a bugzilla ticket for this env issue as I do not think this is the desired behavior. > > On Jan 17, 2012, at 12:48 PM, Mihael Hategan wrote: > > > On Tue, 2012-01-17 at 11:33 -0600, Jonathan Monette wrote: > >> I am assuming that I have to get the sshcl provider to fail again for > >> the stdout/stderr to be recorded? > > > > Yes! > > > >> I typed the correct password in and this is what showed up: > >> > >> 2-01-17 11:25:16,375-0600 DEBUG JobSubmissionTaskHandler Job: > >> executable: /bin/bash > >> arguments: /home/jonmon/Workspace/Swift/swift.workdir/hostname-20120117-1125-ora1c4k1/shared/_swiftwrap hostname-km209rlk -jobdir k -scratch -e /bin/hostname -out hostname.txt -err stderr.txt -i -d -if -of hostname.txt -k -cdmfile -status provider -a > >> stdout: null > >> stderr: null > >> directory: /home/jonmon/Workspace/Swift/swift.workdir/hostname-20120117-1125-ora1c4k1 > >> batch: false > >> redirected: false > >> attributes: null > >> env: SWIFT_GEN_SCRIPTS=1 > >> > >> This is what shows up in ps: > >> jonmon 19162 0.0 0.0 59456 3344 pts/0 S+ 11:25 0:00 ssh -v bridled.ci.uchicago.edu /bin/bash -s > >> > >> so -v is being used in the ssh command but Swift isn't recording the output. > > > > Right. Only in the case of failure is it logged. > > > >> > >> I have also tried entering in the wrong password but Swift just keeps > >> asking for the password until I type the correct one in. > > > > It's not swift asking for the password, but ssh. You have to mistype it > > enough times so that ssh will fail. > > > >> > >> I see in the JobSubmissionTaskHandler file there is a way to specify a > >> username. In the sites file I use the line: >> provider="ssh-cl" username="jonmon" url="bridled.ci.uchicago.edu"/> > > > > Yeah, I'll remove that. Use ~/.ssh/config instead. > > > >> > >> but the username is not used, I do not see the -l in the command line of the sshcl provider. I wanted to try providing a username to the ssh command to see if that some how made it realize to use the agent set up for the user jonmon. > >> > > > > > From hategan at mcs.anl.gov Tue Jan 17 17:05:46 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 17 Jan 2012 15:05:46 -0800 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <1326840264.328.3.camel@blabla> References: <1935551723.142011.1326499201599.JavaMail.root@zimbra.anl.gov> <1326706619.18881.0.camel@blabla> <1326740307.19812.3.camel@blabla> <71A31CC9-EE7C-47AF-9F3A-5BAFC1DF19D9@mcs.anl.gov> <1326741257.20114.0.camel@blabla> <1326741391.20114.2.camel@blabla> <1326770942.26577.0.camel@blabla> <23BC1A56-0E52-4FCD-8CF7-61A210D095ED@mcs.anl.gov> <1326826128.31470.2.camel@blabla> <1326840264.328.3.camel@blabla> Message-ID: <1326841546.5626.1.camel@blabla> On Tue, 2012-01-17 at 14:44 -0800, Mihael Hategan wrote: > That seems to be a limitation of Runtime.exec(). If you don't supply and > env array, the subprocess inherits the environment from the parent. If > you do, it overrides it (including unsetting the unspecified variables). > > Traditionally, this was the only way of starting a subprocess. Starting > with 1.5, ProcessBuilder was added, which offers a solution to this > problem. I will change the code to use that. Though now that I think about it, this wasn't strictly necessary since for the ssh-cl provider, the task environment variables shouldn't be passed to the ssh subprocess (they go on the command line of the ssh shell). Anyway, I changed the code to use ProcessBuilder. This at least fixes similar issues with the local provider (and also ssh-cl). Mihael From jonmon at mcs.anl.gov Tue Jan 17 17:08:16 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Tue, 17 Jan 2012 17:08:16 -0600 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <1326841546.5626.1.camel@blabla> References: <1935551723.142011.1326499201599.JavaMail.root@zimbra.anl.gov> <1326706619.18881.0.camel@blabla> <1326740307.19812.3.camel@blabla> <71A31CC9-EE7C-47AF-9F3A-5BAFC1DF19D9@mcs.anl.gov> <1326741257.20114.0.camel@blabla> <1326741391.20114.2.camel@blabla> <1326770942.26577.0.camel@blabla> <23BC1A56-0E52-4FCD-8CF7-61A210D095ED@mcs.anl.gov> <1326826128.31470.2.camel@blabla> <1326840264.328.3.camel@blabla> <1326841546.5626.1.camel@blabla> Message-ID: <397D42CF-5025-4A30-B9EA-03FC4247FFE5@mcs.anl.gov> On Jan 17, 2012, at 5:05 PM, Mihael Hategan wrote: > On Tue, 2012-01-17 at 14:44 -0800, Mihael Hategan wrote: >> That seems to be a limitation of Runtime.exec(). If you don't supply and >> env array, the subprocess inherits the environment from the parent. If >> you do, it overrides it (including unsetting the unspecified variables). >> >> Traditionally, this was the only way of starting a subprocess. Starting >> with 1.5, ProcessBuilder was added, which offers a solution to this >> problem. I will change the code to use that. > > Though now that I think about it, this wasn't strictly necessary since > for the ssh-cl provider, the task environment variables shouldn't be > passed to the ssh subprocess (they go on the command line of the ssh > shell). What do you mean by this? When I saw the command line of the ssh shell no environment variables are being passed. > > Anyway, I changed the code to use ProcessBuilder. This at least fixes > similar issues with the local provider (and also ssh-cl). > Thanks. I'll give it a try again to see if it allows me to set the environment variable > Mihael > From jonmon at mcs.anl.gov Tue Jan 17 17:12:00 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Tue, 17 Jan 2012 17:12:00 -0600 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <397D42CF-5025-4A30-B9EA-03FC4247FFE5@mcs.anl.gov> References: <1935551723.142011.1326499201599.JavaMail.root@zimbra.anl.gov> <1326706619.18881.0.camel@blabla> <1326740307.19812.3.camel@blabla> <71A31CC9-EE7C-47AF-9F3A-5BAFC1DF19D9@mcs.anl.gov> <1326741257.20114.0.camel@blabla> <1326741391.20114.2.camel@blabla> <1326770942.26577.0.camel@blabla> <23BC1A56-0E52-4FCD-8CF7-61A210D095ED@mcs.anl.gov> <1326826128.31470.2.camel@blabla> <1326840264.328.3.camel@blabla> <1326841546.5626.1.camel@blabla> <397D42CF-5025-4A30-B9EA-03FC4247FFE5@mcs.anl.gov> Message-ID: The change to ProcessBuilder fixed this issue. On Jan 17, 2012, at 5:08 PM, Jonathan Monette wrote: > > On Jan 17, 2012, at 5:05 PM, Mihael Hategan wrote: > >> On Tue, 2012-01-17 at 14:44 -0800, Mihael Hategan wrote: >>> That seems to be a limitation of Runtime.exec(). If you don't supply and >>> env array, the subprocess inherits the environment from the parent. If >>> you do, it overrides it (including unsetting the unspecified variables). >>> >>> Traditionally, this was the only way of starting a subprocess. Starting >>> with 1.5, ProcessBuilder was added, which offers a solution to this >>> problem. I will change the code to use that. >> >> Though now that I think about it, this wasn't strictly necessary since >> for the ssh-cl provider, the task environment variables shouldn't be >> passed to the ssh subprocess (they go on the command line of the ssh >> shell). > What do you mean by this? When I saw the command line of the ssh shell no environment variables are being passed. > >> >> Anyway, I changed the code to use ProcessBuilder. This at least fixes >> similar issues with the local provider (and also ssh-cl). >> > Thanks. I'll give it a try again to see if it allows me to set the environment variable > >> Mihael >> > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From hategan at mcs.anl.gov Tue Jan 17 17:12:24 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 17 Jan 2012 15:12:24 -0800 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <397D42CF-5025-4A30-B9EA-03FC4247FFE5@mcs.anl.gov> References: <1935551723.142011.1326499201599.JavaMail.root@zimbra.anl.gov> <1326706619.18881.0.camel@blabla> <1326740307.19812.3.camel@blabla> <71A31CC9-EE7C-47AF-9F3A-5BAFC1DF19D9@mcs.anl.gov> <1326741257.20114.0.camel@blabla> <1326741391.20114.2.camel@blabla> <1326770942.26577.0.camel@blabla> <23BC1A56-0E52-4FCD-8CF7-61A210D095ED@mcs.anl.gov> <1326826128.31470.2.camel@blabla> <1326840264.328.3.camel@blabla> <1326841546.5626.1.camel@blabla> <397D42CF-5025-4A30-B9EA-03FC4247FFE5@mcs.anl.gov> Message-ID: <1326841944.5730.1.camel@blabla> On Tue, 2012-01-17 at 17:08 -0600, Jonathan Monette wrote: > > Though now that I think about it, this wasn't strictly necessary since > > for the ssh-cl provider, the task environment variables shouldn't be > > passed to the ssh subprocess (they go on the command line of the ssh > > shell). > What do you mean by this? When I saw the command line of the ssh shell no environment variables are being passed. 1. java forks ssh 2. ssh -> sshd 3. sshd forks shell There is no point in passing task env vars in (1). You need them somewhere in (3). You don't see them in the shell command line because they are passed on its stdin (i.e. as if typed). From jonmon at mcs.anl.gov Tue Jan 17 17:17:44 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Tue, 17 Jan 2012 17:17:44 -0600 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <1326841944.5730.1.camel@blabla> References: <1935551723.142011.1326499201599.JavaMail.root@zimbra.anl.gov> <1326706619.18881.0.camel@blabla> <1326740307.19812.3.camel@blabla> <71A31CC9-EE7C-47AF-9F3A-5BAFC1DF19D9@mcs.anl.gov> <1326741257.20114.0.camel@blabla> <1326741391.20114.2.camel@blabla> <1326770942.26577.0.camel@blabla> <23BC1A56-0E52-4FCD-8CF7-61A210D095ED@mcs.anl.gov> <1326826128.31470.2.camel@blabla> <1326840264.328.3.camel@blabla> <1326841546.5626.1.camel@blabla> <397D42CF-5025-4A30-B9EA-03FC4247FFE5@mcs.anl.gov> <1326841944.5730.1.camel@blabla> Message-ID: <50518C44-0B05-495B-9BF8-5B66DE957CE6@mcs.anl.gov> Ok. I understand now. Anyways the switch to ProcessBuilder fixed the issue. On Jan 17, 2012, at 5:12 PM, Mihael Hategan wrote: > On Tue, 2012-01-17 at 17:08 -0600, Jonathan Monette wrote: >>> Though now that I think about it, this wasn't strictly necessary since >>> for the ssh-cl provider, the task environment variables shouldn't be >>> passed to the ssh subprocess (they go on the command line of the ssh >>> shell). >> What do you mean by this? When I saw the command line of the ssh shell no environment variables are being passed. > > 1. java forks ssh > 2. ssh -> sshd > 3. sshd forks shell > > There is no point in passing task env vars in (1). You need them > somewhere in (3). > > You don't see them in the shell command line because they are passed on > its stdin (i.e. as if typed). > From wilde at mcs.anl.gov Tue Jan 17 18:22:18 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 17 Jan 2012 18:22:18 -0600 (CST) Subject: [Swift-devel] command line ssh provider... In-Reply-To: <50518C44-0B05-495B-9BF8-5B66DE957CE6@mcs.anl.gov> Message-ID: <1995757495.151685.1326846138451.JavaMail.root@zimbra.anl.gov> I think the reason this fixed the ssh problem is that the SSH_AUTH_SOCK etc vars were getting erased at step 1, and this fix stopped these local env vars from getting lost, right? - Mike ----- Original Message ----- > From: "Jonathan Monette" > To: "Mihael Hategan" > Cc: "Swift Devel" > Sent: Tuesday, January 17, 2012 5:17:44 PM > Subject: Re: [Swift-devel] command line ssh provider... > Ok. I understand now. Anyways the switch to ProcessBuilder fixed the > issue. > > On Jan 17, 2012, at 5:12 PM, Mihael Hategan wrote: > > > On Tue, 2012-01-17 at 17:08 -0600, Jonathan Monette wrote: > >>> Though now that I think about it, this wasn't strictly necessary > >>> since > >>> for the ssh-cl provider, the task environment variables shouldn't > >>> be > >>> passed to the ssh subprocess (they go on the command line of the > >>> ssh > >>> shell). > >> What do you mean by this? When I saw the command line of the ssh > >> shell no environment variables are being passed. > > > > 1. java forks ssh > > 2. ssh -> sshd > > 3. sshd forks shell > > > > There is no point in passing task env vars in (1). You need them > > somewhere in (3). > > > > You don't see them in the shell command line because they are passed > > on > > its stdin (i.e. as if typed). > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From hategan at mcs.anl.gov Tue Jan 17 19:57:51 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 17 Jan 2012 17:57:51 -0800 Subject: [Swift-devel] command line ssh provider... In-Reply-To: <1995757495.151685.1326846138451.JavaMail.root@zimbra.anl.gov> References: <1995757495.151685.1326846138451.JavaMail.root@zimbra.anl.gov> Message-ID: <1326851871.6961.0.camel@blabla> On Tue, 2012-01-17 at 18:22 -0600, Michael Wilde wrote: > I think the reason this fixed the ssh problem is that the SSH_AUTH_SOCK etc vars were getting erased at step 1, and this fix stopped these local env vars from getting lost, right? > Yes. From ketancmaheshwari at gmail.com Wed Jan 18 15:59:41 2012 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Wed, 18 Jan 2012 15:59:41 -0600 Subject: [Swift-devel] Handling Reply timeout Message-ID: Hi Mihael, During the staging of data, I get this message alongside the progress messages: Command(282, HEARTBEAT): handling reply timeout; sendReqTime=120118-155328.718, sendTime=691231-180000.000, now=120118-155528.764, channel=SC-null What is the meaning of this message? What kind of reply does Swift expect here? Is there a way to set this to a long or infinite time? This is with provider staging connecting worker from OSG site to the PADS. There is no other exception message. Swift continues to run but the progress seems to have stopped. Regards, -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From benc at hawaga.org.uk Wed Jan 18 17:04:39 2012 From: benc at hawaga.org.uk (Ben Clifford) Date: Thu, 19 Jan 2012 00:04:39 +0100 Subject: [Swift-devel] Swift Licensing In-Reply-To: <1481066739.16139.1323201129384.JavaMail.root@zimbra-mb2.anl.gov> References: <1481066739.16139.1323201129384.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: <6B8A3EFA-1576-4043-895C-05CF23D1FC3C@hawaga.org.uk> On Dec 6, 2011, at 8:52 PM, David Kelly wrote: > Copyright 2011 University of Chicago > Licensed under the Apache License, Version 2.0 (the "License"); > you may not use this file except in compliance with the License. At least anything that I contributed under a personal basis and outside of my employment with the university of chicago is neither copyright the university of chicago, nor licensed under the apache license. I have no idea if there is any such code in the tree at the moment (nor indeed if there ever has been) but previously Argonne seemed very excited about the idea that there might be and that I should license it to them (not under the apache license). Also, I think Milena had personal copyright to her contributions (though I don't know if those are still in the tree) sometime around 2008-ish. All of the above mentioned copyrights started earlier than 2011. -- From wilde at mcs.anl.gov Wed Jan 18 17:57:19 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 18 Jan 2012 17:57:19 -0600 (CST) Subject: [Swift-devel] Swift Licensing In-Reply-To: <6B8A3EFA-1576-4043-895C-05CF23D1FC3C@hawaga.org.uk> Message-ID: <1359618100.156433.1326931039412.JavaMail.root@zimbra.anl.gov> > From: "Ben Clifford" > Sent: Wednesday, January 18, 2012 5:04:39 PM > ... > At least anything that I contributed under a personal basis and > outside of my employment with the university of chicago is neither > copyright the university of chicago, nor licensed under the apache > license. > > I have no idea if there is any such code in the tree at the moment > (nor indeed if there ever has been) but previously Argonne seemed very > excited about the idea that there might be and that I should license > it to them (not under the apache license). We'll check the svn logs to see if any such code was committed. > Also, I think Milena had personal copyright to her contributions > (though I don't know if those are still in the tree) sometime around > 2008-ish. Presumably Milena's code was committed (by Ben) just prior to this message: ----- Forwarded Message ----- From: "Ben Clifford" To: swift-devel at ci.uchicago.edu Sent: Tuesday, July 29, 2008 4:12:02 AM Subject: [Swift-devel] more compile time type checking I just committed Milena's work on compile-time type checking. ... > All of the above mentioned copyrights started earlier than 2011. GSoC's rules from 2008 regarding code copyright are here: http://code.google.com/opensource/gsoc/2008/faqs.html#0.1_owns_code Did Milena's code have any copyright declarations in it when you integrated it? We may need to follow with Milena, as I suspect that nothing was explicitly done in this regard when her code was added in 2008. - Mike From hategan at mcs.anl.gov Wed Jan 18 18:01:34 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 18 Jan 2012 16:01:34 -0800 Subject: [Swift-devel] Swift Licensing In-Reply-To: <6B8A3EFA-1576-4043-895C-05CF23D1FC3C@hawaga.org.uk> References: <1481066739.16139.1323201129384.JavaMail.root@zimbra-mb2.anl.gov> <6B8A3EFA-1576-4043-895C-05CF23D1FC3C@hawaga.org.uk> Message-ID: <1326931294.16589.3.camel@blabla> On Thu, 2012-01-19 at 00:04 +0100, Ben Clifford wrote: > Also, I think Milena had personal copyright to her contributions > (though I don't know if those are still in the tree) sometime around > 2008-ish. I'm not saying whether that's true or not, but want to point out that GSOC is meant to get students to participate in OSS development. I find it hard to believe that they didn't think about licensing issues, and I suspect that the students get asked to sign something to that extent. From hategan at mcs.anl.gov Wed Jan 18 18:04:33 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 18 Jan 2012 16:04:33 -0800 Subject: [Swift-devel] Swift Licensing In-Reply-To: <1326931294.16589.3.camel@blabla> References: <1481066739.16139.1323201129384.JavaMail.root@zimbra-mb2.anl.gov> <6B8A3EFA-1576-4043-895C-05CF23D1FC3C@hawaga.org.uk> <1326931294.16589.3.camel@blabla> Message-ID: <1326931473.16589.4.camel@blabla> Nevermind. I read the relevant GSOC page. On Wed, 2012-01-18 at 16:01 -0800, Mihael Hategan wrote: > On Thu, 2012-01-19 at 00:04 +0100, Ben Clifford wrote: > > > Also, I think Milena had personal copyright to her contributions > > (though I don't know if those are still in the tree) sometime around > > 2008-ish. > > I'm not saying whether that's true or not, but want to point out that > GSOC is meant to get students to participate in OSS development. I find > it hard to believe that they didn't think about licensing issues, and I > suspect that the students get asked to sign something to that extent. > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From benc at hawaga.org.uk Thu Jan 19 03:57:45 2012 From: benc at hawaga.org.uk (Ben Clifford) Date: Thu, 19 Jan 2012 10:57:45 +0100 Subject: [Swift-devel] Swift Licensing In-Reply-To: <1359618100.156433.1326931039412.JavaMail.root@zimbra.anl.gov> References: <1359618100.156433.1326931039412.JavaMail.root@zimbra.anl.gov> Message-ID: <35437738-4DF9-4A61-BA0B-A9AC3159CB25@hawaga.org.uk> On Jan 19, 2012, at 12:57 AM, Michael Wilde wrote: > Did Milena's code have any copyright declarations in it when you integrated it? > > We may need to follow with Milena, as I suspect that nothing was explicitly done in this regard when her code was added in 2008. She would have done the dev.globus licensing stuff that was sent to argonne, as that was the fashionable thing to do at the time, I think - so there should be reasonable BSD-like licensing documentation of that around somewhere. -- From ketancmaheshwari at gmail.com Thu Jan 19 13:54:19 2012 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Thu, 19 Jan 2012 13:54:19 -0600 Subject: [Swift-devel] timeout on OSG with coasters provider staging In-Reply-To: References: <1326742704.20900.0.camel@blabla> Message-ID: Mihael, I have the logs now. Filed as bug 690: https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=690 Regards, Ketan On Mon, Jan 16, 2012 at 2:24 PM, Ketan Maheshwari < ketancmaheshwari at gmail.com> wrote: > Mihael, > > Please find service log here: > http://ci.uchicago.edu/~ketan/swift.log.tar.gz > > worker logs seems to have lost. I'll see if I can find'em. > > Regards, > Ketan > > On Mon, Jan 16, 2012 at 1:38 PM, Mihael Hategan wrote: > >> Nothing interesting there. Do you also happen to have the service and >> worker logs? >> >> On Mon, 2012-01-16 at 11:05 -0600, Ketan Maheshwari wrote: >> > Hi Mihael, >> > >> > >> > I could reproduce this timeout exception on OSG with catsn Swift jobs. >> > >> > >> > These are 100 jobs with a data size of 10MB each. So, 2000MB of data >> > movement in all. >> > >> > >> > I tried with 1 worker running on a single OSG site. I tried three >> > different OSG sites: Nebraska, UChicago and RENCI. >> > >> > >> > In each of these cases, I run into the following timeout after ~4 >> > minutes of run (15-70 jobs complete during this period) . : >> > >> > >> > Timeout >> > org.globus.cog.karajan.workflow.service.TimeoutException: Handler(562, >> > PUT): timed out receiving request. Last time 940817-011255.807, now: >> > 120115-194100.072 >> > at >> > >> org.globus.cog.karajan.workflow.service.handlers.RequestHandler.handleTimeout(RequestHandler.java:124) >> > at >> > >> org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.checkTimeouts(AbstractKarajanChannel.java:131) >> > at >> > >> org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.checkTimeouts(AbstractKarajanChannel.java:123) >> > at >> > >> org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel$1.run(AbstractKarajanChannel.java:116) >> > at java.util.TimerThread.mainLoop(Timer.java:512) >> > at java.util.TimerThread.run(Timer.java:462) >> > Command(168, SUBMITJOB): handling reply timeout; >> > sendReqTime=120115-193900.255, sendTime=120115-193900.255, >> > now=120115-194100.416, channel=SC-null >> > >> > >> > This is followed by messages similar to the above last line but the >> > progress of workflow halts. >> > >> > >> > Here is the tarball of the >> > experiment: http://ci.uchicago.edu/~ketan/catsn-exp-formihael.tgz >> > >> > >> > It contains a README which has the steps to run: basically >> > start-service on localhost -> start worker on OSG site -> run swift >> > >> > >> > Regards, >> > -- >> > Ketan >> > >> > >> > >> >> >> > > > -- > Ketan > > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Thu Jan 19 17:22:19 2012 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Thu, 19 Jan 2012 17:22:19 -0600 Subject: [Swift-devel] timeout on OSG with coasters provider staging In-Reply-To: References: <1326742704.20900.0.camel@blabla> Message-ID: Here is another worker log this one is for a real SCEC run: ci.uchicago.edu/~ketan/timeout_worker_log_scec.txt On Thu, Jan 19, 2012 at 1:54 PM, Ketan Maheshwari < ketancmaheshwari at gmail.com> wrote: > Mihael, > > I have the logs now. Filed as bug 690: > > https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=690 > > Regards, > Ketan > > On Mon, Jan 16, 2012 at 2:24 PM, Ketan Maheshwari < > ketancmaheshwari at gmail.com> wrote: > >> Mihael, >> >> Please find service log here: >> http://ci.uchicago.edu/~ketan/swift.log.tar.gz >> >> worker logs seems to have lost. I'll see if I can find'em. >> >> Regards, >> Ketan >> >> On Mon, Jan 16, 2012 at 1:38 PM, Mihael Hategan wrote: >> >>> Nothing interesting there. Do you also happen to have the service and >>> worker logs? >>> >>> On Mon, 2012-01-16 at 11:05 -0600, Ketan Maheshwari wrote: >>> > Hi Mihael, >>> > >>> > >>> > I could reproduce this timeout exception on OSG with catsn Swift jobs. >>> > >>> > >>> > These are 100 jobs with a data size of 10MB each. So, 2000MB of data >>> > movement in all. >>> > >>> > >>> > I tried with 1 worker running on a single OSG site. I tried three >>> > different OSG sites: Nebraska, UChicago and RENCI. >>> > >>> > >>> > In each of these cases, I run into the following timeout after ~4 >>> > minutes of run (15-70 jobs complete during this period) . : >>> > >>> > >>> > Timeout >>> > org.globus.cog.karajan.workflow.service.TimeoutException: Handler(562, >>> > PUT): timed out receiving request. Last time 940817-011255.807, now: >>> > 120115-194100.072 >>> > at >>> > >>> org.globus.cog.karajan.workflow.service.handlers.RequestHandler.handleTimeout(RequestHandler.java:124) >>> > at >>> > >>> org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.checkTimeouts(AbstractKarajanChannel.java:131) >>> > at >>> > >>> org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.checkTimeouts(AbstractKarajanChannel.java:123) >>> > at >>> > >>> org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel$1.run(AbstractKarajanChannel.java:116) >>> > at java.util.TimerThread.mainLoop(Timer.java:512) >>> > at java.util.TimerThread.run(Timer.java:462) >>> > Command(168, SUBMITJOB): handling reply timeout; >>> > sendReqTime=120115-193900.255, sendTime=120115-193900.255, >>> > now=120115-194100.416, channel=SC-null >>> > >>> > >>> > This is followed by messages similar to the above last line but the >>> > progress of workflow halts. >>> > >>> > >>> > Here is the tarball of the >>> > experiment: http://ci.uchicago.edu/~ketan/catsn-exp-formihael.tgz >>> > >>> > >>> > It contains a README which has the steps to run: basically >>> > start-service on localhost -> start worker on OSG site -> run swift >>> > >>> > >>> > Regards, >>> > -- >>> > Ketan >>> > >>> > >>> > >>> >>> >>> >> >> >> -- >> Ketan >> >> >> > > > -- > Ketan > > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Fri Jan 20 03:40:19 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 20 Jan 2012 01:40:19 -0800 Subject: [Swift-devel] timeout on OSG with coasters provider staging In-Reply-To: References: <1326742704.20900.0.camel@blabla> Message-ID: <1327052419.3479.2.camel@blabla> Thanks! Most of the coaster staging problems seem to be between the worker and the service, so those are most likely the most important logs for these issues. Mihael On Thu, 2012-01-19 at 17:22 -0600, Ketan Maheshwari wrote: > > Here is another worker log this one is for a real SCEC run: > > > ci.uchicago.edu/~ketan/timeout_worker_log_scec.txt > > On Thu, Jan 19, 2012 at 1:54 PM, Ketan Maheshwari > wrote: > Mihael, > > > I have the logs now. Filed as bug 690: > > > https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=690 > > Regards, > Ketan > > > On Mon, Jan 16, 2012 at 2:24 PM, Ketan Maheshwari > wrote: > Mihael, > > > Please find service log here: > http://ci.uchicago.edu/~ketan/swift.log.tar.gz > > worker logs seems to have lost. I'll see if I can > find'em. > > Regards, > Ketan > > > On Mon, Jan 16, 2012 at 1:38 PM, Mihael Hategan > wrote: > Nothing interesting there. Do you also happen > to have the service and > worker logs? > > > On Mon, 2012-01-16 at 11:05 -0600, Ketan > Maheshwari wrote: > > Hi Mihael, > > > > > > I could reproduce this timeout exception on > OSG with catsn Swift jobs. > > > > > > These are 100 jobs with a data size of 10MB > each. So, 2000MB of data > > movement in all. > > > > > > I tried with 1 worker running on a single > OSG site. I tried three > > different OSG sites: Nebraska, UChicago and > RENCI. > > > > > > In each of these cases, I run into the > following timeout after ~4 > > minutes of run (15-70 jobs complete during > this period) . : > > > > > > Timeout > > > org.globus.cog.karajan.workflow.service.TimeoutException: Handler(562, > > PUT): timed out receiving request. Last time > 940817-011255.807, now: > > 120115-194100.072 > > at > > > org.globus.cog.karajan.workflow.service.handlers.RequestHandler.handleTimeout(RequestHandler.java:124) > > at > > > org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.checkTimeouts(AbstractKarajanChannel.java:131) > > at > > > org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.checkTimeouts(AbstractKarajanChannel.java:123) > > at > > > org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel$1.run(AbstractKarajanChannel.java:116) > > at > java.util.TimerThread.mainLoop(Timer.java:512) > > at java.util.TimerThread.run(Timer.java:462) > > Command(168, SUBMITJOB): handling reply > timeout; > > sendReqTime=120115-193900.255, > sendTime=120115-193900.255, > > now=120115-194100.416, channel=SC-null > > > > > > This is followed by messages similar to the > above last line but the > > progress of workflow halts. > > > > > > Here is the tarball of the > > experiment: > http://ci.uchicago.edu/~ketan/catsn-exp-formihael.tgz > > > > > > It contains a README which has the steps to > run: basically > > start-service on localhost -> start worker > on OSG site -> run swift > > > > > > Regards, > > -- > > Ketan > > > > > > > > > > > > > > -- > Ketan > > > > > > > > -- > Ketan > > > > > > > > -- > Ketan > > > From hategan at mcs.anl.gov Mon Jan 23 15:04:19 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 23 Jan 2012 13:04:19 -0800 Subject: [Swift-devel] merging Message-ID: <1327352659.23014.1.camel@blabla> Hi, So I merged 0.93 to 0.93.1. The merge to trunk seems to require a bit more work because there are some conflicts that need to be solved manually. Mihael From wilde at mcs.anl.gov Wed Jan 25 08:33:41 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 25 Jan 2012 08:33:41 -0600 (CST) Subject: [Swift-devel] Progress on Bug 690? - Re: timeout on OSG with coasters provider staging In-Reply-To: Message-ID: <811412679.176148.1327502021856.JavaMail.root@zimbra.anl.gov> Mihael, Ketan, can you send an update on this, and escalate the priority of resolving this problem? A resolution is needed rather urgently for the ExTENCI project. Mihael, do you know where the problem lies, and have a strategy for a fix? Thanks, - Mike ----- Original Message ----- > From: "Ketan Maheshwari" > To: "Mihael Hategan" > Cc: "Swift Devel" > Sent: Thursday, January 19, 2012 5:22:19 PM > Subject: Re: [Swift-devel] timeout on OSG with coasters provider staging > Here is another worker log this one is for a real SCEC run: > > > ci.uchicago.edu/~ketan/timeout_worker_log_scec.txt > > > On Thu, Jan 19, 2012 at 1:54 PM, Ketan Maheshwari < > ketancmaheshwari at gmail.com > wrote: > > > Mihael, > > > I have the logs now. Filed as bug 690: > > > https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=690 > > Regards, > Ketan > > > > > > On Mon, Jan 16, 2012 at 2:24 PM, Ketan Maheshwari < > ketancmaheshwari at gmail.com > wrote: > > > Mihael, > > > Please find service log here: > http://ci.uchicago.edu/~ketan/swift.log.tar.gz > > worker logs seems to have lost. I'll see if I can find'em. > > Regards, > Ketan > > > > > > On Mon, Jan 16, 2012 at 1:38 PM, Mihael Hategan < hategan at mcs.anl.gov > > wrote: > > > Nothing interesting there. Do you also happen to have the service and > worker logs? > > > > > On Mon, 2012-01-16 at 11:05 -0600, Ketan Maheshwari wrote: > > Hi Mihael, > > > > > > I could reproduce this timeout exception on OSG with catsn Swift > > jobs. > > > > > > These are 100 jobs with a data size of 10MB each. So, 2000MB of data > > movement in all. > > > > > > I tried with 1 worker running on a single OSG site. I tried three > > different OSG sites: Nebraska, UChicago and RENCI. > > > > > > In each of these cases, I run into the following timeout after ~4 > > minutes of run (15-70 jobs complete during this period) . : > > > > > > Timeout > > org.globus.cog.karajan.workflow.service.TimeoutException: > > Handler(562, > > PUT): timed out receiving request. Last time 940817-011255.807, now: > > 120115-194100.072 > > at > > org.globus.cog.karajan.workflow.service.handlers.RequestHandler.handleTimeout(RequestHandler.java:124) > > at > > org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.checkTimeouts(AbstractKarajanChannel.java:131) > > at > > org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.checkTimeouts(AbstractKarajanChannel.java:123) > > at > > org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel$1.run(AbstractKarajanChannel.java:116) > > at java.util.TimerThread.mainLoop(Timer.java:512) > > at java.util.TimerThread.run(Timer.java:462) > > Command(168, SUBMITJOB): handling reply timeout; > > sendReqTime=120115-193900.255, sendTime=120115-193900.255, > > now=120115-194100.416, channel=SC-null > > > > > > This is followed by messages similar to the above last line but the > > progress of workflow halts. > > > > > > Here is the tarball of the > > experiment: http://ci.uchicago.edu/~ketan/catsn-exp-formihael.tgz > > > > > > It contains a README which has the steps to run: basically > > start-service on localhost -> start worker on OSG site -> run swift > > > > > > Regards, > > -- > > Ketan > > > > > > > > > > > > > -- > Ketan > > > > > > > -- > Ketan > > > > > > > -- > Ketan > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From hategan at mcs.anl.gov Wed Jan 25 13:15:12 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 25 Jan 2012 11:15:12 -0800 Subject: [Swift-devel] Progress on Bug 690? - Re: timeout on OSG with coasters provider staging In-Reply-To: <811412679.176148.1327502021856.JavaMail.root@zimbra.anl.gov> References: <811412679.176148.1327502021856.JavaMail.root@zimbra.anl.gov> Message-ID: <1327518912.3840.2.camel@blabla> Sorry. I was with the sshcl provider and the merging. I'll have to look at it this weekend. On Wed, 2012-01-25 at 08:33 -0600, Michael Wilde wrote: > Mihael, Ketan, can you send an update on this, and escalate the priority of resolving this problem? > > A resolution is needed rather urgently for the ExTENCI project. > > Mihael, do you know where the problem lies, and have a strategy for a fix? > > Thanks, > > - Mike > > ----- Original Message ----- > > From: "Ketan Maheshwari" > > To: "Mihael Hategan" > > Cc: "Swift Devel" > > Sent: Thursday, January 19, 2012 5:22:19 PM > > Subject: Re: [Swift-devel] timeout on OSG with coasters provider staging > > Here is another worker log this one is for a real SCEC run: > > > > > > ci.uchicago.edu/~ketan/timeout_worker_log_scec.txt > > > > > > On Thu, Jan 19, 2012 at 1:54 PM, Ketan Maheshwari < > > ketancmaheshwari at gmail.com > wrote: > > > > > > Mihael, > > > > > > I have the logs now. Filed as bug 690: > > > > > > https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=690 > > > > Regards, > > Ketan > > > > > > > > > > > > On Mon, Jan 16, 2012 at 2:24 PM, Ketan Maheshwari < > > ketancmaheshwari at gmail.com > wrote: > > > > > > Mihael, > > > > > > Please find service log here: > > http://ci.uchicago.edu/~ketan/swift.log.tar.gz > > > > worker logs seems to have lost. I'll see if I can find'em. > > > > Regards, > > Ketan > > > > > > > > > > > > On Mon, Jan 16, 2012 at 1:38 PM, Mihael Hategan < hategan at mcs.anl.gov > > > wrote: > > > > > > Nothing interesting there. Do you also happen to have the service and > > worker logs? > > > > > > > > > > On Mon, 2012-01-16 at 11:05 -0600, Ketan Maheshwari wrote: > > > Hi Mihael, > > > > > > > > > I could reproduce this timeout exception on OSG with catsn Swift > > > jobs. > > > > > > > > > These are 100 jobs with a data size of 10MB each. So, 2000MB of data > > > movement in all. > > > > > > > > > I tried with 1 worker running on a single OSG site. I tried three > > > different OSG sites: Nebraska, UChicago and RENCI. > > > > > > > > > In each of these cases, I run into the following timeout after ~4 > > > minutes of run (15-70 jobs complete during this period) . : > > > > > > > > > Timeout > > > org.globus.cog.karajan.workflow.service.TimeoutException: > > > Handler(562, > > > PUT): timed out receiving request. Last time 940817-011255.807, now: > > > 120115-194100.072 > > > at > > > org.globus.cog.karajan.workflow.service.handlers.RequestHandler.handleTimeout(RequestHandler.java:124) > > > at > > > org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.checkTimeouts(AbstractKarajanChannel.java:131) > > > at > > > org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.checkTimeouts(AbstractKarajanChannel.java:123) > > > at > > > org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel$1.run(AbstractKarajanChannel.java:116) > > > at java.util.TimerThread.mainLoop(Timer.java:512) > > > at java.util.TimerThread.run(Timer.java:462) > > > Command(168, SUBMITJOB): handling reply timeout; > > > sendReqTime=120115-193900.255, sendTime=120115-193900.255, > > > now=120115-194100.416, channel=SC-null > > > > > > > > > This is followed by messages similar to the above last line but the > > > progress of workflow halts. > > > > > > > > > Here is the tarball of the > > > experiment: http://ci.uchicago.edu/~ketan/catsn-exp-formihael.tgz > > > > > > > > > It contains a README which has the steps to run: basically > > > start-service on localhost -> start worker on OSG site -> run swift > > > > > > > > > Regards, > > > -- > > > Ketan > > > > > > > > > > > > > > > > > > > > > > > -- > > Ketan > > > > > > > > > > > > > > -- > > Ketan > > > > > > > > > > > > > > -- > > Ketan > > > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > From ketancmaheshwari at gmail.com Wed Jan 25 13:54:58 2012 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Wed, 25 Jan 2012 13:54:58 -0600 Subject: [Swift-devel] merging In-Reply-To: <1327352659.23014.1.camel@blabla> References: <1327352659.23014.1.camel@blabla> Message-ID: On Mon, Jan 23, 2012 at 3:04 PM, Mihael Hategan wrote: > Hi, > > So I merged 0.93 to 0.93.1. > Does it mean the gridFTP provider staging feature is available on 0.93 now? > > The merge to trunk seems to require a bit more work because there are > some conflicts that need to be solved manually. > > Mihael > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Wed Jan 25 16:18:33 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 25 Jan 2012 14:18:33 -0800 Subject: [Swift-devel] merging In-Reply-To: References: <1327352659.23014.1.camel@blabla> Message-ID: <1327529913.5868.7.camel@blabla> On Wed, 2012-01-25 at 13:54 -0600, Ketan Maheshwari wrote: > > On Mon, Jan 23, 2012 at 3:04 PM, Mihael Hategan > wrote: > Hi, > > So I merged 0.93 to 0.93.1. > > Does it mean the gridFTP provider staging feature is available on 0.93 > now? No. 0.93 has been released as far as I am concerned, and that means it's not going to change. What it means is that bug fixes from 0.93 are also now in 0.93.1 Mihael From iraicu at cs.iit.edu Wed Jan 25 21:04:23 2012 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Wed, 25 Jan 2012 21:04:23 -0600 Subject: [Swift-devel] Fwd: Reminder of CFP for SIGMOD workshop SWEET'12 References: Message-ID: <489F81AB-1789-4D56-8E5A-B8DB815A1DDA@cs.iit.edu> Hi all, I think this workshop seems relevant to the Swift community. Cheers, Ioan -- ================================================ Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================ Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================ Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================ ================================================ Begin forwarded message: > From: Jan Hidders > Date: January 25, 2012 4:02:18 PM CST > To: iraicu at cs.iit.edu > Subject: Reminder of CFP for SIGMOD workshop SWEET'12 > > Dear Ioan Raicu, > > Given your expertise in the relevant area we would like to remind you of the SIGMOD workshop SWEET'12 on scalable workflow enactment engines and technologies. Enclosed you will find the final call for papers. Please note that the submission deadline, 19 February, is rapidly approaching. We hope you will have the opportunity to submit a high quality paper. > > On behalf of the organizers, > > Jan Hidders, TU Delft, The Netherlands > Paolo Missier, Newcastle University, UK > Jacek Sroka, University of Warsaw, Poland > > > > ************************* > * Final Call for Papers * > ************************* > > SWEET'12 > 1st International Workshop on Scalable Workflow Enactment Engines and Technologies > http://sites.google.com/site/sweetworkshop2012 > inquiries: sweet2012 at easychair.org > > Held in conjunction with SIGMOD 2012 > Scottsdale, Arizona, USA, May 20, 2012 > http://www.sigmod.org/2012/ > > ---------------- > IMPORTANT DATES: > ---------------- > Papers submission deadline: February, 19th, 2012 > Authors notification: April 8th > Deadline for camera-ready copy: May 13th > Workshop: May 20 > > ----- > FOCUS > ----- > The goal of the workshop is to bring together researchers and practitioners to explore the potential of cloud-based computing in facilitating the convergence between workflows and large-scale data processing. Concretely, the workshop is expected to provide insight into: > > - performance issues: efficient data processing using cloud-based workflows, > - modelling issues: best practices in data-intensive workflow modelling and enactment, > - support technology issues: how the potential synergy between large-scale data processing and workflow technology can be exploited in a principled way. > > The workshop aims to address issues of (i) Architecture, (ii) Models and Languages, (iii) Applications of cloud-based workflows. Specific topics include (but, as usual, are not limited to): > > Architectures: > + cloud-based, scalable workflow enactment architectures, > + efficient data storage for data-intensive workflows, > + optimizing execution of data-intensive workflows, > + workflow scheduling in cloud computing. > > Models, Languages: > + languages for data-intensive workflows, data processing pipelines and data-mashups, > + verification and validation of data-intensive workflows, > + programming models for cloud computing, > + access control and authorisation models, privacy, security, risk and trust issues, > + workflow patterns for data-intensive workflows. > > Applications of cloud-based workflow: > + bioinformatics, > + data mashups, > + semantic web data management, > + big data analytics. > > ---------------- > SUBMISSION GUIDELINES > ---------------- > We invite full research or experience papers (up to 12 pages), or short papers (up to 6 pages) describing research in progress, > formatted using the ACM proceedings style (http://www.acm.org/sigs/publications/proceedings-templates) > > ---------------- > PUBLICATION > ---------------- > The workshop proceedings will be part published by CEUR and will be included in the ACM DL. > > In addition, we have an agreement with the Fundamenta Informaticae journal to fast-track a few selected paper for further publication. > > --------------------------- > KEYNOTE > --------------------------- > Dr. Pawel Garbacki from Google Inc.: "Data Processing at Scale" > > --------------------------- > CHAIRS > --------------------------- > Jan Hidders, TU Delft, The Netherlands > Jacek Sroka University of Warsaw, Poland > Paolo Missier, Newcastle University, UK > > > --------------------------- > Program Committee > --------------------------- > > Sarah Cohen-Boulakia, LRI, Universite Paris-Sud, France > Juliana Freire, NYU Poly, USA > Khalid Belhajjame, University of Manchester, UK > Vasa Curcin, Imperial college, London, UK > Paul Groth, VU University Amsterdam, NL > Paul Watson, Newcastle University, UK > Hugo Hiden, Newcastle University, UK > Matthew Jones, University of California Santa Barbara, USA > Bertram Ludaescher, UC Davis, USA > Marta Mattoso, COPPE- Federal Univ. Rio de Janeiro, Brasil > Norman Paton, University of Manchester, UK > Jelena Pjesivac-Grbovic, Google, USA > Benjamin Reed, Yahoo! Research > Yogesh Simmhan, University of Southern California, USA > Krzysztof Stencel, University of Warsaw, Poland > Wei Tan, J.T. Watson IBM Research, USA > Giovanni Tummarello, DERI, National University of Ireland Galway, Ireland > Jerzy Tyszkiewicz, Institute of Informatics, Warsaw University, PL > Jan Van Den Bussche, Hasselt University & Transnational University of Limburg, Belgium > Aad Van Moorsel, Newcastle University, UK, USA > Simon Woodman, Newcastle University, UK > Suraj Pandey, University of Melbourne, Australia > Jianwu Wang, University of California, San Diego, USA -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Wed Jan 25 21:22:18 2012 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Wed, 25 Jan 2012 21:22:18 -0600 Subject: [Swift-devel] Progress on Bug 690? - Re: timeout on OSG with coasters provider staging In-Reply-To: <1327518912.3840.2.camel@blabla> References: <811412679.176148.1327502021856.JavaMail.root@zimbra.anl.gov> <1327518912.3840.2.camel@blabla> Message-ID: I could reproduce the bug going from bridled to mcs with the same configuration. I am seeing 2 timeouts: one is the HEARTBEAT and other similar timeout messages and second is the register timeout message when trying to start a worker after about a gap of 5 minutes. This is a very similar scenario to OSG since the workers will only start after a delay (often long). The exact message is: Failed to register (timeout) So, Mihael, if you try the catsn example that I sent you from any machine to mcs workstations, you should be able to see the symptoms. Following are the config etc files that you could use: ====config====== wrapperlog.always.transfer=false sitedir.keep=true execution.retries=0 lazy.errors=false status.mode=provider use.provider.staging=true provider.staging.pin.swiftfiles=false foreach.max.threads=200 ========== =====sites.xml===== passive 1 0.02 10000 proxy DEBUG /tmp/ketan ============== ====tc====== grid cat /bin/cat null null null ====== The catsn example tarball is here: http://ci.uchicago.edu/~ketan/catsn-exp-formihael.tgz Regards, Ketan On Wed, Jan 25, 2012 at 1:15 PM, Mihael Hategan wrote: > Sorry. I was with the sshcl provider and the merging. I'll have to look > at it this weekend. > > On Wed, 2012-01-25 at 08:33 -0600, Michael Wilde wrote: > > Mihael, Ketan, can you send an update on this, and escalate the priority > of resolving this problem? > > > > A resolution is needed rather urgently for the ExTENCI project. > > > > Mihael, do you know where the problem lies, and have a strategy for a > fix? > > > > Thanks, > > > > - Mike > > > > ----- Original Message ----- > > > From: "Ketan Maheshwari" > > > To: "Mihael Hategan" > > > Cc: "Swift Devel" > > > Sent: Thursday, January 19, 2012 5:22:19 PM > > > Subject: Re: [Swift-devel] timeout on OSG with coasters provider > staging > > > Here is another worker log this one is for a real SCEC run: > > > > > > > > > ci.uchicago.edu/~ketan/timeout_worker_log_scec.txt > > > > > > > > > On Thu, Jan 19, 2012 at 1:54 PM, Ketan Maheshwari < > > > ketancmaheshwari at gmail.com > wrote: > > > > > > > > > Mihael, > > > > > > > > > I have the logs now. Filed as bug 690: > > > > > > > > > https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=690 > > > > > > Regards, > > > Ketan > > > > > > > > > > > > > > > > > > On Mon, Jan 16, 2012 at 2:24 PM, Ketan Maheshwari < > > > ketancmaheshwari at gmail.com > wrote: > > > > > > > > > Mihael, > > > > > > > > > Please find service log here: > > > http://ci.uchicago.edu/~ketan/swift.log.tar.gz > > > > > > worker logs seems to have lost. I'll see if I can find'em. > > > > > > Regards, > > > Ketan > > > > > > > > > > > > > > > > > > On Mon, Jan 16, 2012 at 1:38 PM, Mihael Hategan < hategan at mcs.anl.gov > > > > wrote: > > > > > > > > > Nothing interesting there. Do you also happen to have the service and > > > worker logs? > > > > > > > > > > > > > > > On Mon, 2012-01-16 at 11:05 -0600, Ketan Maheshwari wrote: > > > > Hi Mihael, > > > > > > > > > > > > I could reproduce this timeout exception on OSG with catsn Swift > > > > jobs. > > > > > > > > > > > > These are 100 jobs with a data size of 10MB each. So, 2000MB of data > > > > movement in all. > > > > > > > > > > > > I tried with 1 worker running on a single OSG site. I tried three > > > > different OSG sites: Nebraska, UChicago and RENCI. > > > > > > > > > > > > In each of these cases, I run into the following timeout after ~4 > > > > minutes of run (15-70 jobs complete during this period) . : > > > > > > > > > > > > Timeout > > > > org.globus.cog.karajan.workflow.service.TimeoutException: > > > > Handler(562, > > > > PUT): timed out receiving request. Last time 940817-011255.807, now: > > > > 120115-194100.072 > > > > at > > > > > org.globus.cog.karajan.workflow.service.handlers.RequestHandler.handleTimeout(RequestHandler.java:124) > > > > at > > > > > org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.checkTimeouts(AbstractKarajanChannel.java:131) > > > > at > > > > > org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.checkTimeouts(AbstractKarajanChannel.java:123) > > > > at > > > > > org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel$1.run(AbstractKarajanChannel.java:116) > > > > at java.util.TimerThread.mainLoop(Timer.java:512) > > > > at java.util.TimerThread.run(Timer.java:462) > > > > Command(168, SUBMITJOB): handling reply timeout; > > > > sendReqTime=120115-193900.255, sendTime=120115-193900.255, > > > > now=120115-194100.416, channel=SC-null > > > > > > > > > > > > This is followed by messages similar to the above last line but the > > > > progress of workflow halts. > > > > > > > > > > > > Here is the tarball of the > > > > experiment: http://ci.uchicago.edu/~ketan/catsn-exp-formihael.tgz > > > > > > > > > > > > It contains a README which has the steps to run: basically > > > > start-service on localhost -> start worker on OSG site -> run swift > > > > > > > > > > > > Regards, > > > > -- > > > > Ketan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > Ketan > > > > > > > > > > > > > > > > > > > > > -- > > > Ketan > > > > > > > > > > > > > > > > > > > > > -- > > > Ketan > > > > > > > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Wed Jan 25 23:57:30 2012 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Wed, 25 Jan 2012 23:57:30 -0600 Subject: [Swift-devel] Progress on Bug 690? - Re: timeout on OSG with coasters provider staging In-Reply-To: References: <811412679.176148.1327502021856.JavaMail.root@zimbra.anl.gov> <1327518912.3840.2.camel@blabla> Message-ID: I further tried a real scec workflow on all 10 mcs machines from Bridled. I did get timeout exception. Please find workerlogs from all mcs machines here: http://www.mcs.anl.gov/~ketan/mcsworkerlogs.tar.gz On Wed, Jan 25, 2012 at 9:22 PM, Ketan Maheshwari < ketancmaheshwari at gmail.com> wrote: > I could reproduce the bug going from bridled to mcs with the same > configuration. I am seeing 2 timeouts: one is the HEARTBEAT and other > similar timeout messages and second is the register timeout message when > trying to start a worker after about a gap of 5 minutes. This is a very > similar scenario to OSG since the workers will only start after a delay > (often long). The exact message is: > > Failed to register (timeout) > > So, Mihael, if you try the catsn example that I sent you from any machine > to mcs workstations, you should be able to see the symptoms. Following are > the config etc files that you could use: > > ====config====== > wrapperlog.always.transfer=false > sitedir.keep=true > execution.retries=0 > lazy.errors=false > status.mode=provider > use.provider.staging=true > provider.staging.pin.swiftfiles=false > foreach.max.threads=200 > ========== > > =====sites.xml===== > > > jobmanager="local:local"/> > passive > 1 > 0.02 > 10000 > > proxy > DEBUG > /tmp/ketan > > > ============== > > ====tc====== > grid cat /bin/cat null null null > ====== > > The catsn example tarball is here: > http://ci.uchicago.edu/~ketan/catsn-exp-formihael.tgz > > > Regards, > Ketan > > > On Wed, Jan 25, 2012 at 1:15 PM, Mihael Hategan wrote: > >> Sorry. I was with the sshcl provider and the merging. I'll have to look >> at it this weekend. >> >> On Wed, 2012-01-25 at 08:33 -0600, Michael Wilde wrote: >> > Mihael, Ketan, can you send an update on this, and escalate the >> priority of resolving this problem? >> > >> > A resolution is needed rather urgently for the ExTENCI project. >> > >> > Mihael, do you know where the problem lies, and have a strategy for a >> fix? >> > >> > Thanks, >> > >> > - Mike >> > >> > ----- Original Message ----- >> > > From: "Ketan Maheshwari" >> > > To: "Mihael Hategan" >> > > Cc: "Swift Devel" >> > > Sent: Thursday, January 19, 2012 5:22:19 PM >> > > Subject: Re: [Swift-devel] timeout on OSG with coasters provider >> staging >> > > Here is another worker log this one is for a real SCEC run: >> > > >> > > >> > > ci.uchicago.edu/~ketan/timeout_worker_log_scec.txt >> > > >> > > >> > > On Thu, Jan 19, 2012 at 1:54 PM, Ketan Maheshwari < >> > > ketancmaheshwari at gmail.com > wrote: >> > > >> > > >> > > Mihael, >> > > >> > > >> > > I have the logs now. Filed as bug 690: >> > > >> > > >> > > https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=690 >> > > >> > > Regards, >> > > Ketan >> > > >> > > >> > > >> > > >> > > >> > > On Mon, Jan 16, 2012 at 2:24 PM, Ketan Maheshwari < >> > > ketancmaheshwari at gmail.com > wrote: >> > > >> > > >> > > Mihael, >> > > >> > > >> > > Please find service log here: >> > > http://ci.uchicago.edu/~ketan/swift.log.tar.gz >> > > >> > > worker logs seems to have lost. I'll see if I can find'em. >> > > >> > > Regards, >> > > Ketan >> > > >> > > >> > > >> > > >> > > >> > > On Mon, Jan 16, 2012 at 1:38 PM, Mihael Hategan < hategan at mcs.anl.gov >> > > > wrote: >> > > >> > > >> > > Nothing interesting there. Do you also happen to have the service and >> > > worker logs? >> > > >> > > >> > > >> > > >> > > On Mon, 2012-01-16 at 11:05 -0600, Ketan Maheshwari wrote: >> > > > Hi Mihael, >> > > > >> > > > >> > > > I could reproduce this timeout exception on OSG with catsn Swift >> > > > jobs. >> > > > >> > > > >> > > > These are 100 jobs with a data size of 10MB each. So, 2000MB of data >> > > > movement in all. >> > > > >> > > > >> > > > I tried with 1 worker running on a single OSG site. I tried three >> > > > different OSG sites: Nebraska, UChicago and RENCI. >> > > > >> > > > >> > > > In each of these cases, I run into the following timeout after ~4 >> > > > minutes of run (15-70 jobs complete during this period) . : >> > > > >> > > > >> > > > Timeout >> > > > org.globus.cog.karajan.workflow.service.TimeoutException: >> > > > Handler(562, >> > > > PUT): timed out receiving request. Last time 940817-011255.807, now: >> > > > 120115-194100.072 >> > > > at >> > > > >> org.globus.cog.karajan.workflow.service.handlers.RequestHandler.handleTimeout(RequestHandler.java:124) >> > > > at >> > > > >> org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.checkTimeouts(AbstractKarajanChannel.java:131) >> > > > at >> > > > >> org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel.checkTimeouts(AbstractKarajanChannel.java:123) >> > > > at >> > > > >> org.globus.cog.karajan.workflow.service.channels.AbstractKarajanChannel$1.run(AbstractKarajanChannel.java:116) >> > > > at java.util.TimerThread.mainLoop(Timer.java:512) >> > > > at java.util.TimerThread.run(Timer.java:462) >> > > > Command(168, SUBMITJOB): handling reply timeout; >> > > > sendReqTime=120115-193900.255, sendTime=120115-193900.255, >> > > > now=120115-194100.416, channel=SC-null >> > > > >> > > > >> > > > This is followed by messages similar to the above last line but the >> > > > progress of workflow halts. >> > > > >> > > > >> > > > Here is the tarball of the >> > > > experiment: http://ci.uchicago.edu/~ketan/catsn-exp-formihael.tgz >> > > > >> > > > >> > > > It contains a README which has the steps to run: basically >> > > > start-service on localhost -> start worker on OSG site -> run swift >> > > > >> > > > >> > > > Regards, >> > > > -- >> > > > Ketan >> > > > >> > > > >> > > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > -- >> > > Ketan >> > > >> > > >> > > >> > > >> > > >> > > >> > > -- >> > > Ketan >> > > >> > > >> > > >> > > >> > > >> > > >> > > -- >> > > Ketan >> > > >> > > >> > > >> > > _______________________________________________ >> > > Swift-devel mailing list >> > > Swift-devel at ci.uchicago.edu >> > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> > >> >> >> > > > -- > Ketan > > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Thu Jan 26 15:33:53 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 26 Jan 2012 15:33:53 -0600 (CST) Subject: [Swift-devel] Need pointers to BG/P info for Swift execution Message-ID: <1216945426.183702.1327613633637.JavaMail.root@zimbra.anl.gov> Justin, Zhao, All, Its been a long time since I have personally run Swift scripts on the BG/Ps. I am trying to help Jon get started on the BG/P, but I have forgotten some of the basics. Can you point out the info a Swift BG/P user needs regarding: - how to make sure that workers and the swift app() programs they launch have the right Linux environment (ie, full bash, full env, not limited "busybox" tools) - how to ssh/telnet to worker nodes - anything else different from a normal cluster? - tips for working with Cobalt Thanks, - Mike -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From iraicu at cs.iit.edu Thu Jan 26 15:59:41 2012 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Thu, 26 Jan 2012 15:59:41 -0600 Subject: [Swift-devel] CFP: The 9th Int. Conf. on Autonomic Computing (ICAC 2012) -- San Jose CA Message-ID: <4F21CCCD.6070106@cs.iit.edu> CALL FOR PAPERS and WORKSHOP PROPOSALS The 9th International Conference on Autonomic Computing (ICAC 2012) September 17-21, 2012. San Jose, CA, USA http://icac2012.cs.fiu.edu/ ----------------------------------------------------------------- IMPORTANT DATES Paper and Poster Submission: March 9, 2012, 11:59pm PST Notification: May 18, 2012 Camera-ready Due: June 8, 2012 Workshop Proposal Submission: February 10, 2012 ----------------------------------------------------------------- OVERVIEW ICAC is the leading conference on autonomic computing techniques, foundations, and applications. Autonomic computing refers to methods and means for automated management of performance, fault, security, and configuration with little involvement of users or administrators. Systems introducing new autonomic features are becoming increasingly prevalent, motivating research that spans a variety of areas, from computer systems, networking, software engineering, and data management to machine learning, control theory, and bio-inspired computing. ICAC brings together researchers and practitioners across these disciplines to address multiple facets of adaptation and self-management in computing systems and applications from different perspectives. Autonomic computing solutions are sought for clouds, grids, data centers, enterprise software, internet services, data services, smart phones, embedded systems, and sensor networks. In these environments, resources and applications must be managed to maximize performance and minimize cost, while maintaining predictable and reliable behavior in the face of varying workloads, failures, and malicious threats. Papers are solicited from all areas of autonomic computing, including (but not limited to): * End-to-end techniques for management of resources, workloads, performance, faults, power/cooling, security, and others. * Self-managing components, such as server, storage, network protocols, or specific application elements, and embedded and mobile end systems such as smart phones. * Decision and analysis techniques and their use, such as machine learning, control theory, predictive methods, probability and stochastic processes, queuing theory methodologies, emergent behavior, rule-based systems, and bio-inspired techniques. * Monitoring systems for autonomic computing. * Hypervisor, operating systems, hardware, or application support for autonomic computing. * Novel human interfaces for monitoring and controlling autonomic systems. * Management topics, such as specification and modeling of service-level agreements, behavior enforcement and tie-in with IT governance. * Toolkits, frameworks, principles and architectures, from software engineering practices and experimental methodologies to agent-based techniques and virtualization. * Fundamental science and theory of self-managing systems: understanding, controlling or exploiting system behaviors to enforce autonomic properties. * Applications of autonomic computing and experiences with prototyped or deployed systems solving real-world problems in science, engineering, business and society. Papers will be judged on originality, significance, interest, correctness, clarity and relevance to the broader community. Papers should report on experiences, measurements, user studies, or other evaluations, as appropriate. Evaluations of a prototype or large-scale deployment of systems and applications is expected. PAPER AND POSTER SUBMISSIONS Full papers (a maximum of 10 pages in the two-column ACM proceedings format) and posters (2 pages) are invited on a wide variety of topics relating to autonomic computing. Submitted papers must be original work, and may not be under consideration for another conference or journal. Complete formatting and submission instructions can be found on the conference web site. Accepted papers and posters will appear in proceedings distributed at the conference and available electronically. Relevant top ICAC'12 papers will be invited for "fast-track" submissions to the ACM Transactions on Autonomous and Adaptive Systems (TAAS). WORKSHOPS, DEMONSTRATIONS AND EXHIBITION ICAC'12 welcomes proposals for co-located workshops on topics of interest to the autonomic computing community. Workshop proposals should be submitted to the Workshop Chair, Fred Douglis (f.douglis at computer.org) by February 10, 2012. Workshops are expected to publish proceedings, and should cover areas that complement the main program. ICAC'12 will also feature a demonstration and exhibition session consisting of prototypes and technology artifacts such as demonstrating autonomic software or autonomic computing principles. Entries will be judged by a separate committee led by the demo/exhibit chair. INDUSTRY SESSION One of ICAC's important roles is to bring together researchers and practitioners from academia and industry. In its industry session, ICAC helps fulfill this role by presenting an industry viewpoint on technologies, products, and market needs. The industry session also addresses current challenges, and opportunities for academic and corporate research collaborations. We encourage industry leaders, including entrepreneurs, product developers, architects, managers, marketers and end users, to submit their papers and posters reflecting such industry perspectives as part of the regular submission process. ------------------------------------------------------------------ ORGANIZERS GENERAL CHAIR Dejan Milojicic, HP Labs PROGRAM CHAIRS Dongyan Xu, Purdue University Vanish Talwar, HP Labs INDUSTRY CHAIR Xiaoyun Zhu, VMware WORKSHOPS CHAIR Fred Douglis, EMC POSTERS/DEMO/EXHIBITS CHAIR Eno Thereska, Microsoft Research FINANCE CHAIR Michael Kozuch, Intel LOCAL ARRANGEMENT CHAIR Jessica Blaine PUBLICITY CHAIRS Daniel Batista, University of S?o Paulo Vartan Padaryan, ISP/Russian Academy of Sci. Ioan Raicu, Illinois Inst. of Technology Jianfeng Zhan, ICT/Chinese Academy of Sci. Ming Zhao, Florida Intl. University PROGRAM COMMITTEE Tarek Abdelzaher, UIUC Umesh Bellur, IIT, Bombay Ken Birman, Cornell University Rajkumar Buyya, Univ. of Melbourne Rocky Chang, Hong Kong Polytechnic University Yuan Chen, HP Labs Alva Couch, Tufts University Peter Dinda, Northwestern University Fred Douglis, EMC Renato Figueiredo, University of Florida Mohamed Hefeeda, Qatar Computing Research Institute Joe Hellerstein, Google Geoff Jiang, NEC Labs Jeff Kephart, IBM Research Emre Kiciman, Microsoft Research Fabio Kon, University of S?o Paulo Michael Kozuch, Intel Dejan Milojicic, HP Labs Klara Nahrstedt, UIUC Priya Narasimhan, CMU Manish Parashar, Rutgers University Ioan Raicu, Illinois Inst. of Technology Omer Rana, Cardiff University Masoud Sadjadi, Florida Intl. University Rick Schlichting, AT&T Labs Hartmut Schmeck, KIT Karsten Schwan, Georgia Tech Onn Shehory, IBM Research Eno Thereska, Microsoft Research Xiaoyun Zhu, VMware -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================================= ================================================================= From iraicu at cs.iit.edu Thu Jan 26 16:35:52 2012 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Thu, 26 Jan 2012 16:35:52 -0600 Subject: [Swift-devel] CFP: 3rd Cloud Futures Workshop 2012: Hot Topics in Research and Education -- Berkeley CA Message-ID: <4F21D548.6040606@cs.iit.edu> *3rd Cloud Futures Workshop 2012*: Hot Topics in Research and Education May 7--8, 2012 | Berkeley, California, United States *Website:*http://research.microsoft.com/en-US/events/cloudfutures2012/ Cloud computing is an exciting platform for research and education. It has the potential to advance scientific and technological progress by making data and computing resources readily available at unprecedented economy of scale and nearly infinite scalability. To realize the full promise of cloud computing for research and education, however, we must think about the cloud as a holistic platform for creating new services, new experiences, and new methods to pursue research, teaching, and scholarly communication. This goal presents a broad range of interesting questions. The Cloud Futures Workshop series brings together thought leaders from academia, industry, and government to discuss the role of cloud computing across a variety of research and educational areas. Presentations, posters and discussions will investigate how new programming techniques, software platforms, software engineering and methodology, and methods of research and teaching in the cloud may solve distinct challenges arising in diverse areas of society. *General Co-Chairs* Michaael J Franklin, University of California, Berkeley Tony Hey, Microsoft Research *Keynote speakers* Joseph L Hellerstein, Manager, Big Science , Google Inc. Yousef Khalidi, Distinguished Engineer, Microsoft Corporation *Call for Abstracts and Participation* This year, we are looking for extended abstracts on hot topics in cloud computing to be presented at the workshop---either as talks or posters. Abstracts should highlight how new techniques and methods of research in the cloud may solve distinct challenges arising in diverse areas, including computer science, engineering, earth sciences, healthcare, humanities, interactive games, life sciences, and social sciences. We encourage abstracts that describe practical experiences, experimental results, and vision papers. All papers will be peer-reviewed. *Submission Instructions* ?Submit abstracts of five pages, including references. ?Your submission should include a bio (150 words maximum). ?Submit your abstracts through the online form . *Important Dates* ?Abstracts due: February 29, 2012 ?Results available: March 23, 2012 ?Workshop: May 7--8, 2012 *About the Workshop* The Cloud Futures 2012 workshop is a joint venture between the Microsoft Research Connections, Azure Research Engagement, and Developer & Platform Evangelism Academic groups, and is in association with and co-supported by the University of California, Berkeley. -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================================= ================================================================= -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhaozhang at uchicago.edu Fri Jan 27 12:54:03 2012 From: zhaozhang at uchicago.edu (ZHAO ZHANG) Date: Fri, 27 Jan 2012 12:54:03 -0600 Subject: [Swift-devel] Need pointers to BG/P info for Swift execution In-Reply-To: <1216945426.183702.1327613633637.JavaMail.root@zimbra.anl.gov> References: <1216945426.183702.1327613633637.JavaMail.root@zimbra.anl.gov> Message-ID: <4F22F2CB.9060608@uchicago.edu> Hi, Mike I am cc'ing Jon here. I am not sure how swift is currently configured on BG/P. Here are some instructions to run stuff on there. On 1/26/2012 3:33 PM, Michael Wilde wrote: > Justin, Zhao, All, > > Its been a long time since I have personally run Swift scripts on the BG/Ps. > > I am trying to help Jon get started on the BG/P, but I have forgotten some of the basics. Can you point out the info a Swift BG/P user needs regarding: > > - how to make sure that workers and the swift app() programs they launch have the right Linux environment (ie, full bash, full env, not limited "busybox" tools) Please find out in the .sh file in the attachment. To run it, execute "cqsub -p MTCScienceApps -q prod-devel -k zepto-vn-eval $PATH/cnip-start.sh" on either challenger or surveyor. > > - how to ssh/telnet to worker nodes On surveyor and intrepid, Run "cqstat | grep running" or "cqstat -f | grep running", then you get 467064 toussain 02:00:00 512 running ANL-R03-M0-512 Running " /soft/apps/ZeptoOS/bin/listip ANL-R11-M0-512" returns the list of the IO nodes 172.16.5.9 172.16.5.10 172.16.5.11 172.16.5.12 172.16.5.13 172.16.5.14 172.16.5.15 172.16.5.16 Then you can ssh to those IO nodes. On Challenger, it is a bit different, Given a job status as following 467717 felker 00:30:00 64 running CHR-R00-M1-N08-64 Running "nslookup R00-M1-N08-J00" returns the IP of the IO node. 172.16.9.49 in this case. On each IO node, it has 64 compute nodes with it. The ip address is from 192.168.1.1 to 192.168.1.64. From the compute nodes' point of view, the IO node's address is 192.168.1.254 . We have to telnet to those compute nodes. You may find the first couple of lines of cnip-start.sh is setting the IP on the compute nodes. That is the IP on torus network, it is a global network across all compute nodes within a single allocation. best zhao > > - anything else different from a normal cluster? > > - tips for working with Cobalt > > Thanks, > > - Mike > -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: cnip-start.sh URL: From wozniak at mcs.anl.gov Fri Jan 27 13:09:05 2012 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Fri, 27 Jan 2012 13:09:05 -0600 (Central Standard Time) Subject: [Swift-devel] Need pointers to BG/P info for Swift execution In-Reply-To: <4F22F2CB.9060608@uchicago.edu> References: <1216945426.183702.1327613633637.JavaMail.root@zimbra.anl.gov> <4F22F2CB.9060608@uchicago.edu> Message-ID: There are some things on the old wiki site: http://www.ci.uchicago.edu/wiki/bin/view/SWFT/WebHome including: http://www.ci.uchicago.edu/wiki/bin/view/SWFT/BgpCookbook As we improve these notes, we should move everything to: https://sites.google.com/site/swiftdevel/sites Here are my notes from performance runs on the BG/P: http://www.ci.uchicago.edu/wiki/bin/view/SWFT/PerformanceNotes All of those tests are in my: https://svn.mcs.anl.gov/repos/wozniak/collab/cdm which I can share if you're interested. Justin On Fri, 27 Jan 2012, ZHAO ZHANG wrote: > Hi, Mike > > I am cc'ing Jon here. I am not sure how swift is currently configured on > BG/P. > Here are some instructions to run stuff on there. > > On 1/26/2012 3:33 PM, Michael Wilde wrote: >> Justin, Zhao, All, >> >> Its been a long time since I have personally run Swift scripts on the >> BG/Ps. >> >> I am trying to help Jon get started on the BG/P, but I have forgotten some >> of the basics. Can you point out the info a Swift BG/P user needs >> regarding: >> >> - how to make sure that workers and the swift app() programs they launch >> have the right Linux environment (ie, full bash, full env, not limited >> "busybox" tools) > Please find out in the .sh file in the attachment. To run it, execute "cqsub > -p MTCScienceApps -q prod-devel -k zepto-vn-eval $PATH/cnip-start.sh" on > either challenger or surveyor. >> >> - how to ssh/telnet to worker nodes > On surveyor and intrepid, > Run "cqstat | grep running" or "cqstat -f | grep running", then you get > 467064 toussain 02:00:00 512 running ANL-R03-M0-512 > > Running " /soft/apps/ZeptoOS/bin/listip ANL-R11-M0-512" returns the list of > the IO nodes > 172.16.5.9 > 172.16.5.10 > 172.16.5.11 > 172.16.5.12 > 172.16.5.13 > 172.16.5.14 > 172.16.5.15 > 172.16.5.16 > Then you can ssh to those IO nodes. > > > On Challenger, it is a bit different, > Given a job status as following > 467717 felker 00:30:00 64 running CHR-R00-M1-N08-64 > Running "nslookup R00-M1-N08-J00" returns the IP of the IO node. 172.16.9.49 > in this case. > > On each IO node, it has 64 compute nodes with it. The ip address is from > 192.168.1.1 to 192.168.1.64. > From the compute nodes' point of view, the IO node's address is 192.168.1.254 > . > We have to telnet to those compute nodes. > > You may find the first couple of lines of cnip-start.sh is setting the IP on > the compute nodes. > That is the IP on torus network, it is a global network across all compute > nodes within a single allocation. > > best > zhao > >> >> - anything else different from a normal cluster? >> >> - tips for working with Cobalt >> >> Thanks, >> >> - Mike >> > -- Justin M Wozniak From hategan at mcs.anl.gov Sat Jan 28 23:01:54 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sat, 28 Jan 2012 21:01:54 -0800 Subject: [Swift-devel] merge 0.93 -> trunk Message-ID: <1327813314.27724.0.camel@blabla> Did the merge. I still need to do some sanity checks, so it may be shaky at the moment. Mihael From wilde at mcs.anl.gov Sun Jan 29 10:27:18 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sun, 29 Jan 2012 10:27:18 -0600 (CST) Subject: [Swift-devel] merge 0.93 -> trunk In-Reply-To: <1327813314.27724.0.camel@blabla> Message-ID: <114784283.189433.1327854438865.JavaMail.root@zimbra.anl.gov> Excellent - thanks! David, can you tell us how the nightly tests in trunk were affected by the integration? - Mike ----- Original Message ----- > From: "Mihael Hategan" > To: "Swift Devel" > Sent: Saturday, January 28, 2012 11:01:54 PM > Subject: [Swift-devel] merge 0.93 -> trunk > Did the merge. I still need to do some sanity checks, so it may be > shaky > at the moment. > > Mihael > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From ketancmaheshwari at gmail.com Sun Jan 29 13:25:23 2012 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Sun, 29 Jan 2012 13:25:23 -0600 Subject: [Swift-devel] ImageMagick on OSG Message-ID: ImageMagick is now installed on the following OSG sites: (location: $OSG_APP/ImageMagick-install) gate02.grid.umich.edu cit-gatekeeper2.ultralight.org fermigridosg1.fnal.gov osggrid01.hep.wisc.edu ce.grid.unesp.br osg-nemo-ce.phys.uwm.edu nys1.cac.cornell.edu grid1.oscer.ou.edu brgw1.renci.org osg-ce.sprace.org.br gk01.atlas-swt2.org top.ucr.edu osg-gw-4.t2.ucsd.edu gluskap.phys.uconn.edu osg.hpc.ufl.edu umiss001.hep.olemiss.edu gk04.swt2.uta.edu ce1.accre.vanderbilt.edu David, if you have modis ready, could you give it a try on one or two of these sites to see if it works. Regards, -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidk at ci.uchicago.edu Sun Jan 29 16:07:14 2012 From: davidk at ci.uchicago.edu (David Kelly) Date: Sun, 29 Jan 2012 16:07:14 -0600 (CST) Subject: [Swift-devel] merge 0.93 -> trunk In-Reply-To: <114784283.189433.1327854438865.JavaMail.root@zimbra.anl.gov> Message-ID: <1687246729.93896.1327874834624.JavaMail.root@zimbra-mb2.anl.gov> It looks like the compile failed and the test did not run last night. Here is the error I am getting: [javac] /swift/swift-trunk/cog/modules/provider-coaster/src/org/globus/cog/abstraction/coaster/service/LocalTCPService.java:29: org.globus.cog.abstraction.coaster.service.LocalTCPService is not abstract and does not override abstract method registrationReceived(java.lang.String,java.lang.String,org.globus.cog.karajan.workflow.service.channels.KarajanChannel,java.util.Map) in org.globus.cog.abstraction.coaster.service.Registering [javac] public class LocalTCPService extends GSSService implements Registering { [javac] ^ [javac] /swift/swift-trunk/cog/modules/provider-coaster/src/org/globus/cog/abstraction/coaster/service/LocalTCPService.java:64: registrationReceived(java.lang.String,java.lang.String,java.lang.String,org.globus.cog.karajan.workflow.service.channels.ChannelContext,java.util.Map) in org.globus.cog.abstraction.coaster.service.RegistrationManager cannot be applied to (java.lang.String,java.lang.String,java.lang.String,org.globus.cog.karajan.workflow.service.channels.ChannelContext) [javac] registrationManager.registrationReceived(blockid, wid, url, cc); [javac] ^ [javac] Note: /swift/swift-trunk/cog/modules/provider-coaster/src/org/globus/cog/abstraction/coaster/service/job/manager/Block.java uses or overrides a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: /swift/swift-trunk/cog/modules/provider-coaster/src/org/globus/cog/abstraction/coaster/service/job/manager/BQPStatusHandler.java uses unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 2 errors BUILD FAILED /swift/swift-trunk/cog/modules/swift/build.xml:73: The following error occurred while executing this line: /swift/swift-trunk/cog/mbuild.xml:445: The following error occurred while executing this line: /swift/swift-trunk/cog/mbuild.xml:79: The following error occurred while executing this line: /swift/swift-trunk/cog/mbuild.xml:52: The following error occurred while executing this line: /swift/swift-trunk/cog/modules/swift/dependencies.xml:13: The following error occurred while executing this line: /swift/swift-trunk/cog/mbuild.xml:163: The following error occurred while executing this line: /swift/swift-trunk/cog/mbuild.xml:168: The following error occurred while executing this line: /swift/swift-trunk/cog/modules/provider-coaster/build.xml:59: The following error occurred while executing this line: /swift/swift-trunk/cog/mbuild.xml:466: The following error occurred while executing this line: /swift/swift-trunk/cog/mbuild.xml:229: Compile failed; see the compiler error output for details. ----- Original Message ----- > From: "Michael Wilde" > To: "Mihael Hategan" > Cc: "Swift Devel" > Sent: Sunday, January 29, 2012 10:27:18 AM > Subject: Re: [Swift-devel] merge 0.93 -> trunk > Excellent - thanks! David, can you tell us how the nightly tests in > trunk were affected by the integration? > > - Mike > > ----- Original Message ----- > > From: "Mihael Hategan" > > To: "Swift Devel" > > Sent: Saturday, January 28, 2012 11:01:54 PM > > Subject: [Swift-devel] merge 0.93 -> trunk > > Did the merge. I still need to do some sanity checks, so it may be > > shaky > > at the moment. > > > > Mihael > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From hategan at mcs.anl.gov Sun Jan 29 18:15:44 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sun, 29 Jan 2012 16:15:44 -0800 Subject: [Swift-devel] merge 0.93 -> trunk In-Reply-To: <1687246729.93896.1327874834624.JavaMail.root@zimbra-mb2.anl.gov> References: <1687246729.93896.1327874834624.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: <1327882544.4083.0.camel@blabla> Maybe the checkout happened in the middle of a commit? Is anybody seeing this with a clean checkout? On Sun, 2012-01-29 at 16:07 -0600, David Kelly wrote: > It looks like the compile failed and the test did not run last night. Here is the error I am getting: > > [javac] /swift/swift-trunk/cog/modules/provider-coaster/src/org/globus/cog/abstraction/coaster/service/LocalTCPService.java:29: org.globus.cog.abstraction.coaster.service.LocalTCPService is not abstract and does not override abstract method registrationReceived(java.lang.String,java.lang.String,org.globus.cog.karajan.workflow.service.channels.KarajanChannel,java.util.Map) in org.globus.cog.abstraction.coaster.service.Registering > [javac] public class LocalTCPService extends GSSService implements Registering { > [javac] ^ > [javac] /swift/swift-trunk/cog/modules/provider-coaster/src/org/globus/cog/abstraction/coaster/service/LocalTCPService.java:64: registrationReceived(java.lang.String,java.lang.String,java.lang.String,org.globus.cog.karajan.workflow.service.channels.ChannelContext,java.util.Map) in org.globus.cog.abstraction.coaster.service.RegistrationManager cannot be applied to (java.lang.String,java.lang.String,java.lang.String,org.globus.cog.karajan.workflow.service.channels.ChannelContext) > [javac] registrationManager.registrationReceived(blockid, wid, url, cc); > [javac] ^ > [javac] Note: /swift/swift-trunk/cog/modules/provider-coaster/src/org/globus/cog/abstraction/coaster/service/job/manager/Block.java uses or overrides a deprecated API. > [javac] Note: Recompile with -Xlint:deprecation for details. > [javac] Note: /swift/swift-trunk/cog/modules/provider-coaster/src/org/globus/cog/abstraction/coaster/service/job/manager/BQPStatusHandler.java uses unchecked or unsafe operations. > [javac] Note: Recompile with -Xlint:unchecked for details. > [javac] 2 errors > > BUILD FAILED > /swift/swift-trunk/cog/modules/swift/build.xml:73: The following error occurred while executing this line: > /swift/swift-trunk/cog/mbuild.xml:445: The following error occurred while executing this line: > /swift/swift-trunk/cog/mbuild.xml:79: The following error occurred while executing this line: > /swift/swift-trunk/cog/mbuild.xml:52: The following error occurred while executing this line: > /swift/swift-trunk/cog/modules/swift/dependencies.xml:13: The following error occurred while executing this line: > /swift/swift-trunk/cog/mbuild.xml:163: The following error occurred while executing this line: > /swift/swift-trunk/cog/mbuild.xml:168: The following error occurred while executing this line: > /swift/swift-trunk/cog/modules/provider-coaster/build.xml:59: The following error occurred while executing this line: > /swift/swift-trunk/cog/mbuild.xml:466: The following error occurred while executing this line: > /swift/swift-trunk/cog/mbuild.xml:229: Compile failed; see the compiler error output for details. > > > > ----- Original Message ----- > > From: "Michael Wilde" > > To: "Mihael Hategan" > > Cc: "Swift Devel" > > Sent: Sunday, January 29, 2012 10:27:18 AM > > Subject: Re: [Swift-devel] merge 0.93 -> trunk > > Excellent - thanks! David, can you tell us how the nightly tests in > > trunk were affected by the integration? > > > > - Mike > > > > ----- Original Message ----- > > > From: "Mihael Hategan" > > > To: "Swift Devel" > > > Sent: Saturday, January 28, 2012 11:01:54 PM > > > Subject: [Swift-devel] merge 0.93 -> trunk > > > Did the merge. I still need to do some sanity checks, so it may be > > > shaky > > > at the moment. > > > > > > Mihael > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From jonmon at mcs.anl.gov Mon Jan 30 08:45:06 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Mon, 30 Jan 2012 08:45:06 -0600 Subject: [Swift-devel] merge 0.93 -> trunk In-Reply-To: <1327882544.4083.0.camel@blabla> References: <1687246729.93896.1327874834624.JavaMail.root@zimbra-mb2.anl.gov> <1327882544.4083.0.camel@blabla> Message-ID: I am seeing the same error when trying to compile trunk. On Jan 29, 2012, at 6:15 PM, Mihael Hategan wrote: > Maybe the checkout happened in the middle of a commit? > > Is anybody seeing this with a clean checkout? > > On Sun, 2012-01-29 at 16:07 -0600, David Kelly wrote: >> It looks like the compile failed and the test did not run last night. Here is the error I am getting: >> >> [javac] /swift/swift-trunk/cog/modules/provider-coaster/src/org/globus/cog/abstraction/coaster/service/LocalTCPService.java:29: org.globus.cog.abstraction.coaster.service.LocalTCPService is not abstract and does not override abstract method registrationReceived(java.lang.String,java.lang.String,org.globus.cog.karajan.workflow.service.channels.KarajanChannel,java.util.Map) in org.globus.cog.abstraction.coaster.service.Registering >> [javac] public class LocalTCPService extends GSSService implements Registering { >> [javac] ^ >> [javac] /swift/swift-trunk/cog/modules/provider-coaster/src/org/globus/cog/abstraction/coaster/service/LocalTCPService.java:64: registrationReceived(java.lang.String,java.lang.String,java.lang.String,org.globus.cog.karajan.workflow.service.channels.ChannelContext,java.util.Map) in org.globus.cog.abstraction.coaster.service.RegistrationManager cannot be applied to (java.lang.String,java.lang.String,java.lang.String,org.globus.cog.karajan.workflow.service.channels.ChannelContext) >> [javac] registrationManager.registrationReceived(blockid, wid, url, cc); >> [javac] ^ >> [javac] Note: /swift/swift-trunk/cog/modules/provider-coaster/src/org/globus/cog/abstraction/coaster/service/job/manager/Block.java uses or overrides a deprecated API. >> [javac] Note: Recompile with -Xlint:deprecation for details. >> [javac] Note: /swift/swift-trunk/cog/modules/provider-coaster/src/org/globus/cog/abstraction/coaster/service/job/manager/BQPStatusHandler.java uses unchecked or unsafe operations. >> [javac] Note: Recompile with -Xlint:unchecked for details. >> [javac] 2 errors >> >> BUILD FAILED >> /swift/swift-trunk/cog/modules/swift/build.xml:73: The following error occurred while executing this line: >> /swift/swift-trunk/cog/mbuild.xml:445: The following error occurred while executing this line: >> /swift/swift-trunk/cog/mbuild.xml:79: The following error occurred while executing this line: >> /swift/swift-trunk/cog/mbuild.xml:52: The following error occurred while executing this line: >> /swift/swift-trunk/cog/modules/swift/dependencies.xml:13: The following error occurred while executing this line: >> /swift/swift-trunk/cog/mbuild.xml:163: The following error occurred while executing this line: >> /swift/swift-trunk/cog/mbuild.xml:168: The following error occurred while executing this line: >> /swift/swift-trunk/cog/modules/provider-coaster/build.xml:59: The following error occurred while executing this line: >> /swift/swift-trunk/cog/mbuild.xml:466: The following error occurred while executing this line: >> /swift/swift-trunk/cog/mbuild.xml:229: Compile failed; see the compiler error output for details. >> >> >> >> ----- Original Message ----- >>> From: "Michael Wilde" >>> To: "Mihael Hategan" >>> Cc: "Swift Devel" >>> Sent: Sunday, January 29, 2012 10:27:18 AM >>> Subject: Re: [Swift-devel] merge 0.93 -> trunk >>> Excellent - thanks! David, can you tell us how the nightly tests in >>> trunk were affected by the integration? >>> >>> - Mike >>> >>> ----- Original Message ----- >>>> From: "Mihael Hategan" >>>> To: "Swift Devel" >>>> Sent: Saturday, January 28, 2012 11:01:54 PM >>>> Subject: [Swift-devel] merge 0.93 -> trunk >>>> Did the merge. I still need to do some sanity checks, so it may be >>>> shaky >>>> at the moment. >>>> >>>> Mihael >>>> >>>> _______________________________________________ >>>> Swift-devel mailing list >>>> Swift-devel at ci.uchicago.edu >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>> >>> -- >>> Michael Wilde >>> Computation Institute, University of Chicago >>> Mathematics and Computer Science Division >>> Argonne National Laboratory >>> >>> _______________________________________________ >>> Swift-devel mailing list >>> Swift-devel at ci.uchicago.edu >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel