From ketancmaheshwari at gmail.com Wed Jun 1 11:38:04 2011 From: ketancmaheshwari at gmail.com (ketan) Date: Wed, 01 Jun 2011 11:38:04 -0500 Subject: [Swift-devel] SCEC postproc workflow unresponsive after first 2 tasks Message-ID: <4DE66AEC.7000409@gmail.com> Allan, I tried to run the posproc workflow on the OSG whitelisted resources. However, the workflow seems not to respond after completing the first two tasks: I get something like this: Progress: Selecting site:248 Stage in:2 Finished successfully:2 Progress: Selecting site:248 Stage in:2 Finished successfully:2 Progress: Selecting site:248 Stage in:2 Finished successfully:2 Progress: Selecting site:248 Stage in:2 Finished successfully:2 Progress: Selecting site:248 Stage in:2 Finished successfully:2 .. .. .. The sites.xml, tc.data and the log files are on bridled as follows: /home/ketan/osg-tg-effort/cybershake/condor_osg.xml /home/ketan/osg-tg-effort/cybershake/tc.data /home/ketan/osg-tg-effort/cybershake/postproc-20110601-0951-43n3a22g.log Swift is: [bridled.ci.uchicago.edu:cybershake]$ which swift swift is /home/ketan/swift-0.92.1/bin/swift I have memcached on, sourced /opt/osg-1.x.x/setup.sh and have a valid proxy. Could you indicate what are the first debugging steps that I should be taking on osg in such condition? Thanks, Ketan From aespinosa at cs.uchicago.edu Wed Jun 1 13:02:43 2011 From: aespinosa at cs.uchicago.edu (Allan Espinosa) Date: Wed, 1 Jun 2011 12:02:43 -0600 Subject: [Swift-devel] Re: SCEC postproc workflow unresponsive after first 2 tasks In-Reply-To: <4DE66AEC.7000409@gmail.com> References: <4DE66AEC.7000409@gmail.com> Message-ID: Hi Ketan, Could you add debugging for Swift's vdl:stagein calls? Also, is this using the stable branch? Here's the log4j.properties I always use: # Set root category priority to WARN and its appenders to CONSOLE and FILE. log4j.rootCategory=INFO, CONSOLE, FILE log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout log4j.appender.CONSOLE.Threshold=INFO log4j.appender.CONSOLE.layout.ConversionPattern=%m%n log4j.appender.FILE=org.apache.log4j.FileAppender log4j.appender.FILE.File=swift.log log4j.appender.FILE.layout=org.apache.log4j.PatternLayout log4j.appender.FILE.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss,SSSZZZZZ} %-5p %c{1} %m%n log4j.logger.swift=DEBUG log4j.logger.org.apache.axis.utils=ERROR log4j.logger.org.globus.swift.trace=INFO log4j.logger.org.griphyn.vdl.karajan.Loader=DEBUG log4j.logger.org.globus.cog.karajan.workflow.events.WorkerSweeper=WARN log4j.logger.org.globus.cog.karajan.workflow.nodes.FlowNode=WARN log4j.logger.org.globus.cog.karajan.scheduler.WeightedHostScoreScheduler=DEBUG log4j.logger.org.griphyn.vdl.toolkit.VDLt2VDLx=DEBUG log4j.logger.org.griphyn.vdl.karajan.VDL2ExecutionContext=DEBUG log4j.logger.org.globus.cog.abstraction.impl.common.task.TaskImpl=INFO log4j.logger.org.griphyn.vdl.karajan.lib.GetFieldValue=DEBUG log4j.logger.org.griphyn.vdl.engine.Karajan=INFO log4j.logger.org.globus.cog.abstraction.coaster.rlog=DEBUG # log4j.logger.org.globus.swift.data.Director=DEBUG log4j.logger.org.griphyn.vdl.karajan.lib=INFO log4j.logger.org.griphyn.vdl.karajan.lib.SetFieldValue=OFF log4j.logger.org.griphyn.vdl.mapping.AbstractDataNode=OFF # Transfer #log4j.logger.org.globus.ftp=DEBUG #log4j.logger.org.globus.gridftp=DEBUG -Allan 2011/6/1 ketan : > Allan, > > I tried to run the posproc workflow on the OSG whitelisted resources. > However, the workflow seems not to respond after completing the first two > tasks: > > I get something like this: > > Progress: ?Selecting site:248 ?Stage in:2 ?Finished successfully:2 > Progress: ?Selecting site:248 ?Stage in:2 ?Finished successfully:2 > Progress: ?Selecting site:248 ?Stage in:2 ?Finished successfully:2 > Progress: ?Selecting site:248 ?Stage in:2 ?Finished successfully:2 > Progress: ?Selecting site:248 ?Stage in:2 ?Finished successfully:2 > .. > .. > .. > > > The sites.xml, tc.data and the log files are on bridled as follows: > > /home/ketan/osg-tg-effort/cybershake/condor_osg.xml > > /home/ketan/osg-tg-effort/cybershake/tc.data > > /home/ketan/osg-tg-effort/cybershake/postproc-20110601-0951-43n3a22g.log > > Swift is: > > [bridled.ci.uchicago.edu:cybershake]$ which swift > swift is /home/ketan/swift-0.92.1/bin/swift > > I have memcached on, sourced /opt/osg-1.x.x/setup.sh and have a valid proxy. > > Could you indicate what are the first debugging steps that I should be > taking on osg in such condition? > > > Thanks, > Ketan > > -- Allan M. Espinosa PhD student, Computer Science University of Chicago From wilde at mcs.anl.gov Thu Jun 2 09:03:33 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 2 Jun 2011 09:03:33 -0500 (CDT) Subject: [Swift-devel] Tim Armstrong's Master's thesis: Swift semantics in Python In-Reply-To: Message-ID: <116109542.134994.1307023413252.JavaMail.root@zimbra.anl.gov> Hi All, Tim has created a Python package called PyDFlow which implements Swift-like implicitly parallel data flow semantics. His side-by-side examples of Swift scripts in Python syntax show that the approach is very promising. - Mike ----- Forwarded Message ----- From: "Tim Armstrong" To: "Michael Wilde" , "Justin M Wozniak" Sent: Thursday, June 2, 2011 7:01:16 AM Subject: Final version of Masters paper Hi Guys, I've put the final version of my Masters paper up on my website - feel free to distribute it now http://people.cs.uchicago.edu/~tga/pubs/armstrong-masters.pdf - Tim -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Thu Jun 2 10:30:39 2011 From: ketancmaheshwari at gmail.com (ketan) Date: Thu, 02 Jun 2011 10:30:39 -0500 Subject: [Swift-devel] Re: SCEC postproc workflow unresponsive after first 2 tasks In-Reply-To: References: <4DE66AEC.7000409@gmail.com> Message-ID: <4DE7AC9F.7070701@gmail.com> Ok, done this. On another trial, seems the workflow is progressing, however, after a few hours, I get the following errors: Final status: Initializing:2636 Failed:393 Finished successfully:395 The following errors have occurred: 1. Server refused performing the request. Custom message: Bad password. (error code 1) [Nested exception message: Custom message: Unexpected reply: 530-Login incorrect. : globus_gss_assist: Gridmap lookup failure: Could not map /DC=org/DC=doegrids/OU=People/CN=Ketan Maheshwari 64821 530- 530 End.] (393 times) 2. Application seispeak_agg not executed due to errors in dependencies (216 times) 3. Application seispeak_local not executed due to errors in dependencies (2420 times) While, I am debugging, kindly let me know if these message ring any bells. Regards, Ketan On 6/1/11 1:02 PM, Allan Espinosa wrote: > Hi Ketan, > > Could you add debugging for Swift's vdl:stagein calls? Also, is this > using the stable branch? > > Here's the log4j.properties I always use: > > # Set root category priority to WARN and its appenders to CONSOLE and FILE. > log4j.rootCategory=INFO, CONSOLE, FILE > > log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender > log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout > log4j.appender.CONSOLE.Threshold=INFO > log4j.appender.CONSOLE.layout.ConversionPattern=%m%n > > log4j.appender.FILE=org.apache.log4j.FileAppender > log4j.appender.FILE.File=swift.log > log4j.appender.FILE.layout=org.apache.log4j.PatternLayout > log4j.appender.FILE.layout.ConversionPattern=%d{yyyy-MM-dd > HH:mm:ss,SSSZZZZZ} %-5p %c{1} %m%n > > log4j.logger.swift=DEBUG > > log4j.logger.org.apache.axis.utils=ERROR > > log4j.logger.org.globus.swift.trace=INFO > > log4j.logger.org.griphyn.vdl.karajan.Loader=DEBUG > log4j.logger.org.globus.cog.karajan.workflow.events.WorkerSweeper=WARN > log4j.logger.org.globus.cog.karajan.workflow.nodes.FlowNode=WARN > log4j.logger.org.globus.cog.karajan.scheduler.WeightedHostScoreScheduler=DEBUG > log4j.logger.org.griphyn.vdl.toolkit.VDLt2VDLx=DEBUG > log4j.logger.org.griphyn.vdl.karajan.VDL2ExecutionContext=DEBUG > log4j.logger.org.globus.cog.abstraction.impl.common.task.TaskImpl=INFO > log4j.logger.org.griphyn.vdl.karajan.lib.GetFieldValue=DEBUG > log4j.logger.org.griphyn.vdl.engine.Karajan=INFO > log4j.logger.org.globus.cog.abstraction.coaster.rlog=DEBUG > > # log4j.logger.org.globus.swift.data.Director=DEBUG > log4j.logger.org.griphyn.vdl.karajan.lib=INFO > > > log4j.logger.org.griphyn.vdl.karajan.lib.SetFieldValue=OFF > log4j.logger.org.griphyn.vdl.mapping.AbstractDataNode=OFF > > # Transfer > #log4j.logger.org.globus.ftp=DEBUG > #log4j.logger.org.globus.gridftp=DEBUG > > > -Allan > > 2011/6/1 ketan: >> Allan, >> >> I tried to run the posproc workflow on the OSG whitelisted resources. >> However, the workflow seems not to respond after completing the first two >> tasks: >> >> I get something like this: >> >> Progress: Selecting site:248 Stage in:2 Finished successfully:2 >> Progress: Selecting site:248 Stage in:2 Finished successfully:2 >> Progress: Selecting site:248 Stage in:2 Finished successfully:2 >> Progress: Selecting site:248 Stage in:2 Finished successfully:2 >> Progress: Selecting site:248 Stage in:2 Finished successfully:2 >> .. >> .. >> .. >> >> >> The sites.xml, tc.data and the log files are on bridled as follows: >> >> /home/ketan/osg-tg-effort/cybershake/condor_osg.xml >> >> /home/ketan/osg-tg-effort/cybershake/tc.data >> >> /home/ketan/osg-tg-effort/cybershake/postproc-20110601-0951-43n3a22g.log >> >> Swift is: >> >> [bridled.ci.uchicago.edu:cybershake]$ which swift >> swift is /home/ketan/swift-0.92.1/bin/swift >> >> I have memcached on, sourced /opt/osg-1.x.x/setup.sh and have a valid proxy. >> >> Could you indicate what are the first debugging steps that I should be >> taking on osg in such condition? >> >> >> Thanks, >> Ketan >> >> > > From aespinosa at cs.uchicago.edu Thu Jun 2 11:06:04 2011 From: aespinosa at cs.uchicago.edu (Allan Espinosa) Date: Thu, 2 Jun 2011 10:06:04 -0600 Subject: [Swift-devel] Re: SCEC postproc workflow unresponsive after first 2 tasks In-Reply-To: <4DE7AC9F.7070701@gmail.com> References: <4DE66AEC.7000409@gmail.com> <4DE7AC9F.7070701@gmail.com> Message-ID: Hi Ketan, It maybe the case that you are not included in the Engage VOMRS. If you are included, you could file an OSG ticket regarding this issue on the grid operations center. -Allan 2011/6/2 ketan : > Ok, done this. On another trial, seems the workflow is progressing, however, > after a few hours, I get the following errors: > > Final status: ?Initializing:2636 ?Failed:393 ?Finished successfully:395 > The following errors have occurred: > 1. Server refused performing the request. Custom message: Bad password. > (error code 1) [Nested exception message: ?Custom message: Unexpected reply: > 530-Login incorrect. : globus_gss_assist: Gridmap lookup failure: Could not > map /DC=org/DC=doegrids/OU=People/CN=Ketan Maheshwari 64821 > 530- > 530 End.] (393 times) > 2. Application seispeak_agg not executed due to errors in dependencies (216 > times) > 3. Application seispeak_local not executed due to errors in dependencies > (2420 times) > > While, I am debugging, kindly let me know if these message ring any bells. > > Regards, > Ketan > > > On 6/1/11 1:02 PM, Allan Espinosa wrote: >> >> Hi Ketan, >> >> Could you add debugging for Swift's vdl:stagein calls? ?Also, is this >> using the stable branch? >> >> Here's the log4j.properties I always use: >> >> # Set root category priority to WARN and its appenders to CONSOLE and >> FILE. >> log4j.rootCategory=INFO, CONSOLE, FILE >> >> log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender >> log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout >> log4j.appender.CONSOLE.Threshold=INFO >> log4j.appender.CONSOLE.layout.ConversionPattern=%m%n >> >> log4j.appender.FILE=org.apache.log4j.FileAppender >> log4j.appender.FILE.File=swift.log >> log4j.appender.FILE.layout=org.apache.log4j.PatternLayout >> log4j.appender.FILE.layout.ConversionPattern=%d{yyyy-MM-dd >> HH:mm:ss,SSSZZZZZ} %-5p %c{1} %m%n >> >> log4j.logger.swift=DEBUG >> >> log4j.logger.org.apache.axis.utils=ERROR >> >> log4j.logger.org.globus.swift.trace=INFO >> >> log4j.logger.org.griphyn.vdl.karajan.Loader=DEBUG >> log4j.logger.org.globus.cog.karajan.workflow.events.WorkerSweeper=WARN >> log4j.logger.org.globus.cog.karajan.workflow.nodes.FlowNode=WARN >> >> log4j.logger.org.globus.cog.karajan.scheduler.WeightedHostScoreScheduler=DEBUG >> log4j.logger.org.griphyn.vdl.toolkit.VDLt2VDLx=DEBUG >> log4j.logger.org.griphyn.vdl.karajan.VDL2ExecutionContext=DEBUG >> log4j.logger.org.globus.cog.abstraction.impl.common.task.TaskImpl=INFO >> log4j.logger.org.griphyn.vdl.karajan.lib.GetFieldValue=DEBUG >> log4j.logger.org.griphyn.vdl.engine.Karajan=INFO >> log4j.logger.org.globus.cog.abstraction.coaster.rlog=DEBUG >> >> # log4j.logger.org.globus.swift.data.Director=DEBUG >> log4j.logger.org.griphyn.vdl.karajan.lib=INFO >> >> >> log4j.logger.org.griphyn.vdl.karajan.lib.SetFieldValue=OFF >> log4j.logger.org.griphyn.vdl.mapping.AbstractDataNode=OFF >> >> # Transfer >> #log4j.logger.org.globus.ftp=DEBUG >> #log4j.logger.org.globus.gridftp=DEBUG >> >> >> -Allan >> >> 2011/6/1 ketan: >>> >>> Allan, >>> >>> I tried to run the posproc workflow on the OSG whitelisted resources. >>> However, the workflow seems not to respond after completing the first two >>> tasks: >>> >>> I get something like this: >>> >>> Progress: ?Selecting site:248 ?Stage in:2 ?Finished successfully:2 >>> Progress: ?Selecting site:248 ?Stage in:2 ?Finished successfully:2 >>> Progress: ?Selecting site:248 ?Stage in:2 ?Finished successfully:2 >>> Progress: ?Selecting site:248 ?Stage in:2 ?Finished successfully:2 >>> Progress: ?Selecting site:248 ?Stage in:2 ?Finished successfully:2 >>> .. >>> .. >>> .. >>> >>> >>> The sites.xml, tc.data and the log files are on bridled as follows: >>> >>> /home/ketan/osg-tg-effort/cybershake/condor_osg.xml >>> >>> /home/ketan/osg-tg-effort/cybershake/tc.data >>> >>> /home/ketan/osg-tg-effort/cybershake/postproc-20110601-0951-43n3a22g.log >>> >>> Swift is: >>> >>> [bridled.ci.uchicago.edu:cybershake]$ which swift >>> swift is /home/ketan/swift-0.92.1/bin/swift >>> >>> I have memcached on, sourced /opt/osg-1.x.x/setup.sh and have a valid >>> proxy. >>> >>> Could you indicate what are the first debugging steps that I should be >>> taking on osg in such condition? >>> >>> >>> Thanks, >>> Ketan >>> >>> >> >> > > -- Allan M. Espinosa PhD student, Computer Science University of Chicago From ketancmaheshwari at gmail.com Thu Jun 2 11:07:50 2011 From: ketancmaheshwari at gmail.com (ketan) Date: Thu, 02 Jun 2011 11:07:50 -0500 Subject: [Swift-devel] Re: SCEC postproc workflow unresponsive after first 2 tasks In-Reply-To: References: <4DE66AEC.7000409@gmail.com> <4DE7AC9F.7070701@gmail.com> Message-ID: <4DE7B556.8090300@gmail.com> Is there a definitive way to know if i am not included on Engage VOMRS, since, I am able to create a proxy with vo membership for quite some time now. On 6/2/11 11:06 AM, Allan Espinosa wrote: > Hi Ketan, > > It maybe the case that you are not included in the Engage VOMRS. If > you are included, you could file an OSG ticket regarding this issue on > the grid operations center. > > -Allan > > 2011/6/2 ketan: >> Ok, done this. On another trial, seems the workflow is progressing, however, >> after a few hours, I get the following errors: >> >> Final status: Initializing:2636 Failed:393 Finished successfully:395 >> The following errors have occurred: >> 1. Server refused performing the request. Custom message: Bad password. >> (error code 1) [Nested exception message: Custom message: Unexpected reply: >> 530-Login incorrect. : globus_gss_assist: Gridmap lookup failure: Could not >> map /DC=org/DC=doegrids/OU=People/CN=Ketan Maheshwari 64821 >> 530- >> 530 End.] (393 times) >> 2. Application seispeak_agg not executed due to errors in dependencies (216 >> times) >> 3. Application seispeak_local not executed due to errors in dependencies >> (2420 times) >> >> While, I am debugging, kindly let me know if these message ring any bells. >> >> Regards, >> Ketan >> >> >> On 6/1/11 1:02 PM, Allan Espinosa wrote: >>> Hi Ketan, >>> >>> Could you add debugging for Swift's vdl:stagein calls? Also, is this >>> using the stable branch? >>> >>> Here's the log4j.properties I always use: >>> >>> # Set root category priority to WARN and its appenders to CONSOLE and >>> FILE. >>> log4j.rootCategory=INFO, CONSOLE, FILE >>> >>> log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender >>> log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout >>> log4j.appender.CONSOLE.Threshold=INFO >>> log4j.appender.CONSOLE.layout.ConversionPattern=%m%n >>> >>> log4j.appender.FILE=org.apache.log4j.FileAppender >>> log4j.appender.FILE.File=swift.log >>> log4j.appender.FILE.layout=org.apache.log4j.PatternLayout >>> log4j.appender.FILE.layout.ConversionPattern=%d{yyyy-MM-dd >>> HH:mm:ss,SSSZZZZZ} %-5p %c{1} %m%n >>> >>> log4j.logger.swift=DEBUG >>> >>> log4j.logger.org.apache.axis.utils=ERROR >>> >>> log4j.logger.org.globus.swift.trace=INFO >>> >>> log4j.logger.org.griphyn.vdl.karajan.Loader=DEBUG >>> log4j.logger.org.globus.cog.karajan.workflow.events.WorkerSweeper=WARN >>> log4j.logger.org.globus.cog.karajan.workflow.nodes.FlowNode=WARN >>> >>> log4j.logger.org.globus.cog.karajan.scheduler.WeightedHostScoreScheduler=DEBUG >>> log4j.logger.org.griphyn.vdl.toolkit.VDLt2VDLx=DEBUG >>> log4j.logger.org.griphyn.vdl.karajan.VDL2ExecutionContext=DEBUG >>> log4j.logger.org.globus.cog.abstraction.impl.common.task.TaskImpl=INFO >>> log4j.logger.org.griphyn.vdl.karajan.lib.GetFieldValue=DEBUG >>> log4j.logger.org.griphyn.vdl.engine.Karajan=INFO >>> log4j.logger.org.globus.cog.abstraction.coaster.rlog=DEBUG >>> >>> # log4j.logger.org.globus.swift.data.Director=DEBUG >>> log4j.logger.org.griphyn.vdl.karajan.lib=INFO >>> >>> >>> log4j.logger.org.griphyn.vdl.karajan.lib.SetFieldValue=OFF >>> log4j.logger.org.griphyn.vdl.mapping.AbstractDataNode=OFF >>> >>> # Transfer >>> #log4j.logger.org.globus.ftp=DEBUG >>> #log4j.logger.org.globus.gridftp=DEBUG >>> >>> >>> -Allan >>> >>> 2011/6/1 ketan: >>>> Allan, >>>> >>>> I tried to run the posproc workflow on the OSG whitelisted resources. >>>> However, the workflow seems not to respond after completing the first two >>>> tasks: >>>> >>>> I get something like this: >>>> >>>> Progress: Selecting site:248 Stage in:2 Finished successfully:2 >>>> Progress: Selecting site:248 Stage in:2 Finished successfully:2 >>>> Progress: Selecting site:248 Stage in:2 Finished successfully:2 >>>> Progress: Selecting site:248 Stage in:2 Finished successfully:2 >>>> Progress: Selecting site:248 Stage in:2 Finished successfully:2 >>>> .. >>>> .. >>>> .. >>>> >>>> >>>> The sites.xml, tc.data and the log files are on bridled as follows: >>>> >>>> /home/ketan/osg-tg-effort/cybershake/condor_osg.xml >>>> >>>> /home/ketan/osg-tg-effort/cybershake/tc.data >>>> >>>> /home/ketan/osg-tg-effort/cybershake/postproc-20110601-0951-43n3a22g.log >>>> >>>> Swift is: >>>> >>>> [bridled.ci.uchicago.edu:cybershake]$ which swift >>>> swift is /home/ketan/swift-0.92.1/bin/swift >>>> >>>> I have memcached on, sourced /opt/osg-1.x.x/setup.sh and have a valid >>>> proxy. >>>> >>>> Could you indicate what are the first debugging steps that I should be >>>> taking on osg in such condition? >>>> >>>> >>>> Thanks, >>>> Ketan >>>> >>>> >>> >> > > From aespinosa at cs.uchicago.edu Thu Jun 2 11:16:36 2011 From: aespinosa at cs.uchicago.edu (Allan Espinosa) Date: Thu, 2 Jun 2011 10:16:36 -0600 Subject: [Swift-devel] Re: SCEC postproc workflow unresponsive after first 2 tasks In-Reply-To: <4DE7B556.8090300@gmail.com> References: <4DE66AEC.7000409@gmail.com> <4DE7AC9F.7070701@gmail.com> <4DE7B556.8090300@gmail.com> Message-ID: You could login at the VOMRS webpage and have them identify your membership through your DN. I do occassionally have some sites reject my DN after a while but haven't investigated the reason for it. -Allan 2011/6/2 ketan : > Is there a definitive way to know if i am not included on Engage VOMRS, > since, I am able to create a proxy with vo membership for quite some time > now. > > On 6/2/11 11:06 AM, Allan Espinosa wrote: >> >> Hi Ketan, >> >> It maybe the case that you are not included in the Engage VOMRS. ?If >> you are included, you could file an OSG ticket regarding this issue on >> the grid operations center. >> >> -Allan >> >> 2011/6/2 ketan: >>> >>> Ok, done this. On another trial, seems the workflow is progressing, >>> however, >>> after a few hours, I get the following errors: >>> >>> Final status: ?Initializing:2636 ?Failed:393 ?Finished successfully:395 >>> The following errors have occurred: >>> 1. Server refused performing the request. Custom message: Bad password. >>> (error code 1) [Nested exception message: ?Custom message: Unexpected >>> reply: >>> 530-Login incorrect. : globus_gss_assist: Gridmap lookup failure: Could >>> not >>> map /DC=org/DC=doegrids/OU=People/CN=Ketan Maheshwari 64821 >>> 530- >>> 530 End.] (393 times) >>> 2. Application seispeak_agg not executed due to errors in dependencies >>> (216 >>> times) >>> 3. Application seispeak_local not executed due to errors in dependencies >>> (2420 times) >>> >>> While, I am debugging, kindly let me know if these message ring any >>> bells. >>> >>> Regards, >>> Ketan >>> >>> >>> On 6/1/11 1:02 PM, Allan Espinosa wrote: >>>> >>>> Hi Ketan, >>>> >>>> Could you add debugging for Swift's vdl:stagein calls? ?Also, is this >>>> using the stable branch? >>>> >>>> Here's the log4j.properties I always use: >>>> >>>> # Set root category priority to WARN and its appenders to CONSOLE and >>>> FILE. >>>> log4j.rootCategory=INFO, CONSOLE, FILE >>>> >>>> log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender >>>> log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout >>>> log4j.appender.CONSOLE.Threshold=INFO >>>> log4j.appender.CONSOLE.layout.ConversionPattern=%m%n >>>> >>>> log4j.appender.FILE=org.apache.log4j.FileAppender >>>> log4j.appender.FILE.File=swift.log >>>> log4j.appender.FILE.layout=org.apache.log4j.PatternLayout >>>> log4j.appender.FILE.layout.ConversionPattern=%d{yyyy-MM-dd >>>> HH:mm:ss,SSSZZZZZ} %-5p %c{1} %m%n >>>> >>>> log4j.logger.swift=DEBUG >>>> >>>> log4j.logger.org.apache.axis.utils=ERROR >>>> >>>> log4j.logger.org.globus.swift.trace=INFO >>>> >>>> log4j.logger.org.griphyn.vdl.karajan.Loader=DEBUG >>>> log4j.logger.org.globus.cog.karajan.workflow.events.WorkerSweeper=WARN >>>> log4j.logger.org.globus.cog.karajan.workflow.nodes.FlowNode=WARN >>>> >>>> >>>> log4j.logger.org.globus.cog.karajan.scheduler.WeightedHostScoreScheduler=DEBUG >>>> log4j.logger.org.griphyn.vdl.toolkit.VDLt2VDLx=DEBUG >>>> log4j.logger.org.griphyn.vdl.karajan.VDL2ExecutionContext=DEBUG >>>> log4j.logger.org.globus.cog.abstraction.impl.common.task.TaskImpl=INFO >>>> log4j.logger.org.griphyn.vdl.karajan.lib.GetFieldValue=DEBUG >>>> log4j.logger.org.griphyn.vdl.engine.Karajan=INFO >>>> log4j.logger.org.globus.cog.abstraction.coaster.rlog=DEBUG >>>> >>>> # log4j.logger.org.globus.swift.data.Director=DEBUG >>>> log4j.logger.org.griphyn.vdl.karajan.lib=INFO >>>> >>>> >>>> log4j.logger.org.griphyn.vdl.karajan.lib.SetFieldValue=OFF >>>> log4j.logger.org.griphyn.vdl.mapping.AbstractDataNode=OFF >>>> >>>> # Transfer >>>> #log4j.logger.org.globus.ftp=DEBUG >>>> #log4j.logger.org.globus.gridftp=DEBUG >>>> >>>> >>>> -Allan >>>> >>>> 2011/6/1 ketan: >>>>> >>>>> Allan, >>>>> >>>>> I tried to run the posproc workflow on the OSG whitelisted resources. >>>>> However, the workflow seems not to respond after completing the first >>>>> two >>>>> tasks: >>>>> >>>>> I get something like this: >>>>> >>>>> Progress: ?Selecting site:248 ?Stage in:2 ?Finished successfully:2 >>>>> Progress: ?Selecting site:248 ?Stage in:2 ?Finished successfully:2 >>>>> Progress: ?Selecting site:248 ?Stage in:2 ?Finished successfully:2 >>>>> Progress: ?Selecting site:248 ?Stage in:2 ?Finished successfully:2 >>>>> Progress: ?Selecting site:248 ?Stage in:2 ?Finished successfully:2 >>>>> .. >>>>> .. >>>>> .. >>>>> >>>>> >>>>> The sites.xml, tc.data and the log files are on bridled as follows: >>>>> >>>>> /home/ketan/osg-tg-effort/cybershake/condor_osg.xml >>>>> >>>>> /home/ketan/osg-tg-effort/cybershake/tc.data >>>>> >>>>> >>>>> /home/ketan/osg-tg-effort/cybershake/postproc-20110601-0951-43n3a22g.log >>>>> >>>>> Swift is: >>>>> >>>>> [bridled.ci.uchicago.edu:cybershake]$ which swift >>>>> swift is /home/ketan/swift-0.92.1/bin/swift >>>>> >>>>> I have memcached on, sourced /opt/osg-1.x.x/setup.sh and have a valid >>>>> proxy. >>>>> >>>>> Could you indicate what are the first debugging steps that I should be >>>>> taking on osg in such condition? >>>>> >>>>> >>>>> Thanks, >>>>> Ketan >>>>> From wilde at mcs.anl.gov Thu Jun 2 11:16:46 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 2 Jun 2011 11:16:46 -0500 (CDT) Subject: [Swift-devel] Re: SCEC postproc workflow unresponsive after first 2 tasks In-Reply-To: <4DE7B556.8090300@gmail.com> Message-ID: <260026145.135955.1307031406483.JavaMail.root@zimbra.anl.gov> Yes. Run a globus-job-run under that proxy, to any site in the Engage list, and use the command /usr/bin/id to see what UNIX login and group you are mapped to. - Mike ----- Original Message ----- > Is there a definitive way to know if i am not included on Engage > VOMRS, > since, I am able to create a proxy with vo membership for quite some > time now. > > On 6/2/11 11:06 AM, Allan Espinosa wrote: > > Hi Ketan, > > > > It maybe the case that you are not included in the Engage VOMRS. If > > you are included, you could file an OSG ticket regarding this issue > > on > > the grid operations center. > > > > -Allan > > > > 2011/6/2 ketan: > >> Ok, done this. On another trial, seems the workflow is progressing, > >> however, > >> after a few hours, I get the following errors: > >> > >> Final status: Initializing:2636 Failed:393 Finished > >> successfully:395 > >> The following errors have occurred: > >> 1. Server refused performing the request. Custom message: Bad > >> password. > >> (error code 1) [Nested exception message: Custom message: > >> Unexpected reply: > >> 530-Login incorrect. : globus_gss_assist: Gridmap lookup failure: > >> Could not > >> map /DC=org/DC=doegrids/OU=People/CN=Ketan Maheshwari 64821 > >> 530- > >> 530 End.] (393 times) > >> 2. Application seispeak_agg not executed due to errors in > >> dependencies (216 > >> times) > >> 3. Application seispeak_local not executed due to errors in > >> dependencies > >> (2420 times) > >> > >> While, I am debugging, kindly let me know if these message ring any > >> bells. > >> > >> Regards, > >> Ketan > >> > >> > >> On 6/1/11 1:02 PM, Allan Espinosa wrote: > >>> Hi Ketan, > >>> > >>> Could you add debugging for Swift's vdl:stagein calls? Also, is > >>> this > >>> using the stable branch? > >>> > >>> Here's the log4j.properties I always use: > >>> > >>> # Set root category priority to WARN and its appenders to CONSOLE > >>> and > >>> FILE. > >>> log4j.rootCategory=INFO, CONSOLE, FILE > >>> > >>> log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender > >>> log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout > >>> log4j.appender.CONSOLE.Threshold=INFO > >>> log4j.appender.CONSOLE.layout.ConversionPattern=%m%n > >>> > >>> log4j.appender.FILE=org.apache.log4j.FileAppender > >>> log4j.appender.FILE.File=swift.log > >>> log4j.appender.FILE.layout=org.apache.log4j.PatternLayout > >>> log4j.appender.FILE.layout.ConversionPattern=%d{yyyy-MM-dd > >>> HH:mm:ss,SSSZZZZZ} %-5p %c{1} %m%n > >>> > >>> log4j.logger.swift=DEBUG > >>> > >>> log4j.logger.org.apache.axis.utils=ERROR > >>> > >>> log4j.logger.org.globus.swift.trace=INFO > >>> > >>> log4j.logger.org.griphyn.vdl.karajan.Loader=DEBUG > >>> log4j.logger.org.globus.cog.karajan.workflow.events.WorkerSweeper=WARN > >>> log4j.logger.org.globus.cog.karajan.workflow.nodes.FlowNode=WARN > >>> > >>> log4j.logger.org.globus.cog.karajan.scheduler.WeightedHostScoreScheduler=DEBUG > >>> log4j.logger.org.griphyn.vdl.toolkit.VDLt2VDLx=DEBUG > >>> log4j.logger.org.griphyn.vdl.karajan.VDL2ExecutionContext=DEBUG > >>> log4j.logger.org.globus.cog.abstraction.impl.common.task.TaskImpl=INFO > >>> log4j.logger.org.griphyn.vdl.karajan.lib.GetFieldValue=DEBUG > >>> log4j.logger.org.griphyn.vdl.engine.Karajan=INFO > >>> log4j.logger.org.globus.cog.abstraction.coaster.rlog=DEBUG > >>> > >>> # log4j.logger.org.globus.swift.data.Director=DEBUG > >>> log4j.logger.org.griphyn.vdl.karajan.lib=INFO > >>> > >>> > >>> log4j.logger.org.griphyn.vdl.karajan.lib.SetFieldValue=OFF > >>> log4j.logger.org.griphyn.vdl.mapping.AbstractDataNode=OFF > >>> > >>> # Transfer > >>> #log4j.logger.org.globus.ftp=DEBUG > >>> #log4j.logger.org.globus.gridftp=DEBUG > >>> > >>> > >>> -Allan > >>> > >>> 2011/6/1 ketan: > >>>> Allan, > >>>> > >>>> I tried to run the posproc workflow on the OSG whitelisted > >>>> resources. > >>>> However, the workflow seems not to respond after completing the > >>>> first two > >>>> tasks: > >>>> > >>>> I get something like this: > >>>> > >>>> Progress: Selecting site:248 Stage in:2 Finished successfully:2 > >>>> Progress: Selecting site:248 Stage in:2 Finished successfully:2 > >>>> Progress: Selecting site:248 Stage in:2 Finished successfully:2 > >>>> Progress: Selecting site:248 Stage in:2 Finished successfully:2 > >>>> Progress: Selecting site:248 Stage in:2 Finished successfully:2 > >>>> .. > >>>> .. > >>>> .. > >>>> > >>>> > >>>> The sites.xml, tc.data and the log files are on bridled as > >>>> follows: > >>>> > >>>> /home/ketan/osg-tg-effort/cybershake/condor_osg.xml > >>>> > >>>> /home/ketan/osg-tg-effort/cybershake/tc.data > >>>> > >>>> /home/ketan/osg-tg-effort/cybershake/postproc-20110601-0951-43n3a22g.log > >>>> > >>>> Swift is: > >>>> > >>>> [bridled.ci.uchicago.edu:cybershake]$ which swift > >>>> swift is /home/ketan/swift-0.92.1/bin/swift > >>>> > >>>> I have memcached on, sourced /opt/osg-1.x.x/setup.sh and have a > >>>> valid > >>>> proxy. > >>>> > >>>> Could you indicate what are the first debugging steps that I > >>>> should be > >>>> taking on osg in such condition? > >>>> > >>>> > >>>> Thanks, > >>>> Ketan > >>>> > >>>> > >>> > >> > > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From bresnaha at mcs.anl.gov Thu Jun 2 13:42:19 2011 From: bresnaha at mcs.anl.gov (John Bresnahan) Date: Thu, 02 Jun 2011 08:42:19 -1000 Subject: [Swift-devel] Re: Getting VMs from FG for use with swift In-Reply-To: References: <4DD70560.1020901@mcs.anl.gov> Message-ID: <4DE7D98B.9090407@mcs.anl.gov> I have not forgotten about you guys, but the gpfs file system on hotel.futuregrid.org continues to have problems. I arrive in Chicago on Friday so we should have something straightened out next week. Sorry about the delay. John On 05/23/2011 05:24 AM, David Kelly wrote: > Hi John, > > I now have a futuregrid account and am added to a project. I am now trying to get our scripts > working together. > > I ran into a few problems at first when trying to run the futuregrid scripts. On the first system I > tried I was getting a traceback. It is possible that the system I was using has older versions of > some of the needed libraries. Then I tried it on a more system that is more frequently updated - my > laptop running Ubuntu 10.10. It needed a newer version of the Python crypto tools installed, so I > installed that (and the python development libraries) and that part seems fine now. > > I am now up to the point of the install script where it is trying to register keys, but it is > failing. My guess is that I need to change FUTUREGRID_IAAS_ACCESS_KEY and FUTUREGRID_IAAS_SECRET_KEY > in env.sh. I'm not sure what these should be exactly. Are these the contents of my ssh keys, an ssh > key and a passphrase, or some other type of security? I've tried a few combinations of different > things but haven't had much luck yet. > > Thanks! > > Regards, > David > > > Traceback from earlier: > Installing setuptools.......................done. > Complete output from command /autonfs/home/davidk/swift-vm-...ython > /autonfs/home/davidk/swift-vm-...stall pip: > Searching for pip > Reading http://pypi.python.org/simple/pip/ > Reading http://pip.openplans.org > Reading http://www.pip-installer.org > Best match: pip 1.0.1 > Downloading > http://pypi.python.org/packages/source/p/pip/pip-1.0.1.tar.gz#md5=28dcc70225e5bf925532abc5b087a94b > Processing pip-1.0.1.tar.gz > Running pip-1.0.1/setup.py -q bdist_egg --dist-dir > /tmp/easy_install-GHsjHX/pip-1.0.1/egg-dist-tmp-rXjQ7L > Traceback (most recent call last): > File "/autonfs/home/davidk/swift-vm-boot/ve/bin/easy_install", line 8, in > load_entry_point('setuptools==0.6c11', 'console_scripts', 'easy_install')() > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 1712, in main > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 1700, in with_ei_usage > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 1716, in > File "/soft/python-2.6.1-r1/lib/python2.6/distutils/core.py", line 152, in setup > dist.run_commands() > File "/soft/python-2.6.1-r1/lib/python2.6/distutils/dist.py", line 975, in run_commands > self.run_command(cmd) > File "/soft/python-2.6.1-r1/lib/python2.6/distutils/dist.py", line 995, in run_command > cmd_obj.run() > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 211, in run > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 446, in easy_install > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 476, in install_item > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 655, in install_eggs > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 930, in build_and_install > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 919, in run_setup > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/sandbox.py", > line 52, in run_setup > AttributeError: 'module' object has no attribute '__getstate__' > ---------------------------------------- > Traceback (most recent call last): > File "bin/virtualenv.py", line 1647, in > main() > File "bin/virtualenv.py", line 558, in main > prompt=options.prompt) > File "bin/virtualenv.py", line 656, in create_environment > install_pip(py_executable) > File "bin/virtualenv.py", line 415, in install_pip > filter_stdout=_filter_setup) > File "bin/virtualenv.py", line 624, in call_subprocess > % (cmd_desc, proc.returncode)) > OSError: Command /autonfs/home/davidk/swift-vm-...ython /autonfs/home/davidk/swift-vm-...stall pip > failed with error code 1 > Failed to created the needed python virtual environment > > On Fri, May 20, 2011 at 7:20 PM, John Bresnahan > > wrote: > > Our phone call today left me motiviated to show you guys how easy it is to get virtual machines > for use with swift on FutureGrid. > > I made some small scripts around the Nimbus tool cloudinitd. The scripts just make installing > the software and running it trivial. With a single command you can get N VMs from the > FutureGrid Nimbus clouds (N can be on the order of hundreds). When the tool is done it outputs > a line separated list of hostnames. All of these hostnames have root access available via your > ~/.ssh/id_rsa keys. > > If/when you have FutureGrid credentials, untar the attachment and give it a try. There are a > few minor configurations needed: > > > 1) edit the file env.sh and set your FutureGrid security credentials: > > % cat env.sh > export FUTUREGRID_IAAS_ACCESS_KEY=XXXXXXXXXXXXXXXXXX > export FUTUREGRID_IAAS_SECRET_KEY=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX > > export FUTUREGRID_HOTEL_NODES=2 > export FUTUREGRID_SIERRA_NODES=2 > > You can also change the value '2' to be whatever number of VMs you want. > > > 2) install it on your system. (this single command downloads and installs everything you need > under the cwd): > > % ./install.sh > > 3) boot the VMs > % ./bin/bootit.sh. > > You will see much status output, but the last several lines will be the hostnames acquired from > the cloud. > > Let me know when you guys are ready to check this out! > > From dsk at ci.uchicago.edu Thu Jun 2 13:42:52 2011 From: dsk at ci.uchicago.edu (Daniel S. Katz) Date: Thu, 2 Jun 2011 13:42:52 -0500 Subject: [Swift-devel] OSG usage of Swift Message-ID: <40ED7ABB-36B4-4C19-8346-75565902CED9@ci.uchicago.edu> If possible, we should think about Swift/Coasters along the lines of the table at https://twiki.grid.iu.edu/bin/view/Documentation/JobSubmissionComparison#High_Level_Functionality_Compari Dan -- Daniel S. Katz University of Chicago (773) 834-7186 (voice) (773) 834-6818 (fax) d.katz at ieee.org or dsk at ci.uchicago.edu http://www.ci.uchicago.edu/~dsk/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Thu Jun 2 14:10:22 2011 From: jonmon at utexas.edu (Jonathan S Monette) Date: Thu, 2 Jun 2011 14:10:22 -0500 Subject: [Swift-devel] InternalHostname in sites file Message-ID: Mihael, I believe we have talked about this before but why is it necessary for an InternalHostname to be specified for PADS? I know that the address that coasters connects to is wrong but I do not remember why that was. Could you give an explanation on why internalHostname needs to be set? -- Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction. - Albert Einstein -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Thu Jun 2 14:45:24 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 02 Jun 2011 12:45:24 -0700 Subject: [Swift-devel] Re: InternalHostname in sites file In-Reply-To: References: Message-ID: <1307043924.19522.4.camel@blabla2.none> Right. If the head node has multiple network interfaces, only one of which is visible from the worker nodes. The choice of which interface is the one that worker nodes can connect to is a matter of the particular cluster. It's not particularly easy to have an automated mechanism that figures it out. We tried some scheme to pass all the interface addresses to the worker and let it try to connect to all of them in order, but that didn't work very well. Of course, there might be a scheme that works, but I didn't want to spend too much time on that. So that's why it's needed. To clarify to the workers which exact interface on the head node they are to try to connect to. Mihael On Thu, 2011-06-02 at 14:10 -0500, Jonathan S Monette wrote: > Mihael, > I believe we have talked about this before but why is it necessary > for an InternalHostname to be specified for PADS? I know that the > address that coasters connects to is wrong but I do not remember why > that was. Could you give an explanation on why internalHostname needs > to be set? > > -- > > > Any intelligent fool can make things bigger and more complex... It > takes a touch of genius - and a lot of courage to move in the opposite > direction. > > - Albert Einstein > > > From wilde at mcs.anl.gov Thu Jun 2 15:08:55 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 2 Jun 2011 15:08:55 -0500 (CDT) Subject: [Swift-devel] Re: InternalHostname in sites file In-Reply-To: <1307043924.19522.4.camel@blabla2.none> Message-ID: <346384249.137621.1307045335440.JavaMail.root@zimbra.anl.gov> Thanks for clarifying. Jon and/or David, can you address this with a cookbook entry on Coasters that heads towards a users guide section? We should tell users what they can run on their cluster (eg ping or telnet-style connect tests) to validate the setting of internalHostName. - Mike ----- Original Message ----- > Right. If the head node has multiple network interfaces, only one of > which is visible from the worker nodes. > > The choice of which interface is the one that worker nodes can connect > to is a matter of the particular cluster. It's not particularly easy > to > have an automated mechanism that figures it out. We tried some scheme > to > pass all the interface addresses to the worker and let it try to > connect > to all of them in order, but that didn't work very well. Of course, > there might be a scheme that works, but I didn't want to spend too > much > time on that. > > So that's why it's needed. To clarify to the workers which exact > interface on the head node they are to try to connect to. > > Mihael > > On Thu, 2011-06-02 at 14:10 -0500, Jonathan S Monette wrote: > > Mihael, > > I believe we have talked about this before but why is it > > necessary > > for an InternalHostname to be specified for PADS? I know that the > > address that coasters connects to is wrong but I do not remember why > > that was. Could you give an explanation on why internalHostname > > needs > > to be set? > > > > -- > > > > > > Any intelligent fool can make things bigger and more complex... It > > takes a touch of genius - and a lot of courage to move in the > > opposite > > direction. > > > > - Albert Einstein > > > > > > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From jonmon at utexas.edu Thu Jun 2 15:10:38 2011 From: jonmon at utexas.edu (Jonathan S Monette) Date: Thu, 2 Jun 2011 15:10:38 -0500 Subject: [Swift-devel] Re: InternalHostname in sites file In-Reply-To: <346384249.137621.1307045335440.JavaMail.root@zimbra.anl.gov> References: <1307043924.19522.4.camel@blabla2.none> <346384249.137621.1307045335440.JavaMail.root@zimbra.anl.gov> Message-ID: I can write something up and send it to David to add somewhere for the userguide. Not sure where the files for the userguide are kept. On Thu, Jun 2, 2011 at 3:08 PM, Michael Wilde wrote: > Thanks for clarifying. Jon and/or David, can you address this with a > cookbook entry on Coasters that heads towards a users guide section? > > We should tell users what they can run on their cluster (eg ping or > telnet-style connect tests) to validate the setting of internalHostName. > > - Mike > > > > ----- Original Message ----- > > Right. If the head node has multiple network interfaces, only one of > > which is visible from the worker nodes. > > > > The choice of which interface is the one that worker nodes can connect > > to is a matter of the particular cluster. It's not particularly easy > > to > > have an automated mechanism that figures it out. We tried some scheme > > to > > pass all the interface addresses to the worker and let it try to > > connect > > to all of them in order, but that didn't work very well. Of course, > > there might be a scheme that works, but I didn't want to spend too > > much > > time on that. > > > > So that's why it's needed. To clarify to the workers which exact > > interface on the head node they are to try to connect to. > > > > Mihael > > > > On Thu, 2011-06-02 at 14:10 -0500, Jonathan S Monette wrote: > > > Mihael, > > > I believe we have talked about this before but why is it > > > necessary > > > for an InternalHostname to be specified for PADS? I know that the > > > address that coasters connects to is wrong but I do not remember why > > > that was. Could you give an explanation on why internalHostname > > > needs > > > to be set? > > > > > > -- > > > > > > > > > Any intelligent fool can make things bigger and more complex... It > > > takes a touch of genius - and a lot of courage to move in the > > > opposite > > > direction. > > > > > > - Albert Einstein > > > > > > > > > > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > -- Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction. - Albert Einstein -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.g.armstrong at gmail.com Thu Jun 2 15:24:04 2011 From: tim.g.armstrong at gmail.com (Tim Armstrong) Date: Thu, 2 Jun 2011 15:24:04 -0500 Subject: [Swift-devel] recent error on beagle In-Reply-To: <1306442497.16145.1.camel@blabla2.none> References: <2008592710.93053.1306015923349.JavaMail.root@zimbra.anl.gov> <4DD91203.6090000@gmail.com> <1306090299.2956.1.camel@blabla2.none> <1306442497.16145.1.camel@blabla2.none> Message-ID: Any word on this bug? I have a nice use-case for SwiftR where it would be very handy to take advantage of Swift's dynamic resource procurement. - Tim On Thu, May 26, 2011 at 3:41 PM, Mihael Hategan wrote: > Given that this has now been reported a number of times, it may make > sense to backport the fix from trunk and make a patch release for 0.92. > > Objections? > > On Thu, 2011-05-26 at 14:59 -0500, Tim Armstrong wrote: > > Hi, > > I've encountered this issue with SwiftR, running release 0.92 from > > the svn repository. The issue occurs when > > GLOBUS::maxWallTime="03:55:00" in tc and maxTime is 4 hours in > > sites.xml. After 5 minutes (or whatever the difference is between the > > two times), I get the exception copied below. A tarball is attached > > with the logs, script, etc. replicate.sh shows how to replicate the > > issue on PADS. > > > > Assuming that my problem is the same as the others, it would be good > > if the fix could be merged to release 0.92, as I'm trying to bundle > > stable swift releases with SwiftR. > > > > - Tim > > > > > > Swift svn swift-r4336 cog-r3096 (cog modified locally) > > > > RunID: 20110526-1317-2c8ybi10 > > Progress: > > SwiftScript trace: top of loop: rserver waiting for input > > on, /tmp/nbest/SwiftR/swift.0827/requestpipe > > Progress: Active:1 > > Progress: Finished successfully:1 > > SwiftScript trace: rserver: got > > dir, /tmp/nbest/SwiftR/requests.P09626/R0000007 > > Progress: uninitialized:1 Finished successfully:1 > > Progress: Submitted:1 Finished successfully:1 > > Progress: Active:1 Finished successfully:1 > > Progress: Active:1 Finished successfully:1 > > Progress: Active:1 Finished successfully:1 > > Progress: Active:1 Finished successfully:1 > > Progress: Active:1 Finished successfully:1 > > Progress: Active:1 Finished successfully:1 > > Progress: Active:1 Finished successfully:1 > > Progress: Active:1 Finished successfully:1 > > Progress: Active:1 Finished successfully:1 > > queuedsize > 0 but no job dequeued. Queued: {} > > java.lang.Throwable > > at > > > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.requeueNonFitting(BlockQueueProcessor.java:252) > > at > > > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:520) > > at > > > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109) > > queuedsize > 0 but no job dequeued. Queued: {} > > java.lang.Throwable > > at > > > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.requeueNonFitting(BlockQueueProcessor.java:252) > > at > > > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:520) > > at > > > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109) > > Progress: Finished successfully:1 Failed but can retry:1 > > > > > > On Sun, May 22, 2011 at 1:51 PM, Mihael Hategan > > wrote: > > The second one looks to me like a coaster problem. Can't say > > much about > > the first issue. > > > > Can you try with plain pbs if you want to test the pbs > > provider? > > > > Mihael > > > > > > On Sun, 2011-05-22 at 08:39 -0500, ketan wrote: > > > I can confirm that the trunk is not usable for pbs provider. > > I am using > > > trunk for submitting jobs on beagle and I see a few > > unexpected things: > > > > > > 1. The stderr is showing inconsistent messages: The results > > are getting > > > written to the output even though stderr doesn't report any. > > > 2. qsub jobs being cancelled inadvertantly: I submitted 40 > > of them > > > yesterday, however, only 2 survived today. The log is here: > > > > > > > > > http://www.ci.uchicago.edu/~ketan/files/ftdock-20110521-0337-pokpgg89.log > > > > > > In addition, the ssh-pbs provider does not seem to be > > working for large > > > runs (it worked for a small number of test runs): Getting > > unexpected > > > stdouts. Following is the stdout: > > > > > > http://www.ci.uchicago.edu/~ketan/files/ssh-pbs.stdout > > > > > > Following is the log file for the above run: > > > > > > > > > http://www.ci.uchicago.edu/~ketan/files/ftdock-20110521-1750-b0cot9sa.log > > > > > > > > > Ketan > > > > > > On 5/21/11 5:12 PM, Michael Wilde wrote: > > > > > > > > ----- Original Message ----- > > > >> On Sat, 2011-05-21 at 17:06 -0400, Glen Hocky wrote: > > > >>> as I mentioned, I've been running with Mike's swift > > which was > > > >>> patched > > > >>> for beagle. are all the things that make running on > > beagle work in > > > >>> trunk? > > > >> No idea. > > > >> > > > >> Mike? > > > > Justin, working with Ketan, just applied changes to trunk > > which should make it work now on Beagle (or any Cray XT5+ or > > XE). This uses a different set of sites.xml tags than the > > prototype in the current Beagle swift 0.92.1 module. Justin > > has a note on this at: > > > > https://sites.google.com/site/swiftdevel/sites/pbs/cray > > > > > > > > It was working before for one-node worker jobs; now it > > should work for multi-node worker jobs as well. > > > > > > > > Justin and Ketan should comment on the state of testing > > and readiness of this trunk feature. Don't try trunk on > > Beagle till they give the go-ahead. > > > > > > > > - Mike > > > > > > > >>> If so i'll update to the latest and test. I don't > > think I'm > > > >>> using stable... > > > >> Ok > > > >> > > > >> Mihael > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Thu Jun 2 15:31:51 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 2 Jun 2011 15:31:51 -0500 (CDT) Subject: [Swift-devel] recent error on beagle In-Reply-To: Message-ID: <1641234307.137796.1307046711695.JavaMail.root@zimbra.anl.gov> Tim, in addition: whats the status of the problem of not being able to launch two concurrent R applications on the same compute node? The problem below implies that you've resolved this prior problem? If so, what was the resolution? Thanks, Mike ----- Original Message ----- Any word on this bug? I have a nice use-case for SwiftR where it would be very handy to take advantage of Swift's dynamic resource procurement. - Tim On Thu, May 26, 2011 at 3:41 PM, Mihael Hategan < hategan at mcs.anl.gov > wrote: Given that this has now been reported a number of times, it may make sense to backport the fix from trunk and make a patch release for 0.92. Objections? On Thu, 2011-05-26 at 14:59 -0500, Tim Armstrong wrote: > Hi, > I've encountered this issue with SwiftR, running release 0.92 from > the svn repository. The issue occurs when > GLOBUS::maxWallTime="03:55:00" in tc and maxTime is 4 hours in > sites.xml. After 5 minutes (or whatever the difference is between the > two times), I get the exception copied below. A tarball is attached > with the logs, script, etc. replicate.sh shows how to replicate the > issue on PADS. > > Assuming that my problem is the same as the others, it would be good > if the fix could be merged to release 0.92, as I'm trying to bundle > stable swift releases with SwiftR. > > - Tim > > > Swift svn swift-r4336 cog-r3096 (cog modified locally) > > RunID: 20110526-1317-2c8ybi10 > Progress: > SwiftScript trace: top of loop: rserver waiting for input > on, /tmp/nbest/SwiftR/swift.0827/requestpipe > Progress: Active:1 > Progress: Finished successfully:1 > SwiftScript trace: rserver: got > dir, /tmp/nbest/SwiftR/requests.P09626/R0000007 > Progress: uninitialized:1 Finished successfully:1 > Progress: Submitted:1 Finished successfully:1 > Progress: Active:1 Finished successfully:1 > Progress: Active:1 Finished successfully:1 > Progress: Active:1 Finished successfully:1 > Progress: Active:1 Finished successfully:1 > Progress: Active:1 Finished successfully:1 > Progress: Active:1 Finished successfully:1 > Progress: Active:1 Finished successfully:1 > Progress: Active:1 Finished successfully:1 > Progress: Active:1 Finished successfully:1 > queuedsize > 0 but no job dequeued. Queued: {} > java.lang.Throwable > at > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.requeueNonFitting(BlockQueueProcessor.java:252) > at > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:520) > at > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109) > queuedsize > 0 but no job dequeued. Queued: {} > java.lang.Throwable > at > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.requeueNonFitting(BlockQueueProcessor.java:252) > at > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:520) > at > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109) > Progress: Finished successfully:1 Failed but can retry:1 > > > On Sun, May 22, 2011 at 1:51 PM, Mihael Hategan < hategan at mcs.anl.gov > > wrote: > The second one looks to me like a coaster problem. Can't say > much about > the first issue. > > Can you try with plain pbs if you want to test the pbs > provider? > > Mihael > > > On Sun, 2011-05-22 at 08:39 -0500, ketan wrote: > > I can confirm that the trunk is not usable for pbs provider. > I am using > > trunk for submitting jobs on beagle and I see a few > unexpected things: > > > > 1. The stderr is showing inconsistent messages: The results > are getting > > written to the output even though stderr doesn't report any. > > 2. qsub jobs being cancelled inadvertantly: I submitted 40 > of them > > yesterday, however, only 2 survived today. The log is here: > > > > > http://www.ci.uchicago.edu/~ketan/files/ftdock-20110521-0337-pokpgg89.log > > > > In addition, the ssh-pbs provider does not seem to be > working for large > > runs (it worked for a small number of test runs): Getting > unexpected > > stdouts. Following is the stdout: > > > > http://www.ci.uchicago.edu/~ketan/files/ssh-pbs.stdout > > > > Following is the log file for the above run: > > > > > http://www.ci.uchicago.edu/~ketan/files/ftdock-20110521-1750-b0cot9sa.log > > > > > > Ketan > > > > On 5/21/11 5:12 PM, Michael Wilde wrote: > > > > > > ----- Original Message ----- > > >> On Sat, 2011-05-21 at 17:06 -0400, Glen Hocky wrote: > > >>> as I mentioned, I've been running with Mike's swift > which was > > >>> patched > > >>> for beagle. are all the things that make running on > beagle work in > > >>> trunk? > > >> No idea. > > >> > > >> Mike? > > > Justin, working with Ketan, just applied changes to trunk > which should make it work now on Beagle (or any Cray XT5+ or > XE). This uses a different set of sites.xml tags than the > prototype in the current Beagle swift 0.92.1 module. Justin > has a note on this at: > > > https://sites.google.com/site/swiftdevel/sites/pbs/cray > > > > > > It was working before for one-node worker jobs; now it > should work for multi-node worker jobs as well. > > > > > > Justin and Ketan should comment on the state of testing > and readiness of this trunk feature. Don't try trunk on > Beagle till they give the go-ahead. > > > > > > - Mike > > > > > >>> If so i'll update to the latest and test. I don't > think I'm > > >>> using stable... > > >> Ok > > >> > > >> Mihael > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > _______________________________________________ Swift-devel mailing list Swift-devel at ci.uchicago.edu http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Thu Jun 2 15:39:42 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 02 Jun 2011 13:39:42 -0700 Subject: [Swift-devel] recent error on beagle In-Reply-To: References: <2008592710.93053.1306015923349.JavaMail.root@zimbra.anl.gov> <4DD91203.6090000@gmail.com> <1306090299.2956.1.camel@blabla2.none> <1306442497.16145.1.camel@blabla2.none> Message-ID: <1307047182.20017.2.camel@blabla2.none> Yes. Sorry about the delay. The word is that I need to backport the patch from trunk to 0.92 and then have a patch release. I was waiting for words from other folks, and I got that yesterday. I will be doing this as soon as I have some time, which is probably somewhere between today and next Tuesday. Mihael On Thu, 2011-06-02 at 15:24 -0500, Tim Armstrong wrote: > Any word on this bug? I have a nice use-case for SwiftR where it > would be very handy to take advantage of Swift's dynamic resource > procurement. > > - Tim > > On Thu, May 26, 2011 at 3:41 PM, Mihael Hategan > wrote: > Given that this has now been reported a number of times, it > may make > sense to backport the fix from trunk and make a patch release > for 0.92. > > Objections? > > > On Thu, 2011-05-26 at 14:59 -0500, Tim Armstrong wrote: > > Hi, > > I've encountered this issue with SwiftR, running release > 0.92 from > > the svn repository. The issue occurs when > > GLOBUS::maxWallTime="03:55:00" in tc and maxTime is 4 hours > in > > sites.xml. After 5 minutes (or whatever the difference is > between the > > two times), I get the exception copied below. A tarball is > attached > > with the logs, script, etc. replicate.sh shows how to > replicate the > > issue on PADS. > > > > Assuming that my problem is the same as the others, it would > be good > > if the fix could be merged to release 0.92, as I'm trying to > bundle > > stable swift releases with SwiftR. > > > > - Tim > > > > > > Swift svn swift-r4336 cog-r3096 (cog modified locally) > > > > RunID: 20110526-1317-2c8ybi10 > > Progress: > > SwiftScript trace: top of loop: rserver waiting for input > > on, /tmp/nbest/SwiftR/swift.0827/requestpipe > > Progress: Active:1 > > Progress: Finished successfully:1 > > SwiftScript trace: rserver: got > > dir, /tmp/nbest/SwiftR/requests.P09626/R0000007 > > Progress: uninitialized:1 Finished successfully:1 > > Progress: Submitted:1 Finished successfully:1 > > Progress: Active:1 Finished successfully:1 > > Progress: Active:1 Finished successfully:1 > > Progress: Active:1 Finished successfully:1 > > Progress: Active:1 Finished successfully:1 > > Progress: Active:1 Finished successfully:1 > > Progress: Active:1 Finished successfully:1 > > Progress: Active:1 Finished successfully:1 > > Progress: Active:1 Finished successfully:1 > > Progress: Active:1 Finished successfully:1 > > queuedsize > 0 but no job dequeued. Queued: {} > > java.lang.Throwable > > at > > > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.requeueNonFitting(BlockQueueProcessor.java:252) > > at > > > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:520) > > at > > > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109) > > queuedsize > 0 but no job dequeued. Queued: {} > > java.lang.Throwable > > at > > > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.requeueNonFitting(BlockQueueProcessor.java:252) > > at > > > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:520) > > at > > > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109) > > Progress: Finished successfully:1 Failed but can retry:1 > > > > > > On Sun, May 22, 2011 at 1:51 PM, Mihael Hategan > > > wrote: > > The second one looks to me like a coaster problem. > Can't say > > much about > > the first issue. > > > > Can you try with plain pbs if you want to test the > pbs > > provider? > > > > Mihael > > > > > > On Sun, 2011-05-22 at 08:39 -0500, ketan wrote: > > > I can confirm that the trunk is not usable for pbs > provider. > > I am using > > > trunk for submitting jobs on beagle and I see a > few > > unexpected things: > > > > > > 1. The stderr is showing inconsistent messages: > The results > > are getting > > > written to the output even though stderr doesn't > report any. > > > 2. qsub jobs being cancelled inadvertantly: I > submitted 40 > > of them > > > yesterday, however, only 2 survived today. The log > is here: > > > > > > > > > http://www.ci.uchicago.edu/~ketan/files/ftdock-20110521-0337-pokpgg89.log > > > > > > In addition, the ssh-pbs provider does not seem to > be > > working for large > > > runs (it worked for a small number of test runs): > Getting > > unexpected > > > stdouts. Following is the stdout: > > > > > > > http://www.ci.uchicago.edu/~ketan/files/ssh-pbs.stdout > > > > > > Following is the log file for the above run: > > > > > > > > > http://www.ci.uchicago.edu/~ketan/files/ftdock-20110521-1750-b0cot9sa.log > > > > > > > > > Ketan > > > > > > On 5/21/11 5:12 PM, Michael Wilde wrote: > > > > > > > > ----- Original Message ----- > > > >> On Sat, 2011-05-21 at 17:06 -0400, Glen Hocky > wrote: > > > >>> as I mentioned, I've been running with Mike's > swift > > which was > > > >>> patched > > > >>> for beagle. are all the things that make > running on > > beagle work in > > > >>> trunk? > > > >> No idea. > > > >> > > > >> Mike? > > > > Justin, working with Ketan, just applied changes > to trunk > > which should make it work now on Beagle (or any Cray > XT5+ or > > XE). This uses a different set of sites.xml tags > than the > > prototype in the current Beagle swift 0.92.1 module. > Justin > > has a note on this at: > > > > > https://sites.google.com/site/swiftdevel/sites/pbs/cray > > > > > > > > It was working before for one-node worker jobs; > now it > > should work for multi-node worker jobs as well. > > > > > > > > Justin and Ketan should comment on the state of > testing > > and readiness of this trunk feature. Don't try > trunk on > > Beagle till they give the go-ahead. > > > > > > > > - Mike > > > > > > > >>> If so i'll update to the latest and test. I > don't > > think I'm > > > >>> using stable... > > > >> Ok > > > >> > > > >> Mihael > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > From tim.g.armstrong at gmail.com Thu Jun 2 16:21:42 2011 From: tim.g.armstrong at gmail.com (Tim Armstrong) Date: Thu, 2 Jun 2011 16:21:42 -0500 Subject: [Swift-devel] recent error on beagle In-Reply-To: <1307047182.20017.2.camel@blabla2.none> References: <2008592710.93053.1306015923349.JavaMail.root@zimbra.anl.gov> <4DD91203.6090000@gmail.com> <1306090299.2956.1.camel@blabla2.none> <1306442497.16145.1.camel@blabla2.none> <1307047182.20017.2.camel@blabla2.none> Message-ID: Mihael: thanks, I appreciate it, sorry to bug you Mike: this problem was occuring to me on PADS (the thread was originally about a similar problem on Beagle). I haven't made any progress debugging the issue on beagle, beyond coming up with the minimal example to replicate it. I managed to pare down the example even more: it deadlocks if the pthread library is linked dynamically, even if no pthreads functions are actually used. Ie. the deadlock happens at the time the shared library is loaded. I unsuccessfully attempted some different workarounds. I'm pretty much out of ideas on how to make progress on this - getting Cray in on the problem might be best at this point. - Tim On Thu, Jun 2, 2011 at 3:39 PM, Mihael Hategan wrote: > Yes. Sorry about the delay. The word is that I need to backport the > patch from trunk to 0.92 and then have a patch release. I was waiting > for words from other folks, and I got that yesterday. I will be doing > this as soon as I have some time, which is probably somewhere between > today and next Tuesday. > > Mihael > > On Thu, 2011-06-02 at 15:24 -0500, Tim Armstrong wrote: > > Any word on this bug? I have a nice use-case for SwiftR where it > > would be very handy to take advantage of Swift's dynamic resource > > procurement. > > > > - Tim > > > > On Thu, May 26, 2011 at 3:41 PM, Mihael Hategan > > wrote: > > Given that this has now been reported a number of times, it > > may make > > sense to backport the fix from trunk and make a patch release > > for 0.92. > > > > Objections? > > > > > > On Thu, 2011-05-26 at 14:59 -0500, Tim Armstrong wrote: > > > Hi, > > > I've encountered this issue with SwiftR, running release > > 0.92 from > > > the svn repository. The issue occurs when > > > GLOBUS::maxWallTime="03:55:00" in tc and maxTime is 4 hours > > in > > > sites.xml. After 5 minutes (or whatever the difference is > > between the > > > two times), I get the exception copied below. A tarball is > > attached > > > with the logs, script, etc. replicate.sh shows how to > > replicate the > > > issue on PADS. > > > > > > Assuming that my problem is the same as the others, it would > > be good > > > if the fix could be merged to release 0.92, as I'm trying to > > bundle > > > stable swift releases with SwiftR. > > > > > > - Tim > > > > > > > > > Swift svn swift-r4336 cog-r3096 (cog modified locally) > > > > > > RunID: 20110526-1317-2c8ybi10 > > > Progress: > > > SwiftScript trace: top of loop: rserver waiting for input > > > on, /tmp/nbest/SwiftR/swift.0827/requestpipe > > > Progress: Active:1 > > > Progress: Finished successfully:1 > > > SwiftScript trace: rserver: got > > > dir, /tmp/nbest/SwiftR/requests.P09626/R0000007 > > > Progress: uninitialized:1 Finished successfully:1 > > > Progress: Submitted:1 Finished successfully:1 > > > Progress: Active:1 Finished successfully:1 > > > Progress: Active:1 Finished successfully:1 > > > Progress: Active:1 Finished successfully:1 > > > Progress: Active:1 Finished successfully:1 > > > Progress: Active:1 Finished successfully:1 > > > Progress: Active:1 Finished successfully:1 > > > Progress: Active:1 Finished successfully:1 > > > Progress: Active:1 Finished successfully:1 > > > Progress: Active:1 Finished successfully:1 > > > queuedsize > 0 but no job dequeued. Queued: {} > > > java.lang.Throwable > > > at > > > > > > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.requeueNonFitting(BlockQueueProcessor.java:252) > > > at > > > > > > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:520) > > > at > > > > > > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109) > > > queuedsize > 0 but no job dequeued. Queued: {} > > > java.lang.Throwable > > > at > > > > > > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.requeueNonFitting(BlockQueueProcessor.java:252) > > > at > > > > > > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.updatePlan(BlockQueueProcessor.java:520) > > > at > > > > > > org.globus.cog.abstraction.coaster.service.job.manager.BlockQueueProcessor.run(BlockQueueProcessor.java:109) > > > Progress: Finished successfully:1 Failed but can retry:1 > > > > > > > > > On Sun, May 22, 2011 at 1:51 PM, Mihael Hategan > > > > > wrote: > > > The second one looks to me like a coaster problem. > > Can't say > > > much about > > > the first issue. > > > > > > Can you try with plain pbs if you want to test the > > pbs > > > provider? > > > > > > Mihael > > > > > > > > > On Sun, 2011-05-22 at 08:39 -0500, ketan wrote: > > > > I can confirm that the trunk is not usable for pbs > > provider. > > > I am using > > > > trunk for submitting jobs on beagle and I see a > > few > > > unexpected things: > > > > > > > > 1. The stderr is showing inconsistent messages: > > The results > > > are getting > > > > written to the output even though stderr doesn't > > report any. > > > > 2. qsub jobs being cancelled inadvertantly: I > > submitted 40 > > > of them > > > > yesterday, however, only 2 survived today. The log > > is here: > > > > > > > > > > > > > > http://www.ci.uchicago.edu/~ketan/files/ftdock-20110521-0337-pokpgg89.log > > > > > > > > In addition, the ssh-pbs provider does not seem to > > be > > > working for large > > > > runs (it worked for a small number of test runs): > > Getting > > > unexpected > > > > stdouts. Following is the stdout: > > > > > > > > > > http://www.ci.uchicago.edu/~ketan/files/ssh-pbs.stdout > > > > > > > > Following is the log file for the above run: > > > > > > > > > > > > > > http://www.ci.uchicago.edu/~ketan/files/ftdock-20110521-1750-b0cot9sa.log > > > > > > > > > > > > Ketan > > > > > > > > On 5/21/11 5:12 PM, Michael Wilde wrote: > > > > > > > > > > ----- Original Message ----- > > > > >> On Sat, 2011-05-21 at 17:06 -0400, Glen Hocky > > wrote: > > > > >>> as I mentioned, I've been running with Mike's > > swift > > > which was > > > > >>> patched > > > > >>> for beagle. are all the things that make > > running on > > > beagle work in > > > > >>> trunk? > > > > >> No idea. > > > > >> > > > > >> Mike? > > > > > Justin, working with Ketan, just applied changes > > to trunk > > > which should make it work now on Beagle (or any Cray > > XT5+ or > > > XE). This uses a different set of sites.xml tags > > than the > > > prototype in the current Beagle swift 0.92.1 module. > > Justin > > > has a note on this at: > > > > > > > https://sites.google.com/site/swiftdevel/sites/pbs/cray > > > > > > > > > > It was working before for one-node worker jobs; > > now it > > > should work for multi-node worker jobs as well. > > > > > > > > > > Justin and Ketan should comment on the state of > > testing > > > and readiness of this trunk feature. Don't try > > trunk on > > > Beagle till they give the go-ahead. > > > > > > > > > > - Mike > > > > > > > > > >>> If so i'll update to the latest and test. I > > don't > > > think I'm > > > > >>> using stable... > > > > >> Ok > > > > >> > > > > >> Mihael > > > > _______________________________________________ > > > > Swift-devel mailing list > > > > Swift-devel at ci.uchicago.edu > > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidkelly999 at gmail.com Fri Jun 3 09:00:38 2011 From: davidkelly999 at gmail.com (David Kelly) Date: Fri, 3 Jun 2011 09:00:38 -0500 Subject: [Swift-devel] Walltime in PBS submit files Message-ID: Hello, I am trying to use the shared queue on Fusion. This queue requires a walltime of less than one hour. I have all of the applications in my tc.data file set up with a walltime of 30 seconds. In my sites.xml, I specify a maxtime of 10. However, when Swift generates the PBS submit file, it specifies a walltime of 00:00:00 which prevents it from running. How can I make Swift set the walltime in these PBS submit scripts? Thanks, David -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PBS3006846224310926023.submit Type: application/octet-stream Size: 544 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: sites.xml Type: text/xml Size: 780 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: tc.data Type: application/octet-stream Size: 562 bytes Desc: not available URL: From wilde at mcs.anl.gov Fri Jun 3 09:54:52 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 3 Jun 2011 09:54:52 -0500 (CDT) Subject: [Swift-devel] Walltime in PBS submit files In-Reply-To: Message-ID: <1655081156.937.1307112892201.JavaMail.root@zimbra.anl.gov> David, Im not sure what coasters is thinking here. But first thing I would try is to set the maxtime value in sites.xml (which is in integer seconds) to something like 300 to make coasters realize that it can fit at least 10 30-second app calls into the 300 second coaster block. With maxtime < maxwalltime coasters may be erroneously starting a block but finding that its unable to fit *any* app calls into it. There are also settings you can apply (overallocation etc) to make maxtime more of an "exact time" setting. For example, if you do this: 1800 100 100 ...then each coaster block started should have a maxwalltime of 1800 secs. But your app maxwalltime needs to fit into this block time. - Mike ----- Original Message ----- Hello, I am trying to use the shared queue on Fusion. This queue requires a walltime of less than one hour. I have all of the applications in my tc.data file set up with a walltime of 30 seconds. In my sites.xml, I specify a maxtime of 10. However, when Swift generates the PBS submit file, it specifies a walltime of 00:00:00 which prevents it from running. How can I make Swift set the walltime in these PBS submit scripts? Thanks, David _______________________________________________ Swift-devel mailing list Swift-devel at ci.uchicago.edu http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidkelly999 at gmail.com Fri Jun 3 10:11:27 2011 From: davidkelly999 at gmail.com (David Kelly) Date: Fri, 3 Jun 2011 10:11:27 -0500 Subject: [Swift-devel] Walltime in PBS submit files In-Reply-To: <1655081156.937.1307112892201.JavaMail.root@zimbra.anl.gov> References: <1655081156.937.1307112892201.JavaMail.root@zimbra.anl.gov> Message-ID: That did the trick. I bumped up the maxtime to 300 and it ran immediately. Thanks for the info! David On Fri, Jun 3, 2011 at 9:54 AM, Michael Wilde wrote: > David, > > Im not sure what coasters is thinking here. But first thing I would try is > to set the maxtime value in sites.xml (which is in integer seconds) to > something like 300 to make coasters realize that it can fit at least 10 > 30-second app calls into the 300 second coaster block. > > With maxtime < maxwalltime coasters may be erroneously starting a block but > finding that its unable to fit *any* app calls into it. > > There are also settings you can apply (overallocation etc) to make maxtime > more of an "exact time" setting. For example, if you do this: > > 1800 > 100 > 100 > > ...then each coaster block started should have a maxwalltime of 1800 secs. > > But your app maxwalltime needs to fit into this block time. > > - Mike > > > > > > ------------------------------ > > Hello, > > I am trying to use the shared queue on Fusion. This queue requires a > walltime of less than one hour. I have all of the applications in my tc.data > file set up with a walltime of 30 seconds. In my sites.xml, I specify a > maxtime of 10. However, when Swift generates the PBS submit file, it > specifies a walltime of 00:00:00 which prevents it from running. > > How can I make Swift set the walltime in these PBS submit scripts? > > Thanks, > David > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Fri Jun 3 10:15:28 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 3 Jun 2011 10:15:28 -0500 (CDT) Subject: [Swift-devel] Walltime in PBS submit files In-Reply-To: Message-ID: <1016644273.1082.1307114128373.JavaMail.root@zimbra.anl.gov> Could you add a note on this to the trunk user guide? Thanks, Mike ----- Original Message ----- That did the trick. I bumped up the maxtime to 300 and it ran immediately. Thanks for the info! David On Fri, Jun 3, 2011 at 9:54 AM, Michael Wilde < wilde at mcs.anl.gov > wrote: David, Im not sure what coasters is thinking here. But first thing I would try is to set the maxtime value in sites.xml (which is in integer seconds) to something like 300 to make coasters realize that it can fit at least 10 30-second app calls into the 300 second coaster block. With maxtime < maxwalltime coasters may be erroneously starting a block but finding that its unable to fit *any* app calls into it. There are also settings you can apply (overallocation etc) to make maxtime more of an "exact time" setting. For example, if you do this: 1800 100 100 ...then each coaster block started should have a maxwalltime of 1800 secs. But your app maxwalltime needs to fit into this block time. - Mike Hello, I am trying to use the shared queue on Fusion. This queue requires a walltime of less than one hour. I have all of the applications in my tc.data file set up with a walltime of 30 seconds. In my sites.xml, I specify a maxtime of 10. However, when Swift generates the PBS submit file, it specifies a walltime of 00:00:00 which prevents it from running. How can I make Swift set the walltime in these PBS submit scripts? Thanks, David _______________________________________________ Swift-devel mailing list Swift-devel at ci.uchicago.edu http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.g.armstrong at gmail.com Fri Jun 3 10:21:31 2011 From: tim.g.armstrong at gmail.com (Tim Armstrong) Date: Fri, 3 Jun 2011 10:21:31 -0500 Subject: [Swift-devel] Walltime in PBS submit files In-Reply-To: <1016644273.1082.1307114128373.JavaMail.root@zimbra.anl.gov> References: <1016644273.1082.1307114128373.JavaMail.root@zimbra.anl.gov> Message-ID: I'd been trying separately to understand the effect of the maxTime parameter: it doesn't do what I intuitively expect it to either. It seems like if I set maxtime to x number of minutes, it submits jobs of at most x - 1 minutes duration. Is this the intended behaviour? Is maxTime rounded down to the nearest minute? Should I understand maxTime as meaning "only submit jobs that are strictly less than durations"? - Tim On Fri, Jun 3, 2011 at 10:15 AM, Michael Wilde wrote: > Could you add a note on this to the trunk user guide? > > Thanks, > > Mike > > > ------------------------------ > > That did the trick. I bumped up the maxtime to 300 and it ran immediately. > Thanks for the info! > > David > > On Fri, Jun 3, 2011 at 9:54 AM, Michael Wilde wrote: > >> David, >> >> Im not sure what coasters is thinking here. But first thing I would try >> is to set the maxtime value in sites.xml (which is in integer seconds) to >> something like 300 to make coasters realize that it can fit at least 10 >> 30-second app calls into the 300 second coaster block. >> >> With maxtime < maxwalltime coasters may be erroneously starting a block >> but finding that its unable to fit *any* app calls into it. >> >> There are also settings you can apply (overallocation etc) to make maxtime >> more of an "exact time" setting. For example, if you do this: >> >> 1800 >> 100 >> 100 >> >> ...then each coaster block started should have a maxwalltime of 1800 secs. >> >> But your app maxwalltime needs to fit into this block time. >> >> - Mike >> >> >> >> >> >> ------------------------------ >> >> Hello, >> >> I am trying to use the shared queue on Fusion. This queue requires a >> walltime of less than one hour. I have all of the applications in my tc.data >> file set up with a walltime of 30 seconds. In my sites.xml, I specify a >> maxtime of 10. However, when Swift generates the PBS submit file, it >> specifies a walltime of 00:00:00 which prevents it from running. >> >> How can I make Swift set the walltime in these PBS submit scripts? >> >> Thanks, >> David >> >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel >> >> >> >> >> -- >> Michael Wilde >> Computation Institute, University of Chicago >> Mathematics and Computer Science Division >> Argonne National Laboratory >> >> > > > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidkelly999 at gmail.com Fri Jun 3 13:11:06 2011 From: davidkelly999 at gmail.com (David Kelly) Date: Fri, 3 Jun 2011 13:11:06 -0500 Subject: [Swift-devel] Re: InternalHostname in sites file In-Reply-To: References: <1307043924.19522.4.camel@blabla2.none> <346384249.137621.1307045335440.JavaMail.root@zimbra.anl.gov> Message-ID: Sure, Jon. The userguide is now kept in the docs/userguide directory of Swift. It's in asciidoc format. Feel free to make changes there, or email me the text and I can add it for you. David On Thu, Jun 2, 2011 at 3:10 PM, Jonathan S Monette wrote: > I can write something up and send it to David to add somewhere for the > userguide. Not sure where the files for the userguide are kept. > > > On Thu, Jun 2, 2011 at 3:08 PM, Michael Wilde wrote: > >> Thanks for clarifying. Jon and/or David, can you address this with a >> cookbook entry on Coasters that heads towards a users guide section? >> >> We should tell users what they can run on their cluster (eg ping or >> telnet-style connect tests) to validate the setting of internalHostName. >> >> - Mike >> >> >> >> ----- Original Message ----- >> > Right. If the head node has multiple network interfaces, only one of >> > which is visible from the worker nodes. >> > >> > The choice of which interface is the one that worker nodes can connect >> > to is a matter of the particular cluster. It's not particularly easy >> > to >> > have an automated mechanism that figures it out. We tried some scheme >> > to >> > pass all the interface addresses to the worker and let it try to >> > connect >> > to all of them in order, but that didn't work very well. Of course, >> > there might be a scheme that works, but I didn't want to spend too >> > much >> > time on that. >> > >> > So that's why it's needed. To clarify to the workers which exact >> > interface on the head node they are to try to connect to. >> > >> > Mihael >> > >> > On Thu, 2011-06-02 at 14:10 -0500, Jonathan S Monette wrote: >> > > Mihael, >> > > I believe we have talked about this before but why is it >> > > necessary >> > > for an InternalHostname to be specified for PADS? I know that the >> > > address that coasters connects to is wrong but I do not remember why >> > > that was. Could you give an explanation on why internalHostname >> > > needs >> > > to be set? >> > > >> > > -- >> > > >> > > >> > > Any intelligent fool can make things bigger and more complex... It >> > > takes a touch of genius - and a lot of courage to move in the >> > > opposite >> > > direction. >> > > >> > > - Albert Einstein >> > > >> > > >> > > >> > >> > >> > _______________________________________________ >> > Swift-devel mailing list >> > Swift-devel at ci.uchicago.edu >> > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel >> >> -- >> Michael Wilde >> Computation Institute, University of Chicago >> Mathematics and Computer Science Division >> Argonne National Laboratory >> >> > > > -- > > Any intelligent fool can make things bigger and more complex... It takes a > touch of genius - and a lot of courage to move in the opposite direction. > > - Albert Einstein > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Fri Jun 3 14:22:39 2011 From: jonmon at utexas.edu (Jonathan S Monette) Date: Fri, 3 Jun 2011 14:22:39 -0500 Subject: [Swift-devel] Re: InternalHostname in sites file In-Reply-To: References: <1307043924.19522.4.camel@blabla2.none> <346384249.137621.1307045335440.JavaMail.root@zimbra.anl.gov> Message-ID: Ok. I have one I wrote up based on what Mihael mentioned. But should we just explain why someone would use the internalhostname key in sites.xml or should we also explain how to tell if you need to set this value and also here is how to find it? On Fri, Jun 3, 2011 at 1:11 PM, David Kelly wrote: > Sure, Jon. The userguide is now kept in the docs/userguide directory of > Swift. It's in asciidoc format. Feel free to make changes there, or email me > the text and I can add it for you. > > David > > > On Thu, Jun 2, 2011 at 3:10 PM, Jonathan S Monette wrote: > >> I can write something up and send it to David to add somewhere for the >> userguide. Not sure where the files for the userguide are kept. >> >> >> On Thu, Jun 2, 2011 at 3:08 PM, Michael Wilde wrote: >> >>> Thanks for clarifying. Jon and/or David, can you address this with a >>> cookbook entry on Coasters that heads towards a users guide section? >>> >>> We should tell users what they can run on their cluster (eg ping or >>> telnet-style connect tests) to validate the setting of internalHostName. >>> >>> - Mike >>> >>> >>> >>> ----- Original Message ----- >>> > Right. If the head node has multiple network interfaces, only one of >>> > which is visible from the worker nodes. >>> > >>> > The choice of which interface is the one that worker nodes can connect >>> > to is a matter of the particular cluster. It's not particularly easy >>> > to >>> > have an automated mechanism that figures it out. We tried some scheme >>> > to >>> > pass all the interface addresses to the worker and let it try to >>> > connect >>> > to all of them in order, but that didn't work very well. Of course, >>> > there might be a scheme that works, but I didn't want to spend too >>> > much >>> > time on that. >>> > >>> > So that's why it's needed. To clarify to the workers which exact >>> > interface on the head node they are to try to connect to. >>> > >>> > Mihael >>> > >>> > On Thu, 2011-06-02 at 14:10 -0500, Jonathan S Monette wrote: >>> > > Mihael, >>> > > I believe we have talked about this before but why is it >>> > > necessary >>> > > for an InternalHostname to be specified for PADS? I know that the >>> > > address that coasters connects to is wrong but I do not remember why >>> > > that was. Could you give an explanation on why internalHostname >>> > > needs >>> > > to be set? >>> > > >>> > > -- >>> > > >>> > > >>> > > Any intelligent fool can make things bigger and more complex... It >>> > > takes a touch of genius - and a lot of courage to move in the >>> > > opposite >>> > > direction. >>> > > >>> > > - Albert Einstein >>> > > >>> > > >>> > > >>> > >>> > >>> > _______________________________________________ >>> > Swift-devel mailing list >>> > Swift-devel at ci.uchicago.edu >>> > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel >>> >>> -- >>> Michael Wilde >>> Computation Institute, University of Chicago >>> Mathematics and Computer Science Division >>> Argonne National Laboratory >>> >>> >> >> >> -- >> >> Any intelligent fool can make things bigger and more complex... It takes a >> touch of genius - and a lot of courage to move in the opposite direction. >> >> - Albert Einstein >> >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel >> >> > -- Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction. - Albert Einstein -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Sat Jun 4 11:50:09 2011 From: jonmon at utexas.edu (Jonathan S Monette) Date: Sat, 4 Jun 2011 11:50:09 -0500 Subject: [Swift-devel] catsn on beagle Message-ID: Hello, I am trying to run the catsn test on beagle using the files in ~ketan/catsn. I have copied over this directory over to my home directory and I believe I set it up correctly. I did module load swift and the ran run.sh that was in this directory. I get this error. /soft/swift/0.92/bin/swift: eval: line 152: syntax error near unexpected token `(' /soft/swift/0.92/bin/swift: eval: line 152: `java -Xmx8192M -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed -DUID=1881 -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) -DCOG_INSTALL_PATH=/soft/swift/0.92/bin/.. -Dswift.home=/soft/swift/0.92/bin/.. -Djava.security.egd=file:///dev/urandom -classpath /soft/swift/0.92/bin/../etc:/soft/swift/0.92/bin/../libexec:/soft/swift/0.92/bin/../lib/addressing-1.0.jar:/soft/swift/0.92/bin/../lib/ant.jar:/soft/swift/0.92/bin/../lib/antlr-2.7.5.jar:/soft/swift/0.92/bin/../lib/axis.jar:/soft/swift/0.92/bin/../lib/axis-url.jar:/soft/swift/0.92/bin/../lib/backport-util-concurrent.jar:/soft/swift/0.92/bin/../lib/castor-0.9.6.jar:/soft/swift/0.92/bin/../lib/coaster-bootstrap.jar:/soft/swift/0.92/bin/../lib/cog-abstraction-common-2.4.jar:/soft/swift/0.92/bin/../lib/cog-axis.jar:/soft/swift/0.92/bin/../lib/cog-grapheditor-0.47.jar:/soft/swift/0.92/bin/../lib/cog-jglobus-1.7.0.jar:/soft/swift/0.92/bin/../lib/cog-karajan-0.36-dev.jar:/soft/swift/0.92/bin/../lib/cog-provider-clref-gt4_0_0.jar:/soft/swift/0.92/bin/../lib/cog-provider-coaster-0.3.jar:/soft/swift/0.92/bin/../lib/cog-provider-dcache-0.1.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt2-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/soft/swift/0.92/bin/../lib/cog-provider-local-2.2.jar:/soft/swift/0.92/bin/../lib/cog-provider-localscheduler-0.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-ssh-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-webdav-2.1.jar:/soft/swift/0.92/bin/../lib/cog-resources-1.0.jar:/soft/swift/0.92/bin/../lib/cog-swift-svn.jar:/soft/swift/0.92/bin/../lib/cog-trap-1.0.jar:/soft/swift/0.92/bin/../lib/cog-url.jar:/soft/swift/0.92/bin/../lib/cog-util-0.92.jar:/soft/swift/0.92/bin/../lib/commonj.jar:/soft/swift/0.92/bin/../lib/commons-beanutils.jar:/soft/swift/0.92/bin/../lib/commons-collections-3.0.jar:/soft/swift/0.92/bin/../lib/commons-digester.jar:/soft/swift/0.92/bin/../lib/commons-discovery.jar:/soft/swift/0.92/bin/../lib/commons-httpclient.jar:/soft/swift/0.92/bin/../lib/commons-logging-1.1.jar:/soft/swift/0.92/bin/../lib/concurrent.jar:/soft/swift/0.92/bin/../lib/cryptix32.jar:/soft/swift/0.92/bin/../lib/cryptix-asn1.jar:/soft/swift/0.92/bin/../lib/cryptix.jar:/soft/swift/0.92/bin/../lib/globus_delegation_service.jar:/soft/swift/0.92/bin/../lib/globus_delegation_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_mds_aggregator_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_service.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rft_stubs.jar:/soft/swift/0.92/bin/../lib/gram-client.jar:/soft/swift/0.92/bin/../lib/gram-stubs.jar:/soft/swift/0.92/bin/../lib/gram-utils.jar:/soft/swift/0.92/bin/../lib/j2ssh-common-0.2.2.jar:/soft/swift/0.92/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/soft/swift/0.92/bin/../lib/jakarta-regexp-1.2.jar:/soft/swift/0.92/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/soft/swift/0.92/bin/../lib/jaxrpc.jar:/soft/swift/0.92/bin/../lib/jce-jdk13-131.jar:/soft/swift/0.92/bin/../lib/jgss.jar:/soft/swift/0.92/bin/../lib/jline-0.9.94.jar:/soft/swift/0.92/bin/../lib/jsr173_1.0_api.jar:/soft/swift/0.92/bin/../lib/jug-lgpl-2.0.0.jar:/soft/swift/0.92/bin/../lib/junit.jar:/soft/swift/0.92/bin/../lib/log4j-1.2.8.jar:/soft/swift/0.92/bin/../lib/naming-common.jar:/soft/swift/0.92/bin/../lib/naming-factory.jar:/soft/swift/0.92/bin/../lib/naming-java.jar:/soft/swift/0.92/bin/../lib/naming-resources.jar:/soft/swift/0.92/bin/../lib/opensaml.jar:/soft/swift/0.92/bin/../lib/puretls.jar:/soft/swift/0.92/bin/../lib/resolver.jar:/soft/swift/0.92/bin/../lib/saaj.jar:/soft/swift/0.92/bin/../lib/stringtemplate.jar:/soft/swift/0.92/bin/../lib/vdldefinitions.jar:/soft/swift/0.92/bin/../lib/wsdl4j.jar:/soft/swift/0.92/bin/../lib/wsrf_core.jar:/soft/swift/0.92/bin/../lib/wsrf_core_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_index_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_usefulrp_schema_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_provider_jce.jar:/soft/swift/0.92/bin/../lib/wsrf_tools.jar:/soft/swift/0.92/bin/../lib/wss4j.jar:/soft/swift/0.92/bin/../lib/xalan.jar:/soft/swift/0.92/bin/../lib/xbean.jar:/soft/swift/0.92/bin/../lib/xbean_xpath.jar:/soft/swift/0.92/bin/../lib/xercesImpl.jar:/soft/swift/0.92/bin/../lib/xml-apis.jar:/soft/swift/0.92/bin/../lib/xmlsec.jar:/soft/swift/0.92/bin/../lib/xpp3-1.1.3.4d_b4_min.jar:/soft/swift/0.92/bin/../lib/xstream-1.1.1-patched.jar: org.griphyn.vdl.karajan.Loader '-config' 'cf' '-tc.file' 'tc' '-sites.file' 'beagle-coaster.xml' 'catsn.swift' '-n=1'' I there something else I need to load on beagle to make swift run accordingly? -- Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction. - Albert Einstein -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Sat Jun 4 12:00:25 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Sat, 4 Jun 2011 12:00:25 -0500 Subject: [Swift-devel] Re: catsn on beagle In-Reply-To: References: Message-ID: Jon, Thanks for trying out catsn on Beagle. I just tried it myself but could not reproduce the error you are getting. Have you made the changes that are mentioned in the README: == Change workdir location in beagle-coaster.xml Change the project entry in the beagle-coaster.xml to your project name == Ketan On Sat, Jun 4, 2011 at 11:50 AM, Jonathan S Monette wrote: > Hello, > I am trying to run the catsn test on beagle using the files in > ~ketan/catsn. I have copied over this directory over to my home directory > and I believe I set it up correctly. I did module load swift and the ran > run.sh that was in this directory. I get this error. > > /soft/swift/0.92/bin/swift: eval: line 152: syntax error near unexpected > token `(' > /soft/swift/0.92/bin/swift: eval: line 152: `java -Xmx8192M > -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed -DUID=1881 > -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) > -DCOG_INSTALL_PATH=/soft/swift/0.92/bin/.. > -Dswift.home=/soft/swift/0.92/bin/.. -Djava.security.egd=file:///dev/urandom > -classpath > /soft/swift/0.92/bin/../etc:/soft/swift/0.92/bin/../libexec:/soft/swift/0.92/bin/../lib/addressing-1.0.jar:/soft/swift/0.92/bin/../lib/ant.jar:/soft/swift/0.92/bin/../lib/antlr-2.7.5.jar:/soft/swift/0.92/bin/../lib/axis.jar:/soft/swift/0.92/bin/../lib/axis-url.jar:/soft/swift/0.92/bin/../lib/backport-util-concurrent.jar:/soft/swift/0.92/bin/../lib/castor-0.9.6.jar:/soft/swift/0.92/bin/../lib/coaster-bootstrap.jar:/soft/swift/0.92/bin/../lib/cog-abstraction-common-2.4.jar:/soft/swift/0.92/bin/../lib/cog-axis.jar:/soft/swift/0.92/bin/../lib/cog-grapheditor-0.47.jar:/soft/swift/0.92/bin/../lib/cog-jglobus-1.7.0.jar:/soft/swift/0.92/bin/../lib/cog-karajan-0.36-dev.jar:/soft/swift/0.92/bin/../lib/cog-provider-clref-gt4_0_0.jar:/soft/swift/0.92/bin/../lib/cog-provider-coaster-0.3.jar:/soft/swift/0.92/bin/../lib/cog-provider-dcache-0.1.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt2-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/soft/swift/0.92/bin/../lib/cog-provider-local-2.2.jar:/soft/swift/0.92/bin/../lib/cog-provider-localscheduler-0.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-ssh-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-webdav-2.1.jar:/soft/swift/0.92/bin/../lib/cog-resources-1.0.jar:/soft/swift/0.92/bin/../lib/cog-swift-svn.jar:/soft/swift/0.92/bin/../lib/cog-trap-1.0.jar:/soft/swift/0.92/bin/../lib/cog-url.jar:/soft/swift/0.92/bin/../lib/cog-util-0.92.jar:/soft/swift/0.92/bin/../lib/commonj.jar:/soft/swift/0.92/bin/../lib/commons-beanutils.jar:/soft/swift/0.92/bin/../lib/commons-collections-3.0.jar:/soft/swift/0.92/bin/../lib/commons-digester.jar:/soft/swift/0.92/bin/../lib/commons-discovery.jar:/soft/swift/0.92/bin/../lib/commons-httpclient.jar:/soft/swift/0.92/bin/../lib/commons-logging-1.1.jar:/soft/swift/0.92/bin/../lib/concurrent.jar:/soft/swift/0.92/bin/../lib/cryptix32.jar:/soft/swift/0.92/bin/../lib/cryptix-asn1.jar:/soft/swift/0.92/bin/../lib/cryptix.jar:/soft/swift/0.92/bin/../lib/globus_delegation_service.jar:/soft/swift/0.92/bin/../lib/globus_delegation_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_mds_aggregator_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_service.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rft_stubs.jar:/soft/swift/0.92/bin/../lib/gram-client.jar:/soft/swift/0.92/bin/../lib/gram-stubs.jar:/soft/swift/0.92/bin/../lib/gram-utils.jar:/soft/swift/0.92/bin/../lib/j2ssh-common-0.2.2.jar:/soft/swift/0.92/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/soft/swift/0.92/bin/../lib/jakarta-regexp-1.2.jar:/soft/swift/0.92/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/soft/swift/0.92/bin/../lib/jaxrpc.jar:/soft/swift/0.92/bin/../lib/jce-jdk13-131.jar:/soft/swift/0.92/bin/../lib/jgss.jar:/soft/swift/0.92/bin/../lib/jline-0.9.94.jar:/soft/swift/0.92/bin/../lib/jsr173_1.0_api.jar:/soft/swift/0.92/bin/../lib/jug-lgpl-2.0.0.jar:/soft/swift/0.92/bin/../lib/junit.jar:/soft/swift/0.92/bin/../lib/log4j-1.2.8.jar:/soft/swift/0.92/bin/../lib/naming-common.jar:/soft/swift/0.92/bin/../lib/naming-factory.jar:/soft/swift/0.92/bin/../lib/naming-java.jar:/soft/swift/0.92/bin/../lib/naming-resources.jar:/soft/swift/0.92/bin/../lib/opensaml.jar:/soft/swift/0.92/bin/../lib/puretls.jar:/soft/swift/0.92/bin/../lib/resolver.jar:/soft/swift/0.92/bin/../lib/saaj.jar:/soft/swift/0.92/bin/../lib/stringtemplate.jar:/soft/swift/0.92/bin/../lib/vdldefinitions.jar:/soft/swift/0.92/bin/../lib/wsdl4j.jar:/soft/swift/0.92/bin/../lib/wsrf_core.jar:/soft/swift/0.92/bin/../lib/wsrf_core_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_index_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_usefulrp_schema_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_provider_jce.jar:/soft/swift/0.92/bin/../lib/wsrf_tools.jar:/soft/swift/0.92/bin/../lib/wss4j.jar:/soft/swift/0.92/bin/../lib/xalan.jar:/soft/swift/0.92/bin/../lib/xbean.jar:/soft/swift/0.92/bin/../lib/xbean_xpath.jar:/soft/swift/0.92/bin/../lib/xercesImpl.jar:/soft/swift/0.92/bin/../lib/xml-apis.jar:/soft/swift/0.92/bin/../lib/xmlsec.jar:/soft/swift/0.92/bin/../lib/xpp3-1.1.3.4d_b4_min.jar:/soft/swift/0.92/bin/../lib/xstream-1.1.1-patched.jar: > org.griphyn.vdl.karajan.Loader '-config' 'cf' '-tc.file' 'tc' '-sites.file' > 'beagle-coaster.xml' 'catsn.swift' '-n=1'' > > I there something else I need to load on beagle to make swift run > accordingly? > > -- > > Any intelligent fool can make things bigger and more complex... It takes a > touch of genius - and a lot of courage to move in the opposite direction. > > - Albert Einstein > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Sat Jun 4 12:03:20 2011 From: jonmon at utexas.edu (Jonathan S Monette) Date: Sat, 4 Jun 2011 12:03:20 -0500 Subject: [Swift-devel] Re: catsn on beagle In-Reply-To: References: Message-ID: I did change the work directory. I however did not change the project name. I do not know my project name so I kept the same project that was in there. Here is my modified beagle-coasters.xml CI-CCR000013 24:cray:pack 24 1000 1 1 1 .63 10000 /lustre/beagle/jonmon/Swift/work On Sat, Jun 4, 2011 at 12:00 PM, Ketan Maheshwari < ketancmaheshwari at gmail.com> wrote: > Jon, > > Thanks for trying out catsn on Beagle. > > I just tried it myself but could not reproduce the error you are getting. > Have you made the changes that are mentioned in the README: > == > Change workdir location in beagle-coaster.xml > Change the project entry in the beagle-coaster.xml to your project name > == > > Ketan > > > On Sat, Jun 4, 2011 at 11:50 AM, Jonathan S Monette wrote: > >> Hello, >> I am trying to run the catsn test on beagle using the files in >> ~ketan/catsn. I have copied over this directory over to my home directory >> and I believe I set it up correctly. I did module load swift and the ran >> run.sh that was in this directory. I get this error. >> >> /soft/swift/0.92/bin/swift: eval: line 152: syntax error near unexpected >> token `(' >> /soft/swift/0.92/bin/swift: eval: line 152: `java -Xmx8192M >> -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed -DUID=1881 >> -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >> -DCOG_INSTALL_PATH=/soft/swift/0.92/bin/.. >> -Dswift.home=/soft/swift/0.92/bin/.. -Djava.security.egd=file:///dev/urandom >> -classpath >> /soft/swift/0.92/bin/../etc:/soft/swift/0.92/bin/../libexec:/soft/swift/0.92/bin/../lib/addressing-1.0.jar:/soft/swift/0.92/bin/../lib/ant.jar:/soft/swift/0.92/bin/../lib/antlr-2.7.5.jar:/soft/swift/0.92/bin/../lib/axis.jar:/soft/swift/0.92/bin/../lib/axis-url.jar:/soft/swift/0.92/bin/../lib/backport-util-concurrent.jar:/soft/swift/0.92/bin/../lib/castor-0.9.6.jar:/soft/swift/0.92/bin/../lib/coaster-bootstrap.jar:/soft/swift/0.92/bin/../lib/cog-abstraction-common-2.4.jar:/soft/swift/0.92/bin/../lib/cog-axis.jar:/soft/swift/0.92/bin/../lib/cog-grapheditor-0.47.jar:/soft/swift/0.92/bin/../lib/cog-jglobus-1.7.0.jar:/soft/swift/0.92/bin/../lib/cog-karajan-0.36-dev.jar:/soft/swift/0.92/bin/../lib/cog-provider-clref-gt4_0_0.jar:/soft/swift/0.92/bin/../lib/cog-provider-coaster-0.3.jar:/soft/swift/0.92/bin/../lib/cog-provider-dcache-0.1.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt2-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/soft/swift/0.92/bin/../lib/cog-provider-local-2.2.jar:/soft/swift/0.92/bin/../lib/cog-provider-localscheduler-0.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-ssh-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-webdav-2.1.jar:/soft/swift/0.92/bin/../lib/cog-resources-1.0.jar:/soft/swift/0.92/bin/../lib/cog-swift-svn.jar:/soft/swift/0.92/bin/../lib/cog-trap-1.0.jar:/soft/swift/0.92/bin/../lib/cog-url.jar:/soft/swift/0.92/bin/../lib/cog-util-0.92.jar:/soft/swift/0.92/bin/../lib/commonj.jar:/soft/swift/0.92/bin/../lib/commons-beanutils.jar:/soft/swift/0.92/bin/../lib/commons-collections-3.0.jar:/soft/swift/0.92/bin/../lib/commons-digester.jar:/soft/swift/0.92/bin/../lib/commons-discovery.jar:/soft/swift/0.92/bin/../lib/commons-httpclient.jar:/soft/swift/0.92/bin/../lib/commons-logging-1.1.jar:/soft/swift/0.92/bin/../lib/concurrent.jar:/soft/swift/0.92/bin/../lib/cryptix32.jar:/soft/swift/0.92/bin/../lib/cryptix-asn1.jar:/soft/swift/0.92/bin/../lib/cryptix.jar:/soft/swift/0.92/bin/../lib/globus_delegation_service.jar:/soft/swift/0.92/bin/../lib/globus_delegation_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_mds_aggregator_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_service.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rft_stubs.jar:/soft/swift/0.92/bin/../lib/gram-client.jar:/soft/swift/0.92/bin/../lib/gram-stubs.jar:/soft/swift/0.92/bin/../lib/gram-utils.jar:/soft/swift/0.92/bin/../lib/j2ssh-common-0.2.2.jar:/soft/swift/0.92/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/soft/swift/0.92/bin/../lib/jakarta-regexp-1.2.jar:/soft/swift/0.92/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/soft/swift/0.92/bin/../lib/jaxrpc.jar:/soft/swift/0.92/bin/../lib/jce-jdk13-131.jar:/soft/swift/0.92/bin/../lib/jgss.jar:/soft/swift/0.92/bin/../lib/jline-0.9.94.jar:/soft/swift/0.92/bin/../lib/jsr173_1.0_api.jar:/soft/swift/0.92/bin/../lib/jug-lgpl-2.0.0.jar:/soft/swift/0.92/bin/../lib/junit.jar:/soft/swift/0.92/bin/../lib/log4j-1.2.8.jar:/soft/swift/0.92/bin/../lib/naming-common.jar:/soft/swift/0.92/bin/../lib/naming-factory.jar:/soft/swift/0.92/bin/../lib/naming-java.jar:/soft/swift/0.92/bin/../lib/naming-resources.jar:/soft/swift/0.92/bin/../lib/opensaml.jar:/soft/swift/0.92/bin/../lib/puretls.jar:/soft/swift/0.92/bin/../lib/resolver.jar:/soft/swift/0.92/bin/../lib/saaj.jar:/soft/swift/0.92/bin/../lib/stringtemplate.jar:/soft/swift/0.92/bin/../lib/vdldefinitions.jar:/soft/swift/0.92/bin/../lib/wsdl4j.jar:/soft/swift/0.92/bin/../lib/wsrf_core.jar:/soft/swift/0.92/bin/../lib/wsrf_core_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_index_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_usefulrp_schema_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_provider_jce.jar:/soft/swift/0.92/bin/../lib/wsrf_tools.jar:/soft/swift/0.92/bin/../lib/wss4j.jar:/soft/swift/0.92/bin/../lib/xalan.jar:/soft/swift/0.92/bin/../lib/xbean.jar:/soft/swift/0.92/bin/../lib/xbean_xpath.jar:/soft/swift/0.92/bin/../lib/xercesImpl.jar:/soft/swift/0.92/bin/../lib/xml-apis.jar:/soft/swift/0.92/bin/../lib/xmlsec.jar:/soft/swift/0.92/bin/../lib/xpp3-1.1.3.4d_b4_min.jar:/soft/swift/0.92/bin/../lib/xstream-1.1.1-patched.jar: >> org.griphyn.vdl.karajan.Loader '-config' 'cf' '-tc.file' 'tc' '-sites.file' >> 'beagle-coaster.xml' 'catsn.swift' '-n=1'' >> >> I there something else I need to load on beagle to make swift run >> accordingly? >> >> -- >> >> Any intelligent fool can make things bigger and more complex... It takes a >> touch of genius - and a lot of courage to move in the opposite direction. >> >> - Albert Einstein >> >> > -- Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction. - Albert Einstein -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Sat Jun 4 12:10:38 2011 From: ketancmaheshwari at gmail.com (ketan) Date: Sat, 04 Jun 2011 12:10:38 -0500 Subject: [Swift-devel] Re: catsn on beagle In-Reply-To: References: Message-ID: <4DEA670E.3010909@gmail.com> Can you try "projects --avail" command on beagle to see if you are member of a project. Else you will need to request membership. You can do this from this page: http://pads.ci.uchicago.edu/access/ In any case, your error does not seem to be because of the above. Looks like '(' has sneaked in because of some unexpected typo in commandline .. can you doublecheck it. Also can you make sure that you have all the files: tc, beagle-coasters.xml cf in PATH, ie. are you able to access them w/o ./ On 6/4/11 12:03 PM, Jonathan S Monette wrote: > I did change the work directory. I however did not change the project > name. I do not know my project name so I kept the same project that > was in there. Here is my modified beagle-coasters.xml > > > > > CI-CCR000013 > > 24:cray:pack > > 24 > 1000 > 1 > 1 > 1 > > .63 > 10000 > > > /lustre/beagle/jonmon/Swift/work > > > > > On Sat, Jun 4, 2011 at 12:00 PM, Ketan Maheshwari > > wrote: > > Jon, > > Thanks for trying out catsn on Beagle. > > I just tried it myself but could not reproduce the error you are > getting. Have you made the changes that are mentioned in the README: > == > Change workdir location in beagle-coaster.xml > Change the project entry in the beagle-coaster.xml to your project > name > == > > Ketan > > > On Sat, Jun 4, 2011 at 11:50 AM, Jonathan S Monette > > wrote: > > Hello, > I am trying to run the catsn test on beagle using the files > in ~ketan/catsn. I have copied over this directory over to my > home directory and I believe I set it up correctly. I did > module load swift and the ran run.sh that was in this > directory. I get this error. > > /soft/swift/0.92/bin/swift: eval: line 152: syntax error near > unexpected token `(' > /soft/swift/0.92/bin/swift: eval: line 152: `java -Xmx8192M > -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed > -DUID=1881 > -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) > -DCOG_INSTALL_PATH=/soft/swift/0.92/bin/.. > -Dswift.home=/soft/swift/0.92/bin/.. > -Djava.security.egd=file:///dev/urandom -classpath > /soft/swift/0.92/bin/../etc:/soft/swift/0.92/bin/../libexec:/soft/swift/0.92/bin/../lib/addressing-1.0.jar:/soft/swift/0.92/bin/../lib/ant.jar:/soft/swift/0.92/bin/../lib/antlr-2.7.5.jar:/soft/swift/0.92/bin/../lib/axis.jar:/soft/swift/0.92/bin/../lib/axis-url.jar:/soft/swift/0.92/bin/../lib/backport-util-concurrent.jar:/soft/swift/0.92/bin/../lib/castor-0.9.6.jar:/soft/swift/0.92/bin/../lib/coaster-bootstrap.jar:/soft/swift/0.92/bin/../lib/cog-abstraction-common-2.4.jar:/soft/swift/0.92/bin/../lib/cog-axis.jar:/soft/swift/0.92/bin/../lib/cog-grapheditor-0.47.jar:/soft/swift/0.92/bin/../lib/cog-jglobus-1.7.0.jar:/soft/swift/0.92/bin/../lib/cog-karajan-0.36-dev.jar:/soft/swift/0.92/bin/../lib/cog-provider-clref-gt4_0_0.jar:/soft/swift/0.92/bin/../lib/cog-provider-coaster-0.3.jar:/soft/swift/0.92/bin/../lib/cog-provider-dcache-0.1.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt2-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/soft/swift/0.92/bin/../lib/cog-provider-local-2.2.jar:/soft/swift/0.92/bin/../lib/cog-provider-localscheduler-0.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-ssh-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-webdav-2.1.jar:/soft/swift/0.92/bin/../lib/cog-resources-1.0.jar:/soft/swift/0.92/bin/../lib/cog-swift-svn.jar:/soft/swift/0.92/bin/../lib/cog-trap-1.0.jar:/soft/swift/0.92/bin/../lib/cog-url.jar:/soft/swift/0.92/bin/../lib/cog-util-0.92.jar:/soft/swift/0.92/bin/../lib/commonj.jar:/soft/swift/0.92/bin/../lib/commons-beanutils.jar:/soft/swift/0.92/bin/../lib/commons-collections-3.0.jar:/soft/swift/0.92/bin/../lib/commons-digester.jar:/soft/swift/0.92/bin/../lib/commons-discovery.jar:/soft/swift/0.92/bin/../lib/commons-httpclient.jar:/soft/swift/0.92/bin/../lib/commons-logging-1.1.jar:/soft/swift/0.92/bin/../lib/concurrent.jar:/soft/swift/0.92/bin/../lib/cryptix32.jar:/soft/swift/0.92/bin/../lib/cryptix-asn1.jar:/soft/swift/0.92/bin/../lib/cryptix.jar:/soft/swift/0.92/bin/../lib/globus_delegation_service.jar:/soft/swift/0.92/bin/../lib/globus_delegation_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_mds_aggregator_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_service.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rft_stubs.jar:/soft/swift/0.92/bin/../lib/gram-client.jar:/soft/swift/0.92/bin/../lib/gram-stubs.jar:/soft/swift/0.92/bin/../lib/gram-utils.jar:/soft/swift/0.92/bin/../lib/j2ssh-common-0.2.2.jar:/soft/swift/0.92/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/soft/swift/0.92/bin/../lib/jakarta-regexp-1.2.jar:/soft/swift/0.92/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/soft/swift/0.92/bin/../lib/jaxrpc.jar:/soft/swift/0.92/bin/../lib/jce-jdk13-131.jar:/soft/swift/0.92/bin/../lib/jgss.jar:/soft/swift/0.92/bin/../lib/jline-0.9.94.jar:/soft/swift/0.92/bin/../lib/jsr173_1.0_api.jar:/soft/swift/0.92/bin/../lib/jug-lgpl-2.0.0.jar:/soft/swift/0.92/bin/../lib/junit.jar:/soft/swift/0.92/bin/../lib/log4j-1.2.8.jar:/soft/swift/0.92/bin/../lib/naming-common.jar:/soft/swift/0.92/bin/../lib/naming-factory.jar:/soft/swift/0.92/bin/../lib/naming-java.jar:/soft/swift/0.92/bin/../lib/naming-resources.jar:/soft/swift/0.92/bin/../lib/opensaml.jar:/soft/swift/0.92/bin/../lib/puretls.jar:/soft/swift/0.92/bin/../lib/resolver.jar:/soft/swift/0.92/bin/../lib/saaj.jar:/soft/swift/0.92/bin/../lib/stringtemplate.jar:/soft/swift/0.92/bin/../lib/vdldefinitions.jar:/soft/swift/0.92/bin/../lib/wsdl4j.jar:/soft/swift/0.92/bin/../lib/wsrf_core.jar:/soft/swift/0.92/bin/../lib/wsrf_core_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_index_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_usefulrp_schema_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_provider_jce.jar:/soft/swift/0.92/bin/../lib/wsrf_tools.jar:/soft/swift/0.92/bin/../lib/wss4j.jar:/soft/swift/0.92/bin/../lib/xalan.jar:/soft/swift/0.92/bin/../lib/xbean.jar:/soft/swift/0.92/bin/../lib/xbean_xpath.jar:/soft/swift/0.92/bin/../lib/xercesImpl.jar:/soft/swift/0.92/bin/../lib/xml-apis.jar:/soft/swift/0.92/bin/../lib/xmlsec.jar:/soft/swift/0.92/bin/../lib/xpp3-1.1.3.4d_b4_min.jar:/soft/swift/0.92/bin/../lib/xstream-1.1.1-patched.jar: > org.griphyn.vdl.karajan.Loader '-config' 'cf' '-tc.file' 'tc' > '-sites.file' 'beagle-coaster.xml' 'catsn.swift' '-n=1'' > > I there something else I need to load on beagle to make swift > run accordingly? > > -- > > Any intelligent fool can make things bigger and more > complex... It takes a touch of genius - and a lot of courage > to move in the opposite direction. > > - Albert Einstein > > > > > > > -- > > Any intelligent fool can make things bigger and more complex... It > takes a touch of genius - and a lot of courage to move in the opposite > direction. > > - Albert Einstein > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Sat Jun 4 12:18:01 2011 From: jonmon at utexas.edu (Jonathan S Monette) Date: Sat, 4 Jun 2011 12:18:01 -0500 Subject: [Swift-devel] Re: catsn on beagle In-Reply-To: <4DEA670E.3010909@gmail.com> References: <4DEA670E.3010909@gmail.com> Message-ID: I do have that project. I did projects --avail and got back Project PI Title ------------------------------------------------------------------------------ CI-CCR000013 Michael Wilde The Swift Parallel Scripting System Here is my run.sh file. I execute with ./run.sh after I did chmod +x run.sh #!/bin/bash swift -config cf -tc.file tc -sites.file beagle-coaster.xml catsn.swift -n=1 I do have sh and cat in my path. I can execute them. Here is what which sh and which cat produced. sh is /usr/bin/sh cat is /bin/cat On Sat, Jun 4, 2011 at 12:10 PM, ketan wrote: > Can you try "projects --avail" command on beagle to see if you are member > of a project. > > Else you will need to request membership. You can do this from this page: > > http://pads.ci.uchicago.edu/access/ > > In any case, your error does not seem to be because of the above. Looks > like '(' has sneaked in because of some unexpected typo in commandline .. > can you doublecheck it. > > Also can you make sure that you have all the files: tc, beagle-coasters.xml > cf in PATH, ie. are you able to access them w/o ./ > > > On 6/4/11 12:03 PM, Jonathan S Monette wrote: > > I did change the work directory. I however did not change the project > name. I do not know my project name so I kept the same project that was in > there. Here is my modified beagle-coasters.xml > > > > > CI-CCR000013 > > 24:cray:pack > > 24 > 1000 > 1 > 1 > 1 > > .63 > 10000 > > > /lustre/beagle/jonmon/Swift/work > > > > > On Sat, Jun 4, 2011 at 12:00 PM, Ketan Maheshwari < > ketancmaheshwari at gmail.com> wrote: > >> Jon, >> >> Thanks for trying out catsn on Beagle. >> >> I just tried it myself but could not reproduce the error you are getting. >> Have you made the changes that are mentioned in the README: >> == >> Change workdir location in beagle-coaster.xml >> Change the project entry in the beagle-coaster.xml to your project name >> == >> >> Ketan >> >> >> On Sat, Jun 4, 2011 at 11:50 AM, Jonathan S Monette wrote: >> >>> Hello, >>> I am trying to run the catsn test on beagle using the files in >>> ~ketan/catsn. I have copied over this directory over to my home directory >>> and I believe I set it up correctly. I did module load swift and the ran >>> run.sh that was in this directory. I get this error. >>> >>> /soft/swift/0.92/bin/swift: eval: line 152: syntax error near >>> unexpected token `(' >>> /soft/swift/0.92/bin/swift: eval: line 152: `java -Xmx8192M >>> -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed -DUID=1881 >>> -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>> -DCOG_INSTALL_PATH=/soft/swift/0.92/bin/.. >>> -Dswift.home=/soft/swift/0.92/bin/.. -Djava.security.egd= >>> file:///dev/urandom -classpath >>> /soft/swift/0.92/bin/../etc:/soft/swift/0.92/bin/../libexec:/soft/swift/0.92/bin/../lib/addressing-1.0.jar:/soft/swift/0.92/bin/../lib/ant.jar:/soft/swift/0.92/bin/../lib/antlr-2.7.5.jar:/soft/swift/0.92/bin/../lib/axis.jar:/soft/swift/0.92/bin/../lib/axis-url.jar:/soft/swift/0.92/bin/../lib/backport-util-concurrent.jar:/soft/swift/0.92/bin/../lib/castor-0.9.6.jar:/soft/swift/0.92/bin/../lib/coaster-bootstrap.jar:/soft/swift/0.92/bin/../lib/cog-abstraction-common-2.4.jar:/soft/swift/0.92/bin/../lib/cog-axis.jar:/soft/swift/0.92/bin/../lib/cog-grapheditor-0.47.jar:/soft/swift/0.92/bin/../lib/cog-jglobus-1.7.0.jar:/soft/swift/0.92/bin/../lib/cog-karajan-0.36-dev.jar:/soft/swift/0.92/bin/../lib/cog-provider-clref-gt4_0_0.jar:/soft/swift/0.92/bin/../lib/cog-provider-coaster-0.3.jar:/soft/swift/0.92/bin/../lib/cog-provider-dcache-0.1.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt2-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/soft/swift/0.92/bin/../lib/cog-pro >>> vider-local-2.2.jar:/soft/swift/0.92/bin/../lib/cog-provider-localscheduler-0.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-ssh-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-webdav-2.1.jar:/soft/swift/0.92/bin/../lib/cog-resources-1.0.jar:/soft/swift/0.92/bin/../lib/cog-swift-svn.jar:/soft/swift/0.92/bin/../lib/cog-trap-1.0.jar:/soft/swift/0.92/bin/../lib/cog-url.jar:/soft/swift/0.92/bin/../lib/cog-util-0.92.jar:/soft/swift/0.92/bin/../lib/commonj.jar:/soft/swift/0.92/bin/../lib/commons-beanutils.jar:/soft/swift/0.92/bin/../lib/commons-collections-3.0.jar:/soft/swift/0.92/bin/../lib/commons-digester.jar:/soft/swift/0.92/bin/../lib/commons-discovery.jar:/soft/swift/0.92/bin/../lib/commons-httpclient.jar:/soft/swift/0.92/bin/../lib/commons-logging-1.1.jar:/soft/swift/0.92/bin/../lib/concurrent.jar:/soft/swift/0.92/bin/../lib/cryptix32.jar:/soft/swift/0.92/bin/../lib/cryptix-asn1.jar:/soft/swift/0.92/bin/../lib/cryptix.jar:/soft/swift/0.92/bin/../lib/globus_delegation_servic >>> e.jar:/soft/swift/0.92/bin/../lib/globus_delegation_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_mds_aggregator_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_service.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rft_stubs.jar:/soft/swift/0.92/bin/../lib/gram-client.jar:/soft/swift/0.92/bin/../lib/gram-stubs.jar:/soft/swift/0.92/bin/../lib/gram-utils.jar:/soft/swift/0.92/bin/../lib/j2ssh-common-0.2.2.jar:/soft/swift/0.92/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/soft/swift/0.92/bin/../lib/jakarta-regexp-1.2.jar:/soft/swift/0.92/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/soft/swift/0.92/bin/../lib/jaxrpc.jar:/soft/swift/0.92/bin/../lib/jce-jdk13-131.jar:/soft/swift/0.92/bin/../lib/jgss.jar:/soft/swift/0.92/bin/../lib/jline-0.9.94.jar:/soft/swift/0.92/bin/../lib/jsr173_1.0_api.jar:/soft/swift/0.92/bin/../lib/jug-lgpl-2.0.0.jar:/soft/swift/0.92/bin/../lib/junit.jar:/soft/swift/0.92/bin/../lib/log4j-1.2 >>> .8.jar:/soft/swift/0.92/bin/../lib/naming-common.jar:/soft/swift/0.92/bin/../lib/naming-factory.jar:/soft/swift/0.92/bin/../lib/naming-java.jar:/soft/swift/0.92/bin/../lib/naming-resources.jar:/soft/swift/0.92/bin/../lib/opensaml.jar:/soft/swift/0.92/bin/../lib/puretls.jar:/soft/swift/0.92/bin/../lib/resolver.jar:/soft/swift/0.92/bin/../lib/saaj.jar:/soft/swift/0.92/bin/../lib/stringtemplate.jar:/soft/swift/0.92/bin/../lib/vdldefinitions.jar:/soft/swift/0.92/bin/../lib/wsdl4j.jar:/soft/swift/0.92/bin/../lib/wsrf_core.jar:/soft/swift/0.92/bin/../lib/wsrf_core_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_index_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_usefulrp_schema_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_provider_jce.jar:/soft/swift/0.92/bin/../lib/wsrf_tools.jar:/soft/swift/0.92/bin/../lib/wss4j.jar:/soft/swift/0.92/bin/../lib/xalan.jar:/soft/swift/0.92/bin/../lib/xbean.jar:/soft/swift/0.92/bin/../lib/xbean_xpath.jar:/soft/swift/0.92/bin/../lib/xercesImpl.jar:/soft >>> /swift/0.92/bin/../lib/xml-apis.jar:/soft/swift/0.92/bin/../lib/xmlsec.jar:/soft/swift/0.92/bin/../lib/xpp3-1.1.3.4d_b4_min.jar:/soft/swift/0.92/bin/../lib/xstream-1.1.1-patched.jar: >>> org.griphyn.vdl.karajan.Loader '-config' 'cf' '-tc.file' 'tc' '-sites.file' >>> 'beagle-coaster.xml' 'catsn.swift' '-n=1'' >>> >>> I there something else I need to load on beagle to make swift run >>> accordingly? >>> >>> -- >>> >>> Any intelligent fool can make things bigger and more complex... It takes >>> a touch of genius - and a lot of courage to move in the opposite direction. >>> >>> - Albert Einstein >>> >>> >> > > > -- > > Any intelligent fool can make things bigger and more complex... It takes a > touch of genius - and a lot of courage to move in the opposite direction. > > - Albert Einstein > > -- Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction. - Albert Einstein -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Sat Jun 4 12:24:00 2011 From: ketancmaheshwari at gmail.com (ketan) Date: Sat, 04 Jun 2011 12:24:00 -0500 Subject: [Swift-devel] Re: catsn on beagle In-Reply-To: References: <4DEA670E.3010909@gmail.com> Message-ID: <4DEA6A30.2040503@gmail.com> can you try running the commandline by copying from run.sh and pasting it at the prompt. On 6/4/11 12:18 PM, Jonathan S Monette wrote: > I do have that project. I did projects --avail and got back > > Project PI Title > ------------------------------------------------------------------------------ > CI-CCR000013 Michael Wilde The Swift Parallel Scripting System > > Here is my run.sh file. I execute with ./run.sh after I did chmod +x > run.sh > > #!/bin/bash > swift -config cf -tc.file tc -sites.file beagle-coaster.xml > catsn.swift -n=1 > > I do have sh and cat in my path. I can execute them. Here is what > which sh and which cat produced. > > sh is /usr/bin/sh > cat is /bin/cat > > On Sat, Jun 4, 2011 at 12:10 PM, ketan > wrote: > > Can you try "projects --avail" command on beagle to see if you are > member of a project. > > Else you will need to request membership. You can do this from > this page: > > http://pads.ci.uchicago.edu/access/ > > In any case, your error does not seem to be because of the above. > Looks like '(' has sneaked in because of some unexpected typo in > commandline .. can you doublecheck it. > > Also can you make sure that you have all the files: tc, > beagle-coasters.xml cf in PATH, ie. are you able to access them > w/o ./ > > > On 6/4/11 12:03 PM, Jonathan S Monette wrote: >> I did change the work directory. I however did not change the >> project name. I do not know my project name so I kept the same >> project that was in there. Here is my modified beagle-coasters.xml >> >> >> >> >> CI-CCR000013 >> >> 24:cray:pack >> >> 24 >> 1000 >> 1 >> 1 >> 1 >> >> .63 >> 10000 >> >> >> /lustre/beagle/jonmon/Swift/work >> >> >> >> >> On Sat, Jun 4, 2011 at 12:00 PM, Ketan Maheshwari >> > >> wrote: >> >> Jon, >> >> Thanks for trying out catsn on Beagle. >> >> I just tried it myself but could not reproduce the error you >> are getting. Have you made the changes that are mentioned in >> the README: >> == >> Change workdir location in beagle-coaster.xml >> Change the project entry in the beagle-coaster.xml to your >> project name >> == >> >> Ketan >> >> >> On Sat, Jun 4, 2011 at 11:50 AM, Jonathan S Monette >> > wrote: >> >> Hello, >> I am trying to run the catsn test on beagle using the >> files in ~ketan/catsn. I have copied over this directory >> over to my home directory and I believe I set it up >> correctly. I did module load swift and the ran run.sh >> that was in this directory. I get this error. >> >> /soft/swift/0.92/bin/swift: eval: line 152: syntax error >> near unexpected token `(' >> /soft/swift/0.92/bin/swift: eval: line 152: `java >> -Xmx8192M >> -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed >> -DUID=1881 >> -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >> -DCOG_INSTALL_PATH=/soft/swift/0.92/bin/.. >> -Dswift.home=/soft/swift/0.92/bin/.. >> -Djava.security.egd=file:///dev/urandom -classpath >> /soft/swift/0.92/bin/../etc:/soft/swift/0.92/bin/../libexec:/soft/swift/0.92/bin/../lib/addressing-1.0.jar:/soft/swift/0.92/bin/../lib/ant.jar:/soft/swift/0.92/bin/../lib/antlr-2.7.5.jar:/soft/swift/0.92/bin/../lib/axis.jar:/soft/swift/0.92/bin/../lib/axis-url.jar:/soft/swift/0.92/bin/../lib/backport-util-concurrent.jar:/soft/swift/0.92/bin/../lib/castor-0.9.6.jar:/soft/swift/0.92/bin/../lib/coaster-bootstrap.jar:/soft/swift/0.92/bin/../lib/cog-abstraction-common-2.4.jar:/soft/swift/0.92/bin/../lib/cog-axis.jar:/soft/swift/0.92/bin/../lib/cog-grapheditor-0.47.jar:/soft/swift/0.92/bin/../lib/cog-jglobus-1.7.0.jar:/soft/swift/0.92/bin/../lib/cog-karajan-0.36-dev.jar:/soft/swift/0.92/bin/../lib/cog-provider-clref-gt4_0_0.jar:/soft/swift/0.92/bin/../lib/cog-provider-coaster-0.3.jar:/soft/swift/0.92/bin/../lib/cog-provider-dcache-0.1.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt2-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/soft/swift/0.92/bin/../lib/cog-pro >> vider-local-2.2.jar:/soft/swift/0.92/bin/../lib/cog-provider-localscheduler-0.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-ssh-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-webdav-2.1.jar:/soft/swift/0.92/bin/../lib/cog-resources-1.0.jar:/soft/swift/0.92/bin/../lib/cog-swift-svn.jar:/soft/swift/0.92/bin/../lib/cog-trap-1.0.jar:/soft/swift/0.92/bin/../lib/cog-url.jar:/soft/swift/0.92/bin/../lib/cog-util-0.92.jar:/soft/swift/0.92/bin/../lib/commonj.jar:/soft/swift/0.92/bin/../lib/commons-beanutils.jar:/soft/swift/0.92/bin/../lib/commons-collections-3.0.jar:/soft/swift/0.92/bin/../lib/commons-digester.jar:/soft/swift/0.92/bin/../lib/commons-discovery.jar:/soft/swift/0.92/bin/../lib/commons-httpclient.jar:/soft/swift/0.92/bin/../lib/commons-logging-1.1.jar:/soft/swift/0.92/bin/../lib/concurrent.jar:/soft/swift/0.92/bin/../lib/cryptix32.jar:/soft/swift/0.92/bin/../lib/cryptix-asn1.jar:/soft/swift/0.92/bin/../lib/cryptix.jar:/soft/swift/0.92/bin/../lib/globus_delegation_servic >> e.jar:/soft/swift/0.92/bin/../lib/globus_delegation_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_mds_aggregator_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_service.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rft_stubs.jar:/soft/swift/0.92/bin/../lib/gram-client.jar:/soft/swift/0.92/bin/../lib/gram-stubs.jar:/soft/swift/0.92/bin/../lib/gram-utils.jar:/soft/swift/0.92/bin/../lib/j2ssh-common-0.2.2.jar:/soft/swift/0.92/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/soft/swift/0.92/bin/../lib/jakarta-regexp-1.2.jar:/soft/swift/0.92/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/soft/swift/0.92/bin/../lib/jaxrpc.jar:/soft/swift/0.92/bin/../lib/jce-jdk13-131.jar:/soft/swift/0.92/bin/../lib/jgss.jar:/soft/swift/0.92/bin/../lib/jline-0.9.94.jar:/soft/swift/0.92/bin/../lib/jsr173_1.0_api.jar:/soft/swift/0.92/bin/../lib/jug-lgpl-2.0.0.jar:/soft/swift/0.92/bin/../lib/junit.jar:/soft/swift/0.92/bin/../lib/log4j-1.2 >> .8.jar:/soft/swift/0.92/bin/../lib/naming-common.jar:/soft/swift/0.92/bin/../lib/naming-factory.jar:/soft/swift/0.92/bin/../lib/naming-java.jar:/soft/swift/0.92/bin/../lib/naming-resources.jar:/soft/swift/0.92/bin/../lib/opensaml.jar:/soft/swift/0.92/bin/../lib/puretls.jar:/soft/swift/0.92/bin/../lib/resolver.jar:/soft/swift/0.92/bin/../lib/saaj.jar:/soft/swift/0.92/bin/../lib/stringtemplate.jar:/soft/swift/0.92/bin/../lib/vdldefinitions.jar:/soft/swift/0.92/bin/../lib/wsdl4j.jar:/soft/swift/0.92/bin/../lib/wsrf_core.jar:/soft/swift/0.92/bin/../lib/wsrf_core_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_index_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_usefulrp_schema_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_provider_jce.jar:/soft/swift/0.92/bin/../lib/wsrf_tools.jar:/soft/swift/0.92/bin/../lib/wss4j.jar:/soft/swift/0.92/bin/../lib/xalan.jar:/soft/swift/0.92/bin/../lib/xbean.jar:/soft/swift/0.92/bin/../lib/xbean_xpath.jar:/soft/swift/0.92/bin/../lib/xercesImpl.jar:/soft >> /swift/0.92/bin/../lib/xml-apis.jar:/soft/swift/0.92/bin/../lib/xmlsec.jar:/soft/swift/0.92/bin/../lib/xpp3-1.1.3.4d_b4_min.jar:/soft/swift/0.92/bin/../lib/xstream-1.1.1-patched.jar: >> org.griphyn.vdl.karajan.Loader '-config' 'cf' '-tc.file' >> 'tc' '-sites.file' 'beagle-coaster.xml' 'catsn.swift' '-n=1'' >> >> I there something else I need to load on beagle to make >> swift run accordingly? >> >> -- >> >> Any intelligent fool can make things bigger and more >> complex... It takes a touch of genius - and a lot of >> courage to move in the opposite direction. >> >> - Albert Einstein >> >> >> >> >> >> >> -- >> >> Any intelligent fool can make things bigger and more complex... >> It takes a touch of genius - and a lot of courage to move in the >> opposite direction. >> >> - Albert Einstein >> >> > > > > -- > > Any intelligent fool can make things bigger and more complex... It > takes a touch of genius - and a lot of courage to move in the opposite > direction. > > - Albert Einstein > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Sat Jun 4 12:26:56 2011 From: jonmon at utexas.edu (Jonathan S Monette) Date: Sat, 4 Jun 2011 12:26:56 -0500 Subject: [Swift-devel] Re: catsn on beagle In-Reply-To: <4DEA6A30.2040503@gmail.com> References: <4DEA670E.3010909@gmail.com> <4DEA6A30.2040503@gmail.com> Message-ID: Same result. Same error On Sat, Jun 4, 2011 at 12:24 PM, ketan wrote: > can you try running the commandline by copying from run.sh and pasting it > at the prompt. > > > On 6/4/11 12:18 PM, Jonathan S Monette wrote: > > I do have that project. I did projects --avail and got back > > Project PI Title > > ------------------------------------------------------------------------------ > CI-CCR000013 Michael Wilde The Swift Parallel Scripting System > > Here is my run.sh file. I execute with ./run.sh after I did chmod +x > run.sh > > #!/bin/bash > swift -config cf -tc.file tc -sites.file beagle-coaster.xml catsn.swift > -n=1 > > I do have sh and cat in my path. I can execute them. Here is what which > sh and which cat produced. > > sh is /usr/bin/sh > cat is /bin/cat > > On Sat, Jun 4, 2011 at 12:10 PM, ketan wrote: > >> Can you try "projects --avail" command on beagle to see if you are member >> of a project. >> >> Else you will need to request membership. You can do this from this page: >> >> http://pads.ci.uchicago.edu/access/ >> >> In any case, your error does not seem to be because of the above. Looks >> like '(' has sneaked in because of some unexpected typo in commandline .. >> can you doublecheck it. >> >> Also can you make sure that you have all the files: tc, >> beagle-coasters.xml cf in PATH, ie. are you able to access them w/o ./ >> >> >> On 6/4/11 12:03 PM, Jonathan S Monette wrote: >> >> I did change the work directory. I however did not change the project >> name. I do not know my project name so I kept the same project that was in >> there. Here is my modified beagle-coasters.xml >> >> >> >> >> CI-CCR000013 >> >> 24:cray:pack >> >> 24 >> 1000 >> 1 >> 1 >> 1 >> >> .63 >> 10000 >> >> >> /lustre/beagle/jonmon/Swift/work >> >> >> >> >> On Sat, Jun 4, 2011 at 12:00 PM, Ketan Maheshwari < >> ketancmaheshwari at gmail.com> wrote: >> >>> Jon, >>> >>> Thanks for trying out catsn on Beagle. >>> >>> I just tried it myself but could not reproduce the error you are getting. >>> Have you made the changes that are mentioned in the README: >>> == >>> Change workdir location in beagle-coaster.xml >>> Change the project entry in the beagle-coaster.xml to your project name >>> == >>> >>> Ketan >>> >>> >>> On Sat, Jun 4, 2011 at 11:50 AM, Jonathan S Monette wrote: >>> >>>> Hello, >>>> I am trying to run the catsn test on beagle using the files in >>>> ~ketan/catsn. I have copied over this directory over to my home directory >>>> and I believe I set it up correctly. I did module load swift and the ran >>>> run.sh that was in this directory. I get this error. >>>> >>>> /soft/swift/0.92/bin/swift: eval: line 152: syntax error near >>>> unexpected token `(' >>>> /soft/swift/0.92/bin/swift: eval: line 152: `java -Xmx8192M >>>> -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed -DUID=1881 >>>> -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>> -DCOG_INSTALL_PATH=/soft/swift/0.92/bin/.. >>>> -Dswift.home=/soft/swift/0.92/bin/.. -Djava.security.egd= >>>> file:///dev/urandom -classpath >>>> /soft/swift/0.92/bin/../etc:/soft/swift/0.92/bin/../libexec:/soft/swift/0.92/bin/../lib/addressing-1.0.jar:/soft/swift/0.92/bin/../lib/ant.jar:/soft/swift/0.92/bin/../lib/antlr-2.7.5.jar:/soft/swift/0.92/bin/../lib/axis.jar:/soft/swift/0.92/bin/../lib/axis-url.jar:/soft/swift/0.92/bin/../lib/backport-util-concurrent.jar:/soft/swift/0.92/bin/../lib/castor-0.9.6.jar:/soft/swift/0.92/bin/../lib/coaster-bootstrap.jar:/soft/swift/0.92/bin/../lib/cog-abstraction-common-2.4.jar:/soft/swift/0.92/bin/../lib/cog-axis.jar:/soft/swift/0.92/bin/../lib/cog-grapheditor-0.47.jar:/soft/swift/0.92/bin/../lib/cog-jglobus-1.7.0.jar:/soft/swift/0.92/bin/../lib/cog-karajan-0.36-dev.jar:/soft/swift/0.92/bin/../lib/cog-provider-clref-gt4_0_0.jar:/soft/swift/0.92/bin/../lib/cog-provider-coaster-0.3.jar:/soft/swift/0.92/bin/../lib/cog-provider-dcache-0.1.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt2-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/soft/swift/0.92/bin/../lib/cog-pro >>>> vider-local-2.2.jar:/soft/swift/0.92/bin/../lib/cog-provider-localscheduler-0.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-ssh-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-webdav-2.1.jar:/soft/swift/0.92/bin/../lib/cog-resources-1.0.jar:/soft/swift/0.92/bin/../lib/cog-swift-svn.jar:/soft/swift/0.92/bin/../lib/cog-trap-1.0.jar:/soft/swift/0.92/bin/../lib/cog-url.jar:/soft/swift/0.92/bin/../lib/cog-util-0.92.jar:/soft/swift/0.92/bin/../lib/commonj.jar:/soft/swift/0.92/bin/../lib/commons-beanutils.jar:/soft/swift/0.92/bin/../lib/commons-collections-3.0.jar:/soft/swift/0.92/bin/../lib/commons-digester.jar:/soft/swift/0.92/bin/../lib/commons-discovery.jar:/soft/swift/0.92/bin/../lib/commons-httpclient.jar:/soft/swift/0.92/bin/../lib/commons-logging-1.1.jar:/soft/swift/0.92/bin/../lib/concurrent.jar:/soft/swift/0.92/bin/../lib/cryptix32.jar:/soft/swift/0.92/bin/../lib/cryptix-asn1.jar:/soft/swift/0.92/bin/../lib/cryptix.jar:/soft/swift/0.92/bin/../lib/globus_delegation_servic >>>> e.jar:/soft/swift/0.92/bin/../lib/globus_delegation_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_mds_aggregator_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_service.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rft_stubs.jar:/soft/swift/0.92/bin/../lib/gram-client.jar:/soft/swift/0.92/bin/../lib/gram-stubs.jar:/soft/swift/0.92/bin/../lib/gram-utils.jar:/soft/swift/0.92/bin/../lib/j2ssh-common-0.2.2.jar:/soft/swift/0.92/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/soft/swift/0.92/bin/../lib/jakarta-regexp-1.2.jar:/soft/swift/0.92/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/soft/swift/0.92/bin/../lib/jaxrpc.jar:/soft/swift/0.92/bin/../lib/jce-jdk13-131.jar:/soft/swift/0.92/bin/../lib/jgss.jar:/soft/swift/0.92/bin/../lib/jline-0.9.94.jar:/soft/swift/0.92/bin/../lib/jsr173_1.0_api.jar:/soft/swift/0.92/bin/../lib/jug-lgpl-2.0.0.jar:/soft/swift/0.92/bin/../lib/junit.jar:/soft/swift/0.92/bin/../lib/log4j-1.2 >>>> .8.jar:/soft/swift/0.92/bin/../lib/naming-common.jar:/soft/swift/0.92/bin/../lib/naming-factory.jar:/soft/swift/0.92/bin/../lib/naming-java.jar:/soft/swift/0.92/bin/../lib/naming-resources.jar:/soft/swift/0.92/bin/../lib/opensaml.jar:/soft/swift/0.92/bin/../lib/puretls.jar:/soft/swift/0.92/bin/../lib/resolver.jar:/soft/swift/0.92/bin/../lib/saaj.jar:/soft/swift/0.92/bin/../lib/stringtemplate.jar:/soft/swift/0.92/bin/../lib/vdldefinitions.jar:/soft/swift/0.92/bin/../lib/wsdl4j.jar:/soft/swift/0.92/bin/../lib/wsrf_core.jar:/soft/swift/0.92/bin/../lib/wsrf_core_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_index_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_usefulrp_schema_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_provider_jce.jar:/soft/swift/0.92/bin/../lib/wsrf_tools.jar:/soft/swift/0.92/bin/../lib/wss4j.jar:/soft/swift/0.92/bin/../lib/xalan.jar:/soft/swift/0.92/bin/../lib/xbean.jar:/soft/swift/0.92/bin/../lib/xbean_xpath.jar:/soft/swift/0.92/bin/../lib/xercesImpl.jar:/soft >>>> /swift/0.92/bin/../lib/xml-apis.jar:/soft/swift/0.92/bin/../lib/xmlsec.jar:/soft/swift/0.92/bin/../lib/xpp3-1.1.3.4d_b4_min.jar:/soft/swift/0.92/bin/../lib/xstream-1.1.1-patched.jar: >>>> org.griphyn.vdl.karajan.Loader '-config' 'cf' '-tc.file' 'tc' '-sites.file' >>>> 'beagle-coaster.xml' 'catsn.swift' '-n=1'' >>>> >>>> I there something else I need to load on beagle to make swift run >>>> accordingly? >>>> >>>> -- >>>> >>>> Any intelligent fool can make things bigger and more complex... It takes >>>> a touch of genius - and a lot of courage to move in the opposite direction. >>>> >>>> - Albert Einstein >>>> >>>> >>> >> >> >> -- >> >> Any intelligent fool can make things bigger and more complex... It takes a >> touch of genius - and a lot of courage to move in the opposite direction. >> >> - Albert Einstein >> >> > > > -- > > Any intelligent fool can make things bigger and more complex... It takes a > touch of genius - and a lot of courage to move in the opposite direction. > > - Albert Einstein > > -- Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction. - Albert Einstein -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Sat Jun 4 12:36:55 2011 From: jonmon at utexas.edu (Jonathan S Monette) Date: Sat, 4 Jun 2011 12:36:55 -0500 Subject: [Swift-devel] Re: catsn on beagle In-Reply-To: References: <4DEA670E.3010909@gmail.com> <4DEA6A30.2040503@gmail.com> Message-ID: I may have found the error. I download 0.92 binaries and changed the swift command script. I added On Sat, Jun 4, 2011 at 12:26 PM, Jonathan S Monette wrote: > Same result. Same error > > > On Sat, Jun 4, 2011 at 12:24 PM, ketan wrote: > >> can you try running the commandline by copying from run.sh and pasting it >> at the prompt. >> >> >> On 6/4/11 12:18 PM, Jonathan S Monette wrote: >> >> I do have that project. I did projects --avail and got back >> >> Project PI Title >> >> ------------------------------------------------------------------------------ >> CI-CCR000013 Michael Wilde The Swift Parallel Scripting System >> >> Here is my run.sh file. I execute with ./run.sh after I did chmod +x >> run.sh >> >> #!/bin/bash >> swift -config cf -tc.file tc -sites.file beagle-coaster.xml catsn.swift >> -n=1 >> >> I do have sh and cat in my path. I can execute them. Here is what >> which sh and which cat produced. >> >> sh is /usr/bin/sh >> cat is /bin/cat >> >> On Sat, Jun 4, 2011 at 12:10 PM, ketan wrote: >> >>> Can you try "projects --avail" command on beagle to see if you are >>> member of a project. >>> >>> Else you will need to request membership. You can do this from this page: >>> >>> http://pads.ci.uchicago.edu/access/ >>> >>> In any case, your error does not seem to be because of the above. Looks >>> like '(' has sneaked in because of some unexpected typo in commandline .. >>> can you doublecheck it. >>> >>> Also can you make sure that you have all the files: tc, >>> beagle-coasters.xml cf in PATH, ie. are you able to access them w/o ./ >>> >>> >>> On 6/4/11 12:03 PM, Jonathan S Monette wrote: >>> >>> I did change the work directory. I however did not change the project >>> name. I do not know my project name so I kept the same project that was in >>> there. Here is my modified beagle-coasters.xml >>> >>> >>> >>> >>> CI-CCR000013 >>> >>> 24:cray:pack >>> >>> 24 >>> 1000 >>> 1 >>> 1 >>> 1 >>> >>> .63 >>> 10000 >>> >>> >>> /lustre/beagle/jonmon/Swift/work >>> >>> >>> >>> >>> On Sat, Jun 4, 2011 at 12:00 PM, Ketan Maheshwari < >>> ketancmaheshwari at gmail.com> wrote: >>> >>>> Jon, >>>> >>>> Thanks for trying out catsn on Beagle. >>>> >>>> I just tried it myself but could not reproduce the error you are >>>> getting. Have you made the changes that are mentioned in the README: >>>> == >>>> Change workdir location in beagle-coaster.xml >>>> Change the project entry in the beagle-coaster.xml to your project name >>>> == >>>> >>>> Ketan >>>> >>>> >>>> On Sat, Jun 4, 2011 at 11:50 AM, Jonathan S Monette >>> > wrote: >>>> >>>>> Hello, >>>>> I am trying to run the catsn test on beagle using the files in >>>>> ~ketan/catsn. I have copied over this directory over to my home directory >>>>> and I believe I set it up correctly. I did module load swift and the ran >>>>> run.sh that was in this directory. I get this error. >>>>> >>>>> /soft/swift/0.92/bin/swift: eval: line 152: syntax error near >>>>> unexpected token `(' >>>>> /soft/swift/0.92/bin/swift: eval: line 152: `java -Xmx8192M >>>>> -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed -DUID=1881 >>>>> -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>>> -DCOG_INSTALL_PATH=/soft/swift/0.92/bin/.. >>>>> -Dswift.home=/soft/swift/0.92/bin/.. -Djava.security.egd= >>>>> file:///dev/urandom -classpath >>>>> /soft/swift/0.92/bin/../etc:/soft/swift/0.92/bin/../libexec:/soft/swift/0.92/bin/../lib/addressing-1.0.jar:/soft/swift/0.92/bin/../lib/ant.jar:/soft/swift/0.92/bin/../lib/antlr-2.7.5.jar:/soft/swift/0.92/bin/../lib/axis.jar:/soft/swift/0.92/bin/../lib/axis-url.jar:/soft/swift/0.92/bin/../lib/backport-util-concurrent.jar:/soft/swift/0.92/bin/../lib/castor-0.9.6.jar:/soft/swift/0.92/bin/../lib/coaster-bootstrap.jar:/soft/swift/0.92/bin/../lib/cog-abstraction-common-2.4.jar:/soft/swift/0.92/bin/../lib/cog-axis.jar:/soft/swift/0.92/bin/../lib/cog-grapheditor-0.47.jar:/soft/swift/0.92/bin/../lib/cog-jglobus-1.7.0.jar:/soft/swift/0.92/bin/../lib/cog-karajan-0.36-dev.jar:/soft/swift/0.92/bin/../lib/cog-provider-clref-gt4_0_0.jar:/soft/swift/0.92/bin/../lib/cog-provider-coaster-0.3.jar:/soft/swift/0.92/bin/../lib/cog-provider-dcache-0.1.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt2-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/soft/swift/0.92/bin/../lib/cog-pro >>>>> vider-local-2.2.jar:/soft/swift/0.92/bin/../lib/cog-provider-localscheduler-0.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-ssh-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-webdav-2.1.jar:/soft/swift/0.92/bin/../lib/cog-resources-1.0.jar:/soft/swift/0.92/bin/../lib/cog-swift-svn.jar:/soft/swift/0.92/bin/../lib/cog-trap-1.0.jar:/soft/swift/0.92/bin/../lib/cog-url.jar:/soft/swift/0.92/bin/../lib/cog-util-0.92.jar:/soft/swift/0.92/bin/../lib/commonj.jar:/soft/swift/0.92/bin/../lib/commons-beanutils.jar:/soft/swift/0.92/bin/../lib/commons-collections-3.0.jar:/soft/swift/0.92/bin/../lib/commons-digester.jar:/soft/swift/0.92/bin/../lib/commons-discovery.jar:/soft/swift/0.92/bin/../lib/commons-httpclient.jar:/soft/swift/0.92/bin/../lib/commons-logging-1.1.jar:/soft/swift/0.92/bin/../lib/concurrent.jar:/soft/swift/0.92/bin/../lib/cryptix32.jar:/soft/swift/0.92/bin/../lib/cryptix-asn1.jar:/soft/swift/0.92/bin/../lib/cryptix.jar:/soft/swift/0.92/bin/../lib/globus_delegation_servic >>>>> e.jar:/soft/swift/0.92/bin/../lib/globus_delegation_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_mds_aggregator_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_service.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rft_stubs.jar:/soft/swift/0.92/bin/../lib/gram-client.jar:/soft/swift/0.92/bin/../lib/gram-stubs.jar:/soft/swift/0.92/bin/../lib/gram-utils.jar:/soft/swift/0.92/bin/../lib/j2ssh-common-0.2.2.jar:/soft/swift/0.92/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/soft/swift/0.92/bin/../lib/jakarta-regexp-1.2.jar:/soft/swift/0.92/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/soft/swift/0.92/bin/../lib/jaxrpc.jar:/soft/swift/0.92/bin/../lib/jce-jdk13-131.jar:/soft/swift/0.92/bin/../lib/jgss.jar:/soft/swift/0.92/bin/../lib/jline-0.9.94.jar:/soft/swift/0.92/bin/../lib/jsr173_1.0_api.jar:/soft/swift/0.92/bin/../lib/jug-lgpl-2.0.0.jar:/soft/swift/0.92/bin/../lib/junit.jar:/soft/swift/0.92/bin/../lib/log4j-1.2 >>>>> .8.jar:/soft/swift/0.92/bin/../lib/naming-common.jar:/soft/swift/0.92/bin/../lib/naming-factory.jar:/soft/swift/0.92/bin/../lib/naming-java.jar:/soft/swift/0.92/bin/../lib/naming-resources.jar:/soft/swift/0.92/bin/../lib/opensaml.jar:/soft/swift/0.92/bin/../lib/puretls.jar:/soft/swift/0.92/bin/../lib/resolver.jar:/soft/swift/0.92/bin/../lib/saaj.jar:/soft/swift/0.92/bin/../lib/stringtemplate.jar:/soft/swift/0.92/bin/../lib/vdldefinitions.jar:/soft/swift/0.92/bin/../lib/wsdl4j.jar:/soft/swift/0.92/bin/../lib/wsrf_core.jar:/soft/swift/0.92/bin/../lib/wsrf_core_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_index_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_usefulrp_schema_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_provider_jce.jar:/soft/swift/0.92/bin/../lib/wsrf_tools.jar:/soft/swift/0.92/bin/../lib/wss4j.jar:/soft/swift/0.92/bin/../lib/xalan.jar:/soft/swift/0.92/bin/../lib/xbean.jar:/soft/swift/0.92/bin/../lib/xbean_xpath.jar:/soft/swift/0.92/bin/../lib/xercesImpl.jar:/soft >>>>> /swift/0.92/bin/../lib/xml-apis.jar:/soft/swift/0.92/bin/../lib/xmlsec.jar:/soft/swift/0.92/bin/../lib/xpp3-1.1.3.4d_b4_min.jar:/soft/swift/0.92/bin/../lib/xstream-1.1.1-patched.jar: >>>>> org.griphyn.vdl.karajan.Loader '-config' 'cf' '-tc.file' 'tc' '-sites.file' >>>>> 'beagle-coaster.xml' 'catsn.swift' '-n=1'' >>>>> >>>>> I there something else I need to load on beagle to make swift run >>>>> accordingly? >>>>> >>>>> -- >>>>> >>>>> Any intelligent fool can make things bigger and more complex... It >>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>> direction. >>>>> >>>>> - Albert Einstein >>>>> >>>>> >>>> >>> >>> >>> -- >>> >>> Any intelligent fool can make things bigger and more complex... It takes >>> a touch of genius - and a lot of courage to move in the opposite direction. >>> >>> - Albert Einstein >>> >>> >> >> >> -- >> >> Any intelligent fool can make things bigger and more complex... It takes a >> touch of genius - and a lot of courage to move in the opposite direction. >> >> - Albert Einstein >> >> > > > -- > > Any intelligent fool can make things bigger and more complex... It takes a > touch of genius - and a lot of courage to move in the opposite direction. > > - Albert Einstein > > -- Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction. - Albert Einstein -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Sat Jun 4 12:37:05 2011 From: ketancmaheshwari at gmail.com (ketan) Date: Sat, 04 Jun 2011 12:37:05 -0500 Subject: [Swift-devel] Re: catsn on beagle In-Reply-To: References: <4DEA670E.3010909@gmail.com> <4DEA6A30.2040503@gmail.com> Message-ID: <4DEA6D41.9070704@gmail.com> Ok, I see that there is this variable in the commandline option that might be causing the error: -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) I do not know why there is (none) in your environment. On 6/4/11 12:26 PM, Jonathan S Monette wrote: > Same result. Same error > > On Sat, Jun 4, 2011 at 12:24 PM, ketan > wrote: > > can you try running the commandline by copying from run.sh and > pasting it at the prompt. > > > On 6/4/11 12:18 PM, Jonathan S Monette wrote: >> I do have that project. I did projects --avail and got back >> >> Project PI Title >> ------------------------------------------------------------------------------ >> CI-CCR000013 Michael Wilde The Swift Parallel Scripting >> System >> >> Here is my run.sh file. I execute with ./run.sh after I did >> chmod +x run.sh >> >> #!/bin/bash >> swift -config cf -tc.file tc -sites.file beagle-coaster.xml >> catsn.swift -n=1 >> >> I do have sh and cat in my path. I can execute them. Here is >> what which sh and which cat produced. >> >> sh is /usr/bin/sh >> cat is /bin/cat >> >> On Sat, Jun 4, 2011 at 12:10 PM, ketan >> > >> wrote: >> >> Can you try "projects --avail" command on beagle to see if >> you are member of a project. >> >> Else you will need to request membership. You can do this >> from this page: >> >> http://pads.ci.uchicago.edu/access/ >> >> In any case, your error does not seem to be because of the >> above. Looks like '(' has sneaked in because of some >> unexpected typo in commandline .. can you doublecheck it. >> >> Also can you make sure that you have all the files: tc, >> beagle-coasters.xml cf in PATH, ie. are you able to access >> them w/o ./ >> >> >> On 6/4/11 12:03 PM, Jonathan S Monette wrote: >>> I did change the work directory. I however did not change >>> the project name. I do not know my project name so I kept >>> the same project that was in there. Here is my modified >>> beagle-coasters.xml >>> >>> >>> >>> >>> CI-CCR000013 >>> >>> 24:cray:pack >>> >>> 24 >>> 1000 >>> 1 >>> 1 >>> 1 >>> >>> .63 >>> 10000 >>> >>> >>> /lustre/beagle/jonmon/Swift/work >>> >>> >>> >>> >>> On Sat, Jun 4, 2011 at 12:00 PM, Ketan Maheshwari >>> >> > wrote: >>> >>> Jon, >>> >>> Thanks for trying out catsn on Beagle. >>> >>> I just tried it myself but could not reproduce the error >>> you are getting. Have you made the changes that are >>> mentioned in the README: >>> == >>> Change workdir location in beagle-coaster.xml >>> Change the project entry in the beagle-coaster.xml to >>> your project name >>> == >>> >>> Ketan >>> >>> >>> On Sat, Jun 4, 2011 at 11:50 AM, Jonathan S Monette >>> > wrote: >>> >>> Hello, >>> I am trying to run the catsn test on beagle using >>> the files in ~ketan/catsn. I have copied over this >>> directory over to my home directory and I believe I >>> set it up correctly. I did module load swift and >>> the ran run.sh that was in this directory. I get >>> this error. >>> >>> /soft/swift/0.92/bin/swift: eval: line 152: syntax >>> error near unexpected token `(' >>> /soft/swift/0.92/bin/swift: eval: line 152: `java >>> -Xmx8192M >>> -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed >>> -DUID=1881 >>> -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>> -DCOG_INSTALL_PATH=/soft/swift/0.92/bin/.. >>> -Dswift.home=/soft/swift/0.92/bin/.. >>> -Djava.security.egd=file:///dev/urandom -classpath >>> /soft/swift/0.92/bin/../etc:/soft/swift/0.92/bin/../libexec:/soft/swift/0.92/bin/../lib/addressing-1.0.jar:/soft/swift/0.92/bin/../lib/ant.jar:/soft/swift/0.92/bin/../lib/antlr-2.7.5.jar:/soft/swift/0.92/bin/../lib/axis.jar:/soft/swift/0.92/bin/../lib/axis-url.jar:/soft/swift/0.92/bin/../lib/backport-util-concurrent.jar:/soft/swift/0.92/bin/../lib/castor-0.9.6.jar:/soft/swift/0.92/bin/../lib/coaster-bootstrap.jar:/soft/swift/0.92/bin/../lib/cog-abstraction-common-2.4.jar:/soft/swift/0.92/bin/../lib/cog-axis.jar:/soft/swift/0.92/bin/../lib/cog-grapheditor-0.47.jar:/soft/swift/0.92/bin/../lib/cog-jglobus-1.7.0.jar:/soft/swift/0.92/bin/../lib/cog-karajan-0.36-dev.jar:/soft/swift/0.92/bin/../lib/cog-provider-clref-gt4_0_0.jar:/soft/swift/0.92/bin/../lib/cog-provider-coaster-0.3.jar:/soft/swift/0.92/bin/../lib/cog-provider-dcache-0.1.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt2-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/soft/swift/0.92/bin/../lib/cog-pro >>> vider-local-2.2.jar:/soft/swift/0.92/bin/../lib/cog-provider-localscheduler-0.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-ssh-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-webdav-2.1.jar:/soft/swift/0.92/bin/../lib/cog-resources-1.0.jar:/soft/swift/0.92/bin/../lib/cog-swift-svn.jar:/soft/swift/0.92/bin/../lib/cog-trap-1.0.jar:/soft/swift/0.92/bin/../lib/cog-url.jar:/soft/swift/0.92/bin/../lib/cog-util-0.92.jar:/soft/swift/0.92/bin/../lib/commonj.jar:/soft/swift/0.92/bin/../lib/commons-beanutils.jar:/soft/swift/0.92/bin/../lib/commons-collections-3.0.jar:/soft/swift/0.92/bin/../lib/commons-digester.jar:/soft/swift/0.92/bin/../lib/commons-discovery.jar:/soft/swift/0.92/bin/../lib/commons-httpclient.jar:/soft/swift/0.92/bin/../lib/commons-logging-1.1.jar:/soft/swift/0.92/bin/../lib/concurrent.jar:/soft/swift/0.92/bin/../lib/cryptix32.jar:/soft/swift/0.92/bin/../lib/cryptix-asn1.jar:/soft/swift/0.92/bin/../lib/cryptix.jar:/soft/swift/0.92/bin/../lib/globus_delegation_servic >>> e.jar:/soft/swift/0.92/bin/../lib/globus_delegation_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_mds_aggregator_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_service.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rft_stubs.jar:/soft/swift/0.92/bin/../lib/gram-client.jar:/soft/swift/0.92/bin/../lib/gram-stubs.jar:/soft/swift/0.92/bin/../lib/gram-utils.jar:/soft/swift/0.92/bin/../lib/j2ssh-common-0.2.2.jar:/soft/swift/0.92/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/soft/swift/0.92/bin/../lib/jakarta-regexp-1.2.jar:/soft/swift/0.92/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/soft/swift/0.92/bin/../lib/jaxrpc.jar:/soft/swift/0.92/bin/../lib/jce-jdk13-131.jar:/soft/swift/0.92/bin/../lib/jgss.jar:/soft/swift/0.92/bin/../lib/jline-0.9.94.jar:/soft/swift/0.92/bin/../lib/jsr173_1.0_api.jar:/soft/swift/0.92/bin/../lib/jug-lgpl-2.0.0.jar:/soft/swift/0.92/bin/../lib/junit.jar:/soft/swift/0.92/bin/../lib/log4j-1.2 >>> .8.jar:/soft/swift/0.92/bin/../lib/naming-common.jar:/soft/swift/0.92/bin/../lib/naming-factory.jar:/soft/swift/0.92/bin/../lib/naming-java.jar:/soft/swift/0.92/bin/../lib/naming-resources.jar:/soft/swift/0.92/bin/../lib/opensaml.jar:/soft/swift/0.92/bin/../lib/puretls.jar:/soft/swift/0.92/bin/../lib/resolver.jar:/soft/swift/0.92/bin/../lib/saaj.jar:/soft/swift/0.92/bin/../lib/stringtemplate.jar:/soft/swift/0.92/bin/../lib/vdldefinitions.jar:/soft/swift/0.92/bin/../lib/wsdl4j.jar:/soft/swift/0.92/bin/../lib/wsrf_core.jar:/soft/swift/0.92/bin/../lib/wsrf_core_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_index_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_usefulrp_schema_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_provider_jce.jar:/soft/swift/0.92/bin/../lib/wsrf_tools.jar:/soft/swift/0.92/bin/../lib/wss4j.jar:/soft/swift/0.92/bin/../lib/xalan.jar:/soft/swift/0.92/bin/../lib/xbean.jar:/soft/swift/0.92/bin/../lib/xbean_xpath.jar:/soft/swift/0.92/bin/../lib/xercesImpl.jar:/soft >>> /swift/0.92/bin/../lib/xml-apis.jar:/soft/swift/0.92/bin/../lib/xmlsec.jar:/soft/swift/0.92/bin/../lib/xpp3-1.1.3.4d_b4_min.jar:/soft/swift/0.92/bin/../lib/xstream-1.1.1-patched.jar: >>> org.griphyn.vdl.karajan.Loader '-config' 'cf' >>> '-tc.file' 'tc' '-sites.file' 'beagle-coaster.xml' >>> 'catsn.swift' '-n=1'' >>> >>> I there something else I need to load on beagle to >>> make swift run accordingly? >>> >>> -- >>> >>> Any intelligent fool can make things bigger and more >>> complex... It takes a touch of genius - and a lot of >>> courage to move in the opposite direction. >>> >>> - Albert Einstein >>> >>> >>> >>> >>> >>> >>> -- >>> >>> Any intelligent fool can make things bigger and more >>> complex... It takes a touch of genius - and a lot of courage >>> to move in the opposite direction. >>> >>> - Albert Einstein >>> >>> >> >> >> >> -- >> >> Any intelligent fool can make things bigger and more complex... >> It takes a touch of genius - and a lot of courage to move in the >> opposite direction. >> >> - Albert Einstein >> >> > > > > -- > > Any intelligent fool can make things bigger and more complex... It > takes a touch of genius - and a lot of courage to move in the opposite > direction. > > - Albert Einstein > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Sat Jun 4 12:38:33 2011 From: jonmon at utexas.edu (Jonathan S Monette) Date: Sat, 4 Jun 2011 12:38:33 -0500 Subject: [Swift-devel] Re: catsn on beagle In-Reply-To: References: <4DEA670E.3010909@gmail.com> <4DEA6A30.2040503@gmail.com> Message-ID: I added echo ${OPTIONS} echo ${COG_OPTS} echo ${LOCALCLASSPATH} echo ${EXEC} echo ${CMDLINE} to the script. It seems in the $OPTIONS variable this appears. -Xmx8192M -Djava.endorsed.dirs=/home/jonmon/Library/Swift/swift-0.92/bin/../lib/endorsed -DUID=1881 -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) -DCOG_INSTALL_PATH=/home/jonmon/Library/Swift/swift-0.92/bin/.. -Dswift.home=/home/jonmon/Library/Swift/swift-0.92/bin/.. -Djava.security.egd=file:///dev/urandom There is the line DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) I believe that is the parenthesis that is causing this to fail. Now How do I fix it. On Sat, Jun 4, 2011 at 12:36 PM, Jonathan S Monette wrote: > I may have found the error. I download 0.92 binaries and changed the swift > command script. I added > > > On Sat, Jun 4, 2011 at 12:26 PM, Jonathan S Monette wrote: > >> Same result. Same error >> >> >> On Sat, Jun 4, 2011 at 12:24 PM, ketan wrote: >> >>> can you try running the commandline by copying from run.sh and pasting >>> it at the prompt. >>> >>> >>> On 6/4/11 12:18 PM, Jonathan S Monette wrote: >>> >>> I do have that project. I did projects --avail and got back >>> >>> Project PI Title >>> >>> ------------------------------------------------------------------------------ >>> CI-CCR000013 Michael Wilde The Swift Parallel Scripting System >>> >>> Here is my run.sh file. I execute with ./run.sh after I did chmod +x >>> run.sh >>> >>> #!/bin/bash >>> swift -config cf -tc.file tc -sites.file beagle-coaster.xml catsn.swift >>> -n=1 >>> >>> I do have sh and cat in my path. I can execute them. Here is what >>> which sh and which cat produced. >>> >>> sh is /usr/bin/sh >>> cat is /bin/cat >>> >>> On Sat, Jun 4, 2011 at 12:10 PM, ketan wrote: >>> >>>> Can you try "projects --avail" command on beagle to see if you are >>>> member of a project. >>>> >>>> Else you will need to request membership. You can do this from this >>>> page: >>>> >>>> http://pads.ci.uchicago.edu/access/ >>>> >>>> In any case, your error does not seem to be because of the above. Looks >>>> like '(' has sneaked in because of some unexpected typo in commandline .. >>>> can you doublecheck it. >>>> >>>> Also can you make sure that you have all the files: tc, >>>> beagle-coasters.xml cf in PATH, ie. are you able to access them w/o ./ >>>> >>>> >>>> On 6/4/11 12:03 PM, Jonathan S Monette wrote: >>>> >>>> I did change the work directory. I however did not change the project >>>> name. I do not know my project name so I kept the same project that was in >>>> there. Here is my modified beagle-coasters.xml >>>> >>>> >>>> >>>> >>>> CI-CCR000013 >>>> >>>> 24:cray:pack >>>> >>>> 24 >>>> 1000 >>>> 1 >>>> 1 >>>> 1 >>>> >>>> .63 >>>> 10000 >>>> >>>> >>>> /lustre/beagle/jonmon/Swift/work >>>> >>>> >>>> >>>> >>>> On Sat, Jun 4, 2011 at 12:00 PM, Ketan Maheshwari < >>>> ketancmaheshwari at gmail.com> wrote: >>>> >>>>> Jon, >>>>> >>>>> Thanks for trying out catsn on Beagle. >>>>> >>>>> I just tried it myself but could not reproduce the error you are >>>>> getting. Have you made the changes that are mentioned in the README: >>>>> == >>>>> Change workdir location in beagle-coaster.xml >>>>> Change the project entry in the beagle-coaster.xml to your project name >>>>> == >>>>> >>>>> Ketan >>>>> >>>>> >>>>> On Sat, Jun 4, 2011 at 11:50 AM, Jonathan S Monette < >>>>> jonmon at utexas.edu> wrote: >>>>> >>>>>> Hello, >>>>>> I am trying to run the catsn test on beagle using the files in >>>>>> ~ketan/catsn. I have copied over this directory over to my home directory >>>>>> and I believe I set it up correctly. I did module load swift and the ran >>>>>> run.sh that was in this directory. I get this error. >>>>>> >>>>>> /soft/swift/0.92/bin/swift: eval: line 152: syntax error near >>>>>> unexpected token `(' >>>>>> /soft/swift/0.92/bin/swift: eval: line 152: `java -Xmx8192M >>>>>> -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed -DUID=1881 >>>>>> -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>>>> -DCOG_INSTALL_PATH=/soft/swift/0.92/bin/.. >>>>>> -Dswift.home=/soft/swift/0.92/bin/.. -Djava.security.egd= >>>>>> file:///dev/urandom -classpath >>>>>> /soft/swift/0.92/bin/../etc:/soft/swift/0.92/bin/../libexec:/soft/swift/0.92/bin/../lib/addressing-1.0.jar:/soft/swift/0.92/bin/../lib/ant.jar:/soft/swift/0.92/bin/../lib/antlr-2.7.5.jar:/soft/swift/0.92/bin/../lib/axis.jar:/soft/swift/0.92/bin/../lib/axis-url.jar:/soft/swift/0.92/bin/../lib/backport-util-concurrent.jar:/soft/swift/0.92/bin/../lib/castor-0.9.6.jar:/soft/swift/0.92/bin/../lib/coaster-bootstrap.jar:/soft/swift/0.92/bin/../lib/cog-abstraction-common-2.4.jar:/soft/swift/0.92/bin/../lib/cog-axis.jar:/soft/swift/0.92/bin/../lib/cog-grapheditor-0.47.jar:/soft/swift/0.92/bin/../lib/cog-jglobus-1.7.0.jar:/soft/swift/0.92/bin/../lib/cog-karajan-0.36-dev.jar:/soft/swift/0.92/bin/../lib/cog-provider-clref-gt4_0_0.jar:/soft/swift/0.92/bin/../lib/cog-provider-coaster-0.3.jar:/soft/swift/0.92/bin/../lib/cog-provider-dcache-0.1.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt2-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/soft/swift/0.92/bin/../lib/cog-pro >>>>>> vider-local-2.2.jar:/soft/swift/0.92/bin/../lib/cog-provider-localscheduler-0.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-ssh-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-webdav-2.1.jar:/soft/swift/0.92/bin/../lib/cog-resources-1.0.jar:/soft/swift/0.92/bin/../lib/cog-swift-svn.jar:/soft/swift/0.92/bin/../lib/cog-trap-1.0.jar:/soft/swift/0.92/bin/../lib/cog-url.jar:/soft/swift/0.92/bin/../lib/cog-util-0.92.jar:/soft/swift/0.92/bin/../lib/commonj.jar:/soft/swift/0.92/bin/../lib/commons-beanutils.jar:/soft/swift/0.92/bin/../lib/commons-collections-3.0.jar:/soft/swift/0.92/bin/../lib/commons-digester.jar:/soft/swift/0.92/bin/../lib/commons-discovery.jar:/soft/swift/0.92/bin/../lib/commons-httpclient.jar:/soft/swift/0.92/bin/../lib/commons-logging-1.1.jar:/soft/swift/0.92/bin/../lib/concurrent.jar:/soft/swift/0.92/bin/../lib/cryptix32.jar:/soft/swift/0.92/bin/../lib/cryptix-asn1.jar:/soft/swift/0.92/bin/../lib/cryptix.jar:/soft/swift/0.92/bin/../lib/globus_delegation_servic >>>>>> e.jar:/soft/swift/0.92/bin/../lib/globus_delegation_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_mds_aggregator_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_service.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rft_stubs.jar:/soft/swift/0.92/bin/../lib/gram-client.jar:/soft/swift/0.92/bin/../lib/gram-stubs.jar:/soft/swift/0.92/bin/../lib/gram-utils.jar:/soft/swift/0.92/bin/../lib/j2ssh-common-0.2.2.jar:/soft/swift/0.92/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/soft/swift/0.92/bin/../lib/jakarta-regexp-1.2.jar:/soft/swift/0.92/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/soft/swift/0.92/bin/../lib/jaxrpc.jar:/soft/swift/0.92/bin/../lib/jce-jdk13-131.jar:/soft/swift/0.92/bin/../lib/jgss.jar:/soft/swift/0.92/bin/../lib/jline-0.9.94.jar:/soft/swift/0.92/bin/../lib/jsr173_1.0_api.jar:/soft/swift/0.92/bin/../lib/jug-lgpl-2.0.0.jar:/soft/swift/0.92/bin/../lib/junit.jar:/soft/swift/0.92/bin/../lib/log4j-1.2 >>>>>> .8.jar:/soft/swift/0.92/bin/../lib/naming-common.jar:/soft/swift/0.92/bin/../lib/naming-factory.jar:/soft/swift/0.92/bin/../lib/naming-java.jar:/soft/swift/0.92/bin/../lib/naming-resources.jar:/soft/swift/0.92/bin/../lib/opensaml.jar:/soft/swift/0.92/bin/../lib/puretls.jar:/soft/swift/0.92/bin/../lib/resolver.jar:/soft/swift/0.92/bin/../lib/saaj.jar:/soft/swift/0.92/bin/../lib/stringtemplate.jar:/soft/swift/0.92/bin/../lib/vdldefinitions.jar:/soft/swift/0.92/bin/../lib/wsdl4j.jar:/soft/swift/0.92/bin/../lib/wsrf_core.jar:/soft/swift/0.92/bin/../lib/wsrf_core_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_index_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_usefulrp_schema_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_provider_jce.jar:/soft/swift/0.92/bin/../lib/wsrf_tools.jar:/soft/swift/0.92/bin/../lib/wss4j.jar:/soft/swift/0.92/bin/../lib/xalan.jar:/soft/swift/0.92/bin/../lib/xbean.jar:/soft/swift/0.92/bin/../lib/xbean_xpath.jar:/soft/swift/0.92/bin/../lib/xercesImpl.jar:/soft >>>>>> /swift/0.92/bin/../lib/xml-apis.jar:/soft/swift/0.92/bin/../lib/xmlsec.jar:/soft/swift/0.92/bin/../lib/xpp3-1.1.3.4d_b4_min.jar:/soft/swift/0.92/bin/../lib/xstream-1.1.1-patched.jar: >>>>>> org.griphyn.vdl.karajan.Loader '-config' 'cf' '-tc.file' 'tc' '-sites.file' >>>>>> 'beagle-coaster.xml' 'catsn.swift' '-n=1'' >>>>>> >>>>>> I there something else I need to load on beagle to make swift run >>>>>> accordingly? >>>>>> >>>>>> -- >>>>>> >>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>> direction. >>>>>> >>>>>> - Albert Einstein >>>>>> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> >>>> Any intelligent fool can make things bigger and more complex... It takes >>>> a touch of genius - and a lot of courage to move in the opposite direction. >>>> >>>> - Albert Einstein >>>> >>>> >>> >>> >>> -- >>> >>> Any intelligent fool can make things bigger and more complex... It takes >>> a touch of genius - and a lot of courage to move in the opposite direction. >>> >>> - Albert Einstein >>> >>> >> >> >> -- >> >> Any intelligent fool can make things bigger and more complex... It takes a >> touch of genius - and a lot of courage to move in the opposite direction. >> >> - Albert Einstein >> >> > > > -- > > Any intelligent fool can make things bigger and more complex... It takes a > touch of genius - and a lot of courage to move in the opposite direction. > > - Albert Einstein > > -- Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction. - Albert Einstein -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Sat Jun 4 12:39:45 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Sat, 4 Jun 2011 12:39:45 -0500 Subject: [Swift-devel] Re: catsn on beagle In-Reply-To: References: <4DEA670E.3010909@gmail.com> <4DEA6A30.2040503@gmail.com> Message-ID: This suprises me because, I do not see that (none) in my commandline. On Sat, Jun 4, 2011 at 12:38 PM, Jonathan S Monette wrote: > I added > echo ${OPTIONS} > echo ${COG_OPTS} > echo ${LOCALCLASSPATH} > echo ${EXEC} > echo ${CMDLINE} > > to the script. It seems in the $OPTIONS variable this appears. > > -Xmx8192M > -Djava.endorsed.dirs=/home/jonmon/Library/Swift/swift-0.92/bin/../lib/endorsed > -DUID=1881 -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) > -DCOG_INSTALL_PATH=/home/jonmon/Library/Swift/swift-0.92/bin/.. > -Dswift.home=/home/jonmon/Library/Swift/swift-0.92/bin/.. > -Djava.security.egd=file:///dev/urandom > > There is the line DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) I > believe that is the parenthesis that is causing this to fail. Now How do I > fix it. > > On Sat, Jun 4, 2011 at 12:36 PM, Jonathan S Monette wrote: > >> I may have found the error. I download 0.92 binaries and changed the >> swift command script. I added >> >> >> On Sat, Jun 4, 2011 at 12:26 PM, Jonathan S Monette wrote: >> >>> Same result. Same error >>> >>> >>> On Sat, Jun 4, 2011 at 12:24 PM, ketan wrote: >>> >>>> can you try running the commandline by copying from run.sh and pasting >>>> it at the prompt. >>>> >>>> >>>> On 6/4/11 12:18 PM, Jonathan S Monette wrote: >>>> >>>> I do have that project. I did projects --avail and got back >>>> >>>> Project PI Title >>>> >>>> ------------------------------------------------------------------------------ >>>> CI-CCR000013 Michael Wilde The Swift Parallel Scripting System >>>> >>>> Here is my run.sh file. I execute with ./run.sh after I did chmod +x >>>> run.sh >>>> >>>> #!/bin/bash >>>> swift -config cf -tc.file tc -sites.file beagle-coaster.xml catsn.swift >>>> -n=1 >>>> >>>> I do have sh and cat in my path. I can execute them. Here is what >>>> which sh and which cat produced. >>>> >>>> sh is /usr/bin/sh >>>> cat is /bin/cat >>>> >>>> On Sat, Jun 4, 2011 at 12:10 PM, ketan wrote: >>>> >>>>> Can you try "projects --avail" command on beagle to see if you are >>>>> member of a project. >>>>> >>>>> Else you will need to request membership. You can do this from this >>>>> page: >>>>> >>>>> http://pads.ci.uchicago.edu/access/ >>>>> >>>>> In any case, your error does not seem to be because of the above. Looks >>>>> like '(' has sneaked in because of some unexpected typo in commandline .. >>>>> can you doublecheck it. >>>>> >>>>> Also can you make sure that you have all the files: tc, >>>>> beagle-coasters.xml cf in PATH, ie. are you able to access them w/o ./ >>>>> >>>>> >>>>> On 6/4/11 12:03 PM, Jonathan S Monette wrote: >>>>> >>>>> I did change the work directory. I however did not change the >>>>> project name. I do not know my project name so I kept the same project that >>>>> was in there. Here is my modified beagle-coasters.xml >>>>> >>>>> >>>>> >>>>> >>>>> CI-CCR000013 >>>>> >>>>> 24:cray:pack >>>>> >>>>> 24 >>>>> 1000 >>>>> 1 >>>>> 1 >>>>> 1 >>>>> >>>>> .63 >>>>> 10000 >>>>> >>>>> >>>>> /lustre/beagle/jonmon/Swift/work >>>>> >>>>> >>>>> >>>>> >>>>> On Sat, Jun 4, 2011 at 12:00 PM, Ketan Maheshwari < >>>>> ketancmaheshwari at gmail.com> wrote: >>>>> >>>>>> Jon, >>>>>> >>>>>> Thanks for trying out catsn on Beagle. >>>>>> >>>>>> I just tried it myself but could not reproduce the error you are >>>>>> getting. Have you made the changes that are mentioned in the README: >>>>>> == >>>>>> Change workdir location in beagle-coaster.xml >>>>>> Change the project entry in the beagle-coaster.xml to your project >>>>>> name >>>>>> == >>>>>> >>>>>> Ketan >>>>>> >>>>>> >>>>>> On Sat, Jun 4, 2011 at 11:50 AM, Jonathan S Monette < >>>>>> jonmon at utexas.edu> wrote: >>>>>> >>>>>>> Hello, >>>>>>> I am trying to run the catsn test on beagle using the files in >>>>>>> ~ketan/catsn. I have copied over this directory over to my home directory >>>>>>> and I believe I set it up correctly. I did module load swift and the ran >>>>>>> run.sh that was in this directory. I get this error. >>>>>>> >>>>>>> /soft/swift/0.92/bin/swift: eval: line 152: syntax error near >>>>>>> unexpected token `(' >>>>>>> /soft/swift/0.92/bin/swift: eval: line 152: `java -Xmx8192M >>>>>>> -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed -DUID=1881 >>>>>>> -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>>>>> -DCOG_INSTALL_PATH=/soft/swift/0.92/bin/.. >>>>>>> -Dswift.home=/soft/swift/0.92/bin/.. -Djava.security.egd= >>>>>>> file:///dev/urandom -classpath >>>>>>> /soft/swift/0.92/bin/../etc:/soft/swift/0.92/bin/../libexec:/soft/swift/0.92/bin/../lib/addressing-1.0.jar:/soft/swift/0.92/bin/../lib/ant.jar:/soft/swift/0.92/bin/../lib/antlr-2.7.5.jar:/soft/swift/0.92/bin/../lib/axis.jar:/soft/swift/0.92/bin/../lib/axis-url.jar:/soft/swift/0.92/bin/../lib/backport-util-concurrent.jar:/soft/swift/0.92/bin/../lib/castor-0.9.6.jar:/soft/swift/0.92/bin/../lib/coaster-bootstrap.jar:/soft/swift/0.92/bin/../lib/cog-abstraction-common-2.4.jar:/soft/swift/0.92/bin/../lib/cog-axis.jar:/soft/swift/0.92/bin/../lib/cog-grapheditor-0.47.jar:/soft/swift/0.92/bin/../lib/cog-jglobus-1.7.0.jar:/soft/swift/0.92/bin/../lib/cog-karajan-0.36-dev.jar:/soft/swift/0.92/bin/../lib/cog-provider-clref-gt4_0_0.jar:/soft/swift/0.92/bin/../lib/cog-provider-coaster-0.3.jar:/soft/swift/0.92/bin/../lib/cog-provider-dcache-0.1.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt2-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/soft/swift/0.92/bin/../lib/cog-pro >>>>>>> vider-local-2.2.jar:/soft/swift/0.92/bin/../lib/cog-provider-localscheduler-0.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-ssh-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-webdav-2.1.jar:/soft/swift/0.92/bin/../lib/cog-resources-1.0.jar:/soft/swift/0.92/bin/../lib/cog-swift-svn.jar:/soft/swift/0.92/bin/../lib/cog-trap-1.0.jar:/soft/swift/0.92/bin/../lib/cog-url.jar:/soft/swift/0.92/bin/../lib/cog-util-0.92.jar:/soft/swift/0.92/bin/../lib/commonj.jar:/soft/swift/0.92/bin/../lib/commons-beanutils.jar:/soft/swift/0.92/bin/../lib/commons-collections-3.0.jar:/soft/swift/0.92/bin/../lib/commons-digester.jar:/soft/swift/0.92/bin/../lib/commons-discovery.jar:/soft/swift/0.92/bin/../lib/commons-httpclient.jar:/soft/swift/0.92/bin/../lib/commons-logging-1.1.jar:/soft/swift/0.92/bin/../lib/concurrent.jar:/soft/swift/0.92/bin/../lib/cryptix32.jar:/soft/swift/0.92/bin/../lib/cryptix-asn1.jar:/soft/swift/0.92/bin/../lib/cryptix.jar:/soft/swift/0.92/bin/../lib/globus_delegation_servic >>>>>>> e.jar:/soft/swift/0.92/bin/../lib/globus_delegation_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_mds_aggregator_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_service.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rft_stubs.jar:/soft/swift/0.92/bin/../lib/gram-client.jar:/soft/swift/0.92/bin/../lib/gram-stubs.jar:/soft/swift/0.92/bin/../lib/gram-utils.jar:/soft/swift/0.92/bin/../lib/j2ssh-common-0.2.2.jar:/soft/swift/0.92/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/soft/swift/0.92/bin/../lib/jakarta-regexp-1.2.jar:/soft/swift/0.92/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/soft/swift/0.92/bin/../lib/jaxrpc.jar:/soft/swift/0.92/bin/../lib/jce-jdk13-131.jar:/soft/swift/0.92/bin/../lib/jgss.jar:/soft/swift/0.92/bin/../lib/jline-0.9.94.jar:/soft/swift/0.92/bin/../lib/jsr173_1.0_api.jar:/soft/swift/0.92/bin/../lib/jug-lgpl-2.0.0.jar:/soft/swift/0.92/bin/../lib/junit.jar:/soft/swift/0.92/bin/../lib/log4j-1.2 >>>>>>> .8.jar:/soft/swift/0.92/bin/../lib/naming-common.jar:/soft/swift/0.92/bin/../lib/naming-factory.jar:/soft/swift/0.92/bin/../lib/naming-java.jar:/soft/swift/0.92/bin/../lib/naming-resources.jar:/soft/swift/0.92/bin/../lib/opensaml.jar:/soft/swift/0.92/bin/../lib/puretls.jar:/soft/swift/0.92/bin/../lib/resolver.jar:/soft/swift/0.92/bin/../lib/saaj.jar:/soft/swift/0.92/bin/../lib/stringtemplate.jar:/soft/swift/0.92/bin/../lib/vdldefinitions.jar:/soft/swift/0.92/bin/../lib/wsdl4j.jar:/soft/swift/0.92/bin/../lib/wsrf_core.jar:/soft/swift/0.92/bin/../lib/wsrf_core_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_index_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_usefulrp_schema_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_provider_jce.jar:/soft/swift/0.92/bin/../lib/wsrf_tools.jar:/soft/swift/0.92/bin/../lib/wss4j.jar:/soft/swift/0.92/bin/../lib/xalan.jar:/soft/swift/0.92/bin/../lib/xbean.jar:/soft/swift/0.92/bin/../lib/xbean_xpath.jar:/soft/swift/0.92/bin/../lib/xercesImpl.jar:/soft >>>>>>> /swift/0.92/bin/../lib/xml-apis.jar:/soft/swift/0.92/bin/../lib/xmlsec.jar:/soft/swift/0.92/bin/../lib/xpp3-1.1.3.4d_b4_min.jar:/soft/swift/0.92/bin/../lib/xstream-1.1.1-patched.jar: >>>>>>> org.griphyn.vdl.karajan.Loader '-config' 'cf' '-tc.file' 'tc' '-sites.file' >>>>>>> 'beagle-coaster.xml' 'catsn.swift' '-n=1'' >>>>>>> >>>>>>> I there something else I need to load on beagle to make swift run >>>>>>> accordingly? >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>> direction. >>>>>>> >>>>>>> - Albert Einstein >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Any intelligent fool can make things bigger and more complex... It >>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>> direction. >>>>> >>>>> - Albert Einstein >>>>> >>>>> >>>> >>>> >>>> -- >>>> >>>> Any intelligent fool can make things bigger and more complex... It takes >>>> a touch of genius - and a lot of courage to move in the opposite direction. >>>> >>>> - Albert Einstein >>>> >>>> >>> >>> >>> -- >>> >>> Any intelligent fool can make things bigger and more complex... It takes >>> a touch of genius - and a lot of courage to move in the opposite direction. >>> >>> - Albert Einstein >>> >>> >> >> >> -- >> >> Any intelligent fool can make things bigger and more complex... It takes a >> touch of genius - and a lot of courage to move in the opposite direction. >> >> - Albert Einstein >> >> > > > -- > > Any intelligent fool can make things bigger and more complex... It takes a > touch of genius - and a lot of courage to move in the opposite direction. > > - Albert Einstein > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Sat Jun 4 12:44:33 2011 From: jonmon at utexas.edu (Jonathan S Monette) Date: Sat, 4 Jun 2011 12:44:33 -0500 Subject: [Swift-devel] Re: catsn on beagle In-Reply-To: References: <4DEA670E.3010909@gmail.com> <4DEA6A30.2040503@gmail.com> Message-ID: I am not sure why that (none) appears. Is this a variable that the system sets or do I need to set it? On Sat, Jun 4, 2011 at 12:39 PM, Ketan Maheshwari < ketancmaheshwari at gmail.com> wrote: > This suprises me because, I do not see that (none) in my commandline. > > > On Sat, Jun 4, 2011 at 12:38 PM, Jonathan S Monette wrote: > >> I added >> echo ${OPTIONS} >> echo ${COG_OPTS} >> echo ${LOCALCLASSPATH} >> echo ${EXEC} >> echo ${CMDLINE} >> >> to the script. It seems in the $OPTIONS variable this appears. >> >> -Xmx8192M >> -Djava.endorsed.dirs=/home/jonmon/Library/Swift/swift-0.92/bin/../lib/endorsed >> -DUID=1881 -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >> -DCOG_INSTALL_PATH=/home/jonmon/Library/Swift/swift-0.92/bin/.. >> -Dswift.home=/home/jonmon/Library/Swift/swift-0.92/bin/.. >> -Djava.security.egd=file:///dev/urandom >> >> There is the line DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) I >> believe that is the parenthesis that is causing this to fail. Now How do I >> fix it. >> >> On Sat, Jun 4, 2011 at 12:36 PM, Jonathan S Monette wrote: >> >>> I may have found the error. I download 0.92 binaries and changed the >>> swift command script. I added >>> >>> >>> On Sat, Jun 4, 2011 at 12:26 PM, Jonathan S Monette wrote: >>> >>>> Same result. Same error >>>> >>>> >>>> On Sat, Jun 4, 2011 at 12:24 PM, ketan wrote: >>>> >>>>> can you try running the commandline by copying from run.sh and pasting >>>>> it at the prompt. >>>>> >>>>> >>>>> On 6/4/11 12:18 PM, Jonathan S Monette wrote: >>>>> >>>>> I do have that project. I did projects --avail and got back >>>>> >>>>> Project PI Title >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> CI-CCR000013 Michael Wilde The Swift Parallel Scripting >>>>> System >>>>> >>>>> Here is my run.sh file. I execute with ./run.sh after I did chmod +x >>>>> run.sh >>>>> >>>>> #!/bin/bash >>>>> swift -config cf -tc.file tc -sites.file beagle-coaster.xml catsn.swift >>>>> -n=1 >>>>> >>>>> I do have sh and cat in my path. I can execute them. Here is what >>>>> which sh and which cat produced. >>>>> >>>>> sh is /usr/bin/sh >>>>> cat is /bin/cat >>>>> >>>>> On Sat, Jun 4, 2011 at 12:10 PM, ketan wrote: >>>>> >>>>>> Can you try "projects --avail" command on beagle to see if you are >>>>>> member of a project. >>>>>> >>>>>> Else you will need to request membership. You can do this from this >>>>>> page: >>>>>> >>>>>> http://pads.ci.uchicago.edu/access/ >>>>>> >>>>>> In any case, your error does not seem to be because of the above. >>>>>> Looks like '(' has sneaked in because of some unexpected typo in commandline >>>>>> .. can you doublecheck it. >>>>>> >>>>>> Also can you make sure that you have all the files: tc, >>>>>> beagle-coasters.xml cf in PATH, ie. are you able to access them w/o ./ >>>>>> >>>>>> >>>>>> On 6/4/11 12:03 PM, Jonathan S Monette wrote: >>>>>> >>>>>> I did change the work directory. I however did not change the >>>>>> project name. I do not know my project name so I kept the same project that >>>>>> was in there. Here is my modified beagle-coasters.xml >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> CI-CCR000013 >>>>>> >>>>>> 24:cray:pack >>>>>> >>>>>> 24 >>>>>> 1000 >>>>>> 1 >>>>>> 1 >>>>>> 1 >>>>>> >>>>>> .63 >>>>>> 10000 >>>>>> >>>>>> >>>>>> /lustre/beagle/jonmon/Swift/work >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Sat, Jun 4, 2011 at 12:00 PM, Ketan Maheshwari < >>>>>> ketancmaheshwari at gmail.com> wrote: >>>>>> >>>>>>> Jon, >>>>>>> >>>>>>> Thanks for trying out catsn on Beagle. >>>>>>> >>>>>>> I just tried it myself but could not reproduce the error you are >>>>>>> getting. Have you made the changes that are mentioned in the README: >>>>>>> == >>>>>>> Change workdir location in beagle-coaster.xml >>>>>>> Change the project entry in the beagle-coaster.xml to your project >>>>>>> name >>>>>>> == >>>>>>> >>>>>>> Ketan >>>>>>> >>>>>>> >>>>>>> On Sat, Jun 4, 2011 at 11:50 AM, Jonathan S Monette < >>>>>>> jonmon at utexas.edu> wrote: >>>>>>> >>>>>>>> Hello, >>>>>>>> I am trying to run the catsn test on beagle using the files in >>>>>>>> ~ketan/catsn. I have copied over this directory over to my home directory >>>>>>>> and I believe I set it up correctly. I did module load swift and the ran >>>>>>>> run.sh that was in this directory. I get this error. >>>>>>>> >>>>>>>> /soft/swift/0.92/bin/swift: eval: line 152: syntax error near >>>>>>>> unexpected token `(' >>>>>>>> /soft/swift/0.92/bin/swift: eval: line 152: `java -Xmx8192M >>>>>>>> -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed -DUID=1881 >>>>>>>> -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>>>>>> -DCOG_INSTALL_PATH=/soft/swift/0.92/bin/.. >>>>>>>> -Dswift.home=/soft/swift/0.92/bin/.. -Djava.security.egd= >>>>>>>> file:///dev/urandom -classpath >>>>>>>> /soft/swift/0.92/bin/../etc:/soft/swift/0.92/bin/../libexec:/soft/swift/0.92/bin/../lib/addressing-1.0.jar:/soft/swift/0.92/bin/../lib/ant.jar:/soft/swift/0.92/bin/../lib/antlr-2.7.5.jar:/soft/swift/0.92/bin/../lib/axis.jar:/soft/swift/0.92/bin/../lib/axis-url.jar:/soft/swift/0.92/bin/../lib/backport-util-concurrent.jar:/soft/swift/0.92/bin/../lib/castor-0.9.6.jar:/soft/swift/0.92/bin/../lib/coaster-bootstrap.jar:/soft/swift/0.92/bin/../lib/cog-abstraction-common-2.4.jar:/soft/swift/0.92/bin/../lib/cog-axis.jar:/soft/swift/0.92/bin/../lib/cog-grapheditor-0.47.jar:/soft/swift/0.92/bin/../lib/cog-jglobus-1.7.0.jar:/soft/swift/0.92/bin/../lib/cog-karajan-0.36-dev.jar:/soft/swift/0.92/bin/../lib/cog-provider-clref-gt4_0_0.jar:/soft/swift/0.92/bin/../lib/cog-provider-coaster-0.3.jar:/soft/swift/0.92/bin/../lib/cog-provider-dcache-0.1.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt2-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/soft/swift/0.92/bin/../lib/cog-pro >>>>>>>> vider-local-2.2.jar:/soft/swift/0.92/bin/../lib/cog-provider-localscheduler-0.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-ssh-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-webdav-2.1.jar:/soft/swift/0.92/bin/../lib/cog-resources-1.0.jar:/soft/swift/0.92/bin/../lib/cog-swift-svn.jar:/soft/swift/0.92/bin/../lib/cog-trap-1.0.jar:/soft/swift/0.92/bin/../lib/cog-url.jar:/soft/swift/0.92/bin/../lib/cog-util-0.92.jar:/soft/swift/0.92/bin/../lib/commonj.jar:/soft/swift/0.92/bin/../lib/commons-beanutils.jar:/soft/swift/0.92/bin/../lib/commons-collections-3.0.jar:/soft/swift/0.92/bin/../lib/commons-digester.jar:/soft/swift/0.92/bin/../lib/commons-discovery.jar:/soft/swift/0.92/bin/../lib/commons-httpclient.jar:/soft/swift/0.92/bin/../lib/commons-logging-1.1.jar:/soft/swift/0.92/bin/../lib/concurrent.jar:/soft/swift/0.92/bin/../lib/cryptix32.jar:/soft/swift/0.92/bin/../lib/cryptix-asn1.jar:/soft/swift/0.92/bin/../lib/cryptix.jar:/soft/swift/0.92/bin/../lib/globus_delegation_servic >>>>>>>> e.jar:/soft/swift/0.92/bin/../lib/globus_delegation_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_mds_aggregator_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_service.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rft_stubs.jar:/soft/swift/0.92/bin/../lib/gram-client.jar:/soft/swift/0.92/bin/../lib/gram-stubs.jar:/soft/swift/0.92/bin/../lib/gram-utils.jar:/soft/swift/0.92/bin/../lib/j2ssh-common-0.2.2.jar:/soft/swift/0.92/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/soft/swift/0.92/bin/../lib/jakarta-regexp-1.2.jar:/soft/swift/0.92/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/soft/swift/0.92/bin/../lib/jaxrpc.jar:/soft/swift/0.92/bin/../lib/jce-jdk13-131.jar:/soft/swift/0.92/bin/../lib/jgss.jar:/soft/swift/0.92/bin/../lib/jline-0.9.94.jar:/soft/swift/0.92/bin/../lib/jsr173_1.0_api.jar:/soft/swift/0.92/bin/../lib/jug-lgpl-2.0.0.jar:/soft/swift/0.92/bin/../lib/junit.jar:/soft/swift/0.92/bin/../lib/log4j-1.2 >>>>>>>> .8.jar:/soft/swift/0.92/bin/../lib/naming-common.jar:/soft/swift/0.92/bin/../lib/naming-factory.jar:/soft/swift/0.92/bin/../lib/naming-java.jar:/soft/swift/0.92/bin/../lib/naming-resources.jar:/soft/swift/0.92/bin/../lib/opensaml.jar:/soft/swift/0.92/bin/../lib/puretls.jar:/soft/swift/0.92/bin/../lib/resolver.jar:/soft/swift/0.92/bin/../lib/saaj.jar:/soft/swift/0.92/bin/../lib/stringtemplate.jar:/soft/swift/0.92/bin/../lib/vdldefinitions.jar:/soft/swift/0.92/bin/../lib/wsdl4j.jar:/soft/swift/0.92/bin/../lib/wsrf_core.jar:/soft/swift/0.92/bin/../lib/wsrf_core_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_index_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_usefulrp_schema_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_provider_jce.jar:/soft/swift/0.92/bin/../lib/wsrf_tools.jar:/soft/swift/0.92/bin/../lib/wss4j.jar:/soft/swift/0.92/bin/../lib/xalan.jar:/soft/swift/0.92/bin/../lib/xbean.jar:/soft/swift/0.92/bin/../lib/xbean_xpath.jar:/soft/swift/0.92/bin/../lib/xercesImpl.jar:/soft >>>>>>>> /swift/0.92/bin/../lib/xml-apis.jar:/soft/swift/0.92/bin/../lib/xmlsec.jar:/soft/swift/0.92/bin/../lib/xpp3-1.1.3.4d_b4_min.jar:/soft/swift/0.92/bin/../lib/xstream-1.1.1-patched.jar: >>>>>>>> org.griphyn.vdl.karajan.Loader '-config' 'cf' '-tc.file' 'tc' '-sites.file' >>>>>>>> 'beagle-coaster.xml' 'catsn.swift' '-n=1'' >>>>>>>> >>>>>>>> I there something else I need to load on beagle to make swift run >>>>>>>> accordingly? >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>> direction. >>>>>>>> >>>>>>>> - Albert Einstein >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>> direction. >>>>>> >>>>>> - Albert Einstein >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Any intelligent fool can make things bigger and more complex... It >>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>> direction. >>>>> >>>>> - Albert Einstein >>>>> >>>>> >>>> >>>> >>>> -- >>>> >>>> Any intelligent fool can make things bigger and more complex... It takes >>>> a touch of genius - and a lot of courage to move in the opposite direction. >>>> >>>> - Albert Einstein >>>> >>>> >>> >>> >>> -- >>> >>> Any intelligent fool can make things bigger and more complex... It takes >>> a touch of genius - and a lot of courage to move in the opposite direction. >>> >>> - Albert Einstein >>> >>> >> >> >> -- >> >> Any intelligent fool can make things bigger and more complex... It takes a >> touch of genius - and a lot of courage to move in the opposite direction. >> >> - Albert Einstein >> >> > -- Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction. - Albert Einstein -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Sat Jun 4 12:47:03 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Sat, 4 Jun 2011 12:47:03 -0500 Subject: [Swift-devel] Re: catsn on beagle In-Reply-To: References: <4DEA670E.3010909@gmail.com> <4DEA6A30.2040503@gmail.com> Message-ID: What does echo $HOSTNAME give? On Sat, Jun 4, 2011 at 12:44 PM, Jonathan S Monette wrote: > I am not sure why that (none) appears. Is this a variable that the system > sets or do I need to set it? > > > On Sat, Jun 4, 2011 at 12:39 PM, Ketan Maheshwari < > ketancmaheshwari at gmail.com> wrote: > >> This suprises me because, I do not see that (none) in my commandline. >> >> >> On Sat, Jun 4, 2011 at 12:38 PM, Jonathan S Monette wrote: >> >>> I added >>> echo ${OPTIONS} >>> echo ${COG_OPTS} >>> echo ${LOCALCLASSPATH} >>> echo ${EXEC} >>> echo ${CMDLINE} >>> >>> to the script. It seems in the $OPTIONS variable this appears. >>> >>> -Xmx8192M >>> -Djava.endorsed.dirs=/home/jonmon/Library/Swift/swift-0.92/bin/../lib/endorsed >>> -DUID=1881 -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>> -DCOG_INSTALL_PATH=/home/jonmon/Library/Swift/swift-0.92/bin/.. >>> -Dswift.home=/home/jonmon/Library/Swift/swift-0.92/bin/.. >>> -Djava.security.egd=file:///dev/urandom >>> >>> There is the line DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>> I believe that is the parenthesis that is causing this to fail. Now How do >>> I fix it. >>> >>> On Sat, Jun 4, 2011 at 12:36 PM, Jonathan S Monette wrote: >>> >>>> I may have found the error. I download 0.92 binaries and changed the >>>> swift command script. I added >>>> >>>> >>>> On Sat, Jun 4, 2011 at 12:26 PM, Jonathan S Monette wrote: >>>> >>>>> Same result. Same error >>>>> >>>>> >>>>> On Sat, Jun 4, 2011 at 12:24 PM, ketan wrote: >>>>> >>>>>> can you try running the commandline by copying from run.sh and >>>>>> pasting it at the prompt. >>>>>> >>>>>> >>>>>> On 6/4/11 12:18 PM, Jonathan S Monette wrote: >>>>>> >>>>>> I do have that project. I did projects --avail and got back >>>>>> >>>>>> Project PI Title >>>>>> >>>>>> ------------------------------------------------------------------------------ >>>>>> CI-CCR000013 Michael Wilde The Swift Parallel Scripting >>>>>> System >>>>>> >>>>>> Here is my run.sh file. I execute with ./run.sh after I did chmod >>>>>> +x run.sh >>>>>> >>>>>> #!/bin/bash >>>>>> swift -config cf -tc.file tc -sites.file beagle-coaster.xml >>>>>> catsn.swift -n=1 >>>>>> >>>>>> I do have sh and cat in my path. I can execute them. Here is what >>>>>> which sh and which cat produced. >>>>>> >>>>>> sh is /usr/bin/sh >>>>>> cat is /bin/cat >>>>>> >>>>>> On Sat, Jun 4, 2011 at 12:10 PM, ketan wrote: >>>>>> >>>>>>> Can you try "projects --avail" command on beagle to see if you are >>>>>>> member of a project. >>>>>>> >>>>>>> Else you will need to request membership. You can do this from this >>>>>>> page: >>>>>>> >>>>>>> http://pads.ci.uchicago.edu/access/ >>>>>>> >>>>>>> In any case, your error does not seem to be because of the above. >>>>>>> Looks like '(' has sneaked in because of some unexpected typo in commandline >>>>>>> .. can you doublecheck it. >>>>>>> >>>>>>> Also can you make sure that you have all the files: tc, >>>>>>> beagle-coasters.xml cf in PATH, ie. are you able to access them w/o ./ >>>>>>> >>>>>>> >>>>>>> On 6/4/11 12:03 PM, Jonathan S Monette wrote: >>>>>>> >>>>>>> I did change the work directory. I however did not change the >>>>>>> project name. I do not know my project name so I kept the same project that >>>>>>> was in there. Here is my modified beagle-coasters.xml >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> CI-CCR000013 >>>>>>> >>>>>>> 24:cray:pack >>>>>>> >>>>>>> 24 >>>>>>> 1000 >>>>>>> 1 >>>>>>> 1 >>>>>>> 1 >>>>>>> >>>>>>> .63 >>>>>>> 10000 >>>>>>> >>>>>>> >>>>>>> /lustre/beagle/jonmon/Swift/work >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Sat, Jun 4, 2011 at 12:00 PM, Ketan Maheshwari < >>>>>>> ketancmaheshwari at gmail.com> wrote: >>>>>>> >>>>>>>> Jon, >>>>>>>> >>>>>>>> Thanks for trying out catsn on Beagle. >>>>>>>> >>>>>>>> I just tried it myself but could not reproduce the error you are >>>>>>>> getting. Have you made the changes that are mentioned in the README: >>>>>>>> == >>>>>>>> Change workdir location in beagle-coaster.xml >>>>>>>> Change the project entry in the beagle-coaster.xml to your project >>>>>>>> name >>>>>>>> == >>>>>>>> >>>>>>>> Ketan >>>>>>>> >>>>>>>> >>>>>>>> On Sat, Jun 4, 2011 at 11:50 AM, Jonathan S Monette < >>>>>>>> jonmon at utexas.edu> wrote: >>>>>>>> >>>>>>>>> Hello, >>>>>>>>> I am trying to run the catsn test on beagle using the files in >>>>>>>>> ~ketan/catsn. I have copied over this directory over to my home directory >>>>>>>>> and I believe I set it up correctly. I did module load swift and the ran >>>>>>>>> run.sh that was in this directory. I get this error. >>>>>>>>> >>>>>>>>> /soft/swift/0.92/bin/swift: eval: line 152: syntax error near >>>>>>>>> unexpected token `(' >>>>>>>>> /soft/swift/0.92/bin/swift: eval: line 152: `java -Xmx8192M >>>>>>>>> -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed -DUID=1881 >>>>>>>>> -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>>>>>>> -DCOG_INSTALL_PATH=/soft/swift/0.92/bin/.. >>>>>>>>> -Dswift.home=/soft/swift/0.92/bin/.. -Djava.security.egd= >>>>>>>>> file:///dev/urandom -classpath >>>>>>>>> /soft/swift/0.92/bin/../etc:/soft/swift/0.92/bin/../libexec:/soft/swift/0.92/bin/../lib/addressing-1.0.jar:/soft/swift/0.92/bin/../lib/ant.jar:/soft/swift/0.92/bin/../lib/antlr-2.7.5.jar:/soft/swift/0.92/bin/../lib/axis.jar:/soft/swift/0.92/bin/../lib/axis-url.jar:/soft/swift/0.92/bin/../lib/backport-util-concurrent.jar:/soft/swift/0.92/bin/../lib/castor-0.9.6.jar:/soft/swift/0.92/bin/../lib/coaster-bootstrap.jar:/soft/swift/0.92/bin/../lib/cog-abstraction-common-2.4.jar:/soft/swift/0.92/bin/../lib/cog-axis.jar:/soft/swift/0.92/bin/../lib/cog-grapheditor-0.47.jar:/soft/swift/0.92/bin/../lib/cog-jglobus-1.7.0.jar:/soft/swift/0.92/bin/../lib/cog-karajan-0.36-dev.jar:/soft/swift/0.92/bin/../lib/cog-provider-clref-gt4_0_0.jar:/soft/swift/0.92/bin/../lib/cog-provider-coaster-0.3.jar:/soft/swift/0.92/bin/../lib/cog-provider-dcache-0.1.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt2-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/soft/swift/0.92/bin/../lib/cog-pro >>>>>>>>> vider-local-2.2.jar:/soft/swift/0.92/bin/../lib/cog-provider-localscheduler-0.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-ssh-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-webdav-2.1.jar:/soft/swift/0.92/bin/../lib/cog-resources-1.0.jar:/soft/swift/0.92/bin/../lib/cog-swift-svn.jar:/soft/swift/0.92/bin/../lib/cog-trap-1.0.jar:/soft/swift/0.92/bin/../lib/cog-url.jar:/soft/swift/0.92/bin/../lib/cog-util-0.92.jar:/soft/swift/0.92/bin/../lib/commonj.jar:/soft/swift/0.92/bin/../lib/commons-beanutils.jar:/soft/swift/0.92/bin/../lib/commons-collections-3.0.jar:/soft/swift/0.92/bin/../lib/commons-digester.jar:/soft/swift/0.92/bin/../lib/commons-discovery.jar:/soft/swift/0.92/bin/../lib/commons-httpclient.jar:/soft/swift/0.92/bin/../lib/commons-logging-1.1.jar:/soft/swift/0.92/bin/../lib/concurrent.jar:/soft/swift/0.92/bin/../lib/cryptix32.jar:/soft/swift/0.92/bin/../lib/cryptix-asn1.jar:/soft/swift/0.92/bin/../lib/cryptix.jar:/soft/swift/0.92/bin/../lib/globus_delegation_servic >>>>>>>>> e.jar:/soft/swift/0.92/bin/../lib/globus_delegation_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_mds_aggregator_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_service.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rft_stubs.jar:/soft/swift/0.92/bin/../lib/gram-client.jar:/soft/swift/0.92/bin/../lib/gram-stubs.jar:/soft/swift/0.92/bin/../lib/gram-utils.jar:/soft/swift/0.92/bin/../lib/j2ssh-common-0.2.2.jar:/soft/swift/0.92/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/soft/swift/0.92/bin/../lib/jakarta-regexp-1.2.jar:/soft/swift/0.92/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/soft/swift/0.92/bin/../lib/jaxrpc.jar:/soft/swift/0.92/bin/../lib/jce-jdk13-131.jar:/soft/swift/0.92/bin/../lib/jgss.jar:/soft/swift/0.92/bin/../lib/jline-0.9.94.jar:/soft/swift/0.92/bin/../lib/jsr173_1.0_api.jar:/soft/swift/0.92/bin/../lib/jug-lgpl-2.0.0.jar:/soft/swift/0.92/bin/../lib/junit.jar:/soft/swift/0.92/bin/../lib/log4j-1.2 >>>>>>>>> .8.jar:/soft/swift/0.92/bin/../lib/naming-common.jar:/soft/swift/0.92/bin/../lib/naming-factory.jar:/soft/swift/0.92/bin/../lib/naming-java.jar:/soft/swift/0.92/bin/../lib/naming-resources.jar:/soft/swift/0.92/bin/../lib/opensaml.jar:/soft/swift/0.92/bin/../lib/puretls.jar:/soft/swift/0.92/bin/../lib/resolver.jar:/soft/swift/0.92/bin/../lib/saaj.jar:/soft/swift/0.92/bin/../lib/stringtemplate.jar:/soft/swift/0.92/bin/../lib/vdldefinitions.jar:/soft/swift/0.92/bin/../lib/wsdl4j.jar:/soft/swift/0.92/bin/../lib/wsrf_core.jar:/soft/swift/0.92/bin/../lib/wsrf_core_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_index_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_usefulrp_schema_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_provider_jce.jar:/soft/swift/0.92/bin/../lib/wsrf_tools.jar:/soft/swift/0.92/bin/../lib/wss4j.jar:/soft/swift/0.92/bin/../lib/xalan.jar:/soft/swift/0.92/bin/../lib/xbean.jar:/soft/swift/0.92/bin/../lib/xbean_xpath.jar:/soft/swift/0.92/bin/../lib/xercesImpl.jar:/soft >>>>>>>>> /swift/0.92/bin/../lib/xml-apis.jar:/soft/swift/0.92/bin/../lib/xmlsec.jar:/soft/swift/0.92/bin/../lib/xpp3-1.1.3.4d_b4_min.jar:/soft/swift/0.92/bin/../lib/xstream-1.1.1-patched.jar: >>>>>>>>> org.griphyn.vdl.karajan.Loader '-config' 'cf' '-tc.file' 'tc' '-sites.file' >>>>>>>>> 'beagle-coaster.xml' 'catsn.swift' '-n=1'' >>>>>>>>> >>>>>>>>> I there something else I need to load on beagle to make swift run >>>>>>>>> accordingly? >>>>>>>>> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>>> direction. >>>>>>>>> >>>>>>>>> - Albert Einstein >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>> direction. >>>>>>> >>>>>>> - Albert Einstein >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>> direction. >>>>>> >>>>>> - Albert Einstein >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Any intelligent fool can make things bigger and more complex... It >>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>> direction. >>>>> >>>>> - Albert Einstein >>>>> >>>>> >>>> >>>> >>>> -- >>>> >>>> Any intelligent fool can make things bigger and more complex... It takes >>>> a touch of genius - and a lot of courage to move in the opposite direction. >>>> >>>> - Albert Einstein >>>> >>>> >>> >>> >>> -- >>> >>> Any intelligent fool can make things bigger and more complex... It takes >>> a touch of genius - and a lot of courage to move in the opposite direction. >>> >>> - Albert Einstein >>> >>> >> > > > -- > > Any intelligent fool can make things bigger and more complex... It takes a > touch of genius - and a lot of courage to move in the opposite direction. > > - Albert Einstein > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Sat Jun 4 12:49:18 2011 From: jonmon at utexas.edu (Jonathan S Monette) Date: Sat, 4 Jun 2011 12:49:18 -0500 Subject: [Swift-devel] Re: catsn on beagle In-Reply-To: References: <4DEA670E.3010909@gmail.com> <4DEA6A30.2040503@gmail.com> Message-ID: login2.beagle.ci.uchicago.edu.(none) when you ran your test was in on login1 or login2? On Sat, Jun 4, 2011 at 12:47 PM, Ketan Maheshwari < ketancmaheshwari at gmail.com> wrote: > What does > > echo $HOSTNAME give? > > > On Sat, Jun 4, 2011 at 12:44 PM, Jonathan S Monette wrote: > >> I am not sure why that (none) appears. Is this a variable that the system >> sets or do I need to set it? >> >> >> On Sat, Jun 4, 2011 at 12:39 PM, Ketan Maheshwari < >> ketancmaheshwari at gmail.com> wrote: >> >>> This suprises me because, I do not see that (none) in my commandline. >>> >>> >>> On Sat, Jun 4, 2011 at 12:38 PM, Jonathan S Monette wrote: >>> >>>> I added >>>> echo ${OPTIONS} >>>> echo ${COG_OPTS} >>>> echo ${LOCALCLASSPATH} >>>> echo ${EXEC} >>>> echo ${CMDLINE} >>>> >>>> to the script. It seems in the $OPTIONS variable this appears. >>>> >>>> -Xmx8192M >>>> -Djava.endorsed.dirs=/home/jonmon/Library/Swift/swift-0.92/bin/../lib/endorsed >>>> -DUID=1881 -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>> -DCOG_INSTALL_PATH=/home/jonmon/Library/Swift/swift-0.92/bin/.. >>>> -Dswift.home=/home/jonmon/Library/Swift/swift-0.92/bin/.. >>>> -Djava.security.egd=file:///dev/urandom >>>> >>>> There is the line DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>> I believe that is the parenthesis that is causing this to fail. Now How do >>>> I fix it. >>>> >>>> On Sat, Jun 4, 2011 at 12:36 PM, Jonathan S Monette wrote: >>>> >>>>> I may have found the error. I download 0.92 binaries and changed the >>>>> swift command script. I added >>>>> >>>>> >>>>> On Sat, Jun 4, 2011 at 12:26 PM, Jonathan S Monette >>>> > wrote: >>>>> >>>>>> Same result. Same error >>>>>> >>>>>> >>>>>> On Sat, Jun 4, 2011 at 12:24 PM, ketan wrote: >>>>>> >>>>>>> can you try running the commandline by copying from run.sh and >>>>>>> pasting it at the prompt. >>>>>>> >>>>>>> >>>>>>> On 6/4/11 12:18 PM, Jonathan S Monette wrote: >>>>>>> >>>>>>> I do have that project. I did projects --avail and got back >>>>>>> >>>>>>> Project PI Title >>>>>>> >>>>>>> ------------------------------------------------------------------------------ >>>>>>> CI-CCR000013 Michael Wilde The Swift Parallel Scripting >>>>>>> System >>>>>>> >>>>>>> Here is my run.sh file. I execute with ./run.sh after I did chmod >>>>>>> +x run.sh >>>>>>> >>>>>>> #!/bin/bash >>>>>>> swift -config cf -tc.file tc -sites.file beagle-coaster.xml >>>>>>> catsn.swift -n=1 >>>>>>> >>>>>>> I do have sh and cat in my path. I can execute them. Here is what >>>>>>> which sh and which cat produced. >>>>>>> >>>>>>> sh is /usr/bin/sh >>>>>>> cat is /bin/cat >>>>>>> >>>>>>> On Sat, Jun 4, 2011 at 12:10 PM, ketan wrote: >>>>>>> >>>>>>>> Can you try "projects --avail" command on beagle to see if you are >>>>>>>> member of a project. >>>>>>>> >>>>>>>> Else you will need to request membership. You can do this from this >>>>>>>> page: >>>>>>>> >>>>>>>> http://pads.ci.uchicago.edu/access/ >>>>>>>> >>>>>>>> In any case, your error does not seem to be because of the above. >>>>>>>> Looks like '(' has sneaked in because of some unexpected typo in commandline >>>>>>>> .. can you doublecheck it. >>>>>>>> >>>>>>>> Also can you make sure that you have all the files: tc, >>>>>>>> beagle-coasters.xml cf in PATH, ie. are you able to access them w/o ./ >>>>>>>> >>>>>>>> >>>>>>>> On 6/4/11 12:03 PM, Jonathan S Monette wrote: >>>>>>>> >>>>>>>> I did change the work directory. I however did not change the >>>>>>>> project name. I do not know my project name so I kept the same project that >>>>>>>> was in there. Here is my modified beagle-coasters.xml >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> CI-CCR000013 >>>>>>>> >>>>>>>> 24:cray:pack >>>>>>>> >>>>>>>> 24 >>>>>>>> 1000 >>>>>>>> 1 >>>>>>>> 1 >>>>>>>> 1 >>>>>>>> >>>>>>>> .63 >>>>>>>> 10000 >>>>>>>> >>>>>>>> >>>>>>>> /lustre/beagle/jonmon/Swift/work >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Sat, Jun 4, 2011 at 12:00 PM, Ketan Maheshwari < >>>>>>>> ketancmaheshwari at gmail.com> wrote: >>>>>>>> >>>>>>>>> Jon, >>>>>>>>> >>>>>>>>> Thanks for trying out catsn on Beagle. >>>>>>>>> >>>>>>>>> I just tried it myself but could not reproduce the error you are >>>>>>>>> getting. Have you made the changes that are mentioned in the README: >>>>>>>>> == >>>>>>>>> Change workdir location in beagle-coaster.xml >>>>>>>>> Change the project entry in the beagle-coaster.xml to your project >>>>>>>>> name >>>>>>>>> == >>>>>>>>> >>>>>>>>> Ketan >>>>>>>>> >>>>>>>>> >>>>>>>>> On Sat, Jun 4, 2011 at 11:50 AM, Jonathan S Monette < >>>>>>>>> jonmon at utexas.edu> wrote: >>>>>>>>> >>>>>>>>>> Hello, >>>>>>>>>> I am trying to run the catsn test on beagle using the files in >>>>>>>>>> ~ketan/catsn. I have copied over this directory over to my home directory >>>>>>>>>> and I believe I set it up correctly. I did module load swift and the ran >>>>>>>>>> run.sh that was in this directory. I get this error. >>>>>>>>>> >>>>>>>>>> /soft/swift/0.92/bin/swift: eval: line 152: syntax error near >>>>>>>>>> unexpected token `(' >>>>>>>>>> /soft/swift/0.92/bin/swift: eval: line 152: `java -Xmx8192M >>>>>>>>>> -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed -DUID=1881 >>>>>>>>>> -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>>>>>>>> -DCOG_INSTALL_PATH=/soft/swift/0.92/bin/.. >>>>>>>>>> -Dswift.home=/soft/swift/0.92/bin/.. -Djava.security.egd= >>>>>>>>>> file:///dev/urandom -classpath >>>>>>>>>> /soft/swift/0.92/bin/../etc:/soft/swift/0.92/bin/../libexec:/soft/swift/0.92/bin/../lib/addressing-1.0.jar:/soft/swift/0.92/bin/../lib/ant.jar:/soft/swift/0.92/bin/../lib/antlr-2.7.5.jar:/soft/swift/0.92/bin/../lib/axis.jar:/soft/swift/0.92/bin/../lib/axis-url.jar:/soft/swift/0.92/bin/../lib/backport-util-concurrent.jar:/soft/swift/0.92/bin/../lib/castor-0.9.6.jar:/soft/swift/0.92/bin/../lib/coaster-bootstrap.jar:/soft/swift/0.92/bin/../lib/cog-abstraction-common-2.4.jar:/soft/swift/0.92/bin/../lib/cog-axis.jar:/soft/swift/0.92/bin/../lib/cog-grapheditor-0.47.jar:/soft/swift/0.92/bin/../lib/cog-jglobus-1.7.0.jar:/soft/swift/0.92/bin/../lib/cog-karajan-0.36-dev.jar:/soft/swift/0.92/bin/../lib/cog-provider-clref-gt4_0_0.jar:/soft/swift/0.92/bin/../lib/cog-provider-coaster-0.3.jar:/soft/swift/0.92/bin/../lib/cog-provider-dcache-0.1.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt2-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/soft/swift/0.92/bin/../lib/cog-pro >>>>>>>>>> vider-local-2.2.jar:/soft/swift/0.92/bin/../lib/cog-provider-localscheduler-0.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-ssh-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-webdav-2.1.jar:/soft/swift/0.92/bin/../lib/cog-resources-1.0.jar:/soft/swift/0.92/bin/../lib/cog-swift-svn.jar:/soft/swift/0.92/bin/../lib/cog-trap-1.0.jar:/soft/swift/0.92/bin/../lib/cog-url.jar:/soft/swift/0.92/bin/../lib/cog-util-0.92.jar:/soft/swift/0.92/bin/../lib/commonj.jar:/soft/swift/0.92/bin/../lib/commons-beanutils.jar:/soft/swift/0.92/bin/../lib/commons-collections-3.0.jar:/soft/swift/0.92/bin/../lib/commons-digester.jar:/soft/swift/0.92/bin/../lib/commons-discovery.jar:/soft/swift/0.92/bin/../lib/commons-httpclient.jar:/soft/swift/0.92/bin/../lib/commons-logging-1.1.jar:/soft/swift/0.92/bin/../lib/concurrent.jar:/soft/swift/0.92/bin/../lib/cryptix32.jar:/soft/swift/0.92/bin/../lib/cryptix-asn1.jar:/soft/swift/0.92/bin/../lib/cryptix.jar:/soft/swift/0.92/bin/../lib/globus_delegation_servic >>>>>>>>>> e.jar:/soft/swift/0.92/bin/../lib/globus_delegation_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_mds_aggregator_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_service.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rft_stubs.jar:/soft/swift/0.92/bin/../lib/gram-client.jar:/soft/swift/0.92/bin/../lib/gram-stubs.jar:/soft/swift/0.92/bin/../lib/gram-utils.jar:/soft/swift/0.92/bin/../lib/j2ssh-common-0.2.2.jar:/soft/swift/0.92/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/soft/swift/0.92/bin/../lib/jakarta-regexp-1.2.jar:/soft/swift/0.92/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/soft/swift/0.92/bin/../lib/jaxrpc.jar:/soft/swift/0.92/bin/../lib/jce-jdk13-131.jar:/soft/swift/0.92/bin/../lib/jgss.jar:/soft/swift/0.92/bin/../lib/jline-0.9.94.jar:/soft/swift/0.92/bin/../lib/jsr173_1.0_api.jar:/soft/swift/0.92/bin/../lib/jug-lgpl-2.0.0.jar:/soft/swift/0.92/bin/../lib/junit.jar:/soft/swift/0.92/bin/../lib/log4j-1.2 >>>>>>>>>> .8.jar:/soft/swift/0.92/bin/../lib/naming-common.jar:/soft/swift/0.92/bin/../lib/naming-factory.jar:/soft/swift/0.92/bin/../lib/naming-java.jar:/soft/swift/0.92/bin/../lib/naming-resources.jar:/soft/swift/0.92/bin/../lib/opensaml.jar:/soft/swift/0.92/bin/../lib/puretls.jar:/soft/swift/0.92/bin/../lib/resolver.jar:/soft/swift/0.92/bin/../lib/saaj.jar:/soft/swift/0.92/bin/../lib/stringtemplate.jar:/soft/swift/0.92/bin/../lib/vdldefinitions.jar:/soft/swift/0.92/bin/../lib/wsdl4j.jar:/soft/swift/0.92/bin/../lib/wsrf_core.jar:/soft/swift/0.92/bin/../lib/wsrf_core_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_index_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_usefulrp_schema_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_provider_jce.jar:/soft/swift/0.92/bin/../lib/wsrf_tools.jar:/soft/swift/0.92/bin/../lib/wss4j.jar:/soft/swift/0.92/bin/../lib/xalan.jar:/soft/swift/0.92/bin/../lib/xbean.jar:/soft/swift/0.92/bin/../lib/xbean_xpath.jar:/soft/swift/0.92/bin/../lib/xercesImpl.jar:/soft >>>>>>>>>> /swift/0.92/bin/../lib/xml-apis.jar:/soft/swift/0.92/bin/../lib/xmlsec.jar:/soft/swift/0.92/bin/../lib/xpp3-1.1.3.4d_b4_min.jar:/soft/swift/0.92/bin/../lib/xstream-1.1.1-patched.jar: >>>>>>>>>> org.griphyn.vdl.karajan.Loader '-config' 'cf' '-tc.file' 'tc' '-sites.file' >>>>>>>>>> 'beagle-coaster.xml' 'catsn.swift' '-n=1'' >>>>>>>>>> >>>>>>>>>> I there something else I need to load on beagle to make swift >>>>>>>>>> run accordingly? >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>>>> direction. >>>>>>>>>> >>>>>>>>>> - Albert Einstein >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>> direction. >>>>>>>> >>>>>>>> - Albert Einstein >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>> direction. >>>>>>> >>>>>>> - Albert Einstein >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>> direction. >>>>>> >>>>>> - Albert Einstein >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Any intelligent fool can make things bigger and more complex... It >>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>> direction. >>>>> >>>>> - Albert Einstein >>>>> >>>>> >>>> >>>> >>>> -- >>>> >>>> Any intelligent fool can make things bigger and more complex... It takes >>>> a touch of genius - and a lot of courage to move in the opposite direction. >>>> >>>> - Albert Einstein >>>> >>>> >>> >> >> >> -- >> >> Any intelligent fool can make things bigger and more complex... It takes a >> touch of genius - and a lot of courage to move in the opposite direction. >> >> - Albert Einstein >> >> > -- Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction. - Albert Einstein -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Sat Jun 4 12:51:53 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Sat, 4 Jun 2011 12:51:53 -0500 Subject: [Swift-devel] Re: catsn on beagle In-Reply-To: References: <4DEA670E.3010909@gmail.com> <4DEA6A30.2040503@gmail.com> Message-ID: Alright, so this is the issue. GLOBUS_HOSTNAME is picked up from HOSTNAME and I do not know where does this none comes in your case. In my case I get hostname: login2.beagle.ci.uchicago.edu see if you can manually change this env variable and try to run again. On Sat, Jun 4, 2011 at 12:49 PM, Jonathan S Monette wrote: > login2.beagle.ci.uchicago.edu.(none) > > when you ran your test was in on login1 or login2? > > > On Sat, Jun 4, 2011 at 12:47 PM, Ketan Maheshwari < > ketancmaheshwari at gmail.com> wrote: > >> What does >> >> echo $HOSTNAME give? >> >> >> On Sat, Jun 4, 2011 at 12:44 PM, Jonathan S Monette wrote: >> >>> I am not sure why that (none) appears. Is this a variable that the >>> system sets or do I need to set it? >>> >>> >>> On Sat, Jun 4, 2011 at 12:39 PM, Ketan Maheshwari < >>> ketancmaheshwari at gmail.com> wrote: >>> >>>> This suprises me because, I do not see that (none) in my commandline. >>>> >>>> >>>> On Sat, Jun 4, 2011 at 12:38 PM, Jonathan S Monette wrote: >>>> >>>>> I added >>>>> echo ${OPTIONS} >>>>> echo ${COG_OPTS} >>>>> echo ${LOCALCLASSPATH} >>>>> echo ${EXEC} >>>>> echo ${CMDLINE} >>>>> >>>>> to the script. It seems in the $OPTIONS variable this appears. >>>>> >>>>> -Xmx8192M >>>>> -Djava.endorsed.dirs=/home/jonmon/Library/Swift/swift-0.92/bin/../lib/endorsed >>>>> -DUID=1881 -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>>> -DCOG_INSTALL_PATH=/home/jonmon/Library/Swift/swift-0.92/bin/.. >>>>> -Dswift.home=/home/jonmon/Library/Swift/swift-0.92/bin/.. >>>>> -Djava.security.egd=file:///dev/urandom >>>>> >>>>> There is the line DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>>> I believe that is the parenthesis that is causing this to fail. Now How do >>>>> I fix it. >>>>> >>>>> On Sat, Jun 4, 2011 at 12:36 PM, Jonathan S Monette >>>> > wrote: >>>>> >>>>>> I may have found the error. I download 0.92 binaries and changed the >>>>>> swift command script. I added >>>>>> >>>>>> >>>>>> On Sat, Jun 4, 2011 at 12:26 PM, Jonathan S Monette < >>>>>> jonmon at utexas.edu> wrote: >>>>>> >>>>>>> Same result. Same error >>>>>>> >>>>>>> >>>>>>> On Sat, Jun 4, 2011 at 12:24 PM, ketan wrote: >>>>>>> >>>>>>>> can you try running the commandline by copying from run.sh and >>>>>>>> pasting it at the prompt. >>>>>>>> >>>>>>>> >>>>>>>> On 6/4/11 12:18 PM, Jonathan S Monette wrote: >>>>>>>> >>>>>>>> I do have that project. I did projects --avail and got back >>>>>>>> >>>>>>>> Project PI Title >>>>>>>> >>>>>>>> ------------------------------------------------------------------------------ >>>>>>>> CI-CCR000013 Michael Wilde The Swift Parallel Scripting >>>>>>>> System >>>>>>>> >>>>>>>> Here is my run.sh file. I execute with ./run.sh after I did chmod >>>>>>>> +x run.sh >>>>>>>> >>>>>>>> #!/bin/bash >>>>>>>> swift -config cf -tc.file tc -sites.file beagle-coaster.xml >>>>>>>> catsn.swift -n=1 >>>>>>>> >>>>>>>> I do have sh and cat in my path. I can execute them. Here is >>>>>>>> what which sh and which cat produced. >>>>>>>> >>>>>>>> sh is /usr/bin/sh >>>>>>>> cat is /bin/cat >>>>>>>> >>>>>>>> On Sat, Jun 4, 2011 at 12:10 PM, ketan wrote: >>>>>>>> >>>>>>>>> Can you try "projects --avail" command on beagle to see if you are >>>>>>>>> member of a project. >>>>>>>>> >>>>>>>>> Else you will need to request membership. You can do this from this >>>>>>>>> page: >>>>>>>>> >>>>>>>>> http://pads.ci.uchicago.edu/access/ >>>>>>>>> >>>>>>>>> In any case, your error does not seem to be because of the above. >>>>>>>>> Looks like '(' has sneaked in because of some unexpected typo in commandline >>>>>>>>> .. can you doublecheck it. >>>>>>>>> >>>>>>>>> Also can you make sure that you have all the files: tc, >>>>>>>>> beagle-coasters.xml cf in PATH, ie. are you able to access them w/o ./ >>>>>>>>> >>>>>>>>> >>>>>>>>> On 6/4/11 12:03 PM, Jonathan S Monette wrote: >>>>>>>>> >>>>>>>>> I did change the work directory. I however did not change the >>>>>>>>> project name. I do not know my project name so I kept the same project that >>>>>>>>> was in there. Here is my modified beagle-coasters.xml >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> key="project">CI-CCR000013 >>>>>>>>> >>>>>>>>> 24:cray:pack >>>>>>>>> >>>>>>>>> 24 >>>>>>>>> 1000 >>>>>>>>> 1 >>>>>>>>> 1 >>>>>>>>> 1 >>>>>>>>> >>>>>>>>> .63 >>>>>>>>> 10000 >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >/lustre/beagle/jonmon/Swift/work >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Sat, Jun 4, 2011 at 12:00 PM, Ketan Maheshwari < >>>>>>>>> ketancmaheshwari at gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Jon, >>>>>>>>>> >>>>>>>>>> Thanks for trying out catsn on Beagle. >>>>>>>>>> >>>>>>>>>> I just tried it myself but could not reproduce the error you are >>>>>>>>>> getting. Have you made the changes that are mentioned in the README: >>>>>>>>>> == >>>>>>>>>> Change workdir location in beagle-coaster.xml >>>>>>>>>> Change the project entry in the beagle-coaster.xml to your project >>>>>>>>>> name >>>>>>>>>> == >>>>>>>>>> >>>>>>>>>> Ketan >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Sat, Jun 4, 2011 at 11:50 AM, Jonathan S Monette < >>>>>>>>>> jonmon at utexas.edu> wrote: >>>>>>>>>> >>>>>>>>>>> Hello, >>>>>>>>>>> I am trying to run the catsn test on beagle using the files in >>>>>>>>>>> ~ketan/catsn. I have copied over this directory over to my home directory >>>>>>>>>>> and I believe I set it up correctly. I did module load swift and the ran >>>>>>>>>>> run.sh that was in this directory. I get this error. >>>>>>>>>>> >>>>>>>>>>> /soft/swift/0.92/bin/swift: eval: line 152: syntax error near >>>>>>>>>>> unexpected token `(' >>>>>>>>>>> /soft/swift/0.92/bin/swift: eval: line 152: `java -Xmx8192M >>>>>>>>>>> -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed -DUID=1881 >>>>>>>>>>> -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>>>>>>>>> -DCOG_INSTALL_PATH=/soft/swift/0.92/bin/.. >>>>>>>>>>> -Dswift.home=/soft/swift/0.92/bin/.. -Djava.security.egd= >>>>>>>>>>> file:///dev/urandom -classpath >>>>>>>>>>> /soft/swift/0.92/bin/../etc:/soft/swift/0.92/bin/../libexec:/soft/swift/0.92/bin/../lib/addressing-1.0.jar:/soft/swift/0.92/bin/../lib/ant.jar:/soft/swift/0.92/bin/../lib/antlr-2.7.5.jar:/soft/swift/0.92/bin/../lib/axis.jar:/soft/swift/0.92/bin/../lib/axis-url.jar:/soft/swift/0.92/bin/../lib/backport-util-concurrent.jar:/soft/swift/0.92/bin/../lib/castor-0.9.6.jar:/soft/swift/0.92/bin/../lib/coaster-bootstrap.jar:/soft/swift/0.92/bin/../lib/cog-abstraction-common-2.4.jar:/soft/swift/0.92/bin/../lib/cog-axis.jar:/soft/swift/0.92/bin/../lib/cog-grapheditor-0.47.jar:/soft/swift/0.92/bin/../lib/cog-jglobus-1.7.0.jar:/soft/swift/0.92/bin/../lib/cog-karajan-0.36-dev.jar:/soft/swift/0.92/bin/../lib/cog-provider-clref-gt4_0_0.jar:/soft/swift/0.92/bin/../lib/cog-provider-coaster-0.3.jar:/soft/swift/0.92/bin/../lib/cog-provider-dcache-0.1.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt2-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/soft/swift/0.92/bin/../lib/cog-pro >>>>>>>>>>> vider-local-2.2.jar:/soft/swift/0.92/bin/../lib/cog-provider-localscheduler-0.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-ssh-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-webdav-2.1.jar:/soft/swift/0.92/bin/../lib/cog-resources-1.0.jar:/soft/swift/0.92/bin/../lib/cog-swift-svn.jar:/soft/swift/0.92/bin/../lib/cog-trap-1.0.jar:/soft/swift/0.92/bin/../lib/cog-url.jar:/soft/swift/0.92/bin/../lib/cog-util-0.92.jar:/soft/swift/0.92/bin/../lib/commonj.jar:/soft/swift/0.92/bin/../lib/commons-beanutils.jar:/soft/swift/0.92/bin/../lib/commons-collections-3.0.jar:/soft/swift/0.92/bin/../lib/commons-digester.jar:/soft/swift/0.92/bin/../lib/commons-discovery.jar:/soft/swift/0.92/bin/../lib/commons-httpclient.jar:/soft/swift/0.92/bin/../lib/commons-logging-1.1.jar:/soft/swift/0.92/bin/../lib/concurrent.jar:/soft/swift/0.92/bin/../lib/cryptix32.jar:/soft/swift/0.92/bin/../lib/cryptix-asn1.jar:/soft/swift/0.92/bin/../lib/cryptix.jar:/soft/swift/0.92/bin/../lib/globus_delegation_servic >>>>>>>>>>> e.jar:/soft/swift/0.92/bin/../lib/globus_delegation_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_mds_aggregator_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_service.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rft_stubs.jar:/soft/swift/0.92/bin/../lib/gram-client.jar:/soft/swift/0.92/bin/../lib/gram-stubs.jar:/soft/swift/0.92/bin/../lib/gram-utils.jar:/soft/swift/0.92/bin/../lib/j2ssh-common-0.2.2.jar:/soft/swift/0.92/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/soft/swift/0.92/bin/../lib/jakarta-regexp-1.2.jar:/soft/swift/0.92/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/soft/swift/0.92/bin/../lib/jaxrpc.jar:/soft/swift/0.92/bin/../lib/jce-jdk13-131.jar:/soft/swift/0.92/bin/../lib/jgss.jar:/soft/swift/0.92/bin/../lib/jline-0.9.94.jar:/soft/swift/0.92/bin/../lib/jsr173_1.0_api.jar:/soft/swift/0.92/bin/../lib/jug-lgpl-2.0.0.jar:/soft/swift/0.92/bin/../lib/junit.jar:/soft/swift/0.92/bin/../lib/log4j-1.2 >>>>>>>>>>> .8.jar:/soft/swift/0.92/bin/../lib/naming-common.jar:/soft/swift/0.92/bin/../lib/naming-factory.jar:/soft/swift/0.92/bin/../lib/naming-java.jar:/soft/swift/0.92/bin/../lib/naming-resources.jar:/soft/swift/0.92/bin/../lib/opensaml.jar:/soft/swift/0.92/bin/../lib/puretls.jar:/soft/swift/0.92/bin/../lib/resolver.jar:/soft/swift/0.92/bin/../lib/saaj.jar:/soft/swift/0.92/bin/../lib/stringtemplate.jar:/soft/swift/0.92/bin/../lib/vdldefinitions.jar:/soft/swift/0.92/bin/../lib/wsdl4j.jar:/soft/swift/0.92/bin/../lib/wsrf_core.jar:/soft/swift/0.92/bin/../lib/wsrf_core_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_index_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_usefulrp_schema_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_provider_jce.jar:/soft/swift/0.92/bin/../lib/wsrf_tools.jar:/soft/swift/0.92/bin/../lib/wss4j.jar:/soft/swift/0.92/bin/../lib/xalan.jar:/soft/swift/0.92/bin/../lib/xbean.jar:/soft/swift/0.92/bin/../lib/xbean_xpath.jar:/soft/swift/0.92/bin/../lib/xercesImpl.jar:/soft >>>>>>>>>>> /swift/0.92/bin/../lib/xml-apis.jar:/soft/swift/0.92/bin/../lib/xmlsec.jar:/soft/swift/0.92/bin/../lib/xpp3-1.1.3.4d_b4_min.jar:/soft/swift/0.92/bin/../lib/xstream-1.1.1-patched.jar: >>>>>>>>>>> org.griphyn.vdl.karajan.Loader '-config' 'cf' '-tc.file' 'tc' '-sites.file' >>>>>>>>>>> 'beagle-coaster.xml' 'catsn.swift' '-n=1'' >>>>>>>>>>> >>>>>>>>>>> I there something else I need to load on beagle to make swift >>>>>>>>>>> run accordingly? >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> >>>>>>>>>>> Any intelligent fool can make things bigger and more complex... >>>>>>>>>>> It takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>>>>> direction. >>>>>>>>>>> >>>>>>>>>>> - Albert Einstein >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>>> direction. >>>>>>>>> >>>>>>>>> - Albert Einstein >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>> direction. >>>>>>>> >>>>>>>> - Albert Einstein >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>> direction. >>>>>>> >>>>>>> - Albert Einstein >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>> direction. >>>>>> >>>>>> - Albert Einstein >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Any intelligent fool can make things bigger and more complex... It >>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>> direction. >>>>> >>>>> - Albert Einstein >>>>> >>>>> >>>> >>> >>> >>> -- >>> >>> Any intelligent fool can make things bigger and more complex... It takes >>> a touch of genius - and a lot of courage to move in the opposite direction. >>> >>> - Albert Einstein >>> >>> >> > > > -- > > Any intelligent fool can make things bigger and more complex... It takes a > touch of genius - and a lot of courage to move in the opposite direction. > > - Albert Einstein > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Sat Jun 4 13:01:12 2011 From: jonmon at utexas.edu (Jonathan S Monette) Date: Sat, 4 Jun 2011 13:01:12 -0500 Subject: [Swift-devel] Re: catsn on beagle In-Reply-To: References: <4DEA670E.3010909@gmail.com> <4DEA6A30.2040503@gmail.com> Message-ID: Ok. I have my job submitted and it is currently waiting in the queue. besides manually resetting this variable everytime I am on beagle what else can be done? Is this something that I should bring to the attention of the sys admins for beagle? On Sat, Jun 4, 2011 at 12:51 PM, Ketan Maheshwari < ketancmaheshwari at gmail.com> wrote: > Alright, so this is the issue. GLOBUS_HOSTNAME is picked up from HOSTNAME > and I do not know where does this none comes in your case. > > In my case I get hostname: login2.beagle.ci.uchicago.edu > > see if you can manually change this env variable and try to run again. > > > On Sat, Jun 4, 2011 at 12:49 PM, Jonathan S Monette wrote: > >> login2.beagle.ci.uchicago.edu.(none) >> >> when you ran your test was in on login1 or login2? >> >> >> On Sat, Jun 4, 2011 at 12:47 PM, Ketan Maheshwari < >> ketancmaheshwari at gmail.com> wrote: >> >>> What does >>> >>> echo $HOSTNAME give? >>> >>> >>> On Sat, Jun 4, 2011 at 12:44 PM, Jonathan S Monette wrote: >>> >>>> I am not sure why that (none) appears. Is this a variable that the >>>> system sets or do I need to set it? >>>> >>>> >>>> On Sat, Jun 4, 2011 at 12:39 PM, Ketan Maheshwari < >>>> ketancmaheshwari at gmail.com> wrote: >>>> >>>>> This suprises me because, I do not see that (none) in my commandline. >>>>> >>>>> >>>>> On Sat, Jun 4, 2011 at 12:38 PM, Jonathan S Monette >>>> > wrote: >>>>> >>>>>> I added >>>>>> echo ${OPTIONS} >>>>>> echo ${COG_OPTS} >>>>>> echo ${LOCALCLASSPATH} >>>>>> echo ${EXEC} >>>>>> echo ${CMDLINE} >>>>>> >>>>>> to the script. It seems in the $OPTIONS variable this appears. >>>>>> >>>>>> -Xmx8192M >>>>>> -Djava.endorsed.dirs=/home/jonmon/Library/Swift/swift-0.92/bin/../lib/endorsed >>>>>> -DUID=1881 -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>>>> -DCOG_INSTALL_PATH=/home/jonmon/Library/Swift/swift-0.92/bin/.. >>>>>> -Dswift.home=/home/jonmon/Library/Swift/swift-0.92/bin/.. >>>>>> -Djava.security.egd=file:///dev/urandom >>>>>> >>>>>> There is the line >>>>>> DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) I believe that is the >>>>>> parenthesis that is causing this to fail. Now How do I fix it. >>>>>> >>>>>> On Sat, Jun 4, 2011 at 12:36 PM, Jonathan S Monette < >>>>>> jonmon at utexas.edu> wrote: >>>>>> >>>>>>> I may have found the error. I download 0.92 binaries and changed the >>>>>>> swift command script. I added >>>>>>> >>>>>>> >>>>>>> On Sat, Jun 4, 2011 at 12:26 PM, Jonathan S Monette < >>>>>>> jonmon at utexas.edu> wrote: >>>>>>> >>>>>>>> Same result. Same error >>>>>>>> >>>>>>>> >>>>>>>> On Sat, Jun 4, 2011 at 12:24 PM, ketan wrote: >>>>>>>> >>>>>>>>> can you try running the commandline by copying from run.sh and >>>>>>>>> pasting it at the prompt. >>>>>>>>> >>>>>>>>> >>>>>>>>> On 6/4/11 12:18 PM, Jonathan S Monette wrote: >>>>>>>>> >>>>>>>>> I do have that project. I did projects --avail and got back >>>>>>>>> >>>>>>>>> Project PI Title >>>>>>>>> >>>>>>>>> ------------------------------------------------------------------------------ >>>>>>>>> CI-CCR000013 Michael Wilde The Swift Parallel Scripting >>>>>>>>> System >>>>>>>>> >>>>>>>>> Here is my run.sh file. I execute with ./run.sh after I did >>>>>>>>> chmod +x run.sh >>>>>>>>> >>>>>>>>> #!/bin/bash >>>>>>>>> swift -config cf -tc.file tc -sites.file beagle-coaster.xml >>>>>>>>> catsn.swift -n=1 >>>>>>>>> >>>>>>>>> I do have sh and cat in my path. I can execute them. Here is >>>>>>>>> what which sh and which cat produced. >>>>>>>>> >>>>>>>>> sh is /usr/bin/sh >>>>>>>>> cat is /bin/cat >>>>>>>>> >>>>>>>>> On Sat, Jun 4, 2011 at 12:10 PM, ketan >>>>>>>> > wrote: >>>>>>>>> >>>>>>>>>> Can you try "projects --avail" command on beagle to see if you >>>>>>>>>> are member of a project. >>>>>>>>>> >>>>>>>>>> Else you will need to request membership. You can do this from >>>>>>>>>> this page: >>>>>>>>>> >>>>>>>>>> http://pads.ci.uchicago.edu/access/ >>>>>>>>>> >>>>>>>>>> In any case, your error does not seem to be because of the above. >>>>>>>>>> Looks like '(' has sneaked in because of some unexpected typo in commandline >>>>>>>>>> .. can you doublecheck it. >>>>>>>>>> >>>>>>>>>> Also can you make sure that you have all the files: tc, >>>>>>>>>> beagle-coasters.xml cf in PATH, ie. are you able to access them w/o ./ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 6/4/11 12:03 PM, Jonathan S Monette wrote: >>>>>>>>>> >>>>>>>>>> I did change the work directory. I however did not change the >>>>>>>>>> project name. I do not know my project name so I kept the same project that >>>>>>>>>> was in there. Here is my modified beagle-coasters.xml >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> key="project">CI-CCR000013 >>>>>>>>>> >>>>>>>>>> 24:cray:pack >>>>>>>>>> >>>>>>>>>> >>>>>>>>> key="workersPerNode">24 >>>>>>>>>> 1000 >>>>>>>>>> 1 >>>>>>>>>> 1 >>>>>>>>>> 1 >>>>>>>>>> >>>>>>>>>> .63 >>>>>>>>>> >>>>>>>>> key="initialScore">10000 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >/lustre/beagle/jonmon/Swift/work >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Sat, Jun 4, 2011 at 12:00 PM, Ketan Maheshwari < >>>>>>>>>> ketancmaheshwari at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Jon, >>>>>>>>>>> >>>>>>>>>>> Thanks for trying out catsn on Beagle. >>>>>>>>>>> >>>>>>>>>>> I just tried it myself but could not reproduce the error you are >>>>>>>>>>> getting. Have you made the changes that are mentioned in the README: >>>>>>>>>>> == >>>>>>>>>>> Change workdir location in beagle-coaster.xml >>>>>>>>>>> Change the project entry in the beagle-coaster.xml to your >>>>>>>>>>> project name >>>>>>>>>>> == >>>>>>>>>>> >>>>>>>>>>> Ketan >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Sat, Jun 4, 2011 at 11:50 AM, Jonathan S Monette < >>>>>>>>>>> jonmon at utexas.edu> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hello, >>>>>>>>>>>> I am trying to run the catsn test on beagle using the files >>>>>>>>>>>> in ~ketan/catsn. I have copied over this directory over to my home >>>>>>>>>>>> directory and I believe I set it up correctly. I did module load swift and >>>>>>>>>>>> the ran run.sh that was in this directory. I get this error. >>>>>>>>>>>> >>>>>>>>>>>> /soft/swift/0.92/bin/swift: eval: line 152: syntax error >>>>>>>>>>>> near unexpected token `(' >>>>>>>>>>>> /soft/swift/0.92/bin/swift: eval: line 152: `java -Xmx8192M >>>>>>>>>>>> -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed -DUID=1881 >>>>>>>>>>>> -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>>>>>>>>>> -DCOG_INSTALL_PATH=/soft/swift/0.92/bin/.. >>>>>>>>>>>> -Dswift.home=/soft/swift/0.92/bin/.. -Djava.security.egd= >>>>>>>>>>>> file:///dev/urandom -classpath >>>>>>>>>>>> /soft/swift/0.92/bin/../etc:/soft/swift/0.92/bin/../libexec:/soft/swift/0.92/bin/../lib/addressing-1.0.jar:/soft/swift/0.92/bin/../lib/ant.jar:/soft/swift/0.92/bin/../lib/antlr-2.7.5.jar:/soft/swift/0.92/bin/../lib/axis.jar:/soft/swift/0.92/bin/../lib/axis-url.jar:/soft/swift/0.92/bin/../lib/backport-util-concurrent.jar:/soft/swift/0.92/bin/../lib/castor-0.9.6.jar:/soft/swift/0.92/bin/../lib/coaster-bootstrap.jar:/soft/swift/0.92/bin/../lib/cog-abstraction-common-2.4.jar:/soft/swift/0.92/bin/../lib/cog-axis.jar:/soft/swift/0.92/bin/../lib/cog-grapheditor-0.47.jar:/soft/swift/0.92/bin/../lib/cog-jglobus-1.7.0.jar:/soft/swift/0.92/bin/../lib/cog-karajan-0.36-dev.jar:/soft/swift/0.92/bin/../lib/cog-provider-clref-gt4_0_0.jar:/soft/swift/0.92/bin/../lib/cog-provider-coaster-0.3.jar:/soft/swift/0.92/bin/../lib/cog-provider-dcache-0.1.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt2-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/soft/swift/0.92/bin/../lib/cog-pro >>>>>>>>>>>> vider-local-2.2.jar:/soft/swift/0.92/bin/../lib/cog-provider-localscheduler-0.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-ssh-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-webdav-2.1.jar:/soft/swift/0.92/bin/../lib/cog-resources-1.0.jar:/soft/swift/0.92/bin/../lib/cog-swift-svn.jar:/soft/swift/0.92/bin/../lib/cog-trap-1.0.jar:/soft/swift/0.92/bin/../lib/cog-url.jar:/soft/swift/0.92/bin/../lib/cog-util-0.92.jar:/soft/swift/0.92/bin/../lib/commonj.jar:/soft/swift/0.92/bin/../lib/commons-beanutils.jar:/soft/swift/0.92/bin/../lib/commons-collections-3.0.jar:/soft/swift/0.92/bin/../lib/commons-digester.jar:/soft/swift/0.92/bin/../lib/commons-discovery.jar:/soft/swift/0.92/bin/../lib/commons-httpclient.jar:/soft/swift/0.92/bin/../lib/commons-logging-1.1.jar:/soft/swift/0.92/bin/../lib/concurrent.jar:/soft/swift/0.92/bin/../lib/cryptix32.jar:/soft/swift/0.92/bin/../lib/cryptix-asn1.jar:/soft/swift/0.92/bin/../lib/cryptix.jar:/soft/swift/0.92/bin/../lib/globus_delegation_servic >>>>>>>>>>>> e.jar:/soft/swift/0.92/bin/../lib/globus_delegation_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_mds_aggregator_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_service.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rft_stubs.jar:/soft/swift/0.92/bin/../lib/gram-client.jar:/soft/swift/0.92/bin/../lib/gram-stubs.jar:/soft/swift/0.92/bin/../lib/gram-utils.jar:/soft/swift/0.92/bin/../lib/j2ssh-common-0.2.2.jar:/soft/swift/0.92/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/soft/swift/0.92/bin/../lib/jakarta-regexp-1.2.jar:/soft/swift/0.92/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/soft/swift/0.92/bin/../lib/jaxrpc.jar:/soft/swift/0.92/bin/../lib/jce-jdk13-131.jar:/soft/swift/0.92/bin/../lib/jgss.jar:/soft/swift/0.92/bin/../lib/jline-0.9.94.jar:/soft/swift/0.92/bin/../lib/jsr173_1.0_api.jar:/soft/swift/0.92/bin/../lib/jug-lgpl-2.0.0.jar:/soft/swift/0.92/bin/../lib/junit.jar:/soft/swift/0.92/bin/../lib/log4j-1.2 >>>>>>>>>>>> .8.jar:/soft/swift/0.92/bin/../lib/naming-common.jar:/soft/swift/0.92/bin/../lib/naming-factory.jar:/soft/swift/0.92/bin/../lib/naming-java.jar:/soft/swift/0.92/bin/../lib/naming-resources.jar:/soft/swift/0.92/bin/../lib/opensaml.jar:/soft/swift/0.92/bin/../lib/puretls.jar:/soft/swift/0.92/bin/../lib/resolver.jar:/soft/swift/0.92/bin/../lib/saaj.jar:/soft/swift/0.92/bin/../lib/stringtemplate.jar:/soft/swift/0.92/bin/../lib/vdldefinitions.jar:/soft/swift/0.92/bin/../lib/wsdl4j.jar:/soft/swift/0.92/bin/../lib/wsrf_core.jar:/soft/swift/0.92/bin/../lib/wsrf_core_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_index_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_usefulrp_schema_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_provider_jce.jar:/soft/swift/0.92/bin/../lib/wsrf_tools.jar:/soft/swift/0.92/bin/../lib/wss4j.jar:/soft/swift/0.92/bin/../lib/xalan.jar:/soft/swift/0.92/bin/../lib/xbean.jar:/soft/swift/0.92/bin/../lib/xbean_xpath.jar:/soft/swift/0.92/bin/../lib/xercesImpl.jar:/soft >>>>>>>>>>>> /swift/0.92/bin/../lib/xml-apis.jar:/soft/swift/0.92/bin/../lib/xmlsec.jar:/soft/swift/0.92/bin/../lib/xpp3-1.1.3.4d_b4_min.jar:/soft/swift/0.92/bin/../lib/xstream-1.1.1-patched.jar: >>>>>>>>>>>> org.griphyn.vdl.karajan.Loader '-config' 'cf' '-tc.file' 'tc' '-sites.file' >>>>>>>>>>>> 'beagle-coaster.xml' 'catsn.swift' '-n=1'' >>>>>>>>>>>> >>>>>>>>>>>> I there something else I need to load on beagle to make swift >>>>>>>>>>>> run accordingly? >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> >>>>>>>>>>>> Any intelligent fool can make things bigger and more complex... >>>>>>>>>>>> It takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>>>>>> direction. >>>>>>>>>>>> >>>>>>>>>>>> - Albert Einstein >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>>>> direction. >>>>>>>>>> >>>>>>>>>> - Albert Einstein >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>>> direction. >>>>>>>>> >>>>>>>>> - Albert Einstein >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>> direction. >>>>>>>> >>>>>>>> - Albert Einstein >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>> direction. >>>>>>> >>>>>>> - Albert Einstein >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>> direction. >>>>>> >>>>>> - Albert Einstein >>>>>> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> >>>> Any intelligent fool can make things bigger and more complex... It takes >>>> a touch of genius - and a lot of courage to move in the opposite direction. >>>> >>>> - Albert Einstein >>>> >>>> >>> >> >> >> -- >> >> Any intelligent fool can make things bigger and more complex... It takes a >> touch of genius - and a lot of courage to move in the opposite direction. >> >> - Albert Einstein >> >> > -- Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction. - Albert Einstein -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Sat Jun 4 13:09:35 2011 From: ketancmaheshwari at gmail.com (ketan) Date: Sat, 04 Jun 2011 13:09:35 -0500 Subject: [Swift-devel] Re: catsn on beagle In-Reply-To: References: <4DEA670E.3010909@gmail.com> <4DEA6A30.2040503@gmail.com> Message-ID: <4DEA74DF.5040107@gmail.com> Yes, you should bring this to the attention of beagle sysadmins. On 6/4/11 1:01 PM, Jonathan S Monette wrote: > Ok. I have my job submitted and it is currently waiting in the queue. > besides manually resetting this variable everytime I am on beagle > what else can be done? Is this something that I should bring to the > attention of the sys admins for beagle? > > On Sat, Jun 4, 2011 at 12:51 PM, Ketan Maheshwari > > wrote: > > Alright, so this is the issue. GLOBUS_HOSTNAME is picked up from > HOSTNAME and I do not know where does this none comes in your case. > > In my case I get hostname: login2.beagle.ci.uchicago.edu > > > see if you can manually change this env variable and try to run > again. > > > On Sat, Jun 4, 2011 at 12:49 PM, Jonathan S Monette > > wrote: > > login2.beagle.ci.uchicago.edu.(none) > > when you ran your test was in on login1 or login2? > > > On Sat, Jun 4, 2011 at 12:47 PM, Ketan Maheshwari > > wrote: > > What does > > echo $HOSTNAME give? > > > On Sat, Jun 4, 2011 at 12:44 PM, Jonathan S Monette > > wrote: > > I am not sure why that (none) appears. Is this a > variable that the system sets or do I need to set it? > > > On Sat, Jun 4, 2011 at 12:39 PM, Ketan Maheshwari > > wrote: > > This suprises me because, I do not see that (none) > in my commandline. > > > On Sat, Jun 4, 2011 at 12:38 PM, Jonathan S > Monette > wrote: > > I added > echo ${OPTIONS} > echo ${COG_OPTS} > echo ${LOCALCLASSPATH} > echo ${EXEC} > echo ${CMDLINE} > > to the script. It seems in the $OPTIONS > variable this appears. > > -Xmx8192M > -Djava.endorsed.dirs=/home/jonmon/Library/Swift/swift-0.92/bin/../lib/endorsed > -DUID=1881 > -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) > -DCOG_INSTALL_PATH=/home/jonmon/Library/Swift/swift-0.92/bin/.. > -Dswift.home=/home/jonmon/Library/Swift/swift-0.92/bin/.. > -Djava.security.egd=file:///dev/urandom > > There is the line > DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) > I believe that is the parenthesis that is > causing this to fail. Now How do I fix it. > > On Sat, Jun 4, 2011 at 12:36 PM, Jonathan S > Monette > wrote: > > I may have found the error. I download > 0.92 binaries and changed the swift > command script. I added > > > On Sat, Jun 4, 2011 at 12:26 PM, Jonathan > S Monette > wrote: > > Same result. Same error > > > On Sat, Jun 4, 2011 at 12:24 PM, ketan > > > wrote: > > can you try running the > commandline by copying from run.sh > and pasting it at the prompt. > > > On 6/4/11 12:18 PM, Jonathan S > Monette wrote: >> I do have that project. I did >> projects --avail and got back >> >> Project PI >> Title >> ------------------------------------------------------------------------------ >> CI-CCR000013 Michael Wilde >> The Swift Parallel Scripting >> System >> >> Here is my run.sh file. I >> execute with ./run.sh after I did >> chmod +x run.sh >> >> #!/bin/bash >> swift -config cf -tc.file tc >> -sites.file beagle-coaster.xml >> catsn.swift -n=1 >> >> I do have sh and cat in my path. >> I can execute them. Here is >> what which sh and which cat produced. >> >> sh is /usr/bin/sh >> cat is /bin/cat >> >> On Sat, Jun 4, 2011 at 12:10 PM, >> ketan > > >> wrote: >> >> Can you try "projects >> --avail" command on beagle to >> see if you are member of a >> project. >> >> Else you will need to request >> membership. You can do this >> from this page: >> >> http://pads.ci.uchicago.edu/access/ >> >> In any case, your error does >> not seem to be because of the >> above. Looks like '(' has >> sneaked in because of some >> unexpected typo in >> commandline .. can you >> doublecheck it. >> >> Also can you make sure that >> you have all the files: tc, >> beagle-coasters.xml cf in >> PATH, ie. are you able to >> access them w/o ./ >> >> >> On 6/4/11 12:03 PM, Jonathan >> S Monette wrote: >>> I did change the work >>> directory. I however did >>> not change the project name. >>> I do not know my project >>> name so I kept the same >>> project that was in there. >>> Here is my modified >>> beagle-coasters.xml >>> >>> >>> >>> >> provider="coaster" >>> jobmanager="local:pbs"/> >>> >> key="project">CI-CCR000013 >>> >>> >> key="ppn">24:cray:pack >>> >>> >> key="workersPerNode">24 >>> >> key="maxTime">1000 >>> >> key="slots">1 >>> >> key="nodeGranularity">1 >>> >> key="maxNodes">1 >>> >>> >> key="jobThrottle">.63 >>> >> key="initialScore">10000 >>> >>> >>> >> >/lustre/beagle/jonmon/Swift/work >>> >>> >>> >>> >>> On Sat, Jun 4, 2011 at 12:00 >>> PM, Ketan Maheshwari >>> >> > >>> wrote: >>> >>> Jon, >>> >>> Thanks for trying out >>> catsn on Beagle. >>> >>> I just tried it myself >>> but could not reproduce >>> the error you are >>> getting. Have you made >>> the changes that are >>> mentioned in the README: >>> == >>> Change workdir location >>> in beagle-coaster.xml >>> Change the project entry >>> in the >>> beagle-coaster.xml to >>> your project name >>> == >>> >>> Ketan >>> >>> >>> On Sat, Jun 4, 2011 at >>> 11:50 AM, Jonathan S >>> Monette >>> >> > >>> wrote: >>> >>> Hello, >>> I am trying to >>> run the catsn test >>> on beagle using the >>> files in >>> ~ketan/catsn. I >>> have copied over >>> this directory over >>> to my home directory >>> and I believe I set >>> it up correctly. I >>> did module load >>> swift and the ran >>> run.sh that was in >>> this directory. I >>> get this error. >>> >>> /soft/swift/0.92/bin/swift: >>> eval: line 152: >>> syntax error near >>> unexpected token `(' >>> /soft/swift/0.92/bin/swift: >>> eval: line 152: >>> `java -Xmx8192M >>> -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed >>> -DUID=1881 >>> -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>> -DCOG_INSTALL_PATH=/soft/swift/0.92/bin/.. >>> -Dswift.home=/soft/swift/0.92/bin/.. >>> -Djava.security.egd=file:///dev/urandom >>> -classpath >>> /soft/swift/0.92/bin/../etc:/soft/swift/0.92/bin/../libexec:/soft/swift/0.92/bin/../lib/addressing-1.0.jar:/soft/swift/0.92/bin/../lib/ant.jar:/soft/swift/0.92/bin/../lib/antlr-2.7.5.jar:/soft/swift/0.92/bin/../lib/axis.jar:/soft/swift/0.92/bin/../lib/axis-url.jar:/soft/swift/0.92/bin/../lib/backport-util-concurrent.jar:/soft/swift/0.92/bin/../lib/castor-0.9.6.jar:/soft/swift/0.92/bin/../lib/coaster-bootstrap.jar:/soft/swift/0.92/bin/../lib/cog-abstraction-common-2.4.jar:/soft/swift/0.92/bin/../lib/cog-axis.jar:/soft/swift/0.92/bin/../lib/cog-grapheditor-0.47.jar:/soft/swift/0.92/bin/../lib/cog-jglobus-1.7.0.jar:/soft/swift/0.92/bin/../lib/cog-karajan-0.36-dev.jar:/soft/swift/0.92/bin/../lib/cog-provider-clref-gt4_0_0.jar:/soft/swift/0.92/bin/../lib/cog-provider-coaster-0.3.jar:/soft/swift/0.92/bin/../lib/cog-provider-dcache-0.1.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt2-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/soft/swift/0.92/bin/../lib/cog-pro >>> vider-local-2.2.jar:/soft/swift/0.92/bin/../lib/cog-provider-localscheduler-0.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-ssh-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-webdav-2.1.jar:/soft/swift/0.92/bin/../lib/cog-resources-1.0.jar:/soft/swift/0.92/bin/../lib/cog-swift-svn.jar:/soft/swift/0.92/bin/../lib/cog-trap-1.0.jar:/soft/swift/0.92/bin/../lib/cog-url.jar:/soft/swift/0.92/bin/../lib/cog-util-0.92.jar:/soft/swift/0.92/bin/../lib/commonj.jar:/soft/swift/0.92/bin/../lib/commons-beanutils.jar:/soft/swift/0.92/bin/../lib/commons-collections-3.0.jar:/soft/swift/0.92/bin/../lib/commons-digester.jar:/soft/swift/0.92/bin/../lib/commons-discovery.jar:/soft/swift/0.92/bin/../lib/commons-httpclient.jar:/soft/swift/0.92/bin/../lib/commons-logging-1.1.jar:/soft/swift/0.92/bin/../lib/concurrent.jar:/soft/swift/0.92/bin/../lib/cryptix32.jar:/soft/swift/0.92/bin/../lib/cryptix-asn1.jar:/soft/swift/0.92/bin/../lib/cryptix.jar:/soft/swift/0.92/bin/../lib/globus_delegation_servic >>> e.jar:/soft/swift/0.92/bin/../lib/globus_delegation_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_mds_aggregator_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_service.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rft_stubs.jar:/soft/swift/0.92/bin/../lib/gram-client.jar:/soft/swift/0.92/bin/../lib/gram-stubs.jar:/soft/swift/0.92/bin/../lib/gram-utils.jar:/soft/swift/0.92/bin/../lib/j2ssh-common-0.2.2.jar:/soft/swift/0.92/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/soft/swift/0.92/bin/../lib/jakarta-regexp-1.2.jar:/soft/swift/0.92/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/soft/swift/0.92/bin/../lib/jaxrpc.jar:/soft/swift/0.92/bin/../lib/jce-jdk13-131.jar:/soft/swift/0.92/bin/../lib/jgss.jar:/soft/swift/0.92/bin/../lib/jline-0.9.94.jar:/soft/swift/0.92/bin/../lib/jsr173_1.0_api.jar:/soft/swift/0.92/bin/../lib/jug-lgpl-2.0.0.jar:/soft/swift/0.92/bin/../lib/junit.jar:/soft/swift/0.92/bin/../lib/log4j-1.2 >>> .8.jar:/soft/swift/0.92/bin/../lib/naming-common.jar:/soft/swift/0.92/bin/../lib/naming-factory.jar:/soft/swift/0.92/bin/../lib/naming-java.jar:/soft/swift/0.92/bin/../lib/naming-resources.jar:/soft/swift/0.92/bin/../lib/opensaml.jar:/soft/swift/0.92/bin/../lib/puretls.jar:/soft/swift/0.92/bin/../lib/resolver.jar:/soft/swift/0.92/bin/../lib/saaj.jar:/soft/swift/0.92/bin/../lib/stringtemplate.jar:/soft/swift/0.92/bin/../lib/vdldefinitions.jar:/soft/swift/0.92/bin/../lib/wsdl4j.jar:/soft/swift/0.92/bin/../lib/wsrf_core.jar:/soft/swift/0.92/bin/../lib/wsrf_core_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_index_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_usefulrp_schema_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_provider_jce.jar:/soft/swift/0.92/bin/../lib/wsrf_tools.jar:/soft/swift/0.92/bin/../lib/wss4j.jar:/soft/swift/0.92/bin/../lib/xalan.jar:/soft/swift/0.92/bin/../lib/xbean.jar:/soft/swift/0.92/bin/../lib/xbean_xpath.jar:/soft/swift/0.92/bin/../lib/xercesImpl.jar:/soft >>> /swift/0.92/bin/../lib/xml-apis.jar:/soft/swift/0.92/bin/../lib/xmlsec.jar:/soft/swift/0.92/bin/../lib/xpp3-1.1.3.4d_b4_min.jar:/soft/swift/0.92/bin/../lib/xstream-1.1.1-patched.jar: >>> org.griphyn.vdl.karajan.Loader >>> '-config' 'cf' >>> '-tc.file' 'tc' >>> '-sites.file' >>> 'beagle-coaster.xml' >>> 'catsn.swift' '-n=1'' >>> >>> I there something >>> else I need to load >>> on beagle to make >>> swift run accordingly? >>> >>> -- >>> >>> Any intelligent fool >>> can make things >>> bigger and more >>> complex... It takes >>> a touch of genius - >>> and a lot of courage >>> to move in the >>> opposite direction. >>> >>> - Albert Einstein >>> >>> >>> >>> >>> >>> >>> -- >>> >>> Any intelligent fool can >>> make things bigger and more >>> complex... It takes a touch >>> of genius - and a lot of >>> courage to move in the >>> opposite direction. >>> >>> - Albert Einstein >>> >>> >> >> >> >> -- >> >> Any intelligent fool can make >> things bigger and more complex... >> It takes a touch of genius - and >> a lot of courage to move in the >> opposite direction. >> >> - Albert Einstein >> >> > > > > -- > > Any intelligent fool can make things > bigger and more complex... It takes a > touch of genius - and a lot of courage > to move in the opposite direction. > > - Albert Einstein > > > > > > -- > > Any intelligent fool can make things > bigger and more complex... It takes a > touch of genius - and a lot of courage to > move in the opposite direction. > > - Albert Einstein > > > > > > -- > > Any intelligent fool can make things bigger > and more complex... It takes a touch of genius > - and a lot of courage to move in the opposite > direction. > > - Albert Einstein > > > > > > > -- > > Any intelligent fool can make things bigger and more > complex... It takes a touch of genius - and a lot of > courage to move in the opposite direction. > > - Albert Einstein > > > > > > > -- > > Any intelligent fool can make things bigger and more > complex... It takes a touch of genius - and a lot of courage > to move in the opposite direction. > > - Albert Einstein > > > > > > > -- > > Any intelligent fool can make things bigger and more complex... It > takes a touch of genius - and a lot of courage to move in the opposite > direction. > > - Albert Einstein > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Sat Jun 4 13:12:11 2011 From: jonmon at utexas.edu (Jonathan S Monette) Date: Sat, 4 Jun 2011 13:12:11 -0500 Subject: [Swift-devel] Re: catsn on beagle In-Reply-To: <4DEA74DF.5040107@gmail.com> References: <4DEA670E.3010909@gmail.com> <4DEA6A30.2040503@gmail.com> <4DEA74DF.5040107@gmail.com> Message-ID: Ok. Thanks. and the (none) also appeared when I was on login1.beagle.ci.uchicago.edu. Thanks Ketan. On Sat, Jun 4, 2011 at 1:09 PM, ketan wrote: > Yes, you should bring this to the attention of beagle sysadmins. > > > On 6/4/11 1:01 PM, Jonathan S Monette wrote: > > Ok. I have my job submitted and it is currently waiting in the queue. > besides manually resetting this variable everytime I am on beagle what else > can be done? Is this something that I should bring to the attention of the > sys admins for beagle? > > On Sat, Jun 4, 2011 at 12:51 PM, Ketan Maheshwari < > ketancmaheshwari at gmail.com> wrote: > >> Alright, so this is the issue. GLOBUS_HOSTNAME is picked up from HOSTNAME >> and I do not know where does this none comes in your case. >> >> In my case I get hostname: login2.beagle.ci.uchicago.edu >> >> see if you can manually change this env variable and try to run again. >> >> >> On Sat, Jun 4, 2011 at 12:49 PM, Jonathan S Monette wrote: >> >>> login2.beagle.ci.uchicago.edu.(none) >>> >>> when you ran your test was in on login1 or login2? >>> >>> >>> On Sat, Jun 4, 2011 at 12:47 PM, Ketan Maheshwari < >>> ketancmaheshwari at gmail.com> wrote: >>> >>>> What does >>>> >>>> echo $HOSTNAME give? >>>> >>>> >>>> On Sat, Jun 4, 2011 at 12:44 PM, Jonathan S Monette wrote: >>>> >>>>> I am not sure why that (none) appears. Is this a variable that the >>>>> system sets or do I need to set it? >>>>> >>>>> >>>>> On Sat, Jun 4, 2011 at 12:39 PM, Ketan Maheshwari < >>>>> ketancmaheshwari at gmail.com> wrote: >>>>> >>>>>> This suprises me because, I do not see that (none) in my commandline. >>>>>> >>>>>> >>>>>> On Sat, Jun 4, 2011 at 12:38 PM, Jonathan S Monette < >>>>>> jonmon at utexas.edu> wrote: >>>>>> >>>>>>> I added >>>>>>> echo ${OPTIONS} >>>>>>> echo ${COG_OPTS} >>>>>>> echo ${LOCALCLASSPATH} >>>>>>> echo ${EXEC} >>>>>>> echo ${CMDLINE} >>>>>>> >>>>>>> to the script. It seems in the $OPTIONS variable this appears. >>>>>>> >>>>>>> -Xmx8192M >>>>>>> -Djava.endorsed.dirs=/home/jonmon/Library/Swift/swift-0.92/bin/../lib/endorsed >>>>>>> -DUID=1881 -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>>>>> -DCOG_INSTALL_PATH=/home/jonmon/Library/Swift/swift-0.92/bin/.. >>>>>>> -Dswift.home=/home/jonmon/Library/Swift/swift-0.92/bin/.. >>>>>>> -Djava.security.egd=file:///dev/urandom >>>>>>> >>>>>>> There is the line >>>>>>> DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) I believe that is the >>>>>>> parenthesis that is causing this to fail. Now How do I fix it. >>>>>>> >>>>>>> On Sat, Jun 4, 2011 at 12:36 PM, Jonathan S Monette < >>>>>>> jonmon at utexas.edu> wrote: >>>>>>> >>>>>>>> I may have found the error. I download 0.92 binaries and changed >>>>>>>> the swift command script. I added >>>>>>>> >>>>>>>> >>>>>>>> On Sat, Jun 4, 2011 at 12:26 PM, Jonathan S Monette < >>>>>>>> jonmon at utexas.edu> wrote: >>>>>>>> >>>>>>>>> Same result. Same error >>>>>>>>> >>>>>>>>> >>>>>>>>> On Sat, Jun 4, 2011 at 12:24 PM, ketan >>>>>>>> > wrote: >>>>>>>>> >>>>>>>>>> can you try running the commandline by copying from run.sh and >>>>>>>>>> pasting it at the prompt. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 6/4/11 12:18 PM, Jonathan S Monette wrote: >>>>>>>>>> >>>>>>>>>> I do have that project. I did projects --avail and got back >>>>>>>>>> >>>>>>>>>> Project PI Title >>>>>>>>>> >>>>>>>>>> ------------------------------------------------------------------------------ >>>>>>>>>> CI-CCR000013 Michael Wilde The Swift Parallel Scripting >>>>>>>>>> System >>>>>>>>>> >>>>>>>>>> Here is my run.sh file. I execute with ./run.sh after I did >>>>>>>>>> chmod +x run.sh >>>>>>>>>> >>>>>>>>>> #!/bin/bash >>>>>>>>>> swift -config cf -tc.file tc -sites.file beagle-coaster.xml >>>>>>>>>> catsn.swift -n=1 >>>>>>>>>> >>>>>>>>>> I do have sh and cat in my path. I can execute them. Here is >>>>>>>>>> what which sh and which cat produced. >>>>>>>>>> >>>>>>>>>> sh is /usr/bin/sh >>>>>>>>>> cat is /bin/cat >>>>>>>>>> >>>>>>>>>> On Sat, Jun 4, 2011 at 12:10 PM, ketan < >>>>>>>>>> ketancmaheshwari at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Can you try "projects --avail" command on beagle to see if you >>>>>>>>>>> are member of a project. >>>>>>>>>>> >>>>>>>>>>> Else you will need to request membership. You can do this from >>>>>>>>>>> this page: >>>>>>>>>>> >>>>>>>>>>> http://pads.ci.uchicago.edu/access/ >>>>>>>>>>> >>>>>>>>>>> In any case, your error does not seem to be because of the above. >>>>>>>>>>> Looks like '(' has sneaked in because of some unexpected typo in commandline >>>>>>>>>>> .. can you doublecheck it. >>>>>>>>>>> >>>>>>>>>>> Also can you make sure that you have all the files: tc, >>>>>>>>>>> beagle-coasters.xml cf in PATH, ie. are you able to access them w/o ./ >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 6/4/11 12:03 PM, Jonathan S Monette wrote: >>>>>>>>>>> >>>>>>>>>>> I did change the work directory. I however did not change the >>>>>>>>>>> project name. I do not know my project name so I kept the same project that >>>>>>>>>>> was in there. Here is my modified beagle-coasters.xml >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> key="project">CI-CCR000013 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> key="ppn">24:cray:pack >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> key="workersPerNode">24 >>>>>>>>>>> 1000 >>>>>>>>>>> 1 >>>>>>>>>>> 1 >>>>>>>>>>> 1 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> key="jobThrottle">.63 >>>>>>>>>>> >>>>>>>>>> key="initialScore">10000 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >/lustre/beagle/jonmon/Swift/work >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Sat, Jun 4, 2011 at 12:00 PM, Ketan Maheshwari < >>>>>>>>>>> ketancmaheshwari at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Jon, >>>>>>>>>>>> >>>>>>>>>>>> Thanks for trying out catsn on Beagle. >>>>>>>>>>>> >>>>>>>>>>>> I just tried it myself but could not reproduce the error you are >>>>>>>>>>>> getting. Have you made the changes that are mentioned in the README: >>>>>>>>>>>> == >>>>>>>>>>>> Change workdir location in beagle-coaster.xml >>>>>>>>>>>> Change the project entry in the beagle-coaster.xml to your >>>>>>>>>>>> project name >>>>>>>>>>>> == >>>>>>>>>>>> >>>>>>>>>>>> Ketan >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Sat, Jun 4, 2011 at 11:50 AM, Jonathan S Monette < >>>>>>>>>>>> jonmon at utexas.edu> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hello, >>>>>>>>>>>>> I am trying to run the catsn test on beagle using the files >>>>>>>>>>>>> in ~ketan/catsn. I have copied over this directory over to my home >>>>>>>>>>>>> directory and I believe I set it up correctly. I did module load swift and >>>>>>>>>>>>> the ran run.sh that was in this directory. I get this error. >>>>>>>>>>>>> >>>>>>>>>>>>> /soft/swift/0.92/bin/swift: eval: line 152: syntax error >>>>>>>>>>>>> near unexpected token `(' >>>>>>>>>>>>> /soft/swift/0.92/bin/swift: eval: line 152: `java -Xmx8192M >>>>>>>>>>>>> -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed -DUID=1881 >>>>>>>>>>>>> -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>>>>>>>>>>> -DCOG_INSTALL_PATH=/soft/swift/0.92/bin/.. >>>>>>>>>>>>> -Dswift.home=/soft/swift/0.92/bin/.. -Djava.security.egd= >>>>>>>>>>>>> file:///dev/urandom -classpath >>>>>>>>>>>>> /soft/swift/0.92/bin/../etc:/soft/swift/0.92/bin/../libexec:/soft/swift/0.92/bin/../lib/addressing-1.0.jar:/soft/swift/0.92/bin/../lib/ant.jar:/soft/swift/0.92/bin/../lib/antlr-2.7.5.jar:/soft/swift/0.92/bin/../lib/axis.jar:/soft/swift/0.92/bin/../lib/axis-url.jar:/soft/swift/0.92/bin/../lib/backport-util-concurrent.jar:/soft/swift/0.92/bin/../lib/castor-0.9.6.jar:/soft/swift/0.92/bin/../lib/coaster-bootstrap.jar:/soft/swift/0.92/bin/../lib/cog-abstraction-common-2.4.jar:/soft/swift/0.92/bin/../lib/cog-axis.jar:/soft/swift/0.92/bin/../lib/cog-grapheditor-0.47.jar:/soft/swift/0.92/bin/../lib/cog-jglobus-1.7.0.jar:/soft/swift/0.92/bin/../lib/cog-karajan-0.36-dev.jar:/soft/swift/0.92/bin/../lib/cog-provider-clref-gt4_0_0.jar:/soft/swift/0.92/bin/../lib/cog-provider-coaster-0.3.jar:/soft/swift/0.92/bin/../lib/cog-provider-dcache-0.1.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt2-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/soft/swift/0.92/bin/../lib/cog-pro >>>>>>>>>>>>> vider-local-2.2.jar:/soft/swift/0.92/bin/../lib/cog-provider-localscheduler-0.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-ssh-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-webdav-2.1.jar:/soft/swift/0.92/bin/../lib/cog-resources-1.0.jar:/soft/swift/0.92/bin/../lib/cog-swift-svn.jar:/soft/swift/0.92/bin/../lib/cog-trap-1.0.jar:/soft/swift/0.92/bin/../lib/cog-url.jar:/soft/swift/0.92/bin/../lib/cog-util-0.92.jar:/soft/swift/0.92/bin/../lib/commonj.jar:/soft/swift/0.92/bin/../lib/commons-beanutils.jar:/soft/swift/0.92/bin/../lib/commons-collections-3.0.jar:/soft/swift/0.92/bin/../lib/commons-digester.jar:/soft/swift/0.92/bin/../lib/commons-discovery.jar:/soft/swift/0.92/bin/../lib/commons-httpclient.jar:/soft/swift/0.92/bin/../lib/commons-logging-1.1.jar:/soft/swift/0.92/bin/../lib/concurrent.jar:/soft/swift/0.92/bin/../lib/cryptix32.jar:/soft/swift/0.92/bin/../lib/cryptix-asn1.jar:/soft/swift/0.92/bin/../lib/cryptix.jar:/soft/swift/0.92/bin/../lib/globus_delegation_servic >>>>>>>>>>>>> e.jar:/soft/swift/0.92/bin/../lib/globus_delegation_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_mds_aggregator_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_service.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rft_stubs.jar:/soft/swift/0.92/bin/../lib/gram-client.jar:/soft/swift/0.92/bin/../lib/gram-stubs.jar:/soft/swift/0.92/bin/../lib/gram-utils.jar:/soft/swift/0.92/bin/../lib/j2ssh-common-0.2.2.jar:/soft/swift/0.92/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/soft/swift/0.92/bin/../lib/jakarta-regexp-1.2.jar:/soft/swift/0.92/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/soft/swift/0.92/bin/../lib/jaxrpc.jar:/soft/swift/0.92/bin/../lib/jce-jdk13-131.jar:/soft/swift/0.92/bin/../lib/jgss.jar:/soft/swift/0.92/bin/../lib/jline-0.9.94.jar:/soft/swift/0.92/bin/../lib/jsr173_1.0_api.jar:/soft/swift/0.92/bin/../lib/jug-lgpl-2.0.0.jar:/soft/swift/0.92/bin/../lib/junit.jar:/soft/swift/0.92/bin/../lib/log4j-1.2 >>>>>>>>>>>>> .8.jar:/soft/swift/0.92/bin/../lib/naming-common.jar:/soft/swift/0.92/bin/../lib/naming-factory.jar:/soft/swift/0.92/bin/../lib/naming-java.jar:/soft/swift/0.92/bin/../lib/naming-resources.jar:/soft/swift/0.92/bin/../lib/opensaml.jar:/soft/swift/0.92/bin/../lib/puretls.jar:/soft/swift/0.92/bin/../lib/resolver.jar:/soft/swift/0.92/bin/../lib/saaj.jar:/soft/swift/0.92/bin/../lib/stringtemplate.jar:/soft/swift/0.92/bin/../lib/vdldefinitions.jar:/soft/swift/0.92/bin/../lib/wsdl4j.jar:/soft/swift/0.92/bin/../lib/wsrf_core.jar:/soft/swift/0.92/bin/../lib/wsrf_core_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_index_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_usefulrp_schema_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_provider_jce.jar:/soft/swift/0.92/bin/../lib/wsrf_tools.jar:/soft/swift/0.92/bin/../lib/wss4j.jar:/soft/swift/0.92/bin/../lib/xalan.jar:/soft/swift/0.92/bin/../lib/xbean.jar:/soft/swift/0.92/bin/../lib/xbean_xpath.jar:/soft/swift/0.92/bin/../lib/xercesImpl.jar:/soft >>>>>>>>>>>>> /swift/0.92/bin/../lib/xml-apis.jar:/soft/swift/0.92/bin/../lib/xmlsec.jar:/soft/swift/0.92/bin/../lib/xpp3-1.1.3.4d_b4_min.jar:/soft/swift/0.92/bin/../lib/xstream-1.1.1-patched.jar: >>>>>>>>>>>>> org.griphyn.vdl.karajan.Loader '-config' 'cf' '-tc.file' 'tc' '-sites.file' >>>>>>>>>>>>> 'beagle-coaster.xml' 'catsn.swift' '-n=1'' >>>>>>>>>>>>> >>>>>>>>>>>>> I there something else I need to load on beagle to make swift >>>>>>>>>>>>> run accordingly? >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> >>>>>>>>>>>>> Any intelligent fool can make things bigger and more complex... >>>>>>>>>>>>> It takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>>>>>>> direction. >>>>>>>>>>>>> >>>>>>>>>>>>> - Albert Einstein >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> >>>>>>>>>>> Any intelligent fool can make things bigger and more complex... >>>>>>>>>>> It takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>>>>> direction. >>>>>>>>>>> >>>>>>>>>>> - Albert Einstein >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>>>> direction. >>>>>>>>>> >>>>>>>>>> - Albert Einstein >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>>> direction. >>>>>>>>> >>>>>>>>> - Albert Einstein >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>> direction. >>>>>>>> >>>>>>>> - Albert Einstein >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>> direction. >>>>>>> >>>>>>> - Albert Einstein >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Any intelligent fool can make things bigger and more complex... It >>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>> direction. >>>>> >>>>> - Albert Einstein >>>>> >>>>> >>>> >>> >>> >>> -- >>> >>> Any intelligent fool can make things bigger and more complex... It takes >>> a touch of genius - and a lot of courage to move in the opposite direction. >>> >>> - Albert Einstein >>> >>> >> > > > -- > > Any intelligent fool can make things bigger and more complex... It takes a > touch of genius - and a lot of courage to move in the opposite direction. > > - Albert Einstein > > -- Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction. - Albert Einstein -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Sat Jun 4 13:33:26 2011 From: jonmon at utexas.edu (Jonathan S Monette) Date: Sat, 4 Jun 2011 13:33:26 -0500 Subject: [Swift-devel] Illegal Extra Argument vdl-int.k Message-ID: Hello, I have been using the 0.92 release for my workflows. I checked out trunk to test it on and got the error right away. RunID: 20110604-1327-u8rkc4d2 (input): found 10 files Progress: time: Sat, 04 Jun 2011 13:27:52 -0500 Execution failed: Illegal extra argument `raw_dir/raw_image_6.fits' to sys:each @ vdl-int.k, line: 447 My workflows runs fine under the 0.92 release. The log file is located in ~jonmon/Workspace/Swift/Montage/m101_tutorial/run.0033 -- Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction. - Albert Einstein -------------- next part -------------- An HTML attachment was scrubbed... URL: From iraicu at cs.iit.edu Sun Jun 5 09:44:59 2011 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Sun, 05 Jun 2011 09:44:59 -0500 Subject: [Swift-devel] ACM ScienceCloud2011 is 3 days away -- final program is online Message-ID: <4DEB966B.1030603@cs.iit.edu> Dear all, We just wanted to bring to your attention that ScienceCloud2011 is only 3 days away, and will take place on Wednesday, June 8th, from 9AM to 5PM, in San Jose California, co-located with HPDC 2011, and the FCRC 2011 conferences. We have posted the final program online at http://www.cs.iit.edu/~iraicu/ScienceCloud2011/program.htm; note that we already posted the paper PDF files, and we will post the slides in the coming days. We have an exciting keynote given by Dr. Ion Stoica on "Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center". Dr. Stoica is an Associate Professor in the EECS Department at University of California at Berkeley, where he does research on cloud computing and networked computer systems. He is also the co-founder of Conviva, a startup to commercialize technologies for large scale video distribution. You will find 8 exciting talks that span industry (Yahoo, Microsoft), government labs (LBNL, ANL, JPL), and academia (UCBerkeley, USC, UWashington, CalTech, UVirginia, UCSB). The list of presentations are: 1. The Case for Being Lazy: How to Leverage Lazy Evaluation in MapReduce 2. Debunking some Common Misconceptions of Science in the Cloud 3. Experiences Using Cloud Computing for A Scientific Workflow Application 4. Cumulus: Open Source Storage Cloud for Science 5. Adaptive Rate Stream Processing for Smart Grid Applications on Clouds 6. An Automated Approach to Cloud Storage Service Selection 7. Magellan: Experiences from a Science Cloud 8. Neptune: A Domain Specific Language for Deploying HPC Software on Cloud Platforms For more information about the workshop, please visit http://www.cs.iit.edu/~iraicu/ScienceCloud2011/. We look forward to seeing you at the workshop! Regards, Ioan Raicu, Pete Beckman, Ian Foster & Yogesh Simmhan ScienceCloud2011 Co-Chairs & Program Chair http://www.cs.iit.edu/~iraicu/ScienceCloud2011/ -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================================= ================================================================= From jonmon at utexas.edu Sun Jun 5 13:35:22 2011 From: jonmon at utexas.edu (Jonathan S Monette) Date: Sun, 5 Jun 2011 13:35:22 -0500 Subject: [Swift-devel] Re: catsn on beagle In-Reply-To: References: <4DEA670E.3010909@gmail.com> <4DEA6A30.2040503@gmail.com> <4DEA74DF.5040107@gmail.com> Message-ID: I still have problems running this script. I does not seem to ever execute. I get this to stdout Swift svn swift-r4252 (swift modified locally) cog-r3088 (cog modified locally) RunID: 20110604-2338-5eik2hbb Progress: Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Canceling job 173155.sdb Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Canceling job 173182.sdb Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 I had to C-c the run. I would check qstat and it would say the job was executing in the development queue but it seems like it never ran. Also it seems that the coasters job was cancelled twice during the Swift execution. On Sat, Jun 4, 2011 at 1:12 PM, Jonathan S Monette wrote: > Ok. Thanks. and the (none) also appeared when I was on > login1.beagle.ci.uchicago.edu. Thanks Ketan. > > > On Sat, Jun 4, 2011 at 1:09 PM, ketan wrote: > >> Yes, you should bring this to the attention of beagle sysadmins. >> >> >> On 6/4/11 1:01 PM, Jonathan S Monette wrote: >> >> Ok. I have my job submitted and it is currently waiting in the queue. >> besides manually resetting this variable everytime I am on beagle what else >> can be done? Is this something that I should bring to the attention of the >> sys admins for beagle? >> >> On Sat, Jun 4, 2011 at 12:51 PM, Ketan Maheshwari < >> ketancmaheshwari at gmail.com> wrote: >> >>> Alright, so this is the issue. GLOBUS_HOSTNAME is picked up from HOSTNAME >>> and I do not know where does this none comes in your case. >>> >>> In my case I get hostname: login2.beagle.ci.uchicago.edu >>> >>> see if you can manually change this env variable and try to run again. >>> >>> >>> On Sat, Jun 4, 2011 at 12:49 PM, Jonathan S Monette wrote: >>> >>>> login2.beagle.ci.uchicago.edu.(none) >>>> >>>> when you ran your test was in on login1 or login2? >>>> >>>> >>>> On Sat, Jun 4, 2011 at 12:47 PM, Ketan Maheshwari < >>>> ketancmaheshwari at gmail.com> wrote: >>>> >>>>> What does >>>>> >>>>> echo $HOSTNAME give? >>>>> >>>>> >>>>> On Sat, Jun 4, 2011 at 12:44 PM, Jonathan S Monette >>>> > wrote: >>>>> >>>>>> I am not sure why that (none) appears. Is this a variable that the >>>>>> system sets or do I need to set it? >>>>>> >>>>>> >>>>>> On Sat, Jun 4, 2011 at 12:39 PM, Ketan Maheshwari < >>>>>> ketancmaheshwari at gmail.com> wrote: >>>>>> >>>>>>> This suprises me because, I do not see that (none) in my commandline. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Sat, Jun 4, 2011 at 12:38 PM, Jonathan S Monette < >>>>>>> jonmon at utexas.edu> wrote: >>>>>>> >>>>>>>> I added >>>>>>>> echo ${OPTIONS} >>>>>>>> echo ${COG_OPTS} >>>>>>>> echo ${LOCALCLASSPATH} >>>>>>>> echo ${EXEC} >>>>>>>> echo ${CMDLINE} >>>>>>>> >>>>>>>> to the script. It seems in the $OPTIONS variable this appears. >>>>>>>> >>>>>>>> -Xmx8192M >>>>>>>> -Djava.endorsed.dirs=/home/jonmon/Library/Swift/swift-0.92/bin/../lib/endorsed >>>>>>>> -DUID=1881 -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>>>>>> -DCOG_INSTALL_PATH=/home/jonmon/Library/Swift/swift-0.92/bin/.. >>>>>>>> -Dswift.home=/home/jonmon/Library/Swift/swift-0.92/bin/.. >>>>>>>> -Djava.security.egd=file:///dev/urandom >>>>>>>> >>>>>>>> There is the line >>>>>>>> DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) I believe that is the >>>>>>>> parenthesis that is causing this to fail. Now How do I fix it. >>>>>>>> >>>>>>>> On Sat, Jun 4, 2011 at 12:36 PM, Jonathan S Monette < >>>>>>>> jonmon at utexas.edu> wrote: >>>>>>>> >>>>>>>>> I may have found the error. I download 0.92 binaries and changed >>>>>>>>> the swift command script. I added >>>>>>>>> >>>>>>>>> >>>>>>>>> On Sat, Jun 4, 2011 at 12:26 PM, Jonathan S Monette < >>>>>>>>> jonmon at utexas.edu> wrote: >>>>>>>>> >>>>>>>>>> Same result. Same error >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Sat, Jun 4, 2011 at 12:24 PM, ketan < >>>>>>>>>> ketancmaheshwari at gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> can you try running the commandline by copying from run.sh and >>>>>>>>>>> pasting it at the prompt. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 6/4/11 12:18 PM, Jonathan S Monette wrote: >>>>>>>>>>> >>>>>>>>>>> I do have that project. I did projects --avail and got back >>>>>>>>>>> >>>>>>>>>>> Project PI Title >>>>>>>>>>> >>>>>>>>>>> ------------------------------------------------------------------------------ >>>>>>>>>>> CI-CCR000013 Michael Wilde The Swift Parallel Scripting >>>>>>>>>>> System >>>>>>>>>>> >>>>>>>>>>> Here is my run.sh file. I execute with ./run.sh after I did >>>>>>>>>>> chmod +x run.sh >>>>>>>>>>> >>>>>>>>>>> #!/bin/bash >>>>>>>>>>> swift -config cf -tc.file tc -sites.file beagle-coaster.xml >>>>>>>>>>> catsn.swift -n=1 >>>>>>>>>>> >>>>>>>>>>> I do have sh and cat in my path. I can execute them. Here is >>>>>>>>>>> what which sh and which cat produced. >>>>>>>>>>> >>>>>>>>>>> sh is /usr/bin/sh >>>>>>>>>>> cat is /bin/cat >>>>>>>>>>> >>>>>>>>>>> On Sat, Jun 4, 2011 at 12:10 PM, ketan < >>>>>>>>>>> ketancmaheshwari at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Can you try "projects --avail" command on beagle to see if you >>>>>>>>>>>> are member of a project. >>>>>>>>>>>> >>>>>>>>>>>> Else you will need to request membership. You can do this from >>>>>>>>>>>> this page: >>>>>>>>>>>> >>>>>>>>>>>> http://pads.ci.uchicago.edu/access/ >>>>>>>>>>>> >>>>>>>>>>>> In any case, your error does not seem to be because of the >>>>>>>>>>>> above. Looks like '(' has sneaked in because of some unexpected typo in >>>>>>>>>>>> commandline .. can you doublecheck it. >>>>>>>>>>>> >>>>>>>>>>>> Also can you make sure that you have all the files: tc, >>>>>>>>>>>> beagle-coasters.xml cf in PATH, ie. are you able to access them w/o ./ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 6/4/11 12:03 PM, Jonathan S Monette wrote: >>>>>>>>>>>> >>>>>>>>>>>> I did change the work directory. I however did not change the >>>>>>>>>>>> project name. I do not know my project name so I kept the same project that >>>>>>>>>>>> was in there. Here is my modified beagle-coasters.xml >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> key="project">CI-CCR000013 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> key="ppn">24:cray:pack >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> key="workersPerNode">24 >>>>>>>>>>>> 1000 >>>>>>>>>>>> 1 >>>>>>>>>>>> >>>>>>>>>>> key="nodeGranularity">1 >>>>>>>>>>>> 1 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> key="jobThrottle">.63 >>>>>>>>>>>> >>>>>>>>>>> key="initialScore">10000 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >/lustre/beagle/jonmon/Swift/work >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Sat, Jun 4, 2011 at 12:00 PM, Ketan Maheshwari < >>>>>>>>>>>> ketancmaheshwari at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Jon, >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for trying out catsn on Beagle. >>>>>>>>>>>>> >>>>>>>>>>>>> I just tried it myself but could not reproduce the error you >>>>>>>>>>>>> are getting. Have you made the changes that are mentioned in the README: >>>>>>>>>>>>> == >>>>>>>>>>>>> Change workdir location in beagle-coaster.xml >>>>>>>>>>>>> Change the project entry in the beagle-coaster.xml to your >>>>>>>>>>>>> project name >>>>>>>>>>>>> == >>>>>>>>>>>>> >>>>>>>>>>>>> Ketan >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Sat, Jun 4, 2011 at 11:50 AM, Jonathan S Monette < >>>>>>>>>>>>> jonmon at utexas.edu> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>> I am trying to run the catsn test on beagle using the files >>>>>>>>>>>>>> in ~ketan/catsn. I have copied over this directory over to my home >>>>>>>>>>>>>> directory and I believe I set it up correctly. I did module load swift and >>>>>>>>>>>>>> the ran run.sh that was in this directory. I get this error. >>>>>>>>>>>>>> >>>>>>>>>>>>>> /soft/swift/0.92/bin/swift: eval: line 152: syntax error >>>>>>>>>>>>>> near unexpected token `(' >>>>>>>>>>>>>> /soft/swift/0.92/bin/swift: eval: line 152: `java -Xmx8192M >>>>>>>>>>>>>> -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed -DUID=1881 >>>>>>>>>>>>>> -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>>>>>>>>>>>> -DCOG_INSTALL_PATH=/soft/swift/0.92/bin/.. >>>>>>>>>>>>>> -Dswift.home=/soft/swift/0.92/bin/.. -Djava.security.egd= >>>>>>>>>>>>>> file:///dev/urandom -classpath >>>>>>>>>>>>>> /soft/swift/0.92/bin/../etc:/soft/swift/0.92/bin/../libexec:/soft/swift/0.92/bin/../lib/addressing-1.0.jar:/soft/swift/0.92/bin/../lib/ant.jar:/soft/swift/0.92/bin/../lib/antlr-2.7.5.jar:/soft/swift/0.92/bin/../lib/axis.jar:/soft/swift/0.92/bin/../lib/axis-url.jar:/soft/swift/0.92/bin/../lib/backport-util-concurrent.jar:/soft/swift/0.92/bin/../lib/castor-0.9.6.jar:/soft/swift/0.92/bin/../lib/coaster-bootstrap.jar:/soft/swift/0.92/bin/../lib/cog-abstraction-common-2.4.jar:/soft/swift/0.92/bin/../lib/cog-axis.jar:/soft/swift/0.92/bin/../lib/cog-grapheditor-0.47.jar:/soft/swift/0.92/bin/../lib/cog-jglobus-1.7.0.jar:/soft/swift/0.92/bin/../lib/cog-karajan-0.36-dev.jar:/soft/swift/0.92/bin/../lib/cog-provider-clref-gt4_0_0.jar:/soft/swift/0.92/bin/../lib/cog-provider-coaster-0.3.jar:/soft/swift/0.92/bin/../lib/cog-provider-dcache-0.1.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt2-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/soft/swift/0.92/bin/../lib/cog-pro >>>>>>>>>>>>>> vider-local-2.2.jar:/soft/swift/0.92/bin/../lib/cog-provider-localscheduler-0.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-ssh-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-webdav-2.1.jar:/soft/swift/0.92/bin/../lib/cog-resources-1.0.jar:/soft/swift/0.92/bin/../lib/cog-swift-svn.jar:/soft/swift/0.92/bin/../lib/cog-trap-1.0.jar:/soft/swift/0.92/bin/../lib/cog-url.jar:/soft/swift/0.92/bin/../lib/cog-util-0.92.jar:/soft/swift/0.92/bin/../lib/commonj.jar:/soft/swift/0.92/bin/../lib/commons-beanutils.jar:/soft/swift/0.92/bin/../lib/commons-collections-3.0.jar:/soft/swift/0.92/bin/../lib/commons-digester.jar:/soft/swift/0.92/bin/../lib/commons-discovery.jar:/soft/swift/0.92/bin/../lib/commons-httpclient.jar:/soft/swift/0.92/bin/../lib/commons-logging-1.1.jar:/soft/swift/0.92/bin/../lib/concurrent.jar:/soft/swift/0.92/bin/../lib/cryptix32.jar:/soft/swift/0.92/bin/../lib/cryptix-asn1.jar:/soft/swift/0.92/bin/../lib/cryptix.jar:/soft/swift/0.92/bin/../lib/globus_delegation_servic >>>>>>>>>>>>>> e.jar:/soft/swift/0.92/bin/../lib/globus_delegation_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_mds_aggregator_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_service.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rft_stubs.jar:/soft/swift/0.92/bin/../lib/gram-client.jar:/soft/swift/0.92/bin/../lib/gram-stubs.jar:/soft/swift/0.92/bin/../lib/gram-utils.jar:/soft/swift/0.92/bin/../lib/j2ssh-common-0.2.2.jar:/soft/swift/0.92/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/soft/swift/0.92/bin/../lib/jakarta-regexp-1.2.jar:/soft/swift/0.92/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/soft/swift/0.92/bin/../lib/jaxrpc.jar:/soft/swift/0.92/bin/../lib/jce-jdk13-131.jar:/soft/swift/0.92/bin/../lib/jgss.jar:/soft/swift/0.92/bin/../lib/jline-0.9.94.jar:/soft/swift/0.92/bin/../lib/jsr173_1.0_api.jar:/soft/swift/0.92/bin/../lib/jug-lgpl-2.0.0.jar:/soft/swift/0.92/bin/../lib/junit.jar:/soft/swift/0.92/bin/../lib/log4j-1.2 >>>>>>>>>>>>>> .8.jar:/soft/swift/0.92/bin/../lib/naming-common.jar:/soft/swift/0.92/bin/../lib/naming-factory.jar:/soft/swift/0.92/bin/../lib/naming-java.jar:/soft/swift/0.92/bin/../lib/naming-resources.jar:/soft/swift/0.92/bin/../lib/opensaml.jar:/soft/swift/0.92/bin/../lib/puretls.jar:/soft/swift/0.92/bin/../lib/resolver.jar:/soft/swift/0.92/bin/../lib/saaj.jar:/soft/swift/0.92/bin/../lib/stringtemplate.jar:/soft/swift/0.92/bin/../lib/vdldefinitions.jar:/soft/swift/0.92/bin/../lib/wsdl4j.jar:/soft/swift/0.92/bin/../lib/wsrf_core.jar:/soft/swift/0.92/bin/../lib/wsrf_core_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_index_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_usefulrp_schema_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_provider_jce.jar:/soft/swift/0.92/bin/../lib/wsrf_tools.jar:/soft/swift/0.92/bin/../lib/wss4j.jar:/soft/swift/0.92/bin/../lib/xalan.jar:/soft/swift/0.92/bin/../lib/xbean.jar:/soft/swift/0.92/bin/../lib/xbean_xpath.jar:/soft/swift/0.92/bin/../lib/xercesImpl.jar:/soft >>>>>>>>>>>>>> /swift/0.92/bin/../lib/xml-apis.jar:/soft/swift/0.92/bin/../lib/xmlsec.jar:/soft/swift/0.92/bin/../lib/xpp3-1.1.3.4d_b4_min.jar:/soft/swift/0.92/bin/../lib/xstream-1.1.1-patched.jar: >>>>>>>>>>>>>> org.griphyn.vdl.karajan.Loader '-config' 'cf' '-tc.file' 'tc' '-sites.file' >>>>>>>>>>>>>> 'beagle-coaster.xml' 'catsn.swift' '-n=1'' >>>>>>>>>>>>>> >>>>>>>>>>>>>> I there something else I need to load on beagle to make >>>>>>>>>>>>>> swift run accordingly? >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> >>>>>>>>>>>>>> Any intelligent fool can make things bigger and more >>>>>>>>>>>>>> complex... It takes a touch of genius - and a lot of courage to move in the >>>>>>>>>>>>>> opposite direction. >>>>>>>>>>>>>> >>>>>>>>>>>>>> - Albert Einstein >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> >>>>>>>>>>>> Any intelligent fool can make things bigger and more complex... >>>>>>>>>>>> It takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>>>>>> direction. >>>>>>>>>>>> >>>>>>>>>>>> - Albert Einstein >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> >>>>>>>>>>> Any intelligent fool can make things bigger and more complex... >>>>>>>>>>> It takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>>>>> direction. >>>>>>>>>>> >>>>>>>>>>> - Albert Einstein >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>>>> direction. >>>>>>>>>> >>>>>>>>>> - Albert Einstein >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>>> direction. >>>>>>>>> >>>>>>>>> - Albert Einstein >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>> direction. >>>>>>>> >>>>>>>> - Albert Einstein >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>> direction. >>>>>> >>>>>> - Albert Einstein >>>>>> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> >>>> Any intelligent fool can make things bigger and more complex... It takes >>>> a touch of genius - and a lot of courage to move in the opposite direction. >>>> >>>> - Albert Einstein >>>> >>>> >>> >> >> >> -- >> >> Any intelligent fool can make things bigger and more complex... It takes a >> touch of genius - and a lot of courage to move in the opposite direction. >> >> - Albert Einstein >> >> > > > -- > > Any intelligent fool can make things bigger and more complex... It takes a > touch of genius - and a lot of courage to move in the opposite direction. > > - Albert Einstein > > -- Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction. - Albert Einstein -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Sun Jun 5 20:13:44 2011 From: ketancmaheshwari at gmail.com (ketan) Date: Sun, 05 Jun 2011 20:13:44 -0500 Subject: [Swift-devel] Re: catsn on beagle In-Reply-To: References: <4DEA670E.3010909@gmail.com> <4DEA6A30.2040503@gmail.com> <4DEA74DF.5040107@gmail.com> Message-ID: <4DEC29C8.6020507@gmail.com> I do not know the reason why you are getting this. The PBS submit throws the following stderr: aprun: Apid 327667: Caught signal Terminated, sending to application Also on my another swift submission I am getting this as PBS submit stderr: [NID 00065] 2011-06-05 19:34:26 distributeControlMsg: Apid 327688 write failure to node 66, 10.128.0.67, port 607, Connection reset by peer I suspect the last maintenance/upgrade of Beagle might have caused this. Ketan On 6/5/11 1:35 PM, Jonathan S Monette wrote: > I still have problems running this script. I does not seem to ever > execute. I get this to stdout > > Swift svn swift-r4252 (swift modified locally) cog-r3088 (cog modified > locally) > > RunID: 20110604-2338-5eik2hbb > Progress: > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Canceling job 173155.sdb > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Canceling job 173182.sdb > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > > I had to C-c the run. I would check qstat and it would say the job > was executing in the development queue but it seems like it never ran. > Also it seems that the coasters job was cancelled twice during the > Swift execution. > > On Sat, Jun 4, 2011 at 1:12 PM, Jonathan S Monette > wrote: > > Ok. Thanks. and the (none) also appeared when I was on > login1.beagle.ci.uchicago.edu > . Thanks Ketan. > > > On Sat, Jun 4, 2011 at 1:09 PM, ketan > wrote: > > Yes, you should bring this to the attention of beagle sysadmins. > > > On 6/4/11 1:01 PM, Jonathan S Monette wrote: >> Ok. I have my job submitted and it is currently waiting in >> the queue. besides manually resetting this variable >> everytime I am on beagle what else can be done? Is this >> something that I should bring to the attention of the sys >> admins for beagle? >> >> On Sat, Jun 4, 2011 at 12:51 PM, Ketan Maheshwari >> > > wrote: >> >> Alright, so this is the issue. GLOBUS_HOSTNAME is picked >> up from HOSTNAME and I do not know where does this none >> comes in your case. >> >> In my case I get hostname: login2.beagle.ci.uchicago.edu >> >> >> see if you can manually change this env variable and try >> to run again. >> >> >> On Sat, Jun 4, 2011 at 12:49 PM, Jonathan S Monette >> > wrote: >> >> login2.beagle.ci.uchicago.edu.(none) >> >> when you ran your test was in on login1 or login2? >> >> >> On Sat, Jun 4, 2011 at 12:47 PM, Ketan Maheshwari >> > > wrote: >> >> What does >> >> echo $HOSTNAME give? >> >> >> On Sat, Jun 4, 2011 at 12:44 PM, Jonathan S >> Monette > > wrote: >> >> I am not sure why that (none) appears. Is >> this a variable that the system sets or do I >> need to set it? >> >> >> On Sat, Jun 4, 2011 at 12:39 PM, Ketan >> Maheshwari > > wrote: >> >> This suprises me because, I do not see >> that (none) in my commandline. >> >> >> On Sat, Jun 4, 2011 at 12:38 PM, Jonathan >> S Monette > > wrote: >> >> I added >> echo ${OPTIONS} >> echo ${COG_OPTS} >> echo ${LOCALCLASSPATH} >> echo ${EXEC} >> echo ${CMDLINE} >> >> to the script. It seems in the >> $OPTIONS variable this appears. >> >> -Xmx8192M >> -Djava.endorsed.dirs=/home/jonmon/Library/Swift/swift-0.92/bin/../lib/endorsed >> -DUID=1881 >> -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >> -DCOG_INSTALL_PATH=/home/jonmon/Library/Swift/swift-0.92/bin/.. >> -Dswift.home=/home/jonmon/Library/Swift/swift-0.92/bin/.. >> -Djava.security.egd=file:///dev/urandom >> >> There is the line >> DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >> I believe that is the parenthesis >> that is causing this to fail. Now >> How do I fix it. >> >> On Sat, Jun 4, 2011 at 12:36 PM, >> Jonathan S Monette > > wrote: >> >> I may have found the error. I >> download 0.92 binaries and >> changed the swift command script. >> I added >> >> >> On Sat, Jun 4, 2011 at 12:26 PM, >> Jonathan S Monette >> > > wrote: >> >> Same result. Same error >> >> >> On Sat, Jun 4, 2011 at 12:24 >> PM, ketan >> > > >> wrote: >> >> can you try running the >> commandline by copying >> from run.sh and pasting >> it at the prompt. >> >> >> On 6/4/11 12:18 PM, >> Jonathan S Monette wrote: >>> I do have that project. >>> I did projects --avail >>> and got back >>> >>> Project PI >>> Title >>> ------------------------------------------------------------------------------ >>> CI-CCR000013 Michael >>> Wilde The >>> Swift Parallel Scripting >>> System >>> >>> Here is my run.sh file. >>> I execute with ./run.sh >>> after I did chmod +x run.sh >>> >>> #!/bin/bash >>> swift -config cf >>> -tc.file tc -sites.file >>> beagle-coaster.xml >>> catsn.swift -n=1 >>> >>> I do have sh and cat in >>> my path. I can execute >>> them. Here is what >>> which sh and which cat >>> produced. >>> >>> sh is /usr/bin/sh >>> cat is /bin/cat >>> >>> On Sat, Jun 4, 2011 at >>> 12:10 PM, ketan >>> >> > >>> wrote: >>> >>> Can you try >>> "projects --avail" >>> command on beagle to >>> see if you are >>> member of a project. >>> >>> Else you will need >>> to request >>> membership. You can >>> do this from this page: >>> >>> http://pads.ci.uchicago.edu/access/ >>> >>> In any case, your >>> error does not seem >>> to be because of the >>> above. Looks like >>> '(' has sneaked in >>> because of some >>> unexpected typo in >>> commandline .. can >>> you doublecheck it. >>> >>> Also can you make >>> sure that you have >>> all the files: tc, >>> beagle-coasters.xml >>> cf in PATH, ie. are >>> you able to access >>> them w/o ./ >>> >>> >>> On 6/4/11 12:03 PM, >>> Jonathan S Monette >>> wrote: >>>> I did change the >>>> work directory. I >>>> however did not >>>> change the project >>>> name. I do not >>>> know my project >>>> name so I kept the >>>> same project that >>>> was in there. Here >>>> is my modified >>>> beagle-coasters.xml >>>> >>>> >>>> >>>> >>> provider="coaster" >>>> jobmanager="local:pbs"/> >>>> >>> namespace="globus" >>>> key="project">CI-CCR000013 >>>> >>>> >>> namespace="globus" >>>> key="ppn">24:cray:pack >>>> >>>> >>> namespace="globus" >>>> key="workersPerNode">24 >>>> >>> namespace="globus" >>>> key="maxTime">1000 >>>> >>> namespace="globus" >>>> key="slots">1 >>>> >>> namespace="globus" >>>> key="nodeGranularity">1 >>>> >>> namespace="globus" >>>> key="maxNodes">1 >>>> >>>> >>> namespace="karajan" >>>> key="jobThrottle">.63 >>>> >>> namespace="karajan" >>>> key="initialScore">10000 >>>> >>>> >>> provider="local"/> >>>> >>> >/lustre/beagle/jonmon/Swift/work >>>> >>>> >>>> >>>> >>>> On Sat, Jun 4, 2011 >>>> at 12:00 PM, Ketan >>>> Maheshwari >>>> >>> > >>>> wrote: >>>> >>>> Jon, >>>> >>>> Thanks for >>>> trying out >>>> catsn on Beagle. >>>> >>>> I just tried it >>>> myself but >>>> could not >>>> reproduce the >>>> error you are >>>> getting. Have >>>> you made the >>>> changes that >>>> are mentioned >>>> in the README: >>>> == >>>> Change workdir >>>> location in >>>> beagle-coaster.xml >>>> Change the >>>> project entry >>>> in the >>>> beagle-coaster.xml >>>> to your project >>>> name >>>> == >>>> >>>> Ketan >>>> >>>> >>>> On Sat, Jun 4, >>>> 2011 at 11:50 >>>> AM, Jonathan S >>>> Monette >>>> >>> > >>>> wrote: >>>> >>>> Hello, >>>> I am >>>> trying to >>>> run the >>>> catsn test >>>> on beagle >>>> using the >>>> files in >>>> ~ketan/catsn. >>>> I have >>>> copied over >>>> this >>>> directory >>>> over to my >>>> home >>>> directory >>>> and I >>>> believe I >>>> set it up >>>> correctly. >>>> I did >>>> module load >>>> swift and >>>> the ran >>>> run.sh that >>>> was in this >>>> directory. >>>> I get this >>>> error. >>>> >>>> /soft/swift/0.92/bin/swift: >>>> eval: line >>>> 152: syntax >>>> error near >>>> unexpected >>>> token `(' >>>> /soft/swift/0.92/bin/swift: >>>> eval: line >>>> 152: `java >>>> -Xmx8192M >>>> -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed >>>> -DUID=1881 >>>> -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>> -DCOG_INSTALL_PATH=/soft/swift/0.92/bin/.. >>>> -Dswift.home=/soft/swift/0.92/bin/.. >>>> -Djava.security.egd=file:///dev/urandom >>>> -classpath >>>> /soft/swift/0.92/bin/../etc:/soft/swift/0.92/bin/../libexec:/soft/swift/0.92/bin/../lib/addressing-1.0.jar:/soft/swift/0.92/bin/../lib/ant.jar:/soft/swift/0.92/bin/../lib/antlr-2.7.5.jar:/soft/swift/0.92/bin/../lib/axis.jar:/soft/swift/0.92/bin/../lib/axis-url.jar:/soft/swift/0.92/bin/../lib/backport-util-concurrent.jar:/soft/swift/0.92/bin/../lib/castor-0.9.6.jar:/soft/swift/0.92/bin/../lib/coaster-bootstrap.jar:/soft/swift/0.92/bin/../lib/cog-abstraction-common-2.4.jar:/soft/swift/0.92/bin/../lib/cog-axis.jar:/soft/swift/0.92/bin/../lib/cog-grapheditor-0.47.jar:/soft/swift/0.92/bin/../lib/cog-jglobus-1.7.0.jar:/soft/swift/0.92/bin/../lib/cog-karajan-0.36-dev.jar:/soft/swift/0.92/bin/../lib/cog-provider-clref-gt4_0_0.jar:/soft/swift/0.92/bin/../lib/cog-provider-coaster-0.3.jar:/soft/swift/0.92/bin/../lib/cog-provider-dcache-0.1.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt2-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/soft/swift/0.92/bin/../lib/cog-pro >>>> vider-local-2.2.jar:/soft/swift/0.92/bin/../lib/cog-provider-localscheduler-0.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-ssh-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-webdav-2.1.jar:/soft/swift/0.92/bin/../lib/cog-resources-1.0.jar:/soft/swift/0.92/bin/../lib/cog-swift-svn.jar:/soft/swift/0.92/bin/../lib/cog-trap-1.0.jar:/soft/swift/0.92/bin/../lib/cog-url.jar:/soft/swift/0.92/bin/../lib/cog-util-0.92.jar:/soft/swift/0.92/bin/../lib/commonj.jar:/soft/swift/0.92/bin/../lib/commons-beanutils.jar:/soft/swift/0.92/bin/../lib/commons-collections-3.0.jar:/soft/swift/0.92/bin/../lib/commons-digester.jar:/soft/swift/0.92/bin/../lib/commons-discovery.jar:/soft/swift/0.92/bin/../lib/commons-httpclient.jar:/soft/swift/0.92/bin/../lib/commons-logging-1.1.jar:/soft/swift/0.92/bin/../lib/concurrent.jar:/soft/swift/0.92/bin/../lib/cryptix32.jar:/soft/swift/0.92/bin/../lib/cryptix-asn1.jar:/soft/swift/0.92/bin/../lib/cryptix.jar:/soft/swift/0.92/bin/../lib/globus_delegation_servic >>>> e.jar:/soft/swift/0.92/bin/../lib/globus_delegation_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_mds_aggregator_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_service.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rft_stubs.jar:/soft/swift/0.92/bin/../lib/gram-client.jar:/soft/swift/0.92/bin/../lib/gram-stubs.jar:/soft/swift/0.92/bin/../lib/gram-utils.jar:/soft/swift/0.92/bin/../lib/j2ssh-common-0.2.2.jar:/soft/swift/0.92/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/soft/swift/0.92/bin/../lib/jakarta-regexp-1.2.jar:/soft/swift/0.92/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/soft/swift/0.92/bin/../lib/jaxrpc.jar:/soft/swift/0.92/bin/../lib/jce-jdk13-131.jar:/soft/swift/0.92/bin/../lib/jgss.jar:/soft/swift/0.92/bin/../lib/jline-0.9.94.jar:/soft/swift/0.92/bin/../lib/jsr173_1.0_api.jar:/soft/swift/0.92/bin/../lib/jug-lgpl-2.0.0.jar:/soft/swift/0.92/bin/../lib/junit.jar:/soft/swift/0.92/bin/../lib/log4j-1.2 >>>> .8.jar:/soft/swift/0.92/bin/../lib/naming-common.jar:/soft/swift/0.92/bin/../lib/naming-factory.jar:/soft/swift/0.92/bin/../lib/naming-java.jar:/soft/swift/0.92/bin/../lib/naming-resources.jar:/soft/swift/0.92/bin/../lib/opensaml.jar:/soft/swift/0.92/bin/../lib/puretls.jar:/soft/swift/0.92/bin/../lib/resolver.jar:/soft/swift/0.92/bin/../lib/saaj.jar:/soft/swift/0.92/bin/../lib/stringtemplate.jar:/soft/swift/0.92/bin/../lib/vdldefinitions.jar:/soft/swift/0.92/bin/../lib/wsdl4j.jar:/soft/swift/0.92/bin/../lib/wsrf_core.jar:/soft/swift/0.92/bin/../lib/wsrf_core_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_index_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_usefulrp_schema_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_provider_jce.jar:/soft/swift/0.92/bin/../lib/wsrf_tools.jar:/soft/swift/0.92/bin/../lib/wss4j.jar:/soft/swift/0.92/bin/../lib/xalan.jar:/soft/swift/0.92/bin/../lib/xbean.jar:/soft/swift/0.92/bin/../lib/xbean_xpath.jar:/soft/swift/0.92/bin/../lib/xercesImpl.jar:/soft >>>> /swift/0.92/bin/../lib/xml-apis.jar:/soft/swift/0.92/bin/../lib/xmlsec.jar:/soft/swift/0.92/bin/../lib/xpp3-1.1.3.4d_b4_min.jar:/soft/swift/0.92/bin/../lib/xstream-1.1.1-patched.jar: >>>> org.griphyn.vdl.karajan.Loader >>>> '-config' >>>> 'cf' >>>> '-tc.file' >>>> 'tc' >>>> '-sites.file' >>>> 'beagle-coaster.xml' >>>> 'catsn.swift' >>>> '-n=1'' >>>> >>>> I there >>>> something >>>> else I need >>>> to load on >>>> beagle to >>>> make swift >>>> run >>>> accordingly? >>>> >>>> -- >>>> >>>> Any >>>> intelligent >>>> fool can >>>> make things >>>> bigger and >>>> more >>>> complex... >>>> It takes a >>>> touch of >>>> genius - >>>> and a lot >>>> of courage >>>> to move in >>>> the >>>> opposite >>>> direction. >>>> >>>> - Albert >>>> Einstein >>>> >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> >>>> Any intelligent >>>> fool can make >>>> things bigger and >>>> more complex... It >>>> takes a touch of >>>> genius - and a lot >>>> of courage to move >>>> in the opposite >>>> direction. >>>> >>>> - Albert Einstein >>>> >>>> >>> >>> >>> >>> -- >>> >>> Any intelligent fool can >>> make things bigger and >>> more complex... It takes >>> a touch of genius - and >>> a lot of courage to move >>> in the opposite direction. >>> >>> - Albert Einstein >>> >>> >> >> >> >> -- >> >> Any intelligent fool can make >> things bigger and more >> complex... It takes a touch >> of genius - and a lot of >> courage to move in the >> opposite direction. >> >> - Albert Einstein >> >> >> >> >> >> -- >> >> Any intelligent fool can make >> things bigger and more complex... >> It takes a touch of genius - and >> a lot of courage to move in the >> opposite direction. >> >> - Albert Einstein >> >> >> >> >> >> -- >> >> Any intelligent fool can make things >> bigger and more complex... It takes a >> touch of genius - and a lot of >> courage to move in the opposite >> direction. >> >> - Albert Einstein >> >> >> >> >> >> >> -- >> >> Any intelligent fool can make things bigger >> and more complex... It takes a touch of >> genius - and a lot of courage to move in the >> opposite direction. >> >> - Albert Einstein >> >> >> >> >> >> >> -- >> >> Any intelligent fool can make things bigger and more >> complex... It takes a touch of genius - and a lot of >> courage to move in the opposite direction. >> >> - Albert Einstein >> >> >> >> >> >> >> -- >> >> Any intelligent fool can make things bigger and more >> complex... It takes a touch of genius - and a lot of courage >> to move in the opposite direction. >> >> - Albert Einstein >> >> > > > > -- > > Any intelligent fool can make things bigger and more complex... It > takes a touch of genius - and a lot of courage to move in the > opposite direction. > > - Albert Einstein > > > > > > -- > > Any intelligent fool can make things bigger and more complex... It > takes a touch of genius - and a lot of courage to move in the opposite > direction. > > - Albert Einstein > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Sun Jun 5 20:17:55 2011 From: jonmon at utexas.edu (Jonathan S Monette) Date: Sun, 5 Jun 2011 20:17:55 -0500 Subject: [Swift-devel] Re: catsn on beagle In-Reply-To: <4DEC29C8.6020507@gmail.com> References: <4DEA670E.3010909@gmail.com> <4DEA6A30.2040503@gmail.com> <4DEA74DF.5040107@gmail.com> <4DEC29C8.6020507@gmail.com> Message-ID: So you are seeing the same problem? On Sun, Jun 5, 2011 at 8:13 PM, ketan wrote: > I do not know the reason why you are getting this. The PBS submit throws > the following stderr: > > aprun: Apid 327667: Caught signal Terminated, sending to application > > > Also on my another swift submission I am getting this as PBS submit stderr: > > [NID 00065] 2011-06-05 19:34:26 distributeControlMsg: Apid 327688 write > failure to node 66, 10.128.0.67, port 607, Connection reset by peer > > I suspect the last maintenance/upgrade of Beagle might have caused this. > > Ketan > > > > On 6/5/11 1:35 PM, Jonathan S Monette wrote: > > I still have problems running this script. I does not seem to ever > execute. I get this to stdout > > Swift svn swift-r4252 (swift modified locally) cog-r3088 (cog modified > locally) > > RunID: 20110604-2338-5eik2hbb > Progress: > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Canceling job 173155.sdb > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Canceling job 173182.sdb > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > Progress: Submitted:1 > > I had to C-c the run. I would check qstat and it would say the job was > executing in the development queue but it seems like it never ran. Also it > seems that the coasters job was cancelled twice during the Swift execution. > > On Sat, Jun 4, 2011 at 1:12 PM, Jonathan S Monette wrote: > >> Ok. Thanks. and the (none) also appeared when I was on >> login1.beagle.ci.uchicago.edu. Thanks Ketan. >> >> >> On Sat, Jun 4, 2011 at 1:09 PM, ketan wrote: >> >>> Yes, you should bring this to the attention of beagle sysadmins. >>> >>> >>> On 6/4/11 1:01 PM, Jonathan S Monette wrote: >>> >>> Ok. I have my job submitted and it is currently waiting in the queue. >>> besides manually resetting this variable everytime I am on beagle what else >>> can be done? Is this something that I should bring to the attention of the >>> sys admins for beagle? >>> >>> On Sat, Jun 4, 2011 at 12:51 PM, Ketan Maheshwari < >>> ketancmaheshwari at gmail.com> wrote: >>> >>>> Alright, so this is the issue. GLOBUS_HOSTNAME is picked up from >>>> HOSTNAME and I do not know where does this none comes in your case. >>>> >>>> In my case I get hostname: login2.beagle.ci.uchicago.edu >>>> >>>> see if you can manually change this env variable and try to run again. >>>> >>>> >>>> On Sat, Jun 4, 2011 at 12:49 PM, Jonathan S Monette wrote: >>>> >>>>> login2.beagle.ci.uchicago.edu.(none) >>>>> >>>>> when you ran your test was in on login1 or login2? >>>>> >>>>> >>>>> On Sat, Jun 4, 2011 at 12:47 PM, Ketan Maheshwari < >>>>> ketancmaheshwari at gmail.com> wrote: >>>>> >>>>>> What does >>>>>> >>>>>> echo $HOSTNAME give? >>>>>> >>>>>> >>>>>> On Sat, Jun 4, 2011 at 12:44 PM, Jonathan S Monette < >>>>>> jonmon at utexas.edu> wrote: >>>>>> >>>>>>> I am not sure why that (none) appears. Is this a variable that the >>>>>>> system sets or do I need to set it? >>>>>>> >>>>>>> >>>>>>> On Sat, Jun 4, 2011 at 12:39 PM, Ketan Maheshwari < >>>>>>> ketancmaheshwari at gmail.com> wrote: >>>>>>> >>>>>>>> This suprises me because, I do not see that (none) in my >>>>>>>> commandline. >>>>>>>> >>>>>>>> >>>>>>>> On Sat, Jun 4, 2011 at 12:38 PM, Jonathan S Monette < >>>>>>>> jonmon at utexas.edu> wrote: >>>>>>>> >>>>>>>>> I added >>>>>>>>> echo ${OPTIONS} >>>>>>>>> echo ${COG_OPTS} >>>>>>>>> echo ${LOCALCLASSPATH} >>>>>>>>> echo ${EXEC} >>>>>>>>> echo ${CMDLINE} >>>>>>>>> >>>>>>>>> to the script. It seems in the $OPTIONS variable this appears. >>>>>>>>> >>>>>>>>> -Xmx8192M >>>>>>>>> -Djava.endorsed.dirs=/home/jonmon/Library/Swift/swift-0.92/bin/../lib/endorsed >>>>>>>>> -DUID=1881 -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>>>>>>> -DCOG_INSTALL_PATH=/home/jonmon/Library/Swift/swift-0.92/bin/.. >>>>>>>>> -Dswift.home=/home/jonmon/Library/Swift/swift-0.92/bin/.. >>>>>>>>> -Djava.security.egd=file:///dev/urandom >>>>>>>>> >>>>>>>>> There is the line >>>>>>>>> DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) I believe that is the >>>>>>>>> parenthesis that is causing this to fail. Now How do I fix it. >>>>>>>>> >>>>>>>>> On Sat, Jun 4, 2011 at 12:36 PM, Jonathan S Monette < >>>>>>>>> jonmon at utexas.edu> wrote: >>>>>>>>> >>>>>>>>>> I may have found the error. I download 0.92 binaries and changed >>>>>>>>>> the swift command script. I added >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Sat, Jun 4, 2011 at 12:26 PM, Jonathan S Monette < >>>>>>>>>> jonmon at utexas.edu> wrote: >>>>>>>>>> >>>>>>>>>>> Same result. Same error >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Sat, Jun 4, 2011 at 12:24 PM, ketan < >>>>>>>>>>> ketancmaheshwari at gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> can you try running the commandline by copying from run.sh and >>>>>>>>>>>> pasting it at the prompt. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 6/4/11 12:18 PM, Jonathan S Monette wrote: >>>>>>>>>>>> >>>>>>>>>>>> I do have that project. I did projects --avail and got back >>>>>>>>>>>> >>>>>>>>>>>> Project PI Title >>>>>>>>>>>> >>>>>>>>>>>> ------------------------------------------------------------------------------ >>>>>>>>>>>> CI-CCR000013 Michael Wilde The Swift Parallel >>>>>>>>>>>> Scripting System >>>>>>>>>>>> >>>>>>>>>>>> Here is my run.sh file. I execute with ./run.sh after I did >>>>>>>>>>>> chmod +x run.sh >>>>>>>>>>>> >>>>>>>>>>>> #!/bin/bash >>>>>>>>>>>> swift -config cf -tc.file tc -sites.file beagle-coaster.xml >>>>>>>>>>>> catsn.swift -n=1 >>>>>>>>>>>> >>>>>>>>>>>> I do have sh and cat in my path. I can execute them. Here is >>>>>>>>>>>> what which sh and which cat produced. >>>>>>>>>>>> >>>>>>>>>>>> sh is /usr/bin/sh >>>>>>>>>>>> cat is /bin/cat >>>>>>>>>>>> >>>>>>>>>>>> On Sat, Jun 4, 2011 at 12:10 PM, ketan < >>>>>>>>>>>> ketancmaheshwari at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Can you try "projects --avail" command on beagle to see if you >>>>>>>>>>>>> are member of a project. >>>>>>>>>>>>> >>>>>>>>>>>>> Else you will need to request membership. You can do this from >>>>>>>>>>>>> this page: >>>>>>>>>>>>> >>>>>>>>>>>>> http://pads.ci.uchicago.edu/access/ >>>>>>>>>>>>> >>>>>>>>>>>>> In any case, your error does not seem to be because of the >>>>>>>>>>>>> above. Looks like '(' has sneaked in because of some unexpected typo in >>>>>>>>>>>>> commandline .. can you doublecheck it. >>>>>>>>>>>>> >>>>>>>>>>>>> Also can you make sure that you have all the files: tc, >>>>>>>>>>>>> beagle-coasters.xml cf in PATH, ie. are you able to access them w/o ./ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 6/4/11 12:03 PM, Jonathan S Monette wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> I did change the work directory. I however did not change >>>>>>>>>>>>> the project name. I do not know my project name so I kept the same project >>>>>>>>>>>>> that was in there. Here is my modified beagle-coasters.xml >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> key="project">CI-CCR000013 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> key="ppn">24:cray:pack >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> key="workersPerNode">24 >>>>>>>>>>>>> 1000 >>>>>>>>>>>>> 1 >>>>>>>>>>>>> >>>>>>>>>>>> key="nodeGranularity">1 >>>>>>>>>>>>> 1 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> key="jobThrottle">.63 >>>>>>>>>>>>> >>>>>>>>>>>> key="initialScore">10000 >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >/lustre/beagle/jonmon/Swift/work >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Sat, Jun 4, 2011 at 12:00 PM, Ketan Maheshwari < >>>>>>>>>>>>> ketancmaheshwari at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Jon, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks for trying out catsn on Beagle. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I just tried it myself but could not reproduce the error you >>>>>>>>>>>>>> are getting. Have you made the changes that are mentioned in the README: >>>>>>>>>>>>>> == >>>>>>>>>>>>>> Change workdir location in beagle-coaster.xml >>>>>>>>>>>>>> Change the project entry in the beagle-coaster.xml to your >>>>>>>>>>>>>> project name >>>>>>>>>>>>>> == >>>>>>>>>>>>>> >>>>>>>>>>>>>> Ketan >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Sat, Jun 4, 2011 at 11:50 AM, Jonathan S Monette < >>>>>>>>>>>>>> jonmon at utexas.edu> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>> I am trying to run the catsn test on beagle using the >>>>>>>>>>>>>>> files in ~ketan/catsn. I have copied over this directory over to my home >>>>>>>>>>>>>>> directory and I believe I set it up correctly. I did module load swift and >>>>>>>>>>>>>>> the ran run.sh that was in this directory. I get this error. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> /soft/swift/0.92/bin/swift: eval: line 152: syntax error >>>>>>>>>>>>>>> near unexpected token `(' >>>>>>>>>>>>>>> /soft/swift/0.92/bin/swift: eval: line 152: `java -Xmx8192M >>>>>>>>>>>>>>> -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed -DUID=1881 >>>>>>>>>>>>>>> -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>>>>>>>>>>>>> -DCOG_INSTALL_PATH=/soft/swift/0.92/bin/.. >>>>>>>>>>>>>>> -Dswift.home=/soft/swift/0.92/bin/.. -Djava.security.egd= >>>>>>>>>>>>>>> file:///dev/urandom -classpath >>>>>>>>>>>>>>> /soft/swift/0.92/bin/../etc:/soft/swift/0.92/bin/../libexec:/soft/swift/0.92/bin/../lib/addressing-1.0.jar:/soft/swift/0.92/bin/../lib/ant.jar:/soft/swift/0.92/bin/../lib/antlr-2.7.5.jar:/soft/swift/0.92/bin/../lib/axis.jar:/soft/swift/0.92/bin/../lib/axis-url.jar:/soft/swift/0.92/bin/../lib/backport-util-concurrent.jar:/soft/swift/0.92/bin/../lib/castor-0.9.6.jar:/soft/swift/0.92/bin/../lib/coaster-bootstrap.jar:/soft/swift/0.92/bin/../lib/cog-abstraction-common-2.4.jar:/soft/swift/0.92/bin/../lib/cog-axis.jar:/soft/swift/0.92/bin/../lib/cog-grapheditor-0.47.jar:/soft/swift/0.92/bin/../lib/cog-jglobus-1.7.0.jar:/soft/swift/0.92/bin/../lib/cog-karajan-0.36-dev.jar:/soft/swift/0.92/bin/../lib/cog-provider-clref-gt4_0_0.jar:/soft/swift/0.92/bin/../lib/cog-provider-coaster-0.3.jar:/soft/swift/0.92/bin/../lib/cog-provider-dcache-0.1.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt2-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/soft/swift/0.92/bin/../lib/cog-pro >>>>>>>>>>>>>>> vider-local-2.2.jar:/soft/swift/0.92/bin/../lib/cog-provider-localscheduler-0.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-ssh-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-webdav-2.1.jar:/soft/swift/0.92/bin/../lib/cog-resources-1.0.jar:/soft/swift/0.92/bin/../lib/cog-swift-svn.jar:/soft/swift/0.92/bin/../lib/cog-trap-1.0.jar:/soft/swift/0.92/bin/../lib/cog-url.jar:/soft/swift/0.92/bin/../lib/cog-util-0.92.jar:/soft/swift/0.92/bin/../lib/commonj.jar:/soft/swift/0.92/bin/../lib/commons-beanutils.jar:/soft/swift/0.92/bin/../lib/commons-collections-3.0.jar:/soft/swift/0.92/bin/../lib/commons-digester.jar:/soft/swift/0.92/bin/../lib/commons-discovery.jar:/soft/swift/0.92/bin/../lib/commons-httpclient.jar:/soft/swift/0.92/bin/../lib/commons-logging-1.1.jar:/soft/swift/0.92/bin/../lib/concurrent.jar:/soft/swift/0.92/bin/../lib/cryptix32.jar:/soft/swift/0.92/bin/../lib/cryptix-asn1.jar:/soft/swift/0.92/bin/../lib/cryptix.jar:/soft/swift/0.92/bin/../lib/globus_delegation_servic >>>>>>>>>>>>>>> e.jar:/soft/swift/0.92/bin/../lib/globus_delegation_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_mds_aggregator_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_service.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rft_stubs.jar:/soft/swift/0.92/bin/../lib/gram-client.jar:/soft/swift/0.92/bin/../lib/gram-stubs.jar:/soft/swift/0.92/bin/../lib/gram-utils.jar:/soft/swift/0.92/bin/../lib/j2ssh-common-0.2.2.jar:/soft/swift/0.92/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/soft/swift/0.92/bin/../lib/jakarta-regexp-1.2.jar:/soft/swift/0.92/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/soft/swift/0.92/bin/../lib/jaxrpc.jar:/soft/swift/0.92/bin/../lib/jce-jdk13-131.jar:/soft/swift/0.92/bin/../lib/jgss.jar:/soft/swift/0.92/bin/../lib/jline-0.9.94.jar:/soft/swift/0.92/bin/../lib/jsr173_1.0_api.jar:/soft/swift/0.92/bin/../lib/jug-lgpl-2.0.0.jar:/soft/swift/0.92/bin/../lib/junit.jar:/soft/swift/0.92/bin/../lib/log4j-1.2 >>>>>>>>>>>>>>> .8.jar:/soft/swift/0.92/bin/../lib/naming-common.jar:/soft/swift/0.92/bin/../lib/naming-factory.jar:/soft/swift/0.92/bin/../lib/naming-java.jar:/soft/swift/0.92/bin/../lib/naming-resources.jar:/soft/swift/0.92/bin/../lib/opensaml.jar:/soft/swift/0.92/bin/../lib/puretls.jar:/soft/swift/0.92/bin/../lib/resolver.jar:/soft/swift/0.92/bin/../lib/saaj.jar:/soft/swift/0.92/bin/../lib/stringtemplate.jar:/soft/swift/0.92/bin/../lib/vdldefinitions.jar:/soft/swift/0.92/bin/../lib/wsdl4j.jar:/soft/swift/0.92/bin/../lib/wsrf_core.jar:/soft/swift/0.92/bin/../lib/wsrf_core_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_index_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_usefulrp_schema_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_provider_jce.jar:/soft/swift/0.92/bin/../lib/wsrf_tools.jar:/soft/swift/0.92/bin/../lib/wss4j.jar:/soft/swift/0.92/bin/../lib/xalan.jar:/soft/swift/0.92/bin/../lib/xbean.jar:/soft/swift/0.92/bin/../lib/xbean_xpath.jar:/soft/swift/0.92/bin/../lib/xercesImpl.jar:/soft >>>>>>>>>>>>>>> /swift/0.92/bin/../lib/xml-apis.jar:/soft/swift/0.92/bin/../lib/xmlsec.jar:/soft/swift/0.92/bin/../lib/xpp3-1.1.3.4d_b4_min.jar:/soft/swift/0.92/bin/../lib/xstream-1.1.1-patched.jar: >>>>>>>>>>>>>>> org.griphyn.vdl.karajan.Loader '-config' 'cf' '-tc.file' 'tc' '-sites.file' >>>>>>>>>>>>>>> 'beagle-coaster.xml' 'catsn.swift' '-n=1'' >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I there something else I need to load on beagle to make >>>>>>>>>>>>>>> swift run accordingly? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Any intelligent fool can make things bigger and more >>>>>>>>>>>>>>> complex... It takes a touch of genius - and a lot of courage to move in the >>>>>>>>>>>>>>> opposite direction. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> - Albert Einstein >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> >>>>>>>>>>>>> Any intelligent fool can make things bigger and more complex... >>>>>>>>>>>>> It takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>>>>>>> direction. >>>>>>>>>>>>> >>>>>>>>>>>>> - Albert Einstein >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> >>>>>>>>>>>> Any intelligent fool can make things bigger and more complex... >>>>>>>>>>>> It takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>>>>>> direction. >>>>>>>>>>>> >>>>>>>>>>>> - Albert Einstein >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> >>>>>>>>>>> Any intelligent fool can make things bigger and more complex... >>>>>>>>>>> It takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>>>>> direction. >>>>>>>>>>> >>>>>>>>>>> - Albert Einstein >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>>>> direction. >>>>>>>>>> >>>>>>>>>> - Albert Einstein >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>>> direction. >>>>>>>>> >>>>>>>>> - Albert Einstein >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>> direction. >>>>>>> >>>>>>> - Albert Einstein >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Any intelligent fool can make things bigger and more complex... It >>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>> direction. >>>>> >>>>> - Albert Einstein >>>>> >>>>> >>>> >>> >>> >>> -- >>> >>> Any intelligent fool can make things bigger and more complex... It takes >>> a touch of genius - and a lot of courage to move in the opposite direction. >>> >>> - Albert Einstein >>> >>> >> >> >> -- >> >> Any intelligent fool can make things bigger and more complex... It takes a >> touch of genius - and a lot of courage to move in the opposite direction. >> >> - Albert Einstein >> >> > > > -- > > Any intelligent fool can make things bigger and more complex... It takes a > touch of genius - and a lot of courage to move in the opposite direction. > > - Albert Einstein > > -- Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction. - Albert Einstein -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Sun Jun 5 20:18:18 2011 From: ketancmaheshwari at gmail.com (ketan) Date: Sun, 05 Jun 2011 20:18:18 -0500 Subject: [Swift-devel] Re: catsn on beagle In-Reply-To: References: <4DEA670E.3010909@gmail.com> <4DEA6A30.2040503@gmail.com> <4DEA74DF.5040107@gmail.com> <4DEC29C8.6020507@gmail.com> Message-ID: <4DEC2ADA.7090704@gmail.com> yes. On 6/5/11 8:17 PM, Jonathan S Monette wrote: > So you are seeing the same problem? > > On Sun, Jun 5, 2011 at 8:13 PM, ketan > wrote: > > I do not know the reason why you are getting this. The PBS submit > throws the following stderr: > > aprun: Apid 327667: Caught signal Terminated, sending to application > > > Also on my another swift submission I am getting this as PBS > submit stderr: > > [NID 00065] 2011-06-05 19:34:26 distributeControlMsg: Apid 327688 > write failure to node 66, 10.128.0.67, port 607, Connection reset > by peer > > I suspect the last maintenance/upgrade of Beagle might have caused > this. > > Ketan > > > > On 6/5/11 1:35 PM, Jonathan S Monette wrote: >> I still have problems running this script. I does not seem to >> ever execute. I get this to stdout >> >> Swift svn swift-r4252 (swift modified locally) cog-r3088 (cog >> modified locally) >> >> RunID: 20110604-2338-5eik2hbb >> Progress: >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Canceling job 173155.sdb >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Canceling job 173182.sdb >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> >> I had to C-c the run. I would check qstat and it would say the >> job was executing in the development queue but it seems like it >> never ran. Also it seems that the coasters job was cancelled >> twice during the Swift execution. >> >> On Sat, Jun 4, 2011 at 1:12 PM, Jonathan S Monette >> > wrote: >> >> Ok. Thanks. and the (none) also appeared when I was on >> login1.beagle.ci.uchicago.edu >> . Thanks Ketan. >> >> >> On Sat, Jun 4, 2011 at 1:09 PM, ketan >> > > wrote: >> >> Yes, you should bring this to the attention of beagle >> sysadmins. >> >> >> On 6/4/11 1:01 PM, Jonathan S Monette wrote: >>> Ok. I have my job submitted and it is currently waiting >>> in the queue. besides manually resetting this variable >>> everytime I am on beagle what else can be done? Is this >>> something that I should bring to the attention of the >>> sys admins for beagle? >>> >>> On Sat, Jun 4, 2011 at 12:51 PM, Ketan Maheshwari >>> >> > wrote: >>> >>> Alright, so this is the issue. GLOBUS_HOSTNAME is >>> picked up from HOSTNAME and I do not know where does >>> this none comes in your case. >>> >>> In my case I get hostname: >>> login2.beagle.ci.uchicago.edu >>> >>> >>> see if you can manually change this env variable and >>> try to run again. >>> >>> >>> On Sat, Jun 4, 2011 at 12:49 PM, Jonathan S Monette >>> > wrote: >>> >>> login2.beagle.ci.uchicago.edu.(none) >>> >>> when you ran your test was in on login1 or login2? >>> >>> >>> On Sat, Jun 4, 2011 at 12:47 PM, Ketan >>> Maheshwari >> > wrote: >>> >>> What does >>> >>> echo $HOSTNAME give? >>> >>> >>> On Sat, Jun 4, 2011 at 12:44 PM, Jonathan S >>> Monette >> > wrote: >>> >>> I am not sure why that (none) appears. >>> Is this a variable that the system sets >>> or do I need to set it? >>> >>> >>> On Sat, Jun 4, 2011 at 12:39 PM, Ketan >>> Maheshwari >> > wrote: >>> >>> This suprises me because, I do not >>> see that (none) in my commandline. >>> >>> >>> On Sat, Jun 4, 2011 at 12:38 PM, >>> Jonathan S Monette >>> >> > wrote: >>> >>> I added >>> echo ${OPTIONS} >>> echo ${COG_OPTS} >>> echo ${LOCALCLASSPATH} >>> echo ${EXEC} >>> echo ${CMDLINE} >>> >>> to the script. It seems in the >>> $OPTIONS variable this appears. >>> >>> -Xmx8192M >>> -Djava.endorsed.dirs=/home/jonmon/Library/Swift/swift-0.92/bin/../lib/endorsed >>> -DUID=1881 >>> -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>> -DCOG_INSTALL_PATH=/home/jonmon/Library/Swift/swift-0.92/bin/.. >>> -Dswift.home=/home/jonmon/Library/Swift/swift-0.92/bin/.. >>> -Djava.security.egd=file:///dev/urandom >>> >>> There is the line >>> DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>> I believe that is the >>> parenthesis that is causing this >>> to fail. Now How do I fix it. >>> >>> On Sat, Jun 4, 2011 at 12:36 PM, >>> Jonathan S Monette >>> >> > wrote: >>> >>> I may have found the error. >>> I download 0.92 binaries >>> and changed the swift >>> command script. I added >>> >>> >>> On Sat, Jun 4, 2011 at 12:26 >>> PM, Jonathan S Monette >>> >> > >>> wrote: >>> >>> Same result. Same error >>> >>> >>> On Sat, Jun 4, 2011 at >>> 12:24 PM, ketan >>> >> > >>> wrote: >>> >>> can you try running >>> the commandline by >>> copying from run.sh >>> and pasting it at >>> the prompt. >>> >>> >>> On 6/4/11 12:18 PM, >>> Jonathan S Monette >>> wrote: >>>> I do have that >>>> project. I did >>>> projects --avail >>>> and got back >>>> >>>> Project PI >>>> Title >>>> ------------------------------------------------------------------------------ >>>> CI-CCR000013 >>>> Michael Wilde >>>> The Swift >>>> Parallel Scripting >>>> System >>>> >>>> Here is my run.sh >>>> file. I execute >>>> with ./run.sh after >>>> I did chmod +x run.sh >>>> >>>> #!/bin/bash >>>> swift -config cf >>>> -tc.file tc >>>> -sites.file >>>> beagle-coaster.xml >>>> catsn.swift -n=1 >>>> >>>> I do have sh and >>>> cat in my path. I >>>> can execute them. >>>> Here is what which >>>> sh and which cat >>>> produced. >>>> >>>> sh is /usr/bin/sh >>>> cat is /bin/cat >>>> >>>> On Sat, Jun 4, 2011 >>>> at 12:10 PM, ketan >>>> >>> > >>>> wrote: >>>> >>>> Can you try >>>> "projects >>>> --avail" >>>> command on >>>> beagle to see >>>> if you are >>>> member of a >>>> project. >>>> >>>> Else you will >>>> need to request >>>> membership. You >>>> can do this >>>> from this page: >>>> >>>> http://pads.ci.uchicago.edu/access/ >>>> >>>> In any case, >>>> your error does >>>> not seem to be >>>> because of the >>>> above. Looks >>>> like '(' has >>>> sneaked in >>>> because of some >>>> unexpected typo >>>> in commandline >>>> .. can you >>>> doublecheck it. >>>> >>>> Also can you >>>> make sure that >>>> you have all >>>> the files: tc, >>>> beagle-coasters.xml >>>> cf in PATH, ie. >>>> are you able to >>>> access them w/o ./ >>>> >>>> >>>> On 6/4/11 12:03 >>>> PM, Jonathan S >>>> Monette wrote: >>>>> I did change >>>>> the work >>>>> directory. I >>>>> however did >>>>> not change the >>>>> project name. >>>>> I do not know >>>>> my project >>>>> name so I kept >>>>> the same >>>>> project that >>>>> was in there. >>>>> Here is my >>>>> modified >>>>> beagle-coasters.xml >>>>> >>>>> >>>>> >>>>> >>>> handle="pbs"> >>>>> >>>> provider="coaster" >>>>> jobmanager="local:pbs"/> >>>>> >>>> namespace="globus" >>>>> key="project">CI-CCR000013 >>>>> >>>>> >>>> namespace="globus" >>>>> key="ppn">24:cray:pack >>>>> >>>>> >>>> namespace="globus" >>>>> key="workersPerNode">24 >>>>> >>>> namespace="globus" >>>>> key="maxTime">1000 >>>>> >>>> namespace="globus" >>>>> key="slots">1 >>>>> >>>> namespace="globus" >>>>> key="nodeGranularity">1 >>>>> >>>> namespace="globus" >>>>> key="maxNodes">1 >>>>> >>>>> >>>> namespace="karajan" >>>>> key="jobThrottle">.63 >>>>> >>>> namespace="karajan" >>>>> key="initialScore">10000 >>>>> >>>>> >>>> provider="local"/> >>>>> >>>> >/lustre/beagle/jonmon/Swift/work >>>>> >>>>> >>>>> >>>>> >>>>> On Sat, Jun 4, >>>>> 2011 at 12:00 >>>>> PM, Ketan >>>>> Maheshwari >>>>> >>>> > >>>>> wrote: >>>>> >>>>> Jon, >>>>> >>>>> Thanks for >>>>> trying out >>>>> catsn on >>>>> Beagle. >>>>> >>>>> I just >>>>> tried it >>>>> myself but >>>>> could not >>>>> reproduce >>>>> the error >>>>> you are >>>>> getting. >>>>> Have you >>>>> made the >>>>> changes >>>>> that are >>>>> mentioned >>>>> in the README: >>>>> == >>>>> Change >>>>> workdir >>>>> location >>>>> in >>>>> beagle-coaster.xml >>>>> Change the >>>>> project >>>>> entry in >>>>> the >>>>> beagle-coaster.xml >>>>> to your >>>>> project name >>>>> == >>>>> >>>>> Ketan >>>>> >>>>> >>>>> On Sat, >>>>> Jun 4, >>>>> 2011 at >>>>> 11:50 AM, >>>>> Jonathan S >>>>> Monette >>>>> >>>> > >>>>> wrote: >>>>> >>>>> Hello, >>>>> I >>>>> am >>>>> trying >>>>> to run >>>>> the >>>>> catsn >>>>> test >>>>> on >>>>> beagle >>>>> using >>>>> the >>>>> files >>>>> in >>>>> ~ketan/catsn. >>>>> I >>>>> have >>>>> copied >>>>> over >>>>> this >>>>> directory >>>>> over >>>>> to my >>>>> home >>>>> directory >>>>> and I >>>>> believe I >>>>> set it >>>>> up >>>>> correctly. >>>>> I did >>>>> module >>>>> load >>>>> swift >>>>> and >>>>> the >>>>> ran >>>>> run.sh >>>>> that >>>>> was in >>>>> this >>>>> directory. >>>>> I get >>>>> this >>>>> error. >>>>> >>>>> /soft/swift/0.92/bin/swift: >>>>> eval: >>>>> line >>>>> 152: >>>>> syntax >>>>> error >>>>> near >>>>> unexpected >>>>> token `(' >>>>> /soft/swift/0.92/bin/swift: >>>>> eval: >>>>> line >>>>> 152: >>>>> `java >>>>> -Xmx8192M >>>>> -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed >>>>> -DUID=1881 >>>>> -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>>> -DCOG_INSTALL_PATH=/soft/swift/0.92/bin/.. >>>>> -Dswift.home=/soft/swift/0.92/bin/.. >>>>> -Djava.security.egd=file:///dev/urandom >>>>> -classpath >>>>> /soft/swift/0.92/bin/../etc:/soft/swift/0.92/bin/../libexec:/soft/swift/0.92/bin/../lib/addressing-1.0.jar:/soft/swift/0.92/bin/../lib/ant.jar:/soft/swift/0.92/bin/../lib/antlr-2.7.5.jar:/soft/swift/0.92/bin/../lib/axis.jar:/soft/swift/0.92/bin/../lib/axis-url.jar:/soft/swift/0.92/bin/../lib/backport-util-concurrent.jar:/soft/swift/0.92/bin/../lib/castor-0.9.6.jar:/soft/swift/0.92/bin/../lib/coaster-bootstrap.jar:/soft/swift/0.92/bin/../lib/cog-abstraction-common-2.4.jar:/soft/swift/0.92/bin/../lib/cog-axis.jar:/soft/swift/0.92/bin/../lib/cog-grapheditor-0.47.jar:/soft/swift/0.92/bin/../lib/cog-jglobus-1.7.0.jar:/soft/swift/0.92/bin/../lib/cog-karajan-0.36-dev.jar:/soft/swift/0.92/bin/../lib/cog-provider-clref-gt4_0_0.jar:/soft/swift/0.92/bin/../lib/cog-provider-coaster-0.3.jar:/soft/swift/0.92/bin/../lib/cog-provider-dcache-0.1.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt2-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/soft/swift/0.92/bin/../lib/cog-pro >>>>> vider-local-2.2.jar:/soft/swift/0.92/bin/../lib/cog-provider-localscheduler-0.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-ssh-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-webdav-2.1.jar:/soft/swift/0.92/bin/../lib/cog-resources-1.0.jar:/soft/swift/0.92/bin/../lib/cog-swift-svn.jar:/soft/swift/0.92/bin/../lib/cog-trap-1.0.jar:/soft/swift/0.92/bin/../lib/cog-url.jar:/soft/swift/0.92/bin/../lib/cog-util-0.92.jar:/soft/swift/0.92/bin/../lib/commonj.jar:/soft/swift/0.92/bin/../lib/commons-beanutils.jar:/soft/swift/0.92/bin/../lib/commons-collections-3.0.jar:/soft/swift/0.92/bin/../lib/commons-digester.jar:/soft/swift/0.92/bin/../lib/commons-discovery.jar:/soft/swift/0.92/bin/../lib/commons-httpclient.jar:/soft/swift/0.92/bin/../lib/commons-logging-1.1.jar:/soft/swift/0.92/bin/../lib/concurrent.jar:/soft/swift/0.92/bin/../lib/cryptix32.jar:/soft/swift/0.92/bin/../lib/cryptix-asn1.jar:/soft/swift/0.92/bin/../lib/cryptix.jar:/soft/swift/0.92/bin/../lib/globus_delegation_servic >>>>> e.jar:/soft/swift/0.92/bin/../lib/globus_delegation_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_mds_aggregator_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_service.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rft_stubs.jar:/soft/swift/0.92/bin/../lib/gram-client.jar:/soft/swift/0.92/bin/../lib/gram-stubs.jar:/soft/swift/0.92/bin/../lib/gram-utils.jar:/soft/swift/0.92/bin/../lib/j2ssh-common-0.2.2.jar:/soft/swift/0.92/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/soft/swift/0.92/bin/../lib/jakarta-regexp-1.2.jar:/soft/swift/0.92/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/soft/swift/0.92/bin/../lib/jaxrpc.jar:/soft/swift/0.92/bin/../lib/jce-jdk13-131.jar:/soft/swift/0.92/bin/../lib/jgss.jar:/soft/swift/0.92/bin/../lib/jline-0.9.94.jar:/soft/swift/0.92/bin/../lib/jsr173_1.0_api.jar:/soft/swift/0.92/bin/../lib/jug-lgpl-2.0.0.jar:/soft/swift/0.92/bin/../lib/junit.jar:/soft/swift/0.92/bin/../lib/log4j-1.2 >>>>> .8.jar:/soft/swift/0.92/bin/../lib/naming-common.jar:/soft/swift/0.92/bin/../lib/naming-factory.jar:/soft/swift/0.92/bin/../lib/naming-java.jar:/soft/swift/0.92/bin/../lib/naming-resources.jar:/soft/swift/0.92/bin/../lib/opensaml.jar:/soft/swift/0.92/bin/../lib/puretls.jar:/soft/swift/0.92/bin/../lib/resolver.jar:/soft/swift/0.92/bin/../lib/saaj.jar:/soft/swift/0.92/bin/../lib/stringtemplate.jar:/soft/swift/0.92/bin/../lib/vdldefinitions.jar:/soft/swift/0.92/bin/../lib/wsdl4j.jar:/soft/swift/0.92/bin/../lib/wsrf_core.jar:/soft/swift/0.92/bin/../lib/wsrf_core_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_index_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_usefulrp_schema_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_provider_jce.jar:/soft/swift/0.92/bin/../lib/wsrf_tools.jar:/soft/swift/0.92/bin/../lib/wss4j.jar:/soft/swift/0.92/bin/../lib/xalan.jar:/soft/swift/0.92/bin/../lib/xbean.jar:/soft/swift/0.92/bin/../lib/xbean_xpath.jar:/soft/swift/0.92/bin/../lib/xercesImpl.jar:/soft >>>>> /swift/0.92/bin/../lib/xml-apis.jar:/soft/swift/0.92/bin/../lib/xmlsec.jar:/soft/swift/0.92/bin/../lib/xpp3-1.1.3.4d_b4_min.jar:/soft/swift/0.92/bin/../lib/xstream-1.1.1-patched.jar: >>>>> org.griphyn.vdl.karajan.Loader >>>>> '-config' >>>>> 'cf' >>>>> '-tc.file' >>>>> 'tc' >>>>> '-sites.file' >>>>> 'beagle-coaster.xml' >>>>> 'catsn.swift' >>>>> '-n=1'' >>>>> >>>>> I >>>>> there >>>>> something >>>>> else I >>>>> need >>>>> to >>>>> load >>>>> on >>>>> beagle >>>>> to >>>>> make >>>>> swift >>>>> run >>>>> accordingly? >>>>> >>>>> -- >>>>> >>>>> Any >>>>> intelligent >>>>> fool >>>>> can >>>>> make >>>>> things >>>>> bigger >>>>> and >>>>> more >>>>> complex... >>>>> It >>>>> takes >>>>> a >>>>> touch >>>>> of >>>>> genius >>>>> - and >>>>> a lot >>>>> of >>>>> courage to >>>>> move >>>>> in the >>>>> opposite >>>>> direction. >>>>> >>>>> - >>>>> Albert >>>>> Einstein >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Any >>>>> intelligent >>>>> fool can make >>>>> things bigger >>>>> and more >>>>> complex... It >>>>> takes a touch >>>>> of genius - >>>>> and a lot of >>>>> courage to >>>>> move in the >>>>> opposite >>>>> direction. >>>>> >>>>> - Albert Einstein >>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> >>>> Any intelligent >>>> fool can make >>>> things bigger and >>>> more complex... It >>>> takes a touch of >>>> genius - and a lot >>>> of courage to move >>>> in the opposite >>>> direction. >>>> >>>> - Albert Einstein >>>> >>>> >>> >>> >>> >>> -- >>> >>> Any intelligent fool can >>> make things bigger and >>> more complex... It takes >>> a touch of genius - and >>> a lot of courage to move >>> in the opposite direction. >>> >>> - Albert Einstein >>> >>> >>> >>> >>> >>> -- >>> >>> Any intelligent fool can >>> make things bigger and more >>> complex... It takes a touch >>> of genius - and a lot of >>> courage to move in the >>> opposite direction. >>> >>> - Albert Einstein >>> >>> >>> >>> >>> >>> -- >>> >>> Any intelligent fool can make >>> things bigger and more >>> complex... It takes a touch of >>> genius - and a lot of courage to >>> move in the opposite direction. >>> >>> - Albert Einstein >>> >>> >>> >>> >>> >>> >>> -- >>> >>> Any intelligent fool can make things >>> bigger and more complex... It takes a >>> touch of genius - and a lot of courage >>> to move in the opposite direction. >>> >>> - Albert Einstein >>> >>> >>> >>> >>> >>> >>> -- >>> >>> Any intelligent fool can make things bigger and >>> more complex... It takes a touch of genius - and >>> a lot of courage to move in the opposite direction. >>> >>> - Albert Einstein >>> >>> >>> >>> >>> >>> >>> -- >>> >>> Any intelligent fool can make things bigger and more >>> complex... It takes a touch of genius - and a lot of >>> courage to move in the opposite direction. >>> >>> - Albert Einstein >>> >>> >> >> >> >> -- >> >> Any intelligent fool can make things bigger and more >> complex... It takes a touch of genius - and a lot of courage >> to move in the opposite direction. >> >> - Albert Einstein >> >> >> >> >> >> -- >> >> Any intelligent fool can make things bigger and more complex... >> It takes a touch of genius - and a lot of courage to move in the >> opposite direction. >> >> - Albert Einstein >> >> > > > > -- > > Any intelligent fool can make things bigger and more complex... It > takes a touch of genius - and a lot of courage to move in the opposite > direction. > > - Albert Einstein > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Sun Jun 5 21:13:23 2011 From: jonmon at utexas.edu (Jonathan S Monette) Date: Sun, 5 Jun 2011 21:13:23 -0500 Subject: [Swift-devel] Re: InternalHostname in sites file In-Reply-To: References: <1307043924.19522.4.camel@blabla2.none> <346384249.137621.1307045335440.JavaMail.root@zimbra.anl.gov> Message-ID: This problem seemed to have been fixed. I just tried a run where I didn't specify the internalhostname on PADS and it still ran. I am not sure when this was fixed. I am using the swift-0.92 binaries I downloaded from the website. On Fri, Jun 3, 2011 at 2:22 PM, Jonathan S Monette wrote: > Ok. I have one I wrote up based on what Mihael mentioned. But should we > just explain why someone would use the internalhostname key in sites.xml or > should we also explain how to tell if you need to set this value and also > here is how to find it? > > > On Fri, Jun 3, 2011 at 1:11 PM, David Kelly wrote: > >> Sure, Jon. The userguide is now kept in the docs/userguide directory of >> Swift. It's in asciidoc format. Feel free to make changes there, or email me >> the text and I can add it for you. >> >> David >> >> >> On Thu, Jun 2, 2011 at 3:10 PM, Jonathan S Monette wrote: >> >>> I can write something up and send it to David to add somewhere for the >>> userguide. Not sure where the files for the userguide are kept. >>> >>> >>> On Thu, Jun 2, 2011 at 3:08 PM, Michael Wilde wrote: >>> >>>> Thanks for clarifying. Jon and/or David, can you address this with a >>>> cookbook entry on Coasters that heads towards a users guide section? >>>> >>>> We should tell users what they can run on their cluster (eg ping or >>>> telnet-style connect tests) to validate the setting of internalHostName. >>>> >>>> - Mike >>>> >>>> >>>> >>>> ----- Original Message ----- >>>> > Right. If the head node has multiple network interfaces, only one of >>>> > which is visible from the worker nodes. >>>> > >>>> > The choice of which interface is the one that worker nodes can connect >>>> > to is a matter of the particular cluster. It's not particularly easy >>>> > to >>>> > have an automated mechanism that figures it out. We tried some scheme >>>> > to >>>> > pass all the interface addresses to the worker and let it try to >>>> > connect >>>> > to all of them in order, but that didn't work very well. Of course, >>>> > there might be a scheme that works, but I didn't want to spend too >>>> > much >>>> > time on that. >>>> > >>>> > So that's why it's needed. To clarify to the workers which exact >>>> > interface on the head node they are to try to connect to. >>>> > >>>> > Mihael >>>> > >>>> > On Thu, 2011-06-02 at 14:10 -0500, Jonathan S Monette wrote: >>>> > > Mihael, >>>> > > I believe we have talked about this before but why is it >>>> > > necessary >>>> > > for an InternalHostname to be specified for PADS? I know that the >>>> > > address that coasters connects to is wrong but I do not remember why >>>> > > that was. Could you give an explanation on why internalHostname >>>> > > needs >>>> > > to be set? >>>> > > >>>> > > -- >>>> > > >>>> > > >>>> > > Any intelligent fool can make things bigger and more complex... It >>>> > > takes a touch of genius - and a lot of courage to move in the >>>> > > opposite >>>> > > direction. >>>> > > >>>> > > - Albert Einstein >>>> > > >>>> > > >>>> > > >>>> > >>>> > >>>> > _______________________________________________ >>>> > Swift-devel mailing list >>>> > Swift-devel at ci.uchicago.edu >>>> > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel >>>> >>>> -- >>>> Michael Wilde >>>> Computation Institute, University of Chicago >>>> Mathematics and Computer Science Division >>>> Argonne National Laboratory >>>> >>>> >>> >>> >>> -- >>> >>> Any intelligent fool can make things bigger and more complex... It takes >>> a touch of genius - and a lot of courage to move in the opposite direction. >>> >>> - Albert Einstein >>> >>> >>> _______________________________________________ >>> Swift-devel mailing list >>> Swift-devel at ci.uchicago.edu >>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel >>> >>> >> > > > -- > > Any intelligent fool can make things bigger and more complex... It takes a > touch of genius - and a lot of courage to move in the opposite direction. > > - Albert Einstein > > -- Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction. - Albert Einstein -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Mon Jun 6 02:55:16 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 06 Jun 2011 00:55:16 -0700 Subject: [Swift-devel] recent error on beagle In-Reply-To: References: <2008592710.93053.1306015923349.JavaMail.root@zimbra.anl.gov> <4DD91203.6090000@gmail.com> <1306090299.2956.1.camel@blabla2.none> <1306442497.16145.1.camel@blabla2.none> <1307047182.20017.2.camel@blabla2.none> Message-ID: <1307346916.18043.8.camel@blabla2.none> On Thu, 2011-06-02 at 16:21 -0500, Tim Armstrong wrote: > Mihael: thanks, I appreciate it, sorry to bug you No problem. I did the backport to the branch. However, I do want to stress the fact that this should be tested by folk independently. > > Mike: this problem was occuring to me on PADS (the thread was > originally about a similar problem on Beagle). I haven't made any > progress debugging the issue on beagle, beyond coming up with the > minimal example to replicate it. I managed to pare down the example > even more: it deadlocks if the pthread library is linked dynamically, > even if no pthreads functions are actually used. Ie. the deadlock > happens at the time the shared library is loaded. Have you tried an strace? I believe that libraries support some kind of initialization routine, and pthreads may make use of that, and that may explain why it locks up even without pthreads stuff being called explicitly. Mihael From ketancmaheshwari at gmail.com Mon Jun 6 14:16:09 2011 From: ketancmaheshwari at gmail.com (ketan) Date: Mon, 06 Jun 2011 14:16:09 -0500 Subject: [Swift-devel] Re: catsn on beagle In-Reply-To: References: <4DEA670E.3010909@gmail.com> <4DEA6A30.2040503@gmail.com> <4DEA74DF.5040107@gmail.com> <4DEC29C8.6020507@gmail.com> Message-ID: <4DED2779.7000509@gmail.com> Jon, Turns out beagle was not properly up and circumstantial evidence indicates it was causing the issue that seems now to be resolved (see quoted mesg from beagle admins below). Could you try it again and report back what you get. Thanks, Ketan ==== The five Beagle storage nodes were found offline this morning. These nodes provide DSL, GPFS, and NFS via DVS to compute nodes throughout the system. These nodes have been rebooted and the jobs are running through the scheduler as expected. We are continuing to investigate why the storage nodes went offline, but users may resume running their jobs at this time ==== On 6/5/11 8:17 PM, Jonathan S Monette wrote: > So you are seeing the same problem? > > On Sun, Jun 5, 2011 at 8:13 PM, ketan > wrote: > > I do not know the reason why you are getting this. The PBS submit > throws the following stderr: > > aprun: Apid 327667: Caught signal Terminated, sending to application > > > Also on my another swift submission I am getting this as PBS > submit stderr: > > [NID 00065] 2011-06-05 19:34:26 distributeControlMsg: Apid 327688 > write failure to node 66, 10.128.0.67, port 607, Connection reset > by peer > > I suspect the last maintenance/upgrade of Beagle might have caused > this. > > Ketan > > > > On 6/5/11 1:35 PM, Jonathan S Monette wrote: >> I still have problems running this script. I does not seem to >> ever execute. I get this to stdout >> >> Swift svn swift-r4252 (swift modified locally) cog-r3088 (cog >> modified locally) >> >> RunID: 20110604-2338-5eik2hbb >> Progress: >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Canceling job 173155.sdb >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Canceling job 173182.sdb >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> >> I had to C-c the run. I would check qstat and it would say the >> job was executing in the development queue but it seems like it >> never ran. Also it seems that the coasters job was cancelled >> twice during the Swift execution. >> >> On Sat, Jun 4, 2011 at 1:12 PM, Jonathan S Monette >> > wrote: >> >> Ok. Thanks. and the (none) also appeared when I was on >> login1.beagle.ci.uchicago.edu >> . Thanks Ketan. >> >> >> On Sat, Jun 4, 2011 at 1:09 PM, ketan >> > > wrote: >> >> Yes, you should bring this to the attention of beagle >> sysadmins. >> >> >> On 6/4/11 1:01 PM, Jonathan S Monette wrote: >>> Ok. I have my job submitted and it is currently waiting >>> in the queue. besides manually resetting this variable >>> everytime I am on beagle what else can be done? Is this >>> something that I should bring to the attention of the >>> sys admins for beagle? >>> >>> On Sat, Jun 4, 2011 at 12:51 PM, Ketan Maheshwari >>> >> > wrote: >>> >>> Alright, so this is the issue. GLOBUS_HOSTNAME is >>> picked up from HOSTNAME and I do not know where does >>> this none comes in your case. >>> >>> In my case I get hostname: >>> login2.beagle.ci.uchicago.edu >>> >>> >>> see if you can manually change this env variable and >>> try to run again. >>> >>> >>> On Sat, Jun 4, 2011 at 12:49 PM, Jonathan S Monette >>> > wrote: >>> >>> login2.beagle.ci.uchicago.edu.(none) >>> >>> when you ran your test was in on login1 or login2? >>> >>> >>> On Sat, Jun 4, 2011 at 12:47 PM, Ketan >>> Maheshwari >> > wrote: >>> >>> What does >>> >>> echo $HOSTNAME give? >>> >>> >>> On Sat, Jun 4, 2011 at 12:44 PM, Jonathan S >>> Monette >> > wrote: >>> >>> I am not sure why that (none) appears. >>> Is this a variable that the system sets >>> or do I need to set it? >>> >>> >>> On Sat, Jun 4, 2011 at 12:39 PM, Ketan >>> Maheshwari >> > wrote: >>> >>> This suprises me because, I do not >>> see that (none) in my commandline. >>> >>> >>> On Sat, Jun 4, 2011 at 12:38 PM, >>> Jonathan S Monette >>> >> > wrote: >>> >>> I added >>> echo ${OPTIONS} >>> echo ${COG_OPTS} >>> echo ${LOCALCLASSPATH} >>> echo ${EXEC} >>> echo ${CMDLINE} >>> >>> to the script. It seems in the >>> $OPTIONS variable this appears. >>> >>> -Xmx8192M >>> -Djava.endorsed.dirs=/home/jonmon/Library/Swift/swift-0.92/bin/../lib/endorsed >>> -DUID=1881 >>> -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>> -DCOG_INSTALL_PATH=/home/jonmon/Library/Swift/swift-0.92/bin/.. >>> -Dswift.home=/home/jonmon/Library/Swift/swift-0.92/bin/.. >>> -Djava.security.egd=file:///dev/urandom >>> >>> There is the line >>> DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>> I believe that is the >>> parenthesis that is causing this >>> to fail. Now How do I fix it. >>> >>> On Sat, Jun 4, 2011 at 12:36 PM, >>> Jonathan S Monette >>> >> > wrote: >>> >>> I may have found the error. >>> I download 0.92 binaries >>> and changed the swift >>> command script. I added >>> >>> >>> On Sat, Jun 4, 2011 at 12:26 >>> PM, Jonathan S Monette >>> >> > >>> wrote: >>> >>> Same result. Same error >>> >>> >>> On Sat, Jun 4, 2011 at >>> 12:24 PM, ketan >>> >> > >>> wrote: >>> >>> can you try running >>> the commandline by >>> copying from run.sh >>> and pasting it at >>> the prompt. >>> >>> >>> On 6/4/11 12:18 PM, >>> Jonathan S Monette >>> wrote: >>>> I do have that >>>> project. I did >>>> projects --avail >>>> and got back >>>> >>>> Project PI >>>> Title >>>> ------------------------------------------------------------------------------ >>>> CI-CCR000013 >>>> Michael Wilde >>>> The Swift >>>> Parallel Scripting >>>> System >>>> >>>> Here is my run.sh >>>> file. I execute >>>> with ./run.sh after >>>> I did chmod +x run.sh >>>> >>>> #!/bin/bash >>>> swift -config cf >>>> -tc.file tc >>>> -sites.file >>>> beagle-coaster.xml >>>> catsn.swift -n=1 >>>> >>>> I do have sh and >>>> cat in my path. I >>>> can execute them. >>>> Here is what which >>>> sh and which cat >>>> produced. >>>> >>>> sh is /usr/bin/sh >>>> cat is /bin/cat >>>> >>>> On Sat, Jun 4, 2011 >>>> at 12:10 PM, ketan >>>> >>> > >>>> wrote: >>>> >>>> Can you try >>>> "projects >>>> --avail" >>>> command on >>>> beagle to see >>>> if you are >>>> member of a >>>> project. >>>> >>>> Else you will >>>> need to request >>>> membership. You >>>> can do this >>>> from this page: >>>> >>>> http://pads.ci.uchicago.edu/access/ >>>> >>>> In any case, >>>> your error does >>>> not seem to be >>>> because of the >>>> above. Looks >>>> like '(' has >>>> sneaked in >>>> because of some >>>> unexpected typo >>>> in commandline >>>> .. can you >>>> doublecheck it. >>>> >>>> Also can you >>>> make sure that >>>> you have all >>>> the files: tc, >>>> beagle-coasters.xml >>>> cf in PATH, ie. >>>> are you able to >>>> access them w/o ./ >>>> >>>> >>>> On 6/4/11 12:03 >>>> PM, Jonathan S >>>> Monette wrote: >>>>> I did change >>>>> the work >>>>> directory. I >>>>> however did >>>>> not change the >>>>> project name. >>>>> I do not know >>>>> my project >>>>> name so I kept >>>>> the same >>>>> project that >>>>> was in there. >>>>> Here is my >>>>> modified >>>>> beagle-coasters.xml >>>>> >>>>> >>>>> >>>>> >>>> handle="pbs"> >>>>> >>>> provider="coaster" >>>>> jobmanager="local:pbs"/> >>>>> >>>> namespace="globus" >>>>> key="project">CI-CCR000013 >>>>> >>>>> >>>> namespace="globus" >>>>> key="ppn">24:cray:pack >>>>> >>>>> >>>> namespace="globus" >>>>> key="workersPerNode">24 >>>>> >>>> namespace="globus" >>>>> key="maxTime">1000 >>>>> >>>> namespace="globus" >>>>> key="slots">1 >>>>> >>>> namespace="globus" >>>>> key="nodeGranularity">1 >>>>> >>>> namespace="globus" >>>>> key="maxNodes">1 >>>>> >>>>> >>>> namespace="karajan" >>>>> key="jobThrottle">.63 >>>>> >>>> namespace="karajan" >>>>> key="initialScore">10000 >>>>> >>>>> >>>> provider="local"/> >>>>> >>>> >/lustre/beagle/jonmon/Swift/work >>>>> >>>>> >>>>> >>>>> >>>>> On Sat, Jun 4, >>>>> 2011 at 12:00 >>>>> PM, Ketan >>>>> Maheshwari >>>>> >>>> > >>>>> wrote: >>>>> >>>>> Jon, >>>>> >>>>> Thanks for >>>>> trying out >>>>> catsn on >>>>> Beagle. >>>>> >>>>> I just >>>>> tried it >>>>> myself but >>>>> could not >>>>> reproduce >>>>> the error >>>>> you are >>>>> getting. >>>>> Have you >>>>> made the >>>>> changes >>>>> that are >>>>> mentioned >>>>> in the README: >>>>> == >>>>> Change >>>>> workdir >>>>> location >>>>> in >>>>> beagle-coaster.xml >>>>> Change the >>>>> project >>>>> entry in >>>>> the >>>>> beagle-coaster.xml >>>>> to your >>>>> project name >>>>> == >>>>> >>>>> Ketan >>>>> >>>>> >>>>> On Sat, >>>>> Jun 4, >>>>> 2011 at >>>>> 11:50 AM, >>>>> Jonathan S >>>>> Monette >>>>> >>>> > >>>>> wrote: >>>>> >>>>> Hello, >>>>> I >>>>> am >>>>> trying >>>>> to run >>>>> the >>>>> catsn >>>>> test >>>>> on >>>>> beagle >>>>> using >>>>> the >>>>> files >>>>> in >>>>> ~ketan/catsn. >>>>> I >>>>> have >>>>> copied >>>>> over >>>>> this >>>>> directory >>>>> over >>>>> to my >>>>> home >>>>> directory >>>>> and I >>>>> believe I >>>>> set it >>>>> up >>>>> correctly. >>>>> I did >>>>> module >>>>> load >>>>> swift >>>>> and >>>>> the >>>>> ran >>>>> run.sh >>>>> that >>>>> was in >>>>> this >>>>> directory. >>>>> I get >>>>> this >>>>> error. >>>>> >>>>> /soft/swift/0.92/bin/swift: >>>>> eval: >>>>> line >>>>> 152: >>>>> syntax >>>>> error >>>>> near >>>>> unexpected >>>>> token `(' >>>>> /soft/swift/0.92/bin/swift: >>>>> eval: >>>>> line >>>>> 152: >>>>> `java >>>>> -Xmx8192M >>>>> -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed >>>>> -DUID=1881 >>>>> -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>>> -DCOG_INSTALL_PATH=/soft/swift/0.92/bin/.. >>>>> -Dswift.home=/soft/swift/0.92/bin/.. >>>>> -Djava.security.egd=file:///dev/urandom >>>>> -classpath >>>>> /soft/swift/0.92/bin/../etc:/soft/swift/0.92/bin/../libexec:/soft/swift/0.92/bin/../lib/addressing-1.0.jar:/soft/swift/0.92/bin/../lib/ant.jar:/soft/swift/0.92/bin/../lib/antlr-2.7.5.jar:/soft/swift/0.92/bin/../lib/axis.jar:/soft/swift/0.92/bin/../lib/axis-url.jar:/soft/swift/0.92/bin/../lib/backport-util-concurrent.jar:/soft/swift/0.92/bin/../lib/castor-0.9.6.jar:/soft/swift/0.92/bin/../lib/coaster-bootstrap.jar:/soft/swift/0.92/bin/../lib/cog-abstraction-common-2.4.jar:/soft/swift/0.92/bin/../lib/cog-axis.jar:/soft/swift/0.92/bin/../lib/cog-grapheditor-0.47.jar:/soft/swift/0.92/bin/../lib/cog-jglobus-1.7.0.jar:/soft/swift/0.92/bin/../lib/cog-karajan-0.36-dev.jar:/soft/swift/0.92/bin/../lib/cog-provider-clref-gt4_0_0.jar:/soft/swift/0.92/bin/../lib/cog-provider-coaster-0.3.jar:/soft/swift/0.92/bin/../lib/cog-provider-dcache-0.1.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt2-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/soft/swift/0.92/bin/../lib/cog-pro >>>>> vider-local-2.2.jar:/soft/swift/0.92/bin/../lib/cog-provider-localscheduler-0.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-ssh-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-webdav-2.1.jar:/soft/swift/0.92/bin/../lib/cog-resources-1.0.jar:/soft/swift/0.92/bin/../lib/cog-swift-svn.jar:/soft/swift/0.92/bin/../lib/cog-trap-1.0.jar:/soft/swift/0.92/bin/../lib/cog-url.jar:/soft/swift/0.92/bin/../lib/cog-util-0.92.jar:/soft/swift/0.92/bin/../lib/commonj.jar:/soft/swift/0.92/bin/../lib/commons-beanutils.jar:/soft/swift/0.92/bin/../lib/commons-collections-3.0.jar:/soft/swift/0.92/bin/../lib/commons-digester.jar:/soft/swift/0.92/bin/../lib/commons-discovery.jar:/soft/swift/0.92/bin/../lib/commons-httpclient.jar:/soft/swift/0.92/bin/../lib/commons-logging-1.1.jar:/soft/swift/0.92/bin/../lib/concurrent.jar:/soft/swift/0.92/bin/../lib/cryptix32.jar:/soft/swift/0.92/bin/../lib/cryptix-asn1.jar:/soft/swift/0.92/bin/../lib/cryptix.jar:/soft/swift/0.92/bin/../lib/globus_delegation_servic >>>>> e.jar:/soft/swift/0.92/bin/../lib/globus_delegation_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_mds_aggregator_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_service.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rft_stubs.jar:/soft/swift/0.92/bin/../lib/gram-client.jar:/soft/swift/0.92/bin/../lib/gram-stubs.jar:/soft/swift/0.92/bin/../lib/gram-utils.jar:/soft/swift/0.92/bin/../lib/j2ssh-common-0.2.2.jar:/soft/swift/0.92/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/soft/swift/0.92/bin/../lib/jakarta-regexp-1.2.jar:/soft/swift/0.92/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/soft/swift/0.92/bin/../lib/jaxrpc.jar:/soft/swift/0.92/bin/../lib/jce-jdk13-131.jar:/soft/swift/0.92/bin/../lib/jgss.jar:/soft/swift/0.92/bin/../lib/jline-0.9.94.jar:/soft/swift/0.92/bin/../lib/jsr173_1.0_api.jar:/soft/swift/0.92/bin/../lib/jug-lgpl-2.0.0.jar:/soft/swift/0.92/bin/../lib/junit.jar:/soft/swift/0.92/bin/../lib/log4j-1.2 >>>>> .8.jar:/soft/swift/0.92/bin/../lib/naming-common.jar:/soft/swift/0.92/bin/../lib/naming-factory.jar:/soft/swift/0.92/bin/../lib/naming-java.jar:/soft/swift/0.92/bin/../lib/naming-resources.jar:/soft/swift/0.92/bin/../lib/opensaml.jar:/soft/swift/0.92/bin/../lib/puretls.jar:/soft/swift/0.92/bin/../lib/resolver.jar:/soft/swift/0.92/bin/../lib/saaj.jar:/soft/swift/0.92/bin/../lib/stringtemplate.jar:/soft/swift/0.92/bin/../lib/vdldefinitions.jar:/soft/swift/0.92/bin/../lib/wsdl4j.jar:/soft/swift/0.92/bin/../lib/wsrf_core.jar:/soft/swift/0.92/bin/../lib/wsrf_core_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_index_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_usefulrp_schema_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_provider_jce.jar:/soft/swift/0.92/bin/../lib/wsrf_tools.jar:/soft/swift/0.92/bin/../lib/wss4j.jar:/soft/swift/0.92/bin/../lib/xalan.jar:/soft/swift/0.92/bin/../lib/xbean.jar:/soft/swift/0.92/bin/../lib/xbean_xpath.jar:/soft/swift/0.92/bin/../lib/xercesImpl.jar:/soft >>>>> /swift/0.92/bin/../lib/xml-apis.jar:/soft/swift/0.92/bin/../lib/xmlsec.jar:/soft/swift/0.92/bin/../lib/xpp3-1.1.3.4d_b4_min.jar:/soft/swift/0.92/bin/../lib/xstream-1.1.1-patched.jar: >>>>> org.griphyn.vdl.karajan.Loader >>>>> '-config' >>>>> 'cf' >>>>> '-tc.file' >>>>> 'tc' >>>>> '-sites.file' >>>>> 'beagle-coaster.xml' >>>>> 'catsn.swift' >>>>> '-n=1'' >>>>> >>>>> I >>>>> there >>>>> something >>>>> else I >>>>> need >>>>> to >>>>> load >>>>> on >>>>> beagle >>>>> to >>>>> make >>>>> swift >>>>> run >>>>> accordingly? >>>>> >>>>> -- >>>>> >>>>> Any >>>>> intelligent >>>>> fool >>>>> can >>>>> make >>>>> things >>>>> bigger >>>>> and >>>>> more >>>>> complex... >>>>> It >>>>> takes >>>>> a >>>>> touch >>>>> of >>>>> genius >>>>> - and >>>>> a lot >>>>> of >>>>> courage to >>>>> move >>>>> in the >>>>> opposite >>>>> direction. >>>>> >>>>> - >>>>> Albert >>>>> Einstein >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Any >>>>> intelligent >>>>> fool can make >>>>> things bigger >>>>> and more >>>>> complex... It >>>>> takes a touch >>>>> of genius - >>>>> and a lot of >>>>> courage to >>>>> move in the >>>>> opposite >>>>> direction. >>>>> >>>>> - Albert Einstein >>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> >>>> Any intelligent >>>> fool can make >>>> things bigger and >>>> more complex... It >>>> takes a touch of >>>> genius - and a lot >>>> of courage to move >>>> in the opposite >>>> direction. >>>> >>>> - Albert Einstein >>>> >>>> >>> >>> >>> >>> -- >>> >>> Any intelligent fool can >>> make things bigger and >>> more complex... It takes >>> a touch of genius - and >>> a lot of courage to move >>> in the opposite direction. >>> >>> - Albert Einstein >>> >>> >>> >>> >>> >>> -- >>> >>> Any intelligent fool can >>> make things bigger and more >>> complex... It takes a touch >>> of genius - and a lot of >>> courage to move in the >>> opposite direction. >>> >>> - Albert Einstein >>> >>> >>> >>> >>> >>> -- >>> >>> Any intelligent fool can make >>> things bigger and more >>> complex... It takes a touch of >>> genius - and a lot of courage to >>> move in the opposite direction. >>> >>> - Albert Einstein >>> >>> >>> >>> >>> >>> >>> -- >>> >>> Any intelligent fool can make things >>> bigger and more complex... It takes a >>> touch of genius - and a lot of courage >>> to move in the opposite direction. >>> >>> - Albert Einstein >>> >>> >>> >>> >>> >>> >>> -- >>> >>> Any intelligent fool can make things bigger and >>> more complex... It takes a touch of genius - and >>> a lot of courage to move in the opposite direction. >>> >>> - Albert Einstein >>> >>> >>> >>> >>> >>> >>> -- >>> >>> Any intelligent fool can make things bigger and more >>> complex... It takes a touch of genius - and a lot of >>> courage to move in the opposite direction. >>> >>> - Albert Einstein >>> >>> >> >> >> >> -- >> >> Any intelligent fool can make things bigger and more >> complex... It takes a touch of genius - and a lot of courage >> to move in the opposite direction. >> >> - Albert Einstein >> >> >> >> >> >> -- >> >> Any intelligent fool can make things bigger and more complex... >> It takes a touch of genius - and a lot of courage to move in the >> opposite direction. >> >> - Albert Einstein >> >> > > > > -- > > Any intelligent fool can make things bigger and more complex... It > takes a touch of genius - and a lot of courage to move in the opposite > direction. > > - Albert Einstein > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Mon Jun 6 14:30:28 2011 From: jonmon at utexas.edu (Jonathan S Monette) Date: Mon, 6 Jun 2011 14:30:28 -0500 Subject: [Swift-devel] Re: catsn on beagle In-Reply-To: <4DED2779.7000509@gmail.com> References: <4DEA670E.3010909@gmail.com> <4DEA6A30.2040503@gmail.com> <4DEA74DF.5040107@gmail.com> <4DEC29C8.6020507@gmail.com> <4DED2779.7000509@gmail.com> Message-ID: This fixed it. I can now run the catsn test script on beagle. Thanks Ketan. On Mon, Jun 6, 2011 at 2:16 PM, ketan wrote: > Jon, > > Turns out beagle was not properly up and circumstantial evidence indicates > it was causing the issue that seems now to be resolved (see quoted mesg from > beagle admins below). > > Could you try it again and report back what you get. > > Thanks, > Ketan > > ==== > > The five Beagle storage nodes were found offline this morning. These nodes provide DSL, GPFS, and NFS via DVS to compute nodes throughout the system. These nodes have been rebooted and the jobs are running through the scheduler as expected. We are continuing to investigate why the storage nodes went offline, but users may resume running their jobs at this time > > ==== > > > On 6/5/11 8:17 PM, Jonathan S Monette wrote: > > So you are seeing the same problem? > > On Sun, Jun 5, 2011 at 8:13 PM, ketan wrote: > >> I do not know the reason why you are getting this. The PBS submit throws >> the following stderr: >> >> aprun: Apid 327667: Caught signal Terminated, sending to application >> >> >> Also on my another swift submission I am getting this as PBS submit >> stderr: >> >> [NID 00065] 2011-06-05 19:34:26 distributeControlMsg: Apid 327688 write >> failure to node 66, 10.128.0.67, port 607, Connection reset by peer >> >> I suspect the last maintenance/upgrade of Beagle might have caused this. >> >> Ketan >> >> >> >> On 6/5/11 1:35 PM, Jonathan S Monette wrote: >> >> I still have problems running this script. I does not seem to ever >> execute. I get this to stdout >> >> Swift svn swift-r4252 (swift modified locally) cog-r3088 (cog modified >> locally) >> >> RunID: 20110604-2338-5eik2hbb >> Progress: >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Canceling job 173155.sdb >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Canceling job 173182.sdb >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> Progress: Submitted:1 >> >> I had to C-c the run. I would check qstat and it would say the job was >> executing in the development queue but it seems like it never ran. Also it >> seems that the coasters job was cancelled twice during the Swift execution. >> >> On Sat, Jun 4, 2011 at 1:12 PM, Jonathan S Monette wrote: >> >>> Ok. Thanks. and the (none) also appeared when I was on >>> login1.beagle.ci.uchicago.edu. Thanks Ketan. >>> >>> >>> On Sat, Jun 4, 2011 at 1:09 PM, ketan wrote: >>> >>>> Yes, you should bring this to the attention of beagle sysadmins. >>>> >>>> >>>> On 6/4/11 1:01 PM, Jonathan S Monette wrote: >>>> >>>> Ok. I have my job submitted and it is currently waiting in the queue. >>>> besides manually resetting this variable everytime I am on beagle what else >>>> can be done? Is this something that I should bring to the attention of the >>>> sys admins for beagle? >>>> >>>> On Sat, Jun 4, 2011 at 12:51 PM, Ketan Maheshwari < >>>> ketancmaheshwari at gmail.com> wrote: >>>> >>>>> Alright, so this is the issue. GLOBUS_HOSTNAME is picked up from >>>>> HOSTNAME and I do not know where does this none comes in your case. >>>>> >>>>> In my case I get hostname: login2.beagle.ci.uchicago.edu >>>>> >>>>> see if you can manually change this env variable and try to run again. >>>>> >>>>> >>>>> On Sat, Jun 4, 2011 at 12:49 PM, Jonathan S Monette >>>> > wrote: >>>>> >>>>>> login2.beagle.ci.uchicago.edu.(none) >>>>>> >>>>>> when you ran your test was in on login1 or login2? >>>>>> >>>>>> >>>>>> On Sat, Jun 4, 2011 at 12:47 PM, Ketan Maheshwari < >>>>>> ketancmaheshwari at gmail.com> wrote: >>>>>> >>>>>>> What does >>>>>>> >>>>>>> echo $HOSTNAME give? >>>>>>> >>>>>>> >>>>>>> On Sat, Jun 4, 2011 at 12:44 PM, Jonathan S Monette < >>>>>>> jonmon at utexas.edu> wrote: >>>>>>> >>>>>>>> I am not sure why that (none) appears. Is this a variable that the >>>>>>>> system sets or do I need to set it? >>>>>>>> >>>>>>>> >>>>>>>> On Sat, Jun 4, 2011 at 12:39 PM, Ketan Maheshwari < >>>>>>>> ketancmaheshwari at gmail.com> wrote: >>>>>>>> >>>>>>>>> This suprises me because, I do not see that (none) in my >>>>>>>>> commandline. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Sat, Jun 4, 2011 at 12:38 PM, Jonathan S Monette < >>>>>>>>> jonmon at utexas.edu> wrote: >>>>>>>>> >>>>>>>>>> I added >>>>>>>>>> echo ${OPTIONS} >>>>>>>>>> echo ${COG_OPTS} >>>>>>>>>> echo ${LOCALCLASSPATH} >>>>>>>>>> echo ${EXEC} >>>>>>>>>> echo ${CMDLINE} >>>>>>>>>> >>>>>>>>>> to the script. It seems in the $OPTIONS variable this appears. >>>>>>>>>> >>>>>>>>>> -Xmx8192M >>>>>>>>>> -Djava.endorsed.dirs=/home/jonmon/Library/Swift/swift-0.92/bin/../lib/endorsed >>>>>>>>>> -DUID=1881 -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>>>>>>>> -DCOG_INSTALL_PATH=/home/jonmon/Library/Swift/swift-0.92/bin/.. >>>>>>>>>> -Dswift.home=/home/jonmon/Library/Swift/swift-0.92/bin/.. >>>>>>>>>> -Djava.security.egd=file:///dev/urandom >>>>>>>>>> >>>>>>>>>> There is the line >>>>>>>>>> DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) I believe that is the >>>>>>>>>> parenthesis that is causing this to fail. Now How do I fix it. >>>>>>>>>> >>>>>>>>>> On Sat, Jun 4, 2011 at 12:36 PM, Jonathan S Monette < >>>>>>>>>> jonmon at utexas.edu> wrote: >>>>>>>>>> >>>>>>>>>>> I may have found the error. I download 0.92 binaries and changed >>>>>>>>>>> the swift command script. I added >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Sat, Jun 4, 2011 at 12:26 PM, Jonathan S Monette < >>>>>>>>>>> jonmon at utexas.edu> wrote: >>>>>>>>>>> >>>>>>>>>>>> Same result. Same error >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Sat, Jun 4, 2011 at 12:24 PM, ketan < >>>>>>>>>>>> ketancmaheshwari at gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> can you try running the commandline by copying from run.sh and >>>>>>>>>>>>> pasting it at the prompt. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 6/4/11 12:18 PM, Jonathan S Monette wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> I do have that project. I did projects --avail and got back >>>>>>>>>>>>> >>>>>>>>>>>>> Project PI Title >>>>>>>>>>>>> >>>>>>>>>>>>> ------------------------------------------------------------------------------ >>>>>>>>>>>>> CI-CCR000013 Michael Wilde The Swift Parallel >>>>>>>>>>>>> Scripting System >>>>>>>>>>>>> >>>>>>>>>>>>> Here is my run.sh file. I execute with ./run.sh after I did >>>>>>>>>>>>> chmod +x run.sh >>>>>>>>>>>>> >>>>>>>>>>>>> #!/bin/bash >>>>>>>>>>>>> swift -config cf -tc.file tc -sites.file beagle-coaster.xml >>>>>>>>>>>>> catsn.swift -n=1 >>>>>>>>>>>>> >>>>>>>>>>>>> I do have sh and cat in my path. I can execute them. Here >>>>>>>>>>>>> is what which sh and which cat produced. >>>>>>>>>>>>> >>>>>>>>>>>>> sh is /usr/bin/sh >>>>>>>>>>>>> cat is /bin/cat >>>>>>>>>>>>> >>>>>>>>>>>>> On Sat, Jun 4, 2011 at 12:10 PM, ketan < >>>>>>>>>>>>> ketancmaheshwari at gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Can you try "projects --avail" command on beagle to see if >>>>>>>>>>>>>> you are member of a project. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Else you will need to request membership. You can do this from >>>>>>>>>>>>>> this page: >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://pads.ci.uchicago.edu/access/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> In any case, your error does not seem to be because of the >>>>>>>>>>>>>> above. Looks like '(' has sneaked in because of some unexpected typo in >>>>>>>>>>>>>> commandline .. can you doublecheck it. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Also can you make sure that you have all the files: tc, >>>>>>>>>>>>>> beagle-coasters.xml cf in PATH, ie. are you able to access them w/o ./ >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 6/4/11 12:03 PM, Jonathan S Monette wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> I did change the work directory. I however did not change >>>>>>>>>>>>>> the project name. I do not know my project name so I kept the same project >>>>>>>>>>>>>> that was in there. Here is my modified beagle-coasters.xml >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> key="project">CI-CCR000013 >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> key="ppn">24:cray:pack >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> key="workersPerNode">24 >>>>>>>>>>>>>> 1000 >>>>>>>>>>>>>> 1 >>>>>>>>>>>>>> >>>>>>>>>>>>> key="nodeGranularity">1 >>>>>>>>>>>>>> 1 >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> key="jobThrottle">.63 >>>>>>>>>>>>>> >>>>>>>>>>>>> key="initialScore">10000 >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >/lustre/beagle/jonmon/Swift/work >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Sat, Jun 4, 2011 at 12:00 PM, Ketan Maheshwari < >>>>>>>>>>>>>> ketancmaheshwari at gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Jon, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks for trying out catsn on Beagle. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I just tried it myself but could not reproduce the error you >>>>>>>>>>>>>>> are getting. Have you made the changes that are mentioned in the README: >>>>>>>>>>>>>>> == >>>>>>>>>>>>>>> Change workdir location in beagle-coaster.xml >>>>>>>>>>>>>>> Change the project entry in the beagle-coaster.xml to your >>>>>>>>>>>>>>> project name >>>>>>>>>>>>>>> == >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Ketan >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Sat, Jun 4, 2011 at 11:50 AM, Jonathan S Monette < >>>>>>>>>>>>>>> jonmon at utexas.edu> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hello, >>>>>>>>>>>>>>>> I am trying to run the catsn test on beagle using the >>>>>>>>>>>>>>>> files in ~ketan/catsn. I have copied over this directory over to my home >>>>>>>>>>>>>>>> directory and I believe I set it up correctly. I did module load swift and >>>>>>>>>>>>>>>> the ran run.sh that was in this directory. I get this error. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> /soft/swift/0.92/bin/swift: eval: line 152: syntax error >>>>>>>>>>>>>>>> near unexpected token `(' >>>>>>>>>>>>>>>> /soft/swift/0.92/bin/swift: eval: line 152: `java >>>>>>>>>>>>>>>> -Xmx8192M -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed >>>>>>>>>>>>>>>> -DUID=1881 -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) >>>>>>>>>>>>>>>> -DCOG_INSTALL_PATH=/soft/swift/0.92/bin/.. >>>>>>>>>>>>>>>> -Dswift.home=/soft/swift/0.92/bin/.. -Djava.security.egd= >>>>>>>>>>>>>>>> file:///dev/urandom -classpath >>>>>>>>>>>>>>>> /soft/swift/0.92/bin/../etc:/soft/swift/0.92/bin/../libexec:/soft/swift/0.92/bin/../lib/addressing-1.0.jar:/soft/swift/0.92/bin/../lib/ant.jar:/soft/swift/0.92/bin/../lib/antlr-2.7.5.jar:/soft/swift/0.92/bin/../lib/axis.jar:/soft/swift/0.92/bin/../lib/axis-url.jar:/soft/swift/0.92/bin/../lib/backport-util-concurrent.jar:/soft/swift/0.92/bin/../lib/castor-0.9.6.jar:/soft/swift/0.92/bin/../lib/coaster-bootstrap.jar:/soft/swift/0.92/bin/../lib/cog-abstraction-common-2.4.jar:/soft/swift/0.92/bin/../lib/cog-axis.jar:/soft/swift/0.92/bin/../lib/cog-grapheditor-0.47.jar:/soft/swift/0.92/bin/../lib/cog-jglobus-1.7.0.jar:/soft/swift/0.92/bin/../lib/cog-karajan-0.36-dev.jar:/soft/swift/0.92/bin/../lib/cog-provider-clref-gt4_0_0.jar:/soft/swift/0.92/bin/../lib/cog-provider-coaster-0.3.jar:/soft/swift/0.92/bin/../lib/cog-provider-dcache-0.1.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt2-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-gt4_0_0-2.5.jar:/soft/swift/0.92/bin/../lib/cog-pro >>>>>>>>>>>>>>>> vider-local-2.2.jar:/soft/swift/0.92/bin/../lib/cog-provider-localscheduler-0.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-ssh-2.4.jar:/soft/swift/0.92/bin/../lib/cog-provider-webdav-2.1.jar:/soft/swift/0.92/bin/../lib/cog-resources-1.0.jar:/soft/swift/0.92/bin/../lib/cog-swift-svn.jar:/soft/swift/0.92/bin/../lib/cog-trap-1.0.jar:/soft/swift/0.92/bin/../lib/cog-url.jar:/soft/swift/0.92/bin/../lib/cog-util-0.92.jar:/soft/swift/0.92/bin/../lib/commonj.jar:/soft/swift/0.92/bin/../lib/commons-beanutils.jar:/soft/swift/0.92/bin/../lib/commons-collections-3.0.jar:/soft/swift/0.92/bin/../lib/commons-digester.jar:/soft/swift/0.92/bin/../lib/commons-discovery.jar:/soft/swift/0.92/bin/../lib/commons-httpclient.jar:/soft/swift/0.92/bin/../lib/commons-logging-1.1.jar:/soft/swift/0.92/bin/../lib/concurrent.jar:/soft/swift/0.92/bin/../lib/cryptix32.jar:/soft/swift/0.92/bin/../lib/cryptix-asn1.jar:/soft/swift/0.92/bin/../lib/cryptix.jar:/soft/swift/0.92/bin/../lib/globus_delegation_servic >>>>>>>>>>>>>>>> e.jar:/soft/swift/0.92/bin/../lib/globus_delegation_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_mds_aggregator_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_service.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rendezvous_stubs.jar:/soft/swift/0.92/bin/../lib/globus_wsrf_rft_stubs.jar:/soft/swift/0.92/bin/../lib/gram-client.jar:/soft/swift/0.92/bin/../lib/gram-stubs.jar:/soft/swift/0.92/bin/../lib/gram-utils.jar:/soft/swift/0.92/bin/../lib/j2ssh-common-0.2.2.jar:/soft/swift/0.92/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/soft/swift/0.92/bin/../lib/jakarta-regexp-1.2.jar:/soft/swift/0.92/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/soft/swift/0.92/bin/../lib/jaxrpc.jar:/soft/swift/0.92/bin/../lib/jce-jdk13-131.jar:/soft/swift/0.92/bin/../lib/jgss.jar:/soft/swift/0.92/bin/../lib/jline-0.9.94.jar:/soft/swift/0.92/bin/../lib/jsr173_1.0_api.jar:/soft/swift/0.92/bin/../lib/jug-lgpl-2.0.0.jar:/soft/swift/0.92/bin/../lib/junit.jar:/soft/swift/0.92/bin/../lib/log4j-1.2 >>>>>>>>>>>>>>>> .8.jar:/soft/swift/0.92/bin/../lib/naming-common.jar:/soft/swift/0.92/bin/../lib/naming-factory.jar:/soft/swift/0.92/bin/../lib/naming-java.jar:/soft/swift/0.92/bin/../lib/naming-resources.jar:/soft/swift/0.92/bin/../lib/opensaml.jar:/soft/swift/0.92/bin/../lib/puretls.jar:/soft/swift/0.92/bin/../lib/resolver.jar:/soft/swift/0.92/bin/../lib/saaj.jar:/soft/swift/0.92/bin/../lib/stringtemplate.jar:/soft/swift/0.92/bin/../lib/vdldefinitions.jar:/soft/swift/0.92/bin/../lib/wsdl4j.jar:/soft/swift/0.92/bin/../lib/wsrf_core.jar:/soft/swift/0.92/bin/../lib/wsrf_core_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_index_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_mds_usefulrp_schema_stubs.jar:/soft/swift/0.92/bin/../lib/wsrf_provider_jce.jar:/soft/swift/0.92/bin/../lib/wsrf_tools.jar:/soft/swift/0.92/bin/../lib/wss4j.jar:/soft/swift/0.92/bin/../lib/xalan.jar:/soft/swift/0.92/bin/../lib/xbean.jar:/soft/swift/0.92/bin/../lib/xbean_xpath.jar:/soft/swift/0.92/bin/../lib/xercesImpl.jar:/soft >>>>>>>>>>>>>>>> /swift/0.92/bin/../lib/xml-apis.jar:/soft/swift/0.92/bin/../lib/xmlsec.jar:/soft/swift/0.92/bin/../lib/xpp3-1.1.3.4d_b4_min.jar:/soft/swift/0.92/bin/../lib/xstream-1.1.1-patched.jar: >>>>>>>>>>>>>>>> org.griphyn.vdl.karajan.Loader '-config' 'cf' '-tc.file' 'tc' '-sites.file' >>>>>>>>>>>>>>>> 'beagle-coaster.xml' 'catsn.swift' '-n=1'' >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I there something else I need to load on beagle to make >>>>>>>>>>>>>>>> swift run accordingly? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Any intelligent fool can make things bigger and more >>>>>>>>>>>>>>>> complex... It takes a touch of genius - and a lot of courage to move in the >>>>>>>>>>>>>>>> opposite direction. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - Albert Einstein >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> >>>>>>>>>>>>>> Any intelligent fool can make things bigger and more >>>>>>>>>>>>>> complex... It takes a touch of genius - and a lot of courage to move in the >>>>>>>>>>>>>> opposite direction. >>>>>>>>>>>>>> >>>>>>>>>>>>>> - Albert Einstein >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> >>>>>>>>>>>>> Any intelligent fool can make things bigger and more complex... >>>>>>>>>>>>> It takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>>>>>>> direction. >>>>>>>>>>>>> >>>>>>>>>>>>> - Albert Einstein >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> >>>>>>>>>>>> Any intelligent fool can make things bigger and more complex... >>>>>>>>>>>> It takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>>>>>> direction. >>>>>>>>>>>> >>>>>>>>>>>> - Albert Einstein >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> >>>>>>>>>>> Any intelligent fool can make things bigger and more complex... >>>>>>>>>>> It takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>>>>> direction. >>>>>>>>>>> >>>>>>>>>>> - Albert Einstein >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>>>> direction. >>>>>>>>>> >>>>>>>>>> - Albert Einstein >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>>>> direction. >>>>>>>> >>>>>>>> - Albert Einstein >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Any intelligent fool can make things bigger and more complex... It >>>>>> takes a touch of genius - and a lot of courage to move in the opposite >>>>>> direction. >>>>>> >>>>>> - Albert Einstein >>>>>> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> >>>> Any intelligent fool can make things bigger and more complex... It takes >>>> a touch of genius - and a lot of courage to move in the opposite direction. >>>> >>>> - Albert Einstein >>>> >>>> >>> >>> >>> -- >>> >>> Any intelligent fool can make things bigger and more complex... It takes >>> a touch of genius - and a lot of courage to move in the opposite direction. >>> >>> - Albert Einstein >>> >>> >> >> >> -- >> >> Any intelligent fool can make things bigger and more complex... It takes a >> touch of genius - and a lot of courage to move in the opposite direction. >> >> - Albert Einstein >> >> > > > -- > > Any intelligent fool can make things bigger and more complex... It takes a > touch of genius - and a lot of courage to move in the opposite direction. > > - Albert Einstein > > -- Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction. - Albert Einstein -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Mon Jun 6 16:21:30 2011 From: jonmon at utexas.edu (Jonathan S Monette) Date: Mon, 6 Jun 2011 16:21:30 -0500 Subject: [Swift-devel] Coasters Service Message-ID: I have a question on coasters. If I am on a machine, say bridled, and I set up automatic coasters with a jobmanager of ssh:pbs for PADS and Beagle in my sites file, where does the coasters service run? Is it on the head node of both PADS and Beagle or is it on my local machine Bridled? -- Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction. - Albert Einstein -------------- next part -------------- An HTML attachment was scrubbed... URL: From aespinosa at cs.uchicago.edu Mon Jun 6 16:23:50 2011 From: aespinosa at cs.uchicago.edu (Allan Espinosa) Date: Mon, 6 Jun 2011 15:23:50 -0600 Subject: [Swift-devel] Coasters Service In-Reply-To: References: Message-ID: <20110606212350.GE6327@parker.scd.ucar.edu> PADS and beagle On Mon, Jun 06, 2011 at 04:21:30PM -0500, Jonathan S Monette wrote: > I have a question on coasters. If I am on a machine, say bridled, and I set > up automatic coasters with a jobmanager of ssh:pbs for PADS and Beagle in my > sites file, where does the coasters service run? Is it on the head node of > both PADS and Beagle or is it on my local machine Bridled? > > -- > > Any intelligent fool can make things bigger and more complex... It takes a > touch of genius - and a lot of courage to move in the opposite direction. > > - Albert Einstein > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel From jonmon at utexas.edu Mon Jun 6 16:25:10 2011 From: jonmon at utexas.edu (Jonathan S Monette) Date: Mon, 6 Jun 2011 16:25:10 -0500 Subject: [Swift-devel] Coasters Service In-Reply-To: <20110606212350.GE6327@parker.scd.ucar.edu> References: <20110606212350.GE6327@parker.scd.ucar.edu> Message-ID: Thanks. On Mon, Jun 6, 2011 at 4:23 PM, Allan Espinosa wrote: > PADS and beagle > > On Mon, Jun 06, 2011 at 04:21:30PM -0500, Jonathan S Monette wrote: > > I have a question on coasters. If I am on a machine, say bridled, and I > set > > up automatic coasters with a jobmanager of ssh:pbs for PADS and Beagle in > my > > sites file, where does the coasters service run? Is it on the head node > of > > both PADS and Beagle or is it on my local machine Bridled? > > > > -- > > > > Any intelligent fool can make things bigger and more complex... It takes > a > > touch of genius - and a lot of courage to move in the opposite direction. > > > > - Albert Einstein > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > -- Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction. - Albert Einstein -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Tue Jun 7 12:15:09 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 07 Jun 2011 10:15:09 -0700 Subject: [Swift-devel] Walltime in PBS submit files In-Reply-To: References: <1016644273.1082.1307114128373.JavaMail.root@zimbra.anl.gov> Message-ID: <1307466909.3889.4.camel@blabla2.none> Oops. The email server certificate changed and I didn't notice it until today. So sorry for the late reply. The maxTime parameter is there to enforce queue constrains so that blocks don't trip over those. The blocks also consider some margin of error when scheduling jobs. This is currently 30 seconds. So the maximum possible walltime for a job will be maxTime - 30s. Even if that was not the case, you would still not be able to fit a job with a walltime of maxTime, since some seconds are lost between the time a worker is started and the time a job is actually sent to that worker. So you should understand maxTime as "never submit blocks larger than this time". On Fri, 2011-06-03 at 10:21 -0500, Tim Armstrong wrote: > I'd been trying separately to understand the effect of the maxTime > parameter: it doesn't do what I intuitively expect it to either. It > seems like if I set maxtime to x number of minutes, it submits jobs of > at most x - 1 minutes duration. > > Is this the intended behaviour? Is maxTime rounded down to the > nearest minute? Should I understand maxTime as meaning "only submit > jobs that are strictly less than durations"? > > - Tim > > On Fri, Jun 3, 2011 at 10:15 AM, Michael Wilde > wrote: > Could you add a note on this to the trunk user guide? > > > Thanks, > > > Mike > > > > ______________________________________________________________ > > That did the trick. I bumped up the maxtime to 300 and > it ran immediately. Thanks for the info! > > David > > On Fri, Jun 3, 2011 at 9:54 AM, Michael Wilde > wrote: > David, > > > Im not sure what coasters is thinking here. > But first thing I would try is to set the > maxtime value in sites.xml (which is in > integer seconds) to something like 300 to make > coasters realize that it can fit at least 10 > 30-second app calls into the 300 second > coaster block. > > > With maxtime < maxwalltime coasters may be > erroneously starting a block but finding that > its unable to fit *any* app calls into it. > > > There are also settings you can apply > (overallocation etc) to make maxtime more of > an "exact time" setting. For example, if you > do this: > > > key="maxTime">1800 > key="lowOverAllocation">100 > key="highOverAllocation">100 > > > ...then each coaster block started should have > a maxwalltime of 1800 secs. > > > But your app maxwalltime needs to fit into > this block time. > > > - Mike > > > > > > > > > > ______________________________________________ > > Hello, > > I am trying to use the shared queue on > Fusion. This queue requires a walltime > of less than one hour. I have all of > the applications in my tc.data file > set up with a walltime of 30 seconds. > In my sites.xml, I specify a maxtime > of 10. However, when Swift generates > the PBS submit file, it specifies a > walltime of 00:00:00 which prevents it > from running. > > How can I make Swift set the walltime > in these PBS submit scripts? > > Thanks, > David > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > > > > > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel From hategan at mcs.anl.gov Tue Jun 7 12:16:46 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 07 Jun 2011 10:16:46 -0700 Subject: [Swift-devel] catsn on beagle In-Reply-To: References: Message-ID: <1307467006.3889.5.camel@blabla2.none> On Sat, 2011-06-04 at 11:50 -0500, Jonathan S Monette wrote: > Hello, > I am trying to run the catsn test on beagle using the files in > ~ketan/catsn. I have copied over this directory over to my home > directory and I believe I set it up correctly. I did module load > swift and the ran run.sh that was in this directory. I get this > error. > > > /soft/swift/0.92/bin/swift: eval: line 152: syntax error near > unexpected token `(' > /soft/swift/0.92/bin/swift: eval: line 152: `java -Xmx8192M > -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed -DUID=1881 > -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) What's the "(none)" in there and where is it coming from? From jonmon at utexas.edu Tue Jun 7 12:28:27 2011 From: jonmon at utexas.edu (Jonathan S Monette) Date: Tue, 7 Jun 2011 12:28:27 -0500 Subject: [Swift-devel] catsn on beagle In-Reply-To: <1307467006.3889.5.camel@blabla2.none> References: <1307467006.3889.5.camel@blabla2.none> Message-ID: -DGLOBUS_HOSTNAME is getting it's value from $HOSTNAME which for some reason has (none) appended to the end. It doesn't matter if I am on login1.beagle.ci.uchicago.edu or login2.beagle.ci.uchicago.edu, (none) is there. I already emailed support-beagle inquiring why it shows up. Is there a way to set the value for -DGLOBUS_HOSTNAME in the sites.xml file or something until the problem is resolved? Right now I have to override the variable manually but if I can just add a line to my sites file that would be better. On Tue, Jun 7, 2011 at 12:16 PM, Mihael Hategan wrote: > On Sat, 2011-06-04 at 11:50 -0500, Jonathan S Monette wrote: > > Hello, > > I am trying to run the catsn test on beagle using the files in > > ~ketan/catsn. I have copied over this directory over to my home > > directory and I believe I set it up correctly. I did module load > > swift and the ran run.sh that was in this directory. I get this > > error. > > > > > > /soft/swift/0.92/bin/swift: eval: line 152: syntax error near > > unexpected token `(' > > /soft/swift/0.92/bin/swift: eval: line 152: `java -Xmx8192M > > -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed -DUID=1881 > > -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) > > What's the "(none)" in there and where is it coming from? > > > > > -- Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction. - Albert Einstein -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Tue Jun 7 12:34:00 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 07 Jun 2011 10:34:00 -0700 Subject: [Swift-devel] catsn on beagle In-Reply-To: References: <1307467006.3889.5.camel@blabla2.none> Message-ID: <1307468040.5540.2.camel@blabla2.none> You could filter out the "(none)" part in your .profile. On Tue, 2011-06-07 at 12:28 -0500, Jonathan S Monette wrote: > -DGLOBUS_HOSTNAME is getting it's value from $HOSTNAME which for some > reason has (none) appended to the end. It doesn't matter if I am on > login1.beagle.ci.uchicago.edu or login2.beagle.ci.uchicago.edu, (none) > is there. I already emailed support-beagle inquiring why it shows > up. Is there a way to set the value for -DGLOBUS_HOSTNAME in the > sites.xml file or something until the problem is resolved? Right now > I have to override the variable manually but if I can just add a line > to my sites file that would be better. > > On Tue, Jun 7, 2011 at 12:16 PM, Mihael Hategan > wrote: > On Sat, 2011-06-04 at 11:50 -0500, Jonathan S Monette wrote: > > Hello, > > I am trying to run the catsn test on beagle using the > files in > > ~ketan/catsn. I have copied over this directory over to my > home > > directory and I believe I set it up correctly. I did module > load > > swift and the ran run.sh that was in this directory. I get > this > > error. > > > > > > /soft/swift/0.92/bin/swift: eval: line 152: syntax error > near > > unexpected token `(' > > /soft/swift/0.92/bin/swift: eval: line 152: `java -Xmx8192M > > -Djava.endorsed.dirs=/soft/swift/0.92/bin/../lib/endorsed > -DUID=1881 > > -DGLOBUS_HOSTNAME=login2.beagle.ci.uchicago.edu.(none) > > > What's the "(none)" in there and where is it coming from? > > > > > > > > -- > > > Any intelligent fool can make things bigger and more complex... It > takes a touch of genius - and a lot of courage to move in the opposite > direction. > > - Albert Einstein > > From wilde at mcs.anl.gov Tue Jun 7 14:26:20 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 7 Jun 2011 14:26:20 -0500 (CDT) Subject: [Swift-devel] Re: InternalHostname in sites file In-Reply-To: Message-ID: <152192967.13844.1307474780510.JavaMail.root@zimbra.anl.gov> David, I really like the asciidoc version. Can we move to a userguide/ directory with one file per chapter? (to reduce collisions as more people start editing it) Also, can we set the tables so that they size their columns as needed instead of taking up the entire page width? - Mike ----- Original Message ----- Sure, Jon. The userguide is now kept in the docs/userguide directory of Swift. It's in asciidoc format. Feel free to make changes there, or email me the text and I can add it for you. David On Thu, Jun 2, 2011 at 3:10 PM, Jonathan S Monette < jonmon at utexas.edu > wrote: I can write something up and send it to David to add somewhere for the userguide. Not sure where the files for the userguide are kept. On Thu, Jun 2, 2011 at 3:08 PM, Michael Wilde < wilde at mcs.anl.gov > wrote: Thanks for clarifying. Jon and/or David, can you address this with a cookbook entry on Coasters that heads towards a users guide section? We should tell users what they can run on their cluster (eg ping or telnet-style connect tests) to validate the setting of internalHostName. - Mike ----- Original Message ----- > Right. If the head node has multiple network interfaces, only one of > which is visible from the worker nodes. > > The choice of which interface is the one that worker nodes can connect > to is a matter of the particular cluster. It's not particularly easy > to > have an automated mechanism that figures it out. We tried some scheme > to > pass all the interface addresses to the worker and let it try to > connect > to all of them in order, but that didn't work very well. Of course, > there might be a scheme that works, but I didn't want to spend too > much > time on that. > > So that's why it's needed. To clarify to the workers which exact > interface on the head node they are to try to connect to. > > Mihael > > On Thu, 2011-06-02 at 14:10 -0500, Jonathan S Monette wrote: > > Mihael, > > I believe we have talked about this before but why is it > > necessary > > for an InternalHostname to be specified for PADS? I know that the > > address that coasters connects to is wrong but I do not remember why > > that was. Could you give an explanation on why internalHostname > > needs > > to be set? > > > > -- > > > > > > Any intelligent fool can make things bigger and more complex... It > > takes a touch of genius - and a lot of courage to move in the > > opposite > > direction. > > > > - Albert Einstein > > > > > > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory -- Any intelligent fool can make things bigger and more complex... It takes a touch of genius - and a lot of courage to move in the opposite direction. - Albert Einstein _______________________________________________ Swift-devel mailing list Swift-devel at ci.uchicago.edu http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidkelly999 at gmail.com Tue Jun 7 14:39:20 2011 From: davidkelly999 at gmail.com (David Kelly) Date: Tue, 7 Jun 2011 14:39:20 -0500 Subject: [Swift-devel] Re: Getting VMs from FG for use with swift In-Reply-To: <4DDC12A7.8010304@mcs.anl.gov> References: <4DD70560.1020901@mcs.anl.gov> <4DDC12A7.8010304@mcs.anl.gov> Message-ID: Hello John, I have attached a quickstart guide I wrote on how to get Swift working with futuregrid by using the new Swift coaster service scripts. This will require the latest development version of Swift. Instructions on how to download/install are in the document. Please let me know if you have any questions, if anything is unclear, or if you run into any problems. Thank you! Regards, David On Tue, May 24, 2011 at 3:18 PM, John Bresnahan wrote: > The GPFS server on the FG cluster hotel died yesterday so I cannot get you > your credentials. I'll get back to you when it is up again. Once it is > back the process for getting the needed access keys is described here: > > https://portal.futuregrid.org/tutorials/nimbus > > > On 05/23/2011 05:24 AM, David Kelly wrote: > >> Hi John, >> >> I now have a futuregrid account and am added to a project. I am now trying >> to get our scripts >> working together. >> >> I ran into a few problems at first when trying to run the futuregrid >> scripts. On the first system I >> tried I was getting a traceback. It is possible that the system I was >> using has older versions of >> some of the needed libraries. Then I tried it on a more system that is >> more frequently updated - my >> laptop running Ubuntu 10.10. It needed a newer version of the Python >> crypto tools installed, so I >> installed that (and the python development libraries) and that part seems >> fine now. >> >> I am now up to the point of the install script where it is trying to >> register keys, but it is >> failing. My guess is that I need to change FUTUREGRID_IAAS_ACCESS_KEY and >> FUTUREGRID_IAAS_SECRET_KEY >> in env.sh. I'm not sure what these should be exactly. Are these the >> contents of my ssh keys, an ssh >> key and a passphrase, or some other type of security? I've tried a few >> combinations of different >> things but haven't had much luck yet. >> >> Thanks! >> >> Regards, >> David >> >> >> Traceback from earlier: >> Installing setuptools.......................done. >> Complete output from command /autonfs/home/davidk/swift-vm-...ython >> /autonfs/home/davidk/swift-vm-...stall pip: >> Searching for pip >> Reading http://pypi.python.org/simple/pip/ >> Reading http://pip.openplans.org >> Reading http://www.pip-installer.org >> Best match: pip 1.0.1 >> Downloading >> >> http://pypi.python.org/packages/source/p/pip/pip-1.0.1.tar.gz#md5=28dcc70225e5bf925532abc5b087a94b >> Processing pip-1.0.1.tar.gz >> Running pip-1.0.1/setup.py -q bdist_egg --dist-dir >> /tmp/easy_install-GHsjHX/pip-1.0.1/egg-dist-tmp-rXjQ7L >> Traceback (most recent call last): >> File "/autonfs/home/davidk/swift-vm-boot/ve/bin/easy_install", line 8, >> in >> load_entry_point('setuptools==0.6c11', 'console_scripts', >> 'easy_install')() >> File >> >> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >> line 1712, in main >> File >> >> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >> line 1700, in with_ei_usage >> File >> >> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >> line 1716, in >> File "/soft/python-2.6.1-r1/lib/python2.6/distutils/core.py", line 152, >> in setup >> dist.run_commands() >> File "/soft/python-2.6.1-r1/lib/python2.6/distutils/dist.py", line 975, >> in run_commands >> self.run_command(cmd) >> File "/soft/python-2.6.1-r1/lib/python2.6/distutils/dist.py", line 995, >> in run_command >> cmd_obj.run() >> File >> >> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >> line 211, in run >> File >> >> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >> line 446, in easy_install >> File >> >> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >> line 476, in install_item >> File >> >> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >> line 655, in install_eggs >> File >> >> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >> line 930, in build_and_install >> File >> >> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >> line 919, in run_setup >> File >> >> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/sandbox.py", >> line 52, in run_setup >> AttributeError: 'module' object has no attribute '__getstate__' >> ---------------------------------------- >> Traceback (most recent call last): >> File "bin/virtualenv.py", line 1647, in >> main() >> File "bin/virtualenv.py", line 558, in main >> prompt=options.prompt) >> File "bin/virtualenv.py", line 656, in create_environment >> install_pip(py_executable) >> File "bin/virtualenv.py", line 415, in install_pip >> filter_stdout=_filter_setup) >> File "bin/virtualenv.py", line 624, in call_subprocess >> % (cmd_desc, proc.returncode)) >> OSError: Command /autonfs/home/davidk/swift-vm-...ython >> /autonfs/home/davidk/swift-vm-...stall pip >> failed with error code 1 >> Failed to created the needed python virtual environment >> >> On Fri, May 20, 2011 at 7:20 PM, John Bresnahan > bresnaha at mcs.anl.gov>> >> >> wrote: >> >> Our phone call today left me motiviated to show you guys how easy it is >> to get virtual machines >> for use with swift on FutureGrid. >> >> I made some small scripts around the Nimbus tool cloudinitd. The >> scripts just make installing >> the software and running it trivial. With a single command you can get >> N VMs from the >> FutureGrid Nimbus clouds (N can be on the order of hundreds). When the >> tool is done it outputs >> a line separated list of hostnames. All of these hostnames have root >> access available via your >> ~/.ssh/id_rsa keys. >> >> If/when you have FutureGrid credentials, untar the attachment and give >> it a try. There are a >> few minor configurations needed: >> >> >> 1) edit the file env.sh and set your FutureGrid security credentials: >> >> % cat env.sh >> export FUTUREGRID_IAAS_ACCESS_KEY=XXXXXXXXXXXXXXXXXX >> export FUTUREGRID_IAAS_SECRET_KEY=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX >> >> export FUTUREGRID_HOTEL_NODES=2 >> export FUTUREGRID_SIERRA_NODES=2 >> >> You can also change the value '2' to be whatever number of VMs you >> want. >> >> >> 2) install it on your system. (this single command downloads and >> installs everything you need >> under the cwd): >> >> % ./install.sh >> >> 3) boot the VMs >> % ./bin/bootit.sh. >> You will see much status output, but the last several lines will be the >> hostnames acquired from >> the cloud. >> >> Let me know when you guys are ready to check this out! >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidkelly999 at gmail.com Tue Jun 7 17:09:33 2011 From: davidkelly999 at gmail.com (David Kelly) Date: Tue, 7 Jun 2011 17:09:33 -0500 Subject: [Swift-devel] Re: InternalHostname in sites file In-Reply-To: <152192967.13844.1307474780510.JavaMail.root@zimbra.anl.gov> References: <152192967.13844.1307474780510.JavaMail.root@zimbra.anl.gov> Message-ID: Mike, I separated all the guides into chapters. They also now use autowidth to determine table sizes. This works for HTML, but not supported for PDF. It is updated on the website now. David On Tue, Jun 7, 2011 at 2:26 PM, Michael Wilde wrote: > David, > > I really like the asciidoc version. > > Can we move to a userguide/ directory with one file per chapter? (to reduce > collisions as more people start editing it) > > Also, can we set the tables so that they size their columns as needed > instead of taking up the entire page width? > > - Mike > > ------------------------------ > > Sure, Jon. The userguide is now kept in the docs/userguide directory of > Swift. It's in asciidoc format. Feel free to make changes there, or email me > the text and I can add it for you. > > David > > On Thu, Jun 2, 2011 at 3:10 PM, Jonathan S Monette wrote: > >> I can write something up and send it to David to add somewhere for the >> userguide. Not sure where the files for the userguide are kept. >> >> >> On Thu, Jun 2, 2011 at 3:08 PM, Michael Wilde wrote: >> >>> Thanks for clarifying. Jon and/or David, can you address this with a >>> cookbook entry on Coasters that heads towards a users guide section? >>> >>> We should tell users what they can run on their cluster (eg ping or >>> telnet-style connect tests) to validate the setting of internalHostName. >>> >>> - Mike >>> >>> >>> >>> ----- Original Message ----- >>> > Right. If the head node has multiple network interfaces, only one of >>> > which is visible from the worker nodes. >>> > >>> > The choice of which interface is the one that worker nodes can connect >>> > to is a matter of the particular cluster. It's not particularly easy >>> > to >>> > have an automated mechanism that figures it out. We tried some scheme >>> > to >>> > pass all the interface addresses to the worker and let it try to >>> > connect >>> > to all of them in order, but that didn't work very well. Of course, >>> > there might be a scheme that works, but I didn't want to spend too >>> > much >>> > time on that. >>> > >>> > So that's why it's needed. To clarify to the workers which exact >>> > interface on the head node they are to try to connect to. >>> > >>> > Mihael >>> > >>> > On Thu, 2011-06-02 at 14:10 -0500, Jonathan S Monette wrote: >>> > > Mihael, >>> > > I believe we have talked about this before but why is it >>> > > necessary >>> > > for an InternalHostname to be specified for PADS? I know that the >>> > > address that coasters connects to is wrong but I do not remember why >>> > > that was. Could you give an explanation on why internalHostname >>> > > needs >>> > > to be set? >>> > > >>> > > -- >>> > > >>> > > >>> > > Any intelligent fool can make things bigger and more complex... It >>> > > takes a touch of genius - and a lot of courage to move in the >>> > > opposite >>> > > direction. >>> > > >>> > > - Albert Einstein >>> > > >>> > > >>> > > >>> > >>> > >>> > _______________________________________________ >>> > Swift-devel mailing list >>> > Swift-devel at ci.uchicago.edu >>> > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel >>> >>> -- >>> Michael Wilde >>> Computation Institute, University of Chicago >>> Mathematics and Computer Science Division >>> Argonne National Laboratory >>> >>> >> >> >> -- >> >> Any intelligent fool can make things bigger and more complex... It takes a >> touch of genius - and a lot of courage to move in the opposite direction. >> >> - Albert Einstein >> >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel >> >> > > > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bresnaha at mcs.anl.gov Tue Jun 7 18:59:15 2011 From: bresnaha at mcs.anl.gov (John Bresnahan) Date: Tue, 7 Jun 2011 18:59:15 -0500 (CDT) Subject: [Swift-devel] Re: Getting VMs from FG for use with swift In-Reply-To: Message-ID: <1246650579.15563.1307491155177.JavaMail.root@zimbra.anl.gov> great! I will take a look at that when I can (I am on travel right now). Getting back to using cloudinit.d for swift... we you able to use the VMs you got from FutureGrid with swift? I was hoping we could start encorperating that into swift, or at least acquire a set of VMs and use those as you currently use static machines in some demos. ----- Original Message ----- From: "David Kelly" To: "John Bresnahan" Cc: "Mike Wilde" , "swift-devel" Sent: Tuesday, June 7, 2011 2:39:20 PM Subject: Re: Getting VMs from FG for use with swift Hello John, I have attached a quickstart guide I wrote on how to get Swift working with futuregrid by using the new Swift coaster service scripts. This will require the latest development version of Swift. Instructions on how to download/install are in the document. Please let me know if you have any questions, if anything is unclear, or if you run into any problems. Thank you! Regards, David On Tue, May 24, 2011 at 3:18 PM, John Bresnahan < bresnaha at mcs.anl.gov > wrote: The GPFS server on the FG cluster hotel died yesterday so I cannot get you your credentials. ?I'll get back to you when it is up again. ?Once it is back the process for getting the needed access keys is described here: https://portal.futuregrid.org/tutorials/nimbus On 05/23/2011 05:24 AM, David Kelly wrote: Hi John, I now have a futuregrid account and am added to a project. I am now trying to get our scripts working together. I ran into a few problems at first when trying to run the futuregrid scripts. On the first system I tried I was getting a traceback. It is possible that the system I was using has older versions of some of the needed libraries. Then I tried it on a more system that is more frequently updated - my laptop running Ubuntu 10.10. ?It needed a newer version of the Python crypto tools installed, so I installed that (and the python development libraries) and that part seems fine now. I am now up to the point of the install script where it is trying to register keys, but it is failing. My guess is that I need to change FUTUREGRID_IAAS_ACCESS_KEY and FUTUREGRID_IAAS_SECRET_KEY in env.sh. I'm not sure what these should be exactly. Are these the contents of my ssh keys, an ssh key and a passphrase, or some other type of security? I've tried a few combinations of different things but haven't had much luck yet. Thanks! Regards, David Traceback from earlier: Installing setuptools.......................done. ? Complete output from command /autonfs/home/davidk/swift-vm-...ython /autonfs/home/davidk/swift-vm-...stall pip: ? Searching for pip Reading http://pypi.python.org/simple/pip/ Reading http://pip.openplans.org Reading http://www.pip-installer.org Best match: pip 1.0.1 Downloading http://pypi.python.org/packages/source/p/pip/pip-1.0.1.tar.gz#md5=28dcc70225e5bf925532abc5b087a94b Processing pip-1.0.1.tar.gz Running pip-1.0.1/setup.py -q bdist_egg --dist-dir /tmp/easy_install-GHsjHX/pip-1.0.1/egg-dist-tmp-rXjQ7L Traceback (most recent call last): ? File "/autonfs/home/davidk/swift-vm-boot/ve/bin/easy_install", line 8, in ? ? load_entry_point('setuptools==0.6c11', 'console_scripts', 'easy_install')() ? File "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", line 1712, in main ? File "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", line 1700, in with_ei_usage ? File "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", line 1716, in ? File "/soft/python-2.6.1-r1/lib/python2.6/distutils/core.py", line 152, in setup ? ? dist.run_commands() ? File "/soft/python-2.6.1-r1/lib/python2.6/distutils/dist.py", line 975, in run_commands ? ? self.run_command(cmd) ? File "/soft/python-2.6.1-r1/lib/python2.6/distutils/dist.py", line 995, in run_command ? ? cmd_obj.run() ? File "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", line 211, in run ? File "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", line 446, in easy_install ? File "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", line 476, in install_item ? File "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", line 655, in install_eggs ? File "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", line 930, in build_and_install ? File "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", line 919, in run_setup ? File "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/sandbox.py", line 52, in run_setup AttributeError: 'module' object has no attribute '__getstate__' ---------------------------------------- Traceback (most recent call last): ? File "bin/virtualenv.py", line 1647, in ? ? main() ? File "bin/virtualenv.py", line 558, in main ? ? prompt=options.prompt) ? File "bin/virtualenv.py", line 656, in create_environment ? ? install_pip(py_executable) ? File "bin/virtualenv.py", line 415, in install_pip ? ? filter_stdout=_filter_setup) ? File "bin/virtualenv.py", line 624, in call_subprocess ? ? % (cmd_desc, proc.returncode)) OSError: Command /autonfs/home/davidk/swift-vm-...ython /autonfs/home/davidk/swift-vm-...stall pip failed with error code 1 Failed to created the needed python virtual environment On Fri, May 20, 2011 at 7:20 PM, John Bresnahan < bresnaha at mcs.anl.gov > wrote: ? ?Our phone call today left me motiviated to show you guys how easy it is to get virtual machines ? ?for use with swift on FutureGrid. ? ?I made some small scripts around the Nimbus tool cloudinitd. ?The scripts just make installing ? ?the software and running it trivial. ?With a single command you can get N VMs from the ? ?FutureGrid Nimbus clouds (N can be on the order of hundreds). ?When the tool is done it outputs ? ?a line separated list of hostnames. ?All of these hostnames have root access available via your ? ?~/.ssh/id_rsa keys. ? ?If/when you have FutureGrid credentials, untar the attachment and give it a try. ?There are a ? ?few minor configurations needed: ? ?1) edit the file env.sh and set your FutureGrid security credentials: ? ?% cat env.sh ? ?export FUTUREGRID_IAAS_ACCESS_KEY=XXXXXXXXXXXXXXXXXX ? ?export FUTUREGRID_IAAS_SECRET_KEY=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX ? ?export FUTUREGRID_HOTEL_NODES=2 ? ?export FUTUREGRID_SIERRA_NODES=2 ? ?You can also change the value '2' to be whatever number of VMs you want. ? ?2) install it on your system. ?(this single command downloads and installs everything you need ? ?under the cwd): ? ?% ./install.sh ? ?3) boot the VMs ? ?% ./bin/bootit.sh. You will see much status output, but the last several lines will be the hostnames acquired from ? ?the cloud. ? ?Let me know when you guys are ready to check this out! From davidkelly999 at gmail.com Tue Jun 7 20:01:22 2011 From: davidkelly999 at gmail.com (David Kelly) Date: Tue, 7 Jun 2011 20:01:22 -0500 Subject: [Swift-devel] Re: Getting VMs from FG for use with swift In-Reply-To: <1246650579.15563.1307491155177.JavaMail.root@zimbra.anl.gov> References: <1246650579.15563.1307491155177.JavaMail.root@zimbra.anl.gov> Message-ID: Yep, I am using the VMs I get from cloudinitd with Swift. The only problem I sometimes notice is that cloudinitd can be pretty slow to finish.. somewhere between the point where it prints the information about the hosts (hostname, instance id, etc) but before it prints the success message and exits. I was thinking maybe it's related to the hotel filesystem problems from last week, but I'm not sure. Other times cloudinitd finishes quickly without any major delays. Once it gets past initializing the VMs it runs great. I've been testing with a script called hostsn.swift which basically just calls 'hostname' several times and sends the output to a file. It's useful for verifying that all the VMs are processing work. When you get a chance to test it out, the hostsn script is in the swift examples directory. You can call it with something like this: swift -sites.file sites.xml -tc.file tc.data -config cf hostsn.swift -n=100 N is the number of 'hostname' processes to launch.. the files will be created in a directory called outdir. Feel free to send me an email when you get a chance to look at it and we can talk more about it. David On Tue, Jun 7, 2011 at 6:59 PM, John Bresnahan wrote: > great! I will take a look at that when I can (I am on travel right now). > > Getting back to using cloudinit.d for swift... we you able to use the VMs > you got from FutureGrid with swift? I was hoping we could start > encorperating that into swift, or at least acquire a set of VMs and use > those as you currently use static machines in some demos. > > ----- Original Message ----- > From: "David Kelly" > To: "John Bresnahan" > Cc: "Mike Wilde" , "swift-devel" < > swift-devel at ci.uchicago.edu> > Sent: Tuesday, June 7, 2011 2:39:20 PM > Subject: Re: Getting VMs from FG for use with swift > > Hello John, > > I have attached a quickstart guide I wrote on how to get Swift working with > futuregrid by using the new Swift coaster service scripts. This will require > the latest development version of Swift. Instructions on how to > download/install are in the document. > > Please let me know if you have any questions, if anything is unclear, or if > you run into any problems. Thank you! > > Regards, > David > > > On Tue, May 24, 2011 at 3:18 PM, John Bresnahan < bresnaha at mcs.anl.gov > > wrote: > > > The GPFS server on the FG cluster hotel died yesterday so I cannot get you > your credentials. I'll get back to you when it is up again. Once it is > back the process for getting the needed access keys is described here: > > https://portal.futuregrid.org/tutorials/nimbus > > > > > On 05/23/2011 05:24 AM, David Kelly wrote: > > > > > > Hi John, > > I now have a futuregrid account and am added to a project. I am now trying > to get our scripts > working together. > > I ran into a few problems at first when trying to run the futuregrid > scripts. On the first system I > tried I was getting a traceback. It is possible that the system I was using > has older versions of > some of the needed libraries. Then I tried it on a more system that is more > frequently updated - my > laptop running Ubuntu 10.10. It needed a newer version of the Python > crypto tools installed, so I > installed that (and the python development libraries) and that part seems > fine now. > > I am now up to the point of the install script where it is trying to > register keys, but it is > failing. My guess is that I need to change FUTUREGRID_IAAS_ACCESS_KEY and > FUTUREGRID_IAAS_SECRET_KEY > in env.sh. I'm not sure what these should be exactly. Are these the > contents of my ssh keys, an ssh > key and a passphrase, or some other type of security? I've tried a few > combinations of different > things but haven't had much luck yet. > > Thanks! > > Regards, > David > > > Traceback from earlier: > Installing setuptools.......................done. > Complete output from command /autonfs/home/davidk/swift-vm-...ython > /autonfs/home/davidk/swift-vm-...stall pip: > Searching for pip > Reading http://pypi.python.org/simple/pip/ > Reading http://pip.openplans.org > Reading http://www.pip-installer.org > Best match: pip 1.0.1 > Downloading > > http://pypi.python.org/packages/source/p/pip/pip-1.0.1.tar.gz#md5=28dcc70225e5bf925532abc5b087a94b > Processing pip-1.0.1.tar.gz > Running pip-1.0.1/setup.py -q bdist_egg --dist-dir > /tmp/easy_install-GHsjHX/pip-1.0.1/egg-dist-tmp-rXjQ7L > Traceback (most recent call last): > File "/autonfs/home/davidk/swift-vm-boot/ve/bin/easy_install", line 8, in > > load_entry_point('setuptools==0.6c11', 'console_scripts', > 'easy_install')() > File > > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 1712, in main > File > > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 1700, in with_ei_usage > File > > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 1716, in > File "/soft/python-2.6.1-r1/lib/python2.6/distutils/core.py", line 152, > in setup > dist.run_commands() > File "/soft/python-2.6.1-r1/lib/python2.6/distutils/dist.py", line 975, > in run_commands > self.run_command(cmd) > File "/soft/python-2.6.1-r1/lib/python2.6/distutils/dist.py", line 995, > in run_command > cmd_obj.run() > File > > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 211, in run > File > > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 446, in easy_install > File > > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 476, in install_item > File > > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 655, in install_eggs > File > > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 930, in build_and_install > File > > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 919, in run_setup > File > > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/sandbox.py", > line 52, in run_setup > AttributeError: 'module' object has no attribute '__getstate__' > ---------------------------------------- > Traceback (most recent call last): > File "bin/virtualenv.py", line 1647, in > main() > File "bin/virtualenv.py", line 558, in main > prompt=options.prompt) > File "bin/virtualenv.py", line 656, in create_environment > install_pip(py_executable) > File "bin/virtualenv.py", line 415, in install_pip > filter_stdout=_filter_setup) > File "bin/virtualenv.py", line 624, in call_subprocess > % (cmd_desc, proc.returncode)) > OSError: Command /autonfs/home/davidk/swift-vm-...ython > /autonfs/home/davidk/swift-vm-...stall pip > failed with error code 1 > Failed to created the needed python virtual environment > > On Fri, May 20, 2011 at 7:20 PM, John Bresnahan < bresnaha at mcs.anl.gov bresnaha at mcs.anl.gov >> > > > > wrote: > > Our phone call today left me motiviated to show you guys how easy it is > to get virtual machines > for use with swift on FutureGrid. > > I made some small scripts around the Nimbus tool cloudinitd. The > scripts just make installing > the software and running it trivial. With a single command you can get > N VMs from the > FutureGrid Nimbus clouds (N can be on the order of hundreds). When the > tool is done it outputs > a line separated list of hostnames. All of these hostnames have root > access available via your > ~/.ssh/id_rsa keys. > > If/when you have FutureGrid credentials, untar the attachment and give > it a try. There are a > few minor configurations needed: > > > 1) edit the file env.sh and set your FutureGrid security credentials: > > % cat env.sh > export FUTUREGRID_IAAS_ACCESS_KEY=XXXXXXXXXXXXXXXXXX > export FUTUREGRID_IAAS_SECRET_KEY=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX > > export FUTUREGRID_HOTEL_NODES=2 > export FUTUREGRID_SIERRA_NODES=2 > > You can also change the value '2' to be whatever number of VMs you want. > > > 2) install it on your system. (this single command downloads and > installs everything you need > under the cwd): > > % ./install.sh > > 3) boot the VMs > % ./bin/bootit.sh. > You will see much status output, but the last several lines will be the > hostnames acquired from > the cloud. > > Let me know when you guys are ready to check this out! > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bresnaha at mcs.anl.gov Tue Jun 7 23:27:35 2011 From: bresnaha at mcs.anl.gov (John Bresnahan) Date: Tue, 07 Jun 2011 18:27:35 -1000 Subject: [Swift-devel] Re: Getting VMs from FG for use with swift In-Reply-To: References: <1246650579.15563.1307491155177.JavaMail.root@zimbra.anl.gov> Message-ID: <4DEEFA37.8080906@mcs.anl.gov> Are we at a point where we can have cloudinit.d called from swift and have the hostnames parsed from the json output file? That might make a good demo. On 06/07/2011 03:01 PM, David Kelly wrote: > Yep, I am using the VMs I get from cloudinitd with Swift. The only problem I sometimes notice is > that cloudinitd can be pretty slow to finish.. somewhere between the point where it prints the > information about the hosts (hostname, instance id, etc) but before it prints the success message > and exits. I was thinking maybe it's related to the hotel filesystem problems from last week, but > I'm not sure. Other times cloudinitd finishes quickly without any major delays. > > Once it gets past initializing the VMs it runs great. I've been testing with a script called > hostsn.swift which basically just calls 'hostname' several times and sends the output to a file. > It's useful for verifying that all the VMs are processing work. When you get a chance to test it > out, the hostsn script is in the swift examples directory. You can call it with something like this: > > swift -sites.file sites.xml -tc.file tc.data -config cf hostsn.swift -n=100 > > N is the number of 'hostname' processes to launch.. the files will be created in a directory called > outdir. > > Feel free to send me an email when you get a chance to look at it and we can talk more about it. > > David > > On Tue, Jun 7, 2011 at 6:59 PM, John Bresnahan > > wrote: > > great! I will take a look at that when I can (I am on travel right now). > > Getting back to using cloudinit.d for swift... we you able to use the VMs you got from > FutureGrid with swift? I was hoping we could start encorperating that into swift, or at least > acquire a set of VMs and use those as you currently use static machines in some demos. > > ----- Original Message ----- > From: "David Kelly" > > To: "John Bresnahan" > > Cc: "Mike Wilde" >, "swift-devel" > > > Sent: Tuesday, June 7, 2011 2:39:20 PM > Subject: Re: Getting VMs from FG for use with swift > > Hello John, > > I have attached a quickstart guide I wrote on how to get Swift working with futuregrid by using > the new Swift coaster service scripts. This will require the latest development version of > Swift. Instructions on how to download/install are in the document. > > Please let me know if you have any questions, if anything is unclear, or if you run into any > problems. Thank you! > > Regards, > David > > > On Tue, May 24, 2011 at 3:18 PM, John Bresnahan < bresnaha at mcs.anl.gov > > wrote: > > > The GPFS server on the FG cluster hotel died yesterday so I cannot get you your credentials. > I'll get back to you when it is up again. Once it is back the process for getting the needed > access keys is described here: > > https://portal.futuregrid.org/tutorials/nimbus > > > > > On 05/23/2011 05:24 AM, David Kelly wrote: > > > > > > Hi John, > > I now have a futuregrid account and am added to a project. I am now trying to get our scripts > working together. > > I ran into a few problems at first when trying to run the futuregrid scripts. On the first system I > tried I was getting a traceback. It is possible that the system I was using has older versions of > some of the needed libraries. Then I tried it on a more system that is more frequently updated - my > laptop running Ubuntu 10.10. It needed a newer version of the Python crypto tools installed, so I > installed that (and the python development libraries) and that part seems fine now. > > I am now up to the point of the install script where it is trying to register keys, but it is > failing. My guess is that I need to change FUTUREGRID_IAAS_ACCESS_KEY and FUTUREGRID_IAAS_SECRET_KEY > in env.sh. I'm not sure what these should be exactly. Are these the contents of my ssh keys, an ssh > key and a passphrase, or some other type of security? I've tried a few combinations of different > things but haven't had much luck yet. > > Thanks! > > Regards, > David > > > Traceback from earlier: > Installing setuptools.......................done. > Complete output from command /autonfs/home/davidk/swift-vm-...ython > /autonfs/home/davidk/swift-vm-...stall pip: > Searching for pip > Reading http://pypi.python.org/simple/pip/ > Reading http://pip.openplans.org > Reading http://www.pip-installer.org > Best match: pip 1.0.1 > Downloading > http://pypi.python.org/packages/source/p/pip/pip-1.0.1.tar.gz#md5=28dcc70225e5bf925532abc5b087a94b > Processing pip-1.0.1.tar.gz > Running pip-1.0.1/setup.py -q bdist_egg --dist-dir > /tmp/easy_install-GHsjHX/pip-1.0.1/egg-dist-tmp-rXjQ7L > Traceback (most recent call last): > File "/autonfs/home/davidk/swift-vm-boot/ve/bin/easy_install", line 8, in > load_entry_point('setuptools==0.6c11', 'console_scripts', 'easy_install')() > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 1712, in main > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 1700, in with_ei_usage > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 1716, in > File "/soft/python-2.6.1-r1/lib/python2.6/distutils/core.py", line 152, in setup > dist.run_commands() > File "/soft/python-2.6.1-r1/lib/python2.6/distutils/dist.py", line 975, in run_commands > self.run_command(cmd) > File "/soft/python-2.6.1-r1/lib/python2.6/distutils/dist.py", line 995, in run_command > cmd_obj.run() > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 211, in run > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 446, in easy_install > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 476, in install_item > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 655, in install_eggs > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 930, in build_and_install > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 919, in run_setup > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/sandbox.py", > line 52, in run_setup > AttributeError: 'module' object has no attribute '__getstate__' > ---------------------------------------- > Traceback (most recent call last): > File "bin/virtualenv.py", line 1647, in > main() > File "bin/virtualenv.py", line 558, in main > prompt=options.prompt) > File "bin/virtualenv.py", line 656, in create_environment > install_pip(py_executable) > File "bin/virtualenv.py", line 415, in install_pip > filter_stdout=_filter_setup) > File "bin/virtualenv.py", line 624, in call_subprocess > % (cmd_desc, proc.returncode)) > OSError: Command /autonfs/home/davidk/swift-vm-...ython /autonfs/home/davidk/swift-vm-...stall pip > failed with error code 1 > Failed to created the needed python virtual environment > > On Fri, May 20, 2011 at 7:20 PM, John Bresnahan < bresnaha at mcs.anl.gov > >> > > > > wrote: > > Our phone call today left me motiviated to show you guys how easy it is to get virtual machines > for use with swift on FutureGrid. > > I made some small scripts around the Nimbus tool cloudinitd. The scripts just make installing > the software and running it trivial. With a single command you can get N VMs from the > FutureGrid Nimbus clouds (N can be on the order of hundreds). When the tool is done it outputs > a line separated list of hostnames. All of these hostnames have root access available via your > ~/.ssh/id_rsa keys. > > If/when you have FutureGrid credentials, untar the attachment and give it a try. There are a > few minor configurations needed: > > > 1) edit the file env.sh and set your FutureGrid security credentials: > > % cat env.sh > export FUTUREGRID_IAAS_ACCESS_KEY=XXXXXXXXXXXXXXXXXX > export FUTUREGRID_IAAS_SECRET_KEY=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX > > export FUTUREGRID_HOTEL_NODES=2 > export FUTUREGRID_SIERRA_NODES=2 > > You can also change the value '2' to be whatever number of VMs you want. > > > 2) install it on your system. (this single command downloads and installs everything you need > under the cwd): > > % ./install.sh > > 3) boot the VMs > % ./bin/bootit.sh. > You will see much status output, but the last several lines will be the hostnames acquired from > the cloud. > > Let me know when you guys are ready to check this out! > > > > > From davidkelly999 at gmail.com Wed Jun 8 07:38:36 2011 From: davidkelly999 at gmail.com (David Kelly) Date: Wed, 8 Jun 2011 07:38:36 -0500 Subject: [Swift-devel] Re: Getting VMs from FG for use with swift In-Reply-To: <4DEEFA37.8080906@mcs.anl.gov> References: <1246650579.15563.1307491155177.JavaMail.root@zimbra.anl.gov> <4DEEFA37.8080906@mcs.anl.gov> Message-ID: It's not called directly from Swift, but the start-coaster-service script does that. It calls bootit.sh(cloudinitd), extracts information from the JSON output, sets up reverse SSH tunneling if needed, starts the worker perl script on each node, and then generates Swift configuration files with the correct values. David On Tue, Jun 7, 2011 at 11:27 PM, John Bresnahan wrote: > Are we at a point where we can have cloudinit.d called from swift and have > the hostnames parsed from the json output file? That might make a good > demo. > > > On 06/07/2011 03:01 PM, David Kelly wrote: > >> Yep, I am using the VMs I get from cloudinitd with Swift. The only problem >> I sometimes notice is >> that cloudinitd can be pretty slow to finish.. somewhere between the point >> where it prints the >> information about the hosts (hostname, instance id, etc) but before it >> prints the success message >> and exits. I was thinking maybe it's related to the hotel filesystem >> problems from last week, but >> I'm not sure. Other times cloudinitd finishes quickly without any major >> delays. >> >> Once it gets past initializing the VMs it runs great. I've been testing >> with a script called >> hostsn.swift which basically just calls 'hostname' several times and sends >> the output to a file. >> It's useful for verifying that all the VMs are processing work. When you >> get a chance to test it >> out, the hostsn script is in the swift examples directory. You can call it >> with something like this: >> >> swift -sites.file sites.xml -tc.file tc.data -config cf hostsn.swift >> -n=100 >> >> N is the number of 'hostname' processes to launch.. the files will be >> created in a directory called >> outdir. >> >> Feel free to send me an email when you get a chance to look at it and we >> can talk more about it. >> >> David >> >> On Tue, Jun 7, 2011 at 6:59 PM, John Bresnahan > bresnaha at mcs.anl.gov>> >> wrote: >> >> great! I will take a look at that when I can (I am on travel right >> now). >> >> Getting back to using cloudinit.d for swift... we you able to use the >> VMs you got from >> FutureGrid with swift? I was hoping we could start encorperating that >> into swift, or at least >> acquire a set of VMs and use those as you currently use static machines >> in some demos. >> >> ----- Original Message ----- >> From: "David Kelly" > davidkelly999 at gmail.com>> >> To: "John Bresnahan" > bresnaha at mcs.anl.gov>> >> Cc: "Mike Wilde" >, >> "swift-devel" >> > >> >> Sent: Tuesday, June 7, 2011 2:39:20 PM >> Subject: Re: Getting VMs from FG for use with swift >> >> Hello John, >> >> I have attached a quickstart guide I wrote on how to get Swift working >> with futuregrid by using >> the new Swift coaster service scripts. This will require the latest >> development version of >> Swift. Instructions on how to download/install are in the document. >> >> Please let me know if you have any questions, if anything is unclear, >> or if you run into any >> problems. Thank you! >> >> Regards, >> David >> >> >> On Tue, May 24, 2011 at 3:18 PM, John Bresnahan < bresnaha at mcs.anl.gov >> > wrote: >> >> >> The GPFS server on the FG cluster hotel died yesterday so I cannot get >> you your credentials. >> I'll get back to you when it is up again. Once it is back the >> process for getting the needed >> access keys is described here: >> >> https://portal.futuregrid.org/tutorials/nimbus >> >> >> >> >> On 05/23/2011 05:24 AM, David Kelly wrote: >> >> >> >> >> >> Hi John, >> >> I now have a futuregrid account and am added to a project. I am now >> trying to get our scripts >> working together. >> >> I ran into a few problems at first when trying to run the futuregrid >> scripts. On the first system I >> tried I was getting a traceback. It is possible that the system I was >> using has older versions of >> some of the needed libraries. Then I tried it on a more system that is >> more frequently updated - my >> laptop running Ubuntu 10.10. It needed a newer version of the Python >> crypto tools installed, so I >> installed that (and the python development libraries) and that part >> seems fine now. >> >> I am now up to the point of the install script where it is trying to >> register keys, but it is >> failing. My guess is that I need to change FUTUREGRID_IAAS_ACCESS_KEY >> and FUTUREGRID_IAAS_SECRET_KEY >> in env.sh. I'm not sure what these should be exactly. Are these the >> contents of my ssh keys, an ssh >> key and a passphrase, or some other type of security? I've tried a few >> combinations of different >> things but haven't had much luck yet. >> >> Thanks! >> >> Regards, >> David >> >> >> Traceback from earlier: >> Installing setuptools.......................done. >> Complete output from command /autonfs/home/davidk/swift-vm-...ython >> /autonfs/home/davidk/swift-vm-...stall pip: >> Searching for pip >> Reading http://pypi.python.org/simple/pip/ >> Reading http://pip.openplans.org >> Reading http://www.pip-installer.org >> Best match: pip 1.0.1 >> Downloading >> >> http://pypi.python.org/packages/source/p/pip/pip-1.0.1.tar.gz#md5=28dcc70225e5bf925532abc5b087a94b >> Processing pip-1.0.1.tar.gz >> Running pip-1.0.1/setup.py -q bdist_egg --dist-dir >> /tmp/easy_install-GHsjHX/pip-1.0.1/egg-dist-tmp-rXjQ7L >> Traceback (most recent call last): >> File "/autonfs/home/davidk/swift-vm-boot/ve/bin/easy_install", line >> 8, in >> load_entry_point('setuptools==0.6c11', 'console_scripts', >> 'easy_install')() >> File >> >> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >> line 1712, in main >> File >> >> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >> line 1700, in with_ei_usage >> File >> >> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >> line 1716, in >> File "/soft/python-2.6.1-r1/lib/python2.6/distutils/core.py", line >> 152, in setup >> dist.run_commands() >> File "/soft/python-2.6.1-r1/lib/python2.6/distutils/dist.py", line >> 975, in run_commands >> self.run_command(cmd) >> File "/soft/python-2.6.1-r1/lib/python2.6/distutils/dist.py", line >> 995, in run_command >> cmd_obj.run() >> File >> >> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >> line 211, in run >> File >> >> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >> line 446, in easy_install >> File >> >> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >> line 476, in install_item >> File >> >> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >> line 655, in install_eggs >> File >> >> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >> line 930, in build_and_install >> File >> >> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >> line 919, in run_setup >> File >> >> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/sandbox.py", >> line 52, in run_setup >> AttributeError: 'module' object has no attribute '__getstate__' >> ---------------------------------------- >> Traceback (most recent call last): >> File "bin/virtualenv.py", line 1647, in >> main() >> File "bin/virtualenv.py", line 558, in main >> prompt=options.prompt) >> File "bin/virtualenv.py", line 656, in create_environment >> install_pip(py_executable) >> File "bin/virtualenv.py", line 415, in install_pip >> filter_stdout=_filter_setup) >> File "bin/virtualenv.py", line 624, in call_subprocess >> % (cmd_desc, proc.returncode)) >> OSError: Command /autonfs/home/davidk/swift-vm-...ython >> /autonfs/home/davidk/swift-vm-...stall pip >> failed with error code 1 >> Failed to created the needed python virtual environment >> >> On Fri, May 20, 2011 at 7:20 PM, John Bresnahan < bresnaha at mcs.anl.gov >> > bresnaha at mcs.anl.gov> >> >> >> >> >> >> wrote: >> >> Our phone call today left me motiviated to show you guys how easy >> it is to get virtual machines >> for use with swift on FutureGrid. >> >> I made some small scripts around the Nimbus tool cloudinitd. The >> scripts just make installing >> the software and running it trivial. With a single command you can >> get N VMs from the >> FutureGrid Nimbus clouds (N can be on the order of hundreds). When >> the tool is done it outputs >> a line separated list of hostnames. All of these hostnames have >> root access available via your >> ~/.ssh/id_rsa keys. >> >> If/when you have FutureGrid credentials, untar the attachment and >> give it a try. There are a >> few minor configurations needed: >> >> >> 1) edit the file env.sh and set your FutureGrid security >> credentials: >> >> % cat env.sh >> export FUTUREGRID_IAAS_ACCESS_KEY=XXXXXXXXXXXXXXXXXX >> export >> FUTUREGRID_IAAS_SECRET_KEY=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX >> >> export FUTUREGRID_HOTEL_NODES=2 >> export FUTUREGRID_SIERRA_NODES=2 >> >> You can also change the value '2' to be whatever number of VMs you >> want. >> >> >> 2) install it on your system. (this single command downloads and >> installs everything you need >> under the cwd): >> >> % ./install.sh >> >> 3) boot the VMs >> % ./bin/bootit.sh. >> You will see much status output, but the last several lines will be the >> hostnames acquired from >> the cloud. >> >> Let me know when you guys are ready to check this out! >> >> >> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bresnaha at mcs.anl.gov Wed Jun 8 08:44:40 2011 From: bresnaha at mcs.anl.gov (John Bresnahan) Date: Wed, 08 Jun 2011 03:44:40 -1000 Subject: [Swift-devel] Re: Getting VMs from FG for use with swift In-Reply-To: References: <1246650579.15563.1307491155177.JavaMail.root@zimbra.anl.gov> <4DEEFA37.8080906@mcs.anl.gov> Message-ID: <4DEF7CC8.7020600@mcs.anl.gov> That sounds great. Perhaps we should have another meeting to discuss what we can demo from here. On 06/08/2011 02:38 AM, David Kelly wrote: > It's not called directly from Swift, but the start-coaster-service script does that. It calls > bootit.sh(cloudinitd), extracts information from the JSON output, sets up reverse SSH tunneling if > needed, starts the worker perl script on each node, and then generates Swift configuration files > with the correct values. > > David > > On Tue, Jun 7, 2011 at 11:27 PM, John Bresnahan > > wrote: > > Are we at a point where we can have cloudinit.d called from swift and have the hostnames parsed > from the json output file? That might make a good demo. > > > On 06/07/2011 03:01 PM, David Kelly wrote: > > Yep, I am using the VMs I get from cloudinitd with Swift. The only problem I sometimes notice is > that cloudinitd can be pretty slow to finish.. somewhere between the point where it prints the > information about the hosts (hostname, instance id, etc) but before it prints the success > message > and exits. I was thinking maybe it's related to the hotel filesystem problems from last > week, but > I'm not sure. Other times cloudinitd finishes quickly without any major delays. > > Once it gets past initializing the VMs it runs great. I've been testing with a script called > hostsn.swift which basically just calls 'hostname' several times and sends the output to a file. > It's useful for verifying that all the VMs are processing work. When you get a chance to test it > out, the hostsn script is in the swift examples directory. You can call it with something > like this: > > swift -sites.file sites.xml -tc.file tc.data -config cf hostsn.swift -n=100 > > N is the number of 'hostname' processes to launch.. the files will be created in a directory > called > outdir. > > Feel free to send me an email when you get a chance to look at it and we can talk more about it. > > David > > On Tue, Jun 7, 2011 at 6:59 PM, John Bresnahan >> > wrote: > > great! I will take a look at that when I can (I am on travel right now). > > Getting back to using cloudinit.d for swift... we you able to use the VMs you got from > FutureGrid with swift? I was hoping we could start encorperating that into swift, or at > least > acquire a set of VMs and use those as you currently use static machines in some demos. > > ----- Original Message ----- > From: "David Kelly" > >> > To: "John Bresnahan" > >> > Cc: "Mike Wilde" >>, "swift-devel" > > >> > > Sent: Tuesday, June 7, 2011 2:39:20 PM > Subject: Re: Getting VMs from FG for use with swift > > Hello John, > > I have attached a quickstart guide I wrote on how to get Swift working with futuregrid > by using > the new Swift coaster service scripts. This will require the latest development version of > Swift. Instructions on how to download/install are in the document. > > Please let me know if you have any questions, if anything is unclear, or if you run into any > problems. Thank you! > > Regards, > David > > > On Tue, May 24, 2011 at 3:18 PM, John Bresnahan < bresnaha at mcs.anl.gov > > > > wrote: > > > The GPFS server on the FG cluster hotel died yesterday so I cannot get you your credentials. > I'll get back to you when it is up again. Once it is back the process for getting the > needed > access keys is described here: > > https://portal.futuregrid.org/tutorials/nimbus > > > > > On 05/23/2011 05:24 AM, David Kelly wrote: > > > > > > Hi John, > > I now have a futuregrid account and am added to a project. I am now trying to get our > scripts > working together. > > I ran into a few problems at first when trying to run the futuregrid scripts. On the > first system I > tried I was getting a traceback. It is possible that the system I was using has older > versions of > some of the needed libraries. Then I tried it on a more system that is more frequently > updated - my > laptop running Ubuntu 10.10. It needed a newer version of the Python crypto tools > installed, so I > installed that (and the python development libraries) and that part seems fine now. > > I am now up to the point of the install script where it is trying to register keys, but > it is > failing. My guess is that I need to change FUTUREGRID_IAAS_ACCESS_KEY and > FUTUREGRID_IAAS_SECRET_KEY > in env.sh. I'm not sure what these should be exactly. Are these the contents of my ssh > keys, an ssh > key and a passphrase, or some other type of security? I've tried a few combinations of > different > things but haven't had much luck yet. > > Thanks! > > Regards, > David > > > Traceback from earlier: > Installing setuptools.......................done. > Complete output from command /autonfs/home/davidk/swift-vm-...ython > /autonfs/home/davidk/swift-vm-...stall pip: > Searching for pip > Reading http://pypi.python.org/simple/pip/ > Reading http://pip.openplans.org > Reading http://www.pip-installer.org > Best match: pip 1.0.1 > Downloading > http://pypi.python.org/packages/source/p/pip/pip-1.0.1.tar.gz#md5=28dcc70225e5bf925532abc5b087a94b > Processing pip-1.0.1.tar.gz > Running pip-1.0.1/setup.py -q bdist_egg --dist-dir > /tmp/easy_install-GHsjHX/pip-1.0.1/egg-dist-tmp-rXjQ7L > Traceback (most recent call last): > File "/autonfs/home/davidk/swift-vm-boot/ve/bin/easy_install", line 8, in > load_entry_point('setuptools==0.6c11', 'console_scripts', 'easy_install')() > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 1712, in main > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 1700, in with_ei_usage > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 1716, in > File "/soft/python-2.6.1-r1/lib/python2.6/distutils/core.py", line 152, in setup > dist.run_commands() > File "/soft/python-2.6.1-r1/lib/python2.6/distutils/dist.py", line 975, in run_commands > self.run_command(cmd) > File "/soft/python-2.6.1-r1/lib/python2.6/distutils/dist.py", line 995, in run_command > cmd_obj.run() > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 211, in run > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 446, in easy_install > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 476, in install_item > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 655, in install_eggs > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 930, in build_and_install > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > line 919, in run_setup > File > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/sandbox.py", > line 52, in run_setup > AttributeError: 'module' object has no attribute '__getstate__' > ---------------------------------------- > Traceback (most recent call last): > File "bin/virtualenv.py", line 1647, in > main() > File "bin/virtualenv.py", line 558, in main > prompt=options.prompt) > File "bin/virtualenv.py", line 656, in create_environment > install_pip(py_executable) > File "bin/virtualenv.py", line 415, in install_pip > filter_stdout=_filter_setup) > File "bin/virtualenv.py", line 624, in call_subprocess > % (cmd_desc, proc.returncode)) > OSError: Command /autonfs/home/davidk/swift-vm-...ython > /autonfs/home/davidk/swift-vm-...stall pip > failed with error code 1 > Failed to created the needed python virtual environment > > On Fri, May 20, 2011 at 7:20 PM, John Bresnahan < bresnaha at mcs.anl.gov > > > > >> > > > > > wrote: > > Our phone call today left me motiviated to show you guys how easy it is to get > virtual machines > for use with swift on FutureGrid. > > I made some small scripts around the Nimbus tool cloudinitd. The scripts just make > installing > the software and running it trivial. With a single command you can get N VMs from the > FutureGrid Nimbus clouds (N can be on the order of hundreds). When the tool is done > it outputs > a line separated list of hostnames. All of these hostnames have root access > available via your > ~/.ssh/id_rsa keys. > > If/when you have FutureGrid credentials, untar the attachment and give it a try. > There are a > few minor configurations needed: > > > 1) edit the file env.sh and set your FutureGrid security credentials: > > % cat env.sh > export FUTUREGRID_IAAS_ACCESS_KEY=XXXXXXXXXXXXXXXXXX > export FUTUREGRID_IAAS_SECRET_KEY=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX > > export FUTUREGRID_HOTEL_NODES=2 > export FUTUREGRID_SIERRA_NODES=2 > > You can also change the value '2' to be whatever number of VMs you want. > > > 2) install it on your system. (this single command downloads and installs > everything you need > under the cwd): > > % ./install.sh > > 3) boot the VMs > % ./bin/bootit.sh. > You will see much status output, but the last several lines will be the hostnames > acquired from > the cloud. > > Let me know when you guys are ready to check this out! > > > > > > > From wilde at mcs.anl.gov Wed Jun 8 09:51:36 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 8 Jun 2011 09:51:36 -0500 (CDT) Subject: [Swift-devel] Re: Getting VMs from FG for use with swift In-Reply-To: <4DEF7CC8.7020600@mcs.anl.gov> Message-ID: <1945198655.17003.1307544696516.JavaMail.root@zimbra.anl.gov> Hi John, David, A meeting sounds good. When are you in town? Or we can meet by phone tomorrow or Friday. Re Demos, we can pick from these to start: - a simple BLAST or "mock BLAST" demo: split a file, blast each segment against a DB, - protein-RNA docking with "modFTDock" (simple script, real science) - MODIS: analyze land use satellite data (has graphics) Mike ----- Original Message ----- > That sounds great. Perhaps we should have another meeting to discuss > what we can demo from here. > > On 06/08/2011 02:38 AM, David Kelly wrote: > > It's not called directly from Swift, but the start-coaster-service > > script does that. It calls > > bootit.sh(cloudinitd), extracts information from the JSON output, > > sets up reverse SSH tunneling if > > needed, starts the worker perl script on each node, and then > > generates Swift configuration files > > with the correct values. > > > > David > > > > On Tue, Jun 7, 2011 at 11:27 PM, John Bresnahan > > > > > wrote: > > > > Are we at a point where we can have cloudinit.d called from > > swift and have the hostnames parsed > > from the json output file? That might make a good demo. > > > > > > On 06/07/2011 03:01 PM, David Kelly wrote: > > > > Yep, I am using the VMs I get from cloudinitd with Swift. > > The only problem I sometimes notice is > > that cloudinitd can be pretty slow to finish.. somewhere > > between the point where it prints the > > information about the hosts (hostname, instance id, etc) but > > before it prints the success > > message > > and exits. I was thinking maybe it's related to the hotel > > filesystem problems from last > > week, but > > I'm not sure. Other times cloudinitd finishes quickly > > without any major delays. > > > > Once it gets past initializing the VMs it runs great. I've > > been testing with a script called > > hostsn.swift which basically just calls 'hostname' several > > times and sends the output to a file. > > It's useful for verifying that all the VMs are processing > > work. When you get a chance to test it > > out, the hostsn script is in the swift examples directory. > > You can call it with something > > like this: > > > > swift -sites.file sites.xml -tc.file tc.data -config cf > > hostsn.swift -n=100 > > > > N is the number of 'hostname' processes to launch.. the > > files will be created in a directory > > called > > outdir. > > > > Feel free to send me an email when you get a chance to look > > at it and we can talk more about it. > > > > David > > > > On Tue, Jun 7, 2011 at 6:59 PM, John Bresnahan > > > > >> > > wrote: > > > > great! I will take a look at that when I can (I am on > > travel right now). > > > > Getting back to using cloudinit.d for swift... we you > > able to use the VMs you got from > > FutureGrid with swift? I was hoping we could start > > encorperating that into swift, or at > > least > > acquire a set of VMs and use those as you currently use > > static machines in some demos. > > > > ----- Original Message ----- > > From: "David Kelly" > > > > >> > > To: "John Bresnahan" > > > >> > > Cc: "Mike Wilde" > > >>, "swift-devel" > > > > > > >> > > > > Sent: Tuesday, June 7, 2011 2:39:20 PM > > Subject: Re: Getting VMs from FG for use with swift > > > > Hello John, > > > > I have attached a quickstart guide I wrote on how to get > > Swift working with futuregrid > > by using > > the new Swift coaster service scripts. This will require > > the latest development version of > > Swift. Instructions on how to download/install are in > > the document. > > > > Please let me know if you have any questions, if > > anything is unclear, or if you run into any > > problems. Thank you! > > > > Regards, > > David > > > > > > On Tue, May 24, 2011 at 3:18 PM, John Bresnahan < > > bresnaha at mcs.anl.gov > > > > > > > > wrote: > > > > > > The GPFS server on the FG cluster hotel died yesterday > > so I cannot get you your credentials. > > I'll get back to you when it is up again. Once it is > > back the process for getting the > > needed > > access keys is described here: > > > > https://portal.futuregrid.org/tutorials/nimbus > > > > > > > > > > On 05/23/2011 05:24 AM, David Kelly wrote: > > > > > > > > > > > > Hi John, > > > > I now have a futuregrid account and am added to a > > project. I am now trying to get our > > scripts > > working together. > > > > I ran into a few problems at first when trying to run > > the futuregrid scripts. On the > > first system I > > tried I was getting a traceback. It is possible that the > > system I was using has older > > versions of > > some of the needed libraries. Then I tried it on a more > > system that is more frequently > > updated - my > > laptop running Ubuntu 10.10. It needed a newer version > > of the Python crypto tools > > installed, so I > > installed that (and the python development libraries) > > and that part seems fine now. > > > > I am now up to the point of the install script where it > > is trying to register keys, but > > it is > > failing. My guess is that I need to change > > FUTUREGRID_IAAS_ACCESS_KEY and > > FUTUREGRID_IAAS_SECRET_KEY > > in env.sh. I'm not sure what these should be exactly. > > Are these the contents of my ssh > > keys, an ssh > > key and a passphrase, or some other type of security? > > I've tried a few combinations of > > different > > things but haven't had much luck yet. > > > > Thanks! > > > > Regards, > > David > > > > > > Traceback from earlier: > > Installing setuptools.......................done. > > Complete output from command > > /autonfs/home/davidk/swift-vm-...ython > > /autonfs/home/davidk/swift-vm-...stall pip: > > Searching for pip > > Reading http://pypi.python.org/simple/pip/ > > Reading http://pip.openplans.org > > Reading http://www.pip-installer.org > > Best match: pip 1.0.1 > > Downloading > > http://pypi.python.org/packages/source/p/pip/pip-1.0.1.tar.gz#md5=28dcc70225e5bf925532abc5b087a94b > > Processing pip-1.0.1.tar.gz > > Running pip-1.0.1/setup.py -q bdist_egg --dist-dir > > /tmp/easy_install-GHsjHX/pip-1.0.1/egg-dist-tmp-rXjQ7L > > Traceback (most recent call last): > > File > > "/autonfs/home/davidk/swift-vm-boot/ve/bin/easy_install", > > line 8, in > > load_entry_point('setuptools==0.6c11', > > 'console_scripts', 'easy_install')() > > File > > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > > line 1712, in main > > File > > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > > line 1700, in with_ei_usage > > File > > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > > line 1716, in > > File > > "/soft/python-2.6.1-r1/lib/python2.6/distutils/core.py", > > line 152, in setup > > dist.run_commands() > > File > > "/soft/python-2.6.1-r1/lib/python2.6/distutils/dist.py", > > line 975, in run_commands > > self.run_command(cmd) > > File > > "/soft/python-2.6.1-r1/lib/python2.6/distutils/dist.py", > > line 995, in run_command > > cmd_obj.run() > > File > > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > > line 211, in run > > File > > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > > line 446, in easy_install > > File > > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > > line 476, in install_item > > File > > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > > line 655, in install_eggs > > File > > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > > line 930, in build_and_install > > File > > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", > > line 919, in run_setup > > File > > "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/sandbox.py", > > line 52, in run_setup > > AttributeError: 'module' object has no attribute > > '__getstate__' > > ---------------------------------------- > > Traceback (most recent call last): > > File "bin/virtualenv.py", line 1647, in > > main() > > File "bin/virtualenv.py", line 558, in main > > prompt=options.prompt) > > File "bin/virtualenv.py", line 656, in > > create_environment > > install_pip(py_executable) > > File "bin/virtualenv.py", line 415, in install_pip > > filter_stdout=_filter_setup) > > File "bin/virtualenv.py", line 624, in > > call_subprocess > > % (cmd_desc, proc.returncode)) > > OSError: Command /autonfs/home/davidk/swift-vm-...ython > > /autonfs/home/davidk/swift-vm-...stall pip > > failed with error code 1 > > Failed to created the needed python virtual environment > > > > On Fri, May 20, 2011 at 7:20 PM, John Bresnahan < > > bresnaha at mcs.anl.gov > > > > > > > > > > >> > > > > > > > > > > wrote: > > > > Our phone call today left me motiviated to show you > > guys how easy it is to get > > virtual machines > > for use with swift on FutureGrid. > > > > I made some small scripts around the Nimbus tool > > cloudinitd. The scripts just make > > installing > > the software and running it trivial. With a single > > command you can get N VMs from the > > FutureGrid Nimbus clouds (N can be on the order of > > hundreds). When the tool is done > > it outputs > > a line separated list of hostnames. All of these > > hostnames have root access > > available via your > > ~/.ssh/id_rsa keys. > > > > If/when you have FutureGrid credentials, untar the > > attachment and give it a try. > > There are a > > few minor configurations needed: > > > > > > 1) edit the file env.sh and set your FutureGrid > > security credentials: > > > > % cat env.sh > > export FUTUREGRID_IAAS_ACCESS_KEY=XXXXXXXXXXXXXXXXXX > > export > > FUTUREGRID_IAAS_SECRET_KEY=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX > > > > export FUTUREGRID_HOTEL_NODES=2 > > export FUTUREGRID_SIERRA_NODES=2 > > > > You can also change the value '2' to be whatever > > number of VMs you want. > > > > > > 2) install it on your system. (this single command > > downloads and installs > > everything you need > > under the cwd): > > > > % ./install.sh > > > > 3) boot the VMs > > % ./bin/bootit.sh. > > You will see much status output, but the last several > > lines will be the hostnames > > acquired from > > the cloud. > > > > Let me know when you guys are ready to check this > > out! > > > > > > > > > > > > > > -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From bresnaha at mcs.anl.gov Wed Jun 8 11:46:01 2011 From: bresnaha at mcs.anl.gov (John Bresnahan) Date: Wed, 08 Jun 2011 06:46:01 -1000 Subject: [Swift-devel] Re: Getting VMs from FG for use with swift In-Reply-To: <1945198655.17003.1307544696516.JavaMail.root@zimbra.anl.gov> References: <1945198655.17003.1307544696516.JavaMail.root@zimbra.anl.gov> Message-ID: <4DEFA749.1090203@mcs.anl.gov> I am travelling back from HPDC tomorrow, but I could do Friday. I might be able to make it to UC in the morning if that works. On 06/08/2011 04:51 AM, Michael Wilde wrote: > Hi John, David, > > A meeting sounds good. When are you in town? Or we can meet by phone tomorrow or Friday. > > Re Demos, we can pick from these to start: > > - a simple BLAST or "mock BLAST" demo: split a file, blast each segment against a DB, > > - protein-RNA docking with "modFTDock" (simple script, real science) > > - MODIS: analyze land use satellite data (has graphics) > > Mike > > > > ----- Original Message ----- >> That sounds great. Perhaps we should have another meeting to discuss >> what we can demo from here. >> >> On 06/08/2011 02:38 AM, David Kelly wrote: >>> It's not called directly from Swift, but the start-coaster-service >>> script does that. It calls >>> bootit.sh(cloudinitd), extracts information from the JSON output, >>> sets up reverse SSH tunneling if >>> needed, starts the worker perl script on each node, and then >>> generates Swift configuration files >>> with the correct values. >>> >>> David >>> >>> On Tue, Jun 7, 2011 at 11:27 PM, John Bresnahan >>> > >>> wrote: >>> >>> Are we at a point where we can have cloudinit.d called from >>> swift and have the hostnames parsed >>> from the json output file? That might make a good demo. >>> >>> >>> On 06/07/2011 03:01 PM, David Kelly wrote: >>> >>> Yep, I am using the VMs I get from cloudinitd with Swift. >>> The only problem I sometimes notice is >>> that cloudinitd can be pretty slow to finish.. somewhere >>> between the point where it prints the >>> information about the hosts (hostname, instance id, etc) but >>> before it prints the success >>> message >>> and exits. I was thinking maybe it's related to the hotel >>> filesystem problems from last >>> week, but >>> I'm not sure. Other times cloudinitd finishes quickly >>> without any major delays. >>> >>> Once it gets past initializing the VMs it runs great. I've >>> been testing with a script called >>> hostsn.swift which basically just calls 'hostname' several >>> times and sends the output to a file. >>> It's useful for verifying that all the VMs are processing >>> work. When you get a chance to test it >>> out, the hostsn script is in the swift examples directory. >>> You can call it with something >>> like this: >>> >>> swift -sites.file sites.xml -tc.file tc.data -config cf >>> hostsn.swift -n=100 >>> >>> N is the number of 'hostname' processes to launch.. the >>> files will be created in a directory >>> called >>> outdir. >>> >>> Feel free to send me an email when you get a chance to look >>> at it and we can talk more about it. >>> >>> David >>> >>> On Tue, Jun 7, 2011 at 6:59 PM, John Bresnahan >>> >> >> >> >>> wrote: >>> >>> great! I will take a look at that when I can (I am on >>> travel right now). >>> >>> Getting back to using cloudinit.d for swift... we you >>> able to use the VMs you got from >>> FutureGrid with swift? I was hoping we could start >>> encorperating that into swift, or at >>> least >>> acquire a set of VMs and use those as you currently use >>> static machines in some demos. >>> >>> ----- Original Message ----- >>> From: "David Kelly">> >>> >> >> >>> To: "John Bresnahan">> >>> >> >>> Cc: "Mike Wilde">> >> >>, "swift-devel" >>> >> >>> >> >> >>> >>> Sent: Tuesday, June 7, 2011 2:39:20 PM >>> Subject: Re: Getting VMs from FG for use with swift >>> >>> Hello John, >>> >>> I have attached a quickstart guide I wrote on how to get >>> Swift working with futuregrid >>> by using >>> the new Swift coaster service scripts. This will require >>> the latest development version of >>> Swift. Instructions on how to download/install are in >>> the document. >>> >>> Please let me know if you have any questions, if >>> anything is unclear, or if you run into any >>> problems. Thank you! >>> >>> Regards, >>> David >>> >>> >>> On Tue, May 24, 2011 at 3:18 PM, John Bresnahan< >>> bresnaha at mcs.anl.gov >>> >>> > >>> > wrote: >>> >>> >>> The GPFS server on the FG cluster hotel died yesterday >>> so I cannot get you your credentials. >>> I'll get back to you when it is up again. Once it is >>> back the process for getting the >>> needed >>> access keys is described here: >>> >>> https://portal.futuregrid.org/tutorials/nimbus >>> >>> >>> >>> >>> On 05/23/2011 05:24 AM, David Kelly wrote: >>> >>> >>> >>> >>> >>> Hi John, >>> >>> I now have a futuregrid account and am added to a >>> project. I am now trying to get our >>> scripts >>> working together. >>> >>> I ran into a few problems at first when trying to run >>> the futuregrid scripts. On the >>> first system I >>> tried I was getting a traceback. It is possible that the >>> system I was using has older >>> versions of >>> some of the needed libraries. Then I tried it on a more >>> system that is more frequently >>> updated - my >>> laptop running Ubuntu 10.10. It needed a newer version >>> of the Python crypto tools >>> installed, so I >>> installed that (and the python development libraries) >>> and that part seems fine now. >>> >>> I am now up to the point of the install script where it >>> is trying to register keys, but >>> it is >>> failing. My guess is that I need to change >>> FUTUREGRID_IAAS_ACCESS_KEY and >>> FUTUREGRID_IAAS_SECRET_KEY >>> in env.sh. I'm not sure what these should be exactly. >>> Are these the contents of my ssh >>> keys, an ssh >>> key and a passphrase, or some other type of security? >>> I've tried a few combinations of >>> different >>> things but haven't had much luck yet. >>> >>> Thanks! >>> >>> Regards, >>> David >>> >>> >>> Traceback from earlier: >>> Installing setuptools.......................done. >>> Complete output from command >>> /autonfs/home/davidk/swift-vm-...ython >>> /autonfs/home/davidk/swift-vm-...stall pip: >>> Searching for pip >>> Reading http://pypi.python.org/simple/pip/ >>> Reading http://pip.openplans.org >>> Reading http://www.pip-installer.org >>> Best match: pip 1.0.1 >>> Downloading >>> http://pypi.python.org/packages/source/p/pip/pip-1.0.1.tar.gz#md5=28dcc70225e5bf925532abc5b087a94b >>> Processing pip-1.0.1.tar.gz >>> Running pip-1.0.1/setup.py -q bdist_egg --dist-dir >>> /tmp/easy_install-GHsjHX/pip-1.0.1/egg-dist-tmp-rXjQ7L >>> Traceback (most recent call last): >>> File >>> "/autonfs/home/davidk/swift-vm-boot/ve/bin/easy_install", >>> line 8, in >>> load_entry_point('setuptools==0.6c11', >>> 'console_scripts', 'easy_install')() >>> File >>> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >>> line 1712, in main >>> File >>> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >>> line 1700, in with_ei_usage >>> File >>> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >>> line 1716, in >>> File >>> "/soft/python-2.6.1-r1/lib/python2.6/distutils/core.py", >>> line 152, in setup >>> dist.run_commands() >>> File >>> "/soft/python-2.6.1-r1/lib/python2.6/distutils/dist.py", >>> line 975, in run_commands >>> self.run_command(cmd) >>> File >>> "/soft/python-2.6.1-r1/lib/python2.6/distutils/dist.py", >>> line 995, in run_command >>> cmd_obj.run() >>> File >>> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >>> line 211, in run >>> File >>> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >>> line 446, in easy_install >>> File >>> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >>> line 476, in install_item >>> File >>> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >>> line 655, in install_eggs >>> File >>> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >>> line 930, in build_and_install >>> File >>> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/command/easy_install.py", >>> line 919, in run_setup >>> File >>> "/autonfs/home/davidk/swift-vm-boot/ve/lib/python2.6/site-packages/setuptools-0.6c11-py2.6.egg/setuptools/sandbox.py", >>> line 52, in run_setup >>> AttributeError: 'module' object has no attribute >>> '__getstate__' >>> ---------------------------------------- >>> Traceback (most recent call last): >>> File "bin/virtualenv.py", line 1647, in >>> main() >>> File "bin/virtualenv.py", line 558, in main >>> prompt=options.prompt) >>> File "bin/virtualenv.py", line 656, in >>> create_environment >>> install_pip(py_executable) >>> File "bin/virtualenv.py", line 415, in install_pip >>> filter_stdout=_filter_setup) >>> File "bin/virtualenv.py", line 624, in >>> call_subprocess >>> % (cmd_desc, proc.returncode)) >>> OSError: Command /autonfs/home/davidk/swift-vm-...ython >>> /autonfs/home/davidk/swift-vm-...stall pip >>> failed with error code 1 >>> Failed to created the needed python virtual environment >>> >>> On Fri, May 20, 2011 at 7:20 PM, John Bresnahan< >>> bresnaha at mcs.anl.gov >>> >>> > >>> >> >> > >> >>> >>> >>> >>> >>> wrote: >>> >>> Our phone call today left me motiviated to show you >>> guys how easy it is to get >>> virtual machines >>> for use with swift on FutureGrid. >>> >>> I made some small scripts around the Nimbus tool >>> cloudinitd. The scripts just make >>> installing >>> the software and running it trivial. With a single >>> command you can get N VMs from the >>> FutureGrid Nimbus clouds (N can be on the order of >>> hundreds). When the tool is done >>> it outputs >>> a line separated list of hostnames. All of these >>> hostnames have root access >>> available via your >>> ~/.ssh/id_rsa keys. >>> >>> If/when you have FutureGrid credentials, untar the >>> attachment and give it a try. >>> There are a >>> few minor configurations needed: >>> >>> >>> 1) edit the file env.sh and set your FutureGrid >>> security credentials: >>> >>> % cat env.sh >>> export FUTUREGRID_IAAS_ACCESS_KEY=XXXXXXXXXXXXXXXXXX >>> export >>> FUTUREGRID_IAAS_SECRET_KEY=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX >>> >>> export FUTUREGRID_HOTEL_NODES=2 >>> export FUTUREGRID_SIERRA_NODES=2 >>> >>> You can also change the value '2' to be whatever >>> number of VMs you want. >>> >>> >>> 2) install it on your system. (this single command >>> downloads and installs >>> everything you need >>> under the cwd): >>> >>> % ./install.sh >>> >>> 3) boot the VMs >>> % ./bin/bootit.sh. >>> You will see much status output, but the last several >>> lines will be the hostnames >>> acquired from >>> the cloud. >>> >>> Let me know when you guys are ready to check this >>> out! >>> >>> >>> >>> >>> >>> >>> > From dsk at ci.uchicago.edu Wed Jun 8 12:12:24 2011 From: dsk at ci.uchicago.edu (Daniel S. Katz) Date: Wed, 8 Jun 2011 10:12:24 -0700 Subject: [Swift-devel] MTC-like paper Message-ID: <0F1A0075-10E4-47A6-9593-8927E9E139CA@ci.uchicago.edu> This paper is in my workshop today - others might find it interesting. Dan -- Daniel S. Katz University of Chicago (773) 834-7186 (voice) (773) 834-6818 (fax) d.katz at ieee.org or dsk at ci.uchicago.edu http://www.ci.uchicago.edu/~dsk/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: p7.pdf Type: application/pdf Size: 730316 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Wed Jun 8 14:50:14 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 8 Jun 2011 14:50:14 -0500 (CDT) Subject: [Swift-devel] No Swift Devel meeting today Message-ID: <1821589442.19482.1307562614361.JavaMail.root@zimbra.anl.gov> We wont be meeting today. Lets try to resume next week. - Mike From davidkelly999 at gmail.com Wed Jun 8 17:04:57 2011 From: davidkelly999 at gmail.com (David Kelly) Date: Wed, 8 Jun 2011 17:04:57 -0500 Subject: [Swift-devel] Persistent coaster shell scripts documentation Message-ID: Hello all, I've attached some documentation for the start-coaster-service and stop-coaster-service scripts for managing persistent coasters. This will be part of a larger document I am writing about coasters, but I thought for now it might be useful for anyone who wants to give it a try. The scripts and the configuration files are checked into trunk. David -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From skenny at uchicago.edu Sun Jun 12 15:17:42 2011 From: skenny at uchicago.edu (Sarah Kenny) Date: Sun, 12 Jun 2011 13:17:42 -0700 Subject: [Swift-devel] can't run latest trunk Message-ID: hey, i'm wondering if anyone else is having trouble running the latest trunk? i'm not able to run 'hello world'...could this have to do with the recent commits to the mapper code? [skenny at cosmo bugtests]$ swift first.swift no sites file specified, setting to default: /home/skenny/builds/cog/modules/swift/dist/swift-svn/etc/sites.xml Swift svn swift-r4601 cog-r3159 RunID: 20110612-1310-m6jky76e Progress: time: Sun, 12 Jun 2011 13:10:59 -0700 java.lang.NullPointerException at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:464) at java.lang.Thread.run(Thread.java:662) Progress: time: Sun, 12 Jun 2011 13:11:00 -0700 Initializing site shared directory:1 java.lang.NullPointerException at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:464) at java.lang.Thread.run(Thread.java:662) Progress: time: Sun, 12 Jun 2011 13:11:01 -0700 Initializing site shared directory:1 java.lang.NullPointerException at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:464) at java.lang.Thread.run(Thread.java:662) Execution failed: java.lang.NullPointerException at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:464) at java.lang.Thread.run(Thread.java:662) -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Sun Jun 12 15:22:33 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sun, 12 Jun 2011 13:22:33 -0700 Subject: [Swift-devel] can't run latest trunk In-Reply-To: References: Message-ID: <1307910153.27402.0.camel@blabla2.none> This has to do with my latest changes to the file transfer stuff. Hang on a bit. On Sun, 2011-06-12 at 13:17 -0700, Sarah Kenny wrote: > hey, i'm wondering if anyone else is having trouble running the latest > trunk? i'm not able to run 'hello world'...could this have to do with > the recent commits to the mapper code? > > [skenny at cosmo bugtests]$ swift first.swift > no sites file specified, setting to > default: /home/skenny/builds/cog/modules/swift/dist/swift-svn/etc/sites.xml > Swift svn swift-r4601 cog-r3159 > > RunID: 20110612-1310-m6jky76e > Progress: time: Sun, 12 Jun 2011 13:10:59 -0700 > java.lang.NullPointerException > at > org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:464) > at java.lang.Thread.run(Thread.java:662) > Progress: time: Sun, 12 Jun 2011 13:11:00 -0700 Initializing site > shared directory:1 > java.lang.NullPointerException > at > org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:464) > at java.lang.Thread.run(Thread.java:662) > Progress: time: Sun, 12 Jun 2011 13:11:01 -0700 Initializing site > shared directory:1 > java.lang.NullPointerException > at > org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:464) > at java.lang.Thread.run(Thread.java:662) > Execution failed: > java.lang.NullPointerException > at > org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:464) > at java.lang.Thread.run(Thread.java:662) > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel From hategan at mcs.anl.gov Sun Jun 12 16:16:50 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sun, 12 Jun 2011 14:16:50 -0700 Subject: [Swift-devel] can't run latest trunk In-Reply-To: <1307910153.27402.0.camel@blabla2.none> References: <1307910153.27402.0.camel@blabla2.none> Message-ID: <1307913410.27402.1.camel@blabla2.none> Well, I fixed it but I can't commit because the sourceforge nameserver seems to have a bit of a problem. I'll let you know when I can... On Sun, 2011-06-12 at 13:22 -0700, Mihael Hategan wrote: > This has to do with my latest changes to the file transfer stuff. Hang > on a bit. > > On Sun, 2011-06-12 at 13:17 -0700, Sarah Kenny wrote: > > hey, i'm wondering if anyone else is having trouble running the latest > > trunk? i'm not able to run 'hello world'...could this have to do with > > the recent commits to the mapper code? > > > > [skenny at cosmo bugtests]$ swift first.swift > > no sites file specified, setting to > > default: /home/skenny/builds/cog/modules/swift/dist/swift-svn/etc/sites.xml > > Swift svn swift-r4601 cog-r3159 > > > > RunID: 20110612-1310-m6jky76e > > Progress: time: Sun, 12 Jun 2011 13:10:59 -0700 > > java.lang.NullPointerException > > at > > org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:464) > > at java.lang.Thread.run(Thread.java:662) > > Progress: time: Sun, 12 Jun 2011 13:11:00 -0700 Initializing site > > shared directory:1 > > java.lang.NullPointerException > > at > > org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:464) > > at java.lang.Thread.run(Thread.java:662) > > Progress: time: Sun, 12 Jun 2011 13:11:01 -0700 Initializing site > > shared directory:1 > > java.lang.NullPointerException > > at > > org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:464) > > at java.lang.Thread.run(Thread.java:662) > > Execution failed: > > java.lang.NullPointerException > > at > > org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:464) > > at java.lang.Thread.run(Thread.java:662) > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel From hategan at mcs.anl.gov Sun Jun 12 16:40:22 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sun, 12 Jun 2011 14:40:22 -0700 Subject: [Swift-devel] can't run latest trunk In-Reply-To: <1307913410.27402.1.camel@blabla2.none> References: <1307910153.27402.0.camel@blabla2.none> <1307913410.27402.1.camel@blabla2.none> Message-ID: <1307914822.5948.0.camel@blabla2.none> On Sun, 2011-06-12 at 14:16 -0700, Mihael Hategan wrote: > Well, I fixed it but I can't commit because the sourceforge nameserver > seems to have a bit of a problem. I'll let you know when I can... Done, but that nameserver seems to still be a bit shaky. From skenny at uchicago.edu Sun Jun 12 18:57:19 2011 From: skenny at uchicago.edu (Sarah Kenny) Date: Sun, 12 Jun 2011 16:57:19 -0700 Subject: [Swift-devel] can't run latest trunk In-Reply-To: <1307914822.5948.0.camel@blabla2.none> References: <1307910153.27402.0.camel@blabla2.none> <1307913410.27402.1.camel@blabla2.none> <1307914822.5948.0.camel@blabla2.none> Message-ID: On Sun, Jun 12, 2011 at 2:40 PM, Mihael Hategan wrote: > On Sun, 2011-06-12 at 14:16 -0700, Mihael Hategan wrote: > > Well, I fixed it but I can't commit because the sourceforge nameserver > > seems to have a bit of a problem. I'll let you know when I can... > > Done, but that nameserver seems to still be a bit shaky. > yr right, it's pretty bad...can't even checkout... -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Sun Jun 12 20:23:04 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sun, 12 Jun 2011 18:23:04 -0700 Subject: [Swift-devel] recent trunk changes Message-ID: <1307928184.14284.2.camel@blabla2.none> So I was assuming that comments in bugzilla would be posted on swift-devel, but it seems we may have disabled that. I have recently committed some changes to trunk which make use of the cog resources for 3rd party transfers instead of the jglobus UrlCopy. This was one of the things that was deemed likely to benefit Allan's work on OSG by reducing the number of connections to GridFTP servers when using 3rd party transfers. Mihael From wozniak at mcs.anl.gov Mon Jun 13 10:53:57 2011 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Mon, 13 Jun 2011 10:53:57 -0500 (CDT) Subject: [Swift-devel] Re: Work with Justin on performance plotting In-Reply-To: <4DF1193B.4090604@gmail.com> References: <842815804.23526.1307645411336.JavaMail.root@zimbra.anl.gov> <7ED47B8E-BEFB-4F46-8BBA-D360174D8179@gmail.com> <4DF1193B.4090604@gmail.com> Message-ID: Hello I don't have much to add other than the READMEs and link below. I have tested and documented only a few plots; let's continue to add to the README and build up some additional good use cases. Justin On Thu, 9 Jun 2011, ketan wrote: > Jon, > > To give you heads up, the performance plotting tools are located in trunk at: > swift/libexec/log-processing > > The Readme has some information on setup and generating plots. > > More follows on the tools I added. > > Ketan > > On 6/9/11 1:58 PM, Jonathan Monette wrote: >> Sure, I can work on this. Whenever Justin is ready and has time we can >> meet or he can point me to what works. >> On Jun 9, 2011, at 1:50 PM, Michael Wilde wrote: >> >>> Hi Jon, >>> >>> After discussing this with Justin earlier in the week, I suggested that >>> you work with Justin on testing and continuing to enhance his new version >>> of the Swift performance plotting package. >>> >>> Can you work with Justin to locate the package, try the latest, review and >>> update the documents, and start enhancing it under Justin's guidance to >>> get the plots you need for Montage? >>> >>> Ketan, I know that you have similar needs and have worked with Justin in >>> the past on these (or similar) tools. Can you serve as an early user of >>> the tools and give Jon feedback for further enhancement? >>> >>> Lets do all discussion of the topic on swift-devel (including the handoff >>> from you, Justin, explaining the state of development and where everything >>> is located). >>> >>> Thanks, >>> >>> Mike >>> >>> >>> -- >>> Michael Wilde >>> Computation Institute, University of Chicago >>> Mathematics and Computer Science Division >>> Argonne National Laboratory >>> > -- Justin M Wozniak From jonmon at utexas.edu Mon Jun 13 15:20:43 2011 From: jonmon at utexas.edu (Jonathan Monette) Date: Mon, 13 Jun 2011 15:20:43 -0500 Subject: [Swift-devel] Re: Work with Justin on performance plotting In-Reply-To: References: <842815804.23526.1307645411336.JavaMail.root@zimbra.anl.gov> <7ED47B8E-BEFB-4F46-8BBA-D360174D8179@gmail.com> <4DF1193B.4090604@gmail.com> Message-ID: <21F846DB-1318-4BA4-B38F-361D6E2AF597@gmail.com> Which are the necessary lines to run the sample plotter? I am going to add it to the README. It just says to set log4j.logger.swift=DEBUG. If I just set that I still don't see the CPU lines or the Block lines that need to show up in the logs. On Jun 13, 2011, at 10:53 AM, Justin M Wozniak wrote: > Hello > I don't have much to add other than the READMEs and link below. I have tested and documented only a few plots; let's continue to add to the README and build up some additional good use cases. > Justin > > On Thu, 9 Jun 2011, ketan wrote: > >> Jon, >> >> To give you heads up, the performance plotting tools are located in trunk at: swift/libexec/log-processing >> >> The Readme has some information on setup and generating plots. >> >> More follows on the tools I added. >> >> Ketan >> >> On 6/9/11 1:58 PM, Jonathan Monette wrote: >>> Sure, I can work on this. Whenever Justin is ready and has time we can meet or he can point me to what works. >>> On Jun 9, 2011, at 1:50 PM, Michael Wilde wrote: >>>> Hi Jon, >>>> After discussing this with Justin earlier in the week, I suggested that you work with Justin on testing and continuing to enhance his new version of the Swift performance plotting package. >>>> Can you work with Justin to locate the package, try the latest, review and update the documents, and start enhancing it under Justin's guidance to get the plots you need for Montage? >>>> Ketan, I know that you have similar needs and have worked with Justin in the past on these (or similar) tools. Can you serve as an early user of the tools and give Jon feedback for further enhancement? >>>> Lets do all discussion of the topic on swift-devel (including the handoff from you, Justin, explaining the state of development and where everything is located). >>>> Thanks, >>>> Mike >>>> -- >>>> Michael Wilde >>>> Computation Institute, University of Chicago >>>> Mathematics and Computer Science Division >>>> Argonne National Laboratory >> > > -- > Justin M Wozniak From jonmon at utexas.edu Mon Jun 13 15:44:01 2011 From: jonmon at utexas.edu (Jonathan Monette) Date: Mon, 13 Jun 2011 15:44:01 -0500 Subject: [Swift-devel] Re: Work with Justin on performance plotting In-Reply-To: References: <842815804.23526.1307645411336.JavaMail.root@zimbra.anl.gov> <7ED47B8E-BEFB-4F46-8BBA-D360174D8179@gmail.com> <4DF1193B.4090604@gmail.com> Message-ID: <9A9FEAFD-D5A6-444B-9E1C-8D0C17C5AB58@gmail.com> Also, looking at the first command in the README for log-processing "make -f libexec/log-processing/makefile.implicit swift-run.plot.norm" can't we take that out and just show users how to use the perl script that normalizes the logs to a time. I don't think running the makefile to call this perl script is very useful in this case and we can cut out the middle man. On Jun 13, 2011, at 10:53 AM, Justin M Wozniak wrote: > Hello > I don't have much to add other than the READMEs and link below. I have tested and documented only a few plots; let's continue to add to the README and build up some additional good use cases. > Justin > > On Thu, 9 Jun 2011, ketan wrote: > >> Jon, >> >> To give you heads up, the performance plotting tools are located in trunk at: swift/libexec/log-processing >> >> The Readme has some information on setup and generating plots. >> >> More follows on the tools I added. >> >> Ketan >> >> On 6/9/11 1:58 PM, Jonathan Monette wrote: >>> Sure, I can work on this. Whenever Justin is ready and has time we can meet or he can point me to what works. >>> On Jun 9, 2011, at 1:50 PM, Michael Wilde wrote: >>>> Hi Jon, >>>> After discussing this with Justin earlier in the week, I suggested that you work with Justin on testing and continuing to enhance his new version of the Swift performance plotting package. >>>> Can you work with Justin to locate the package, try the latest, review and update the documents, and start enhancing it under Justin's guidance to get the plots you need for Montage? >>>> Ketan, I know that you have similar needs and have worked with Justin in the past on these (or similar) tools. Can you serve as an early user of the tools and give Jon feedback for further enhancement? >>>> Lets do all discussion of the topic on swift-devel (including the handoff from you, Justin, explaining the state of development and where everything is located). >>>> Thanks, >>>> Mike >>>> -- >>>> Michael Wilde >>>> Computation Institute, University of Chicago >>>> Mathematics and Computer Science Division >>>> Argonne National Laboratory >> > > -- > Justin M Wozniak From wozniak at mcs.anl.gov Mon Jun 13 15:55:06 2011 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Mon, 13 Jun 2011 15:55:06 -0500 (CDT) Subject: [Swift-devel] Re: Work with Justin on performance plotting In-Reply-To: <21F846DB-1318-4BA4-B38F-361D6E2AF597@gmail.com> References: <842815804.23526.1307645411336.JavaMail.root@zimbra.anl.gov> <7ED47B8E-BEFB-4F46-8BBA-D360174D8179@gmail.com> <4DF1193B.4090604@gmail.com> <21F846DB-1318-4BA4-B38F-361D6E2AF597@gmail.com> Message-ID: I just added them. Those lines should be on by default. It would be good to document the minimal log4j settings required for each documented plot type. However, that would really clutter the README. On Mon, 13 Jun 2011, Jonathan Monette wrote: > Which are the necessary lines to run the sample plotter? I am going to > add it to the README. It just says to set log4j.logger.swift=DEBUG. > If I just set that I still don't see the CPU lines or the Block lines > that need to show up in the logs. > On Jun 13, 2011, at 10:53 AM, Justin M Wozniak wrote: > >> Hello >> I don't have much to add other than the READMEs and link below. I have tested and documented only a few plots; let's continue to add to the README and build up some additional good use cases. >> Justin >> >> On Thu, 9 Jun 2011, ketan wrote: >> >>> Jon, >>> >>> To give you heads up, the performance plotting tools are located in trunk at: swift/libexec/log-processing >>> >>> The Readme has some information on setup and generating plots. >>> >>> More follows on the tools I added. >>> >>> Ketan >>> >>> On 6/9/11 1:58 PM, Jonathan Monette wrote: >>>> Sure, I can work on this. Whenever Justin is ready and has time we can meet or he can point me to what works. >>>> On Jun 9, 2011, at 1:50 PM, Michael Wilde wrote: >>>>> Hi Jon, >>>>> After discussing this with Justin earlier in the week, I suggested that you work with Justin on testing and continuing to enhance his new version of the Swift performance plotting package. >>>>> Can you work with Justin to locate the package, try the latest, review and update the documents, and start enhancing it under Justin's guidance to get the plots you need for Montage? >>>>> Ketan, I know that you have similar needs and have worked with Justin in the past on these (or similar) tools. Can you serve as an early user of the tools and give Jon feedback for further enhancement? >>>>> Lets do all discussion of the topic on swift-devel (including the handoff from you, Justin, explaining the state of development and where everything is located). >>>>> Thanks, >>>>> Mike >>>>> -- >>>>> Michael Wilde >>>>> Computation Institute, University of Chicago >>>>> Mathematics and Computer Science Division >>>>> Argonne National Laboratory >>> >> >> -- >> Justin M Wozniak > > -- Justin M Wozniak From jonmon at utexas.edu Mon Jun 13 16:06:57 2011 From: jonmon at utexas.edu (Jonathan Monette) Date: Mon, 13 Jun 2011 16:06:57 -0500 Subject: [Swift-devel] Re: Work with Justin on performance plotting In-Reply-To: References: <842815804.23526.1307645411336.JavaMail.root@zimbra.anl.gov> <7ED47B8E-BEFB-4F46-8BBA-D360174D8179@gmail.com> <4DF1193B.4090604@gmail.com> <21F846DB-1318-4BA4-B38F-361D6E2AF597@gmail.com> Message-ID: those lines are not on by default in the stable branch or the release 0.92. In the trunk log4j.properties file the only line on my default is the Cpu line. I will add the Block line so that they will be on my default. On Jun 13, 2011, at 3:55 PM, Justin M Wozniak wrote: > > I just added them. Those lines should be on by default. > > It would be good to document the minimal log4j settings required for each documented plot type. However, that would really clutter the README. > > On Mon, 13 Jun 2011, Jonathan Monette wrote: > >> Which are the necessary lines to run the sample plotter? I am going to add it to the README. It just says to set log4j.logger.swift=DEBUG. If I just set that I still don't see the CPU lines or the Block lines that need to show up in the logs. > >> On Jun 13, 2011, at 10:53 AM, Justin M Wozniak wrote: >> >>> Hello >>> I don't have much to add other than the READMEs and link below. I have tested and documented only a few plots; let's continue to add to the README and build up some additional good use cases. >>> Justin >>> >>> On Thu, 9 Jun 2011, ketan wrote: >>> >>>> Jon, >>>> >>>> To give you heads up, the performance plotting tools are located in trunk at: swift/libexec/log-processing >>>> >>>> The Readme has some information on setup and generating plots. >>>> >>>> More follows on the tools I added. >>>> >>>> Ketan >>>> >>>> On 6/9/11 1:58 PM, Jonathan Monette wrote: >>>>> Sure, I can work on this. Whenever Justin is ready and has time we can meet or he can point me to what works. >>>>> On Jun 9, 2011, at 1:50 PM, Michael Wilde wrote: >>>>>> Hi Jon, >>>>>> After discussing this with Justin earlier in the week, I suggested that you work with Justin on testing and continuing to enhance his new version of the Swift performance plotting package. >>>>>> Can you work with Justin to locate the package, try the latest, review and update the documents, and start enhancing it under Justin's guidance to get the plots you need for Montage? >>>>>> Ketan, I know that you have similar needs and have worked with Justin in the past on these (or similar) tools. Can you serve as an early user of the tools and give Jon feedback for further enhancement? >>>>>> Lets do all discussion of the topic on swift-devel (including the handoff from you, Justin, explaining the state of development and where everything is located). >>>>>> Thanks, >>>>>> Mike >>>>>> -- >>>>>> Michael Wilde >>>>>> Computation Institute, University of Chicago >>>>>> Mathematics and Computer Science Division >>>>>> Argonne National Laboratory >>>> >>> >>> -- >>> Justin M Wozniak >> >> > > -- > Justin M Wozniak From ketancmaheshwari at gmail.com Mon Jun 13 16:47:19 2011 From: ketancmaheshwari at gmail.com (ketan) Date: Mon, 13 Jun 2011 16:47:19 -0500 Subject: [Swift-devel] java.lang.IncompatibleClassChangeError when using Trunk Message-ID: <4DF68567.4000404@gmail.com> Hello, I am getting this exception using trunk: Uncaught exception: java.lang.IncompatibleClassChangeError: Expecting non-static method org.globus.cog.util.TextFileLoader.loadFromResource(Ljava/lang/String;)Ljava/lang/String; in import @ vdl-sc.k, line: 5 java.lang.IncompatibleClassChangeError: Expecting non-static method org.globus.cog.util.TextFileLoader.loadFromResource(Ljava/lang/String;)Ljava/lang/String; at org.globus.cog.karajan.parser.Grammar.load(Grammar.java:26) at org.globus.cog.karajan.parser.Grammar.(Grammar.java:22) at org.globus.cog.karajan.parser.Parser.(Parser.java:40) at org.globus.cog.karajan.translator.KarajanTranslator.getParser(KarajanTranslator.java:33) at org.globus.cog.karajan.translator.KarajanTranslator.translate(KarajanTranslator.java:58) at org.globus.cog.karajan.workflow.nodes.Include.includeFile(Include.java:263) at org.globus.cog.karajan.workflow.nodes.Include.checkArgs(Include.java:61) at org.globus.cog.karajan.workflow.nodes.Include.post(Include.java:73) at org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) at org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) at org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) at org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) at org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) at org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) at org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) at org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Exception is: java.lang.IncompatibleClassChangeError: Expecting non-static method org.globus.cog.util.TextFileLoader.loadFromResource(Ljava/lang/String;)Ljava/lang/String; Near Karajan line: import @ vdl-sc.k, line: 5 Execution failed: Uncaught exception: java.lang.IncompatibleClassChangeError: Expecting non-static method org.globus.cog.util.TextFileLoader.loadFromResource(Ljava/lang/String;)Ljava/lang/String; Java is: [bridled:site_gen]$ java -version java version "1.6.0_23" Java(TM) SE Runtime Environment (build 1.6.0_23-b05) Java HotSpot(TM) 64-Bit Server VM (build 19.0-b09, mixed mode) Attached are the Swift scripts and the sites.xml files. Ketan -------------- next part -------------- A non-text attachment was scrubbed... Name: sites.xml Type: text/xml Size: 19669 bytes Desc: not available URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: test_osg.swift URL: From wozniak at mcs.anl.gov Mon Jun 13 17:00:37 2011 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Mon, 13 Jun 2011 17:00:37 -0500 (CDT) Subject: [Swift-devel] java.lang.IncompatibleClassChangeError when using Trunk In-Reply-To: <4DF68567.4000404@gmail.com> References: <4DF68567.4000404@gmail.com> Message-ID: I think you may need to do an ant clean. Justin On Mon, 13 Jun 2011, ketan wrote: > Hello, > > I am getting this exception using trunk: > > Uncaught exception: java.lang.IncompatibleClassChangeError: Expecting > non-static method > org.globus.cog.util.TextFileLoader.loadFromResource(Ljava/lang/String;)Ljava/lang/String; > in import @ vdl-sc.k, line: 5 > java.lang.IncompatibleClassChangeError: Expecting non-static method > org.globus.cog.util.TextFileLoader.loadFromResource(Ljava/lang/String;)Ljava/lang/String; > at org.globus.cog.karajan.parser.Grammar.load(Grammar.java:26) > at org.globus.cog.karajan.parser.Grammar.(Grammar.java:22) > at org.globus.cog.karajan.parser.Parser.(Parser.java:40) > at > org.globus.cog.karajan.translator.KarajanTranslator.getParser(KarajanTranslator.java:33) > at > org.globus.cog.karajan.translator.KarajanTranslator.translate(KarajanTranslator.java:58) > at > org.globus.cog.karajan.workflow.nodes.Include.includeFile(Include.java:263) > at > org.globus.cog.karajan.workflow.nodes.Include.checkArgs(Include.java:61) > at org.globus.cog.karajan.workflow.nodes.Include.post(Include.java:73) > at > org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) > at > org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) > at > org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) > at > org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) > at > org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > Exception is: java.lang.IncompatibleClassChangeError: Expecting non-static > method > org.globus.cog.util.TextFileLoader.loadFromResource(Ljava/lang/String;)Ljava/lang/String; > Near Karajan line: import @ vdl-sc.k, line: 5 > Execution failed: > Uncaught exception: java.lang.IncompatibleClassChangeError: Expecting > non-static method > org.globus.cog.util.TextFileLoader.loadFromResource(Ljava/lang/String;)Ljava/lang/String; > > Java is: > > [bridled:site_gen]$ java -version > java version "1.6.0_23" > Java(TM) SE Runtime Environment (build 1.6.0_23-b05) > Java HotSpot(TM) 64-Bit Server VM (build 19.0-b09, mixed mode) > > > Attached are the Swift scripts and the sites.xml files. > > Ketan > > -- Justin M Wozniak From ketancmaheshwari at gmail.com Mon Jun 13 17:23:21 2011 From: ketancmaheshwari at gmail.com (ketan) Date: Mon, 13 Jun 2011 17:23:21 -0500 Subject: [Swift-devel] java.lang.IncompatibleClassChangeError when using Trunk In-Reply-To: References: <4DF68567.4000404@gmail.com> Message-ID: <4DF68DD9.5040800@gmail.com> That works, thanks. On 6/13/11 5:00 PM, Justin M Wozniak wrote: > > I think you may need to do an ant clean. > Justin > > On Mon, 13 Jun 2011, ketan wrote: > >> Hello, >> >> I am getting this exception using trunk: >> >> Uncaught exception: java.lang.IncompatibleClassChangeError: Expecting >> non-static method >> org.globus.cog.util.TextFileLoader.loadFromResource(Ljava/lang/String;)Ljava/lang/String; >> in import @ vdl-sc.k, line: 5 >> java.lang.IncompatibleClassChangeError: Expecting non-static method >> org.globus.cog.util.TextFileLoader.loadFromResource(Ljava/lang/String;)Ljava/lang/String; >> at org.globus.cog.karajan.parser.Grammar.load(Grammar.java:26) >> at org.globus.cog.karajan.parser.Grammar.(Grammar.java:22) >> at org.globus.cog.karajan.parser.Parser.(Parser.java:40) >> at >> org.globus.cog.karajan.translator.KarajanTranslator.getParser(KarajanTranslator.java:33) >> at >> org.globus.cog.karajan.translator.KarajanTranslator.translate(KarajanTranslator.java:58) >> at >> org.globus.cog.karajan.workflow.nodes.Include.includeFile(Include.java:263) >> at >> org.globus.cog.karajan.workflow.nodes.Include.checkArgs(Include.java:61) >> at >> org.globus.cog.karajan.workflow.nodes.Include.post(Include.java:73) >> at >> org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) >> at >> org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) >> at >> org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) >> at >> org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) >> at >> org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) >> at >> org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) >> at >> org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) >> at >> org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) >> at >> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) >> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) >> at java.util.concurrent.FutureTask.run(FutureTask.java:138) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >> at java.lang.Thread.run(Thread.java:662) >> Exception is: java.lang.IncompatibleClassChangeError: Expecting >> non-static method >> org.globus.cog.util.TextFileLoader.loadFromResource(Ljava/lang/String;)Ljava/lang/String; >> Near Karajan line: import @ vdl-sc.k, line: 5 >> Execution failed: >> Uncaught exception: java.lang.IncompatibleClassChangeError: >> Expecting non-static method >> org.globus.cog.util.TextFileLoader.loadFromResource(Ljava/lang/String;)Ljava/lang/String; >> >> Java is: >> >> [bridled:site_gen]$ java -version >> java version "1.6.0_23" >> Java(TM) SE Runtime Environment (build 1.6.0_23-b05) >> Java HotSpot(TM) 64-Bit Server VM (build 19.0-b09, mixed mode) >> >> >> Attached are the Swift scripts and the sites.xml files. >> >> Ketan >> >> > From jonmon at utexas.edu Mon Jun 13 20:55:10 2011 From: jonmon at utexas.edu (Jonathan Monette) Date: Mon, 13 Jun 2011 20:55:10 -0500 Subject: [Swift-devel] vdl-int.k error Message-ID: <499CD289-04C5-4E18-BD1F-C19E6C5CB0BA@utexas.edu> Hello, I am receiving this error when trying to run using trunk. Execution failed: Illegal extra argument `raw_dir/raw_image_8.fits' to sys:each @ vdl-int.k, line: 447 My scripts run fine and correctly using the 0.92 release. From hategan at mcs.anl.gov Mon Jun 13 21:17:56 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 13 Jun 2011 19:17:56 -0700 Subject: [Swift-devel] Re: vdl-int.k error In-Reply-To: <499CD289-04C5-4E18-BD1F-C19E6C5CB0BA@utexas.edu> References: <499CD289-04C5-4E18-BD1F-C19E6C5CB0BA@utexas.edu> Message-ID: <1308017876.9549.0.camel@blabla2.none> Right. Fixed in trunk. This was a problem when setting wrapper.parameter.mode=files On Mon, 2011-06-13 at 20:55 -0500, Jonathan Monette wrote: > Hello, > I am receiving this error when trying to run using trunk. > > Execution failed: > Illegal extra argument `raw_dir/raw_image_8.fits' to sys:each @ vdl-int.k, line: 447 > > My scripts run fine and correctly using the 0.92 release. From jonmon at utexas.edu Mon Jun 13 21:28:28 2011 From: jonmon at utexas.edu (Jonathan Monette) Date: Mon, 13 Jun 2011 21:28:28 -0500 Subject: [Swift-devel] trunk error IndexOutOfBounds Message-ID: <7B033C86-6FF0-41DD-9A5E-B1621527872A@utexas.edu> I seem to be getting this error Executing script montage.swift Check /autonfs/gpfs-pads/projects/CI-CCR000013/jonmon/Swift/Montage/m101_tutorial/run.0004 for output and debugging information Swift svn swift-r4612 cog-r3162 RunID: 20110613-2124-ktnceob8 (input): found 10 files null Caused by: java.lang.IndexOutOfBoundsException at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:220) at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.getFile(FileResourceImpl.java:244) at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.putFile(FileResourceImpl.java:268) at org.globus.cog.abstraction.impl.file.AbstractFileResource.putFile(AbstractFileResource.java:158) at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.doDestination(DelegatedFileTransferHandler.java:314) at org.globus.cog.abstraction.impl.fileTransfer.CachingDelegatedFileTransferHandler.doDestination(CachingDelegatedFileTransferHandler.java:46) at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:487) at java.lang.Thread.run(Thread.java:662) null Caused by: java.lang.IndexOutOfBoundsException at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:220) at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.getFile(FileResourceImpl.java:244) at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.putFile(FileResourceImpl.java:268) at org.globus.cog.abstraction.impl.file.AbstractFileResource.putFile(AbstractFileResource.java:158) at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.doDestination(DelegatedFileTransferHandler.java:314) at org.globus.cog.abstraction.impl.fileTransfer.CachingDelegatedFileTransferHandler.doDestination(CachingDelegatedFileTransferHandler.java:46) at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:487) at java.lang.Thread.run(Thread.java:662) null Caused by: java.lang.IndexOutOfBoundsException at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:220) at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.getFile(FileResourceImpl.java:244) at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.putFile(FileResourceImpl.java:268) at org.globus.cog.abstraction.impl.file.AbstractFileResource.putFile(AbstractFileResource.java:158) at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.doDestination(DelegatedFileTransferHandler.java:314) at org.globus.cog.abstraction.impl.fileTransfer.CachingDelegatedFileTransferHandler.doDestination(CachingDelegatedFileTransferHandler.java:46) at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:487) at java.lang.Thread.run(Thread.java:662) null Caused by: java.lang.IndexOutOfBoundsException at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:220) at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.getFile(FileResourceImpl.java:244) at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.putFile(FileResourceImpl.java:268) at org.globus.cog.abstraction.impl.file.AbstractFileResource.putFile(AbstractFileResource.java:158) at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.doDestination(DelegatedFileTransferHandler.java:314) at org.globus.cog.abstraction.impl.fileTransfer.CachingDelegatedFileTransferHandler.doDestination(CachingDelegatedFileTransferHandler.java:46) at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:487) at java.lang.Thread.run(Thread.java:662) null Caused by: java.lang.IndexOutOfBoundsException at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:220) at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.getFile(FileResourceImpl.java:244) at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.putFile(FileResourceImpl.java:268) at org.globus.cog.abstraction.impl.file.AbstractFileResource.putFile(AbstractFileResource.java:158) at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.doDestination(DelegatedFileTransferHandler.java:314) at org.globus.cog.abstraction.impl.fileTransfer.CachingDelegatedFileTransferHandler.doDestination(CachingDelegatedFileTransferHandler.java:46) at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:487) at java.lang.Thread.run(Thread.java:662) null Caused by: java.lang.IndexOutOfBoundsException at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:220) at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.getFile(FileResourceImpl.java:244) at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.putFile(FileResourceImpl.java:268) at org.globus.cog.abstraction.impl.file.AbstractFileResource.putFile(AbstractFileResource.java:158) at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.doDestination(DelegatedFileTransferHandler.java:314) at org.globus.cog.abstraction.impl.fileTransfer.CachingDelegatedFileTransferHandler.doDestination(CachingDelegatedFileTransferHandler.java:46) at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:487) at java.lang.Thread.run(Thread.java:662) null Caused by: java.lang.IndexOutOfBoundsException at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:220) at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.getFile(FileResourceImpl.java:244) at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.putFile(FileResourceImpl.java:268) at org.globus.cog.abstraction.impl.file.AbstractFileResource.putFile(AbstractFileResource.java:158) at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.doDestination(DelegatedFileTransferHandler.java:314) at org.globus.cog.abstraction.impl.fileTransfer.CachingDelegatedFileTransferHandler.doDestination(CachingDelegatedFileTransferHandler.java:46) at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:487) at java.lang.Thread.run(Thread.java:662) null Caused by: java.lang.IndexOutOfBoundsException at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:220) at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.getFile(FileResourceImpl.java:244) at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.putFile(FileResourceImpl.java:268) at org.globus.cog.abstraction.impl.file.AbstractFileResource.putFile(AbstractFileResource.java:158) at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.doDestination(DelegatedFileTransferHandler.java:314) at org.globus.cog.abstraction.impl.fileTransfer.CachingDelegatedFileTransferHandler.doDestination(CachingDelegatedFileTransferHandler.java:46) at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:487) at java.lang.Thread.run(Thread.java:662) null Caused by: java.lang.IndexOutOfBoundsException at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:220) at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.getFile(FileResourceImpl.java:244) at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.putFile(FileResourceImpl.java:268) at org.globus.cog.abstraction.impl.file.AbstractFileResource.putFile(AbstractFileResource.java:158) at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.doDestination(DelegatedFileTransferHandler.java:314) at org.globus.cog.abstraction.impl.fileTransfer.CachingDelegatedFileTransferHandler.doDestination(CachingDelegatedFileTransferHandler.java:46) at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:487) at java.lang.Thread.run(Thread.java:662) null Caused by: java.lang.IndexOutOfBoundsException at java.io.FileInputStream.readBytes(Native Method) at java.io.FileInputStream.read(FileInputStream.java:220) at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.getFile(FileResourceImpl.java:244) at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.putFile(FileResourceImpl.java:268) at org.globus.cog.abstraction.impl.file.AbstractFileResource.putFile(AbstractFileResource.java:158) at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.doDestination(DelegatedFileTransferHandler.java:314) at org.globus.cog.abstraction.impl.fileTransfer.CachingDelegatedFileTransferHandler.doDestination(CachingDelegatedFileTransferHandler.java:46) at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:487) at java.lang.Thread.run(Thread.java:662) Failed to write to montage-20110613-2124-ktnceob8/info/l on pads. Verify that all directories specified in your sites file exist and are writable! Failed to write to montage-20110613-2124-ktnceob8/info/p on pads. Verify that all directories specified in your sites file exist and are writable! Failed to write to montage-20110613-2124-ktnceob8/info/u on pads. Verify that all directories specified in your sites file exist and are writable! Failed to write to montage-20110613-2124-ktnceob8/info/s on pads. Verify that all directories specified in your sites file exist and are writable! Failed to write to montage-20110613-2124-ktnceob8/info/r on pads. Verify that all directories specified in your sites file exist and are writable! Failed to write to montage-20110613-2124-ktnceob8/info/n on pads. Verify that all directories specified in your sites file exist and are writable! Failed to write to montage-20110613-2124-ktnceob8/info/o on pads. Verify that all directories specified in your sites file exist and are writable! Failed to write to montage-20110613-2124-ktnceob8/info/m on pads. Verify that all directories specified in your sites file exist and are writable! Failed to write to montage-20110613-2124-ktnceob8/info/t on pads. Verify that all directories specified in your sites file exist and are writable! all files used for this run are in ~jonmon/PADS/Swift/Montage/m101_tutorial/run.0004 From hategan at mcs.anl.gov Mon Jun 13 21:54:12 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 13 Jun 2011 19:54:12 -0700 Subject: [Swift-devel] Re: trunk error IndexOutOfBounds In-Reply-To: <7B033C86-6FF0-41DD-9A5E-B1621527872A@utexas.edu> References: <7B033C86-6FF0-41DD-9A5E-B1621527872A@utexas.edu> Message-ID: <1308020052.12431.0.camel@blabla2.none> Oops. Fixed in cog trunk r3163. Mihael On Mon, 2011-06-13 at 21:28 -0500, Jonathan Monette wrote: > I seem to be getting this error > > Executing script montage.swift > Check /autonfs/gpfs-pads/projects/CI-CCR000013/jonmon/Swift/Montage/m101_tutorial/run.0004 for output and debugging information > Swift svn swift-r4612 cog-r3162 > > RunID: 20110613-2124-ktnceob8 > (input): found 10 files > null > Caused by: java.lang.IndexOutOfBoundsException From jonmon at utexas.edu Mon Jun 13 22:03:55 2011 From: jonmon at utexas.edu (Jonathan Monette) Date: Mon, 13 Jun 2011 22:03:55 -0500 Subject: [Swift-devel] Re: trunk error IndexOutOfBounds In-Reply-To: <1308020052.12431.0.camel@blabla2.none> References: <7B033C86-6FF0-41DD-9A5E-B1621527872A@utexas.edu> <1308020052.12431.0.camel@blabla2.none> Message-ID: <9CF51B4C-BD4D-4DF4-BAF8-C4B36848BE62@utexas.edu> Alright. That seemed to fix it. On Jun 13, 2011, at 9:54 PM, Mihael Hategan wrote: > Oops. Fixed in cog trunk r3163. > > Mihael > > On Mon, 2011-06-13 at 21:28 -0500, Jonathan Monette wrote: >> I seem to be getting this error >> >> Executing script montage.swift >> Check /autonfs/gpfs-pads/projects/CI-CCR000013/jonmon/Swift/Montage/m101_tutorial/run.0004 for output and debugging information >> Swift svn swift-r4612 cog-r3162 >> >> RunID: 20110613-2124-ktnceob8 >> (input): found 10 files >> null >> Caused by: java.lang.IndexOutOfBoundsException > > From turam at mcs.anl.gov Tue Jun 14 09:17:05 2011 From: turam at mcs.anl.gov (Thomas Uram) Date: Tue, 14 Jun 2011 09:17:05 -0500 Subject: [Swift-devel] Documentation of sites.xml Message-ID: I know there's been a documentation push, so: Where can I find the best documentation of available sites.xml options? Is there, for example, a gsissh provider? Tom From hategan at mcs.anl.gov Tue Jun 14 14:37:39 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 14 Jun 2011 12:37:39 -0700 Subject: [Swift-devel] Documentation of sites.xml In-Reply-To: References: Message-ID: <1308080259.27688.1.camel@blabla2.none> On Tue, 2011-06-14 at 09:17 -0500, Thomas Uram wrote: > I know there's been a documentation push, so: Where can I find the > best documentation of available sites.xml options? Is there, for > example, a gsissh provider? Don't know much about the documentation of that, but I know the ssh provider doesn't support gsi authentication. How badly do you need it? From turam at mcs.anl.gov Tue Jun 14 14:52:46 2011 From: turam at mcs.anl.gov (Thomas Uram) Date: Tue, 14 Jun 2011 14:52:46 -0500 Subject: [Swift-devel] Documentation of sites.xml In-Reply-To: <1308080259.27688.1.camel@blabla2.none> References: <1308080259.27688.1.camel@blabla2.none> Message-ID: Not badly. It seems like a natural fit for TeraGrid work, so I expected it to be there already. I started with only the question about a gsissh provider, but then came upon another: - can the workdirectory be a relative directory? or must i know the absolute path to the home directory on each compute site to create a pool entry in sites.xml? On Jun 14, 2011, at 2:37 PM, Mihael Hategan wrote: > On Tue, 2011-06-14 at 09:17 -0500, Thomas Uram wrote: >> I know there's been a documentation push, so: Where can I find the >> best documentation of available sites.xml options? Is there, for >> example, a gsissh provider? > > Don't know much about the documentation of that, but I know the ssh > provider doesn't support gsi authentication. How badly do you need it? > > From hategan at mcs.anl.gov Tue Jun 14 15:06:03 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 14 Jun 2011 13:06:03 -0700 Subject: [Swift-devel] Documentation of sites.xml In-Reply-To: References: <1308080259.27688.1.camel@blabla2.none> Message-ID: <1308081963.28652.4.camel@blabla2.none> On Tue, 2011-06-14 at 14:52 -0500, Thomas Uram wrote: > Not badly. It seems like a natural fit for TeraGrid work, so I expected it to be there already. The ssh library that we use doesn't support it (out of the box), and we figured on TG Globus will suffice. > > I started with only the question about a gsissh provider, but then came upon another: > > - can the workdirectory be a relative directory? or must i know the > absolute path to the home directory on each compute site to create a > pool entry in sites.xml? > there's one way to find out... I don't think we tried it in swift, but I know I tried hard in cog to keep the semantics of non-absolute paths consistent across the board (i.e. they are meant relative to the home directory, whatever that may be on a site). From davidkelly999 at gmail.com Tue Jun 14 15:13:12 2011 From: davidkelly999 at gmail.com (David Kelly) Date: Tue, 14 Jun 2011 15:13:12 -0500 Subject: [Swift-devel] Documentation of sites.xml In-Reply-To: References: Message-ID: Hi Tom, I don't think there is any documentation that exists which describes every sites.xml option.. but there should be. I created a bugzilla ticket for myself to add this. The closest thing right now is the sites.xml entry of the userguide at http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_the_site_catalog_sites_xml, but it is incomplete. David On Tue, Jun 14, 2011 at 9:17 AM, Thomas Uram wrote: > > I know there's been a documentation push, so: Where can I find the best > documentation of available sites.xml options? Is there, for example, a > gsissh provider? > > Tom > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Tue Jun 14 15:19:00 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 14 Jun 2011 13:19:00 -0700 Subject: [Swift-devel] Documentation of sites.xml In-Reply-To: References: Message-ID: <1308082740.29249.2.camel@blabla2.none> And now that we're revamping things, maybe we should have "filesystem" and "execution" listed first and the deprecated "gridftp" and "jobmanager" listed last or as a footnote. Mihael On Tue, 2011-06-14 at 15:13 -0500, David Kelly wrote: > Hi Tom, > > I don't think there is any documentation that exists which describes > every sites.xml option.. but there should be. I created a bugzilla > ticket for myself to add this. The closest thing right now is the > sites.xml entry of the userguide at > http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_the_site_catalog_sites_xml, but it is incomplete. > > David > > > On Tue, Jun 14, 2011 at 9:17 AM, Thomas Uram > wrote: > > I know there's been a documentation push, so: Where can I find > the best documentation of available sites.xml options? Is > there, for example, a gsissh provider? > > Tom > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel From turam at mcs.anl.gov Tue Jun 14 17:07:13 2011 From: turam at mcs.anl.gov (Thomas Uram) Date: Tue, 14 Jun 2011 17:07:13 -0500 Subject: [Swift-devel] Documentation of sites.xml In-Reply-To: <1308081963.28652.4.camel@blabla2.none> References: <1308080259.27688.1.camel@blabla2.none> <1308081963.28652.4.camel@blabla2.none> Message-ID: > there's one way to find out... I did try it (before posting) and it failed: 708/wc.file-20110613-1127-z8rf4ugb.log:500 End.] [Nested exception is org.globus.ftp.exception.UnexpectedReplyCodeException: Custom message: Unexpected reply: 500-Command failed : System error in mkdir: Permission denied 708/wc.file-20110613-1127-z8rf4ugb.log:Caused by: org.globus.ftp.exception.ServerException: Server refused performing the request. Custom message: Server refused creating directory (error code 1) [Nested exception message: Custom message: Unexpected reply: 500-Command failed : System error in mkdir: Permission denied Presumably it would have been trying to create the dir in my home directory and it should have succeeded, but I'll look into it and find out. In any case, I was preferring to take a more informed approach and implement according to the docs, hence my question. Tom On Jun 14, 2011, at 3:06 PM, Mihael Hategan wrote: > On Tue, 2011-06-14 at 14:52 -0500, Thomas Uram wrote: >> Not badly. It seems like a natural fit for TeraGrid work, so I expected it to be there already. > > The ssh library that we use doesn't support it (out of the box), and we > figured on TG Globus will suffice. > >> >> I started with only the question about a gsissh provider, but then came upon another: >> >> - can the workdirectory be a relative directory? or must i know the >> absolute path to the home directory on each compute site to create a >> pool entry in sites.xml? >> > > there's one way to find out... > > I don't think we tried it in swift, but I know I tried hard in cog to > keep the semantics of non-absolute paths consistent across the board > (i.e. they are meant relative to the home directory, whatever that may > be on a site). > > From hategan at mcs.anl.gov Tue Jun 14 17:11:07 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 14 Jun 2011 15:11:07 -0700 Subject: [Swift-devel] Documentation of sites.xml In-Reply-To: References: <1308080259.27688.1.camel@blabla2.none> <1308081963.28652.4.camel@blabla2.none> Message-ID: <1308089467.1830.1.camel@blabla2.none> On Tue, 2011-06-14 at 17:07 -0500, Thomas Uram wrote: > > there's one way to find out... > > > I did try it (before posting) and it failed: > > 708/wc.file-20110613-1127-z8rf4ugb.log:500 End.] [Nested exception is org.globus.ftp.exception.UnexpectedReplyCodeException: Custom message: Unexpected reply: 500-Command failed : System error in mkdir: Permission denied > 708/wc.file-20110613-1127-z8rf4ugb.log:Caused by: org.globus.ftp.exception.ServerException: Server refused performing the request. Custom message: Server refused creating directory (error code 1) [Nested exception message: Custom message: Unexpected reply: 500-Command failed : System error in mkdir: Permission denied > > Presumably it would have been trying to create the dir in my home directory and it should have succeeded, but I'll look into it and find out. > > In any case, I was preferring to take a more informed approach and implement according to the docs, hence my question. Right. The answer is "that's not supported, but it may just work" (although it's obvious by now that it doesn't). From skenny at uchicago.edu Wed Jun 15 01:05:27 2011 From: skenny at uchicago.edu (Sarah Kenny) Date: Tue, 14 Jun 2011 23:05:27 -0700 Subject: [Swift-devel] Re: [Bug 183] Print better error message when app executable is not found In-Reply-To: <20110611043107.F2349563A9@wind.mcs.anl.gov> References: <20110611043107.F2349563A9@wind.mcs.anl.gov> Message-ID: hey david, i've been trying to replicate this bug, but when i deliberately point to an executable that doesn't exist i get what seems to be an appropriate error: The executable /bin/echoo does not exist i got this on a couple of sites...can you give a little more info on what was happening with your workflow? could be i'm misunderstanding the bug. ~sk On Fri, Jun 10, 2011 at 9:31 PM, wrote: > https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=183 > > > David Kelly changed: > > What |Removed |Added > > ---------------------------------------------------------------------------- > CC| |davidkelly999 at gmail.com > Component|SwiftScript language |error messages > AssignedTo|benc at hawaga.org.uk |skenny at uchicago.edu > > > > > --- Comment #1 from David Kelly 2011-06-10 > 23:32:22 --- > This one gets my vote.. I just wasted a bunch of time trying to figure out > what > this error meant and why I was getting it. > > -- > Configure bugmail: > https://bugzilla.mcs.anl.gov/swift/userprefs.cgi?tab=email > ------- You are receiving this mail because: ------- > You are the assignee for the bug. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Wed Jun 15 09:24:01 2011 From: jonmon at utexas.edu (Jonathan Monette) Date: Wed, 15 Jun 2011 09:24:01 -0500 Subject: [Swift-devel] Re: [Bug 183] Print better error message when app executable is not found In-Reply-To: References: <20110611043107.F2349563A9@wind.mcs.anl.gov> Message-ID: <2B490651-432D-42DC-9844-571D6D7E80C8@utexas.edu> I don't think that is the problem. What if you have the wrong path to the executable not just ? This is the error I got. RunID: 20110615-0909-if0uyvm1 Progress: Progress: Failed but can retry:1 Job failed with an exit code of 254 Caused by: org.globus.cog.abstraction.impl.common.execution.JobException: Job failed with an exit code of 254 Execution failed: Job failed with an exit code of 254 I used the stable release to reproduce this and this is the error that I got using the trunk. Progress: time: Wed, 15 Jun 2011 09:18:42 -0500 Initializing:1 Progress: time: Wed, 15 Jun 2011 09:18:43 -0500 Failed but can retry:1 null Caused by: org.globus.cog.abstraction.impl.common.execution.JobException: Job failed with an exit code of 254 org.globus.cog.abstraction.impl.file.FileNotFoundException: File not found: /home/jonmon/test/work/localhost/tc_test-20110615-0918-46b6tgld/jobs/8/hello-80d4plbk/stderr.txt at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.getFile(FileResourceImpl.java:225) at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.putFile(FileResourceImpl.java:268) at org.globus.cog.abstraction.impl.file.AbstractFileResource.putFile(AbstractFileResource.java:158) at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.doDestination(DelegatedFileTransferHandler.java:314) at org.globus.cog.abstraction.impl.fileTransfer.CachingDelegatedFileTransferHandler.doDestination(CachingDelegatedFileTransferHandler.java:46) at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:487) at java.lang.Thread.run(Thread.java:662) org.globus.cog.abstraction.impl.file.FileNotFoundException: File not found: /home/jonmon/test/work/localhost/tc_test-20110615-0918-46b6tgld/jobs/8/hello-80d4plbk/?:string = out.txt - Closed at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.getFile(FileResourceImpl.java:225) at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.putFile(FileResourceImpl.java:268) at org.globus.cog.abstraction.impl.file.AbstractFileResource.putFile(AbstractFileResource.java:158) at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.doDestination(DelegatedFileTransferHandler.java:314) at org.globus.cog.abstraction.impl.fileTransfer.CachingDelegatedFileTransferHandler.doDestination(CachingDelegatedFileTransferHandler.java:46) at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:487) at java.lang.Thread.run(Thread.java:662) Execution failed: Job failed with an exit code of 254 The files used are located in ~jonmon/test on the ci machines. On Jun 15, 2011, at 1:05 AM, Sarah Kenny wrote: > hey david, i've been trying to replicate this bug, but when i deliberately point to an executable that doesn't exist i get what seems to be an appropriate error: > > The executable /bin/echoo does not exist > > i got this on a couple of sites...can you give a little more info on what was happening with your workflow? could be i'm misunderstanding the bug. > > ~sk > > On Fri, Jun 10, 2011 at 9:31 PM, wrote: > https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=183 > > > David Kelly changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |davidkelly999 at gmail.com > Component|SwiftScript language |error messages > AssignedTo|benc at hawaga.org.uk |skenny at uchicago.edu > > > > > --- Comment #1 from David Kelly 2011-06-10 23:32:22 --- > This one gets my vote.. I just wasted a bunch of time trying to figure out what > this error meant and why I was getting it. > > -- > Configure bugmail: https://bugzilla.mcs.anl.gov/swift/userprefs.cgi?tab=email > ------- You are receiving this mail because: ------- > You are the assignee for the bug. > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Wed Jun 15 11:03:58 2011 From: jonmon at utexas.edu (Jonathan Monette) Date: Wed, 15 Jun 2011 11:03:58 -0500 Subject: [Swift-devel] Re: [Bug 183] Print better error message when app executable is not found In-Reply-To: <2CB58196-7B17-478C-A074-580792A92504@gmail.com> References: <20110611043107.F2349563A9@wind.mcs.anl.gov> <2B490651-432D-42DC-9844-571D6D7E80C8@utexas.edu> <2CB58196-7B17-478C-A074-580792A92504@gmail.com> Message-ID: From the error you got and the error I got I am thinking that the path to an executable and the executable name in the tc file are different things. /bin/echoo says that the executable echoo does not exist in /bin. Something like /home/jonmon/hereiam/echo, the directory hereiam doesn't exist so it returns a different error. Maybe the path in the tc.data file was supposed to have hereami not hereiam. That is what I meant by "not just". The path to an executable doesn't exist "not just" the executable is missing. On Jun 15, 2011, at 10:58 AM, sarah kenny wrote: > > > On Jun 15, 2011, at 7:24 AM, Jonathan Monette wrote: > >> I don't think that is the problem. What if you have the wrong path to the executable not just ? This is the error I got. > > sorry I don't understand what is meant by "not just ?" I have the wrong path to the executable in my tc.data... > > >> >> RunID: 20110615-0909-if0uyvm1 >> Progress: >> Progress: Failed but can retry:1 >> Job failed with an exit code of 254 >> Caused by: org.globus.cog.abstraction.impl.common.execution.JobException: Job failed with an exit code of 254 >> Execution failed: >> Job failed with an exit code of 254 >> >> I used the stable release to reproduce this and this is the error that I got using the trunk. >> >> Progress: time: Wed, 15 Jun 2011 09:18:42 -0500 Initializing:1 >> Progress: time: Wed, 15 Jun 2011 09:18:43 -0500 Failed but can retry:1 >> null >> Caused by: org.globus.cog.abstraction.impl.common.execution.JobException: Job failed with an exit code of 254 >> org.globus.cog.abstraction.impl.file.FileNotFoundException: File not found: /home/jonmon/test/work/localhost/tc_test-20110615-0918-46b6tgld/jobs/8/hello-80d4plbk/stderr.txt >> at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.getFile(FileResourceImpl.java:225) >> at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.putFile(FileResourceImpl.java:268) >> at org.globus.cog.abstraction.impl.file.AbstractFileResource.putFile(AbstractFileResource.java:158) >> at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.doDestination(DelegatedFileTransferHandler.java:314) >> at org.globus.cog.abstraction.impl.fileTransfer.CachingDelegatedFileTransferHandler.doDestination(CachingDelegatedFileTransferHandler.java:46) >> at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:487) >> at java.lang.Thread.run(Thread.java:662) >> org.globus.cog.abstraction.impl.file.FileNotFoundException: File not found: /home/jonmon/test/work/localhost/tc_test-20110615-0918-46b6tgld/jobs/8/hello-80d4plbk/?:string = out.txt - Closed >> at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.getFile(FileResourceImpl.java:225) >> at org.globus.cog.abstraction.impl.file.local.FileResourceImpl.putFile(FileResourceImpl.java:268) >> at org.globus.cog.abstraction.impl.file.AbstractFileResource.putFile(AbstractFileResource.java:158) >> at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.doDestination(DelegatedFileTransferHandler.java:314) >> at org.globus.cog.abstraction.impl.fileTransfer.CachingDelegatedFileTransferHandler.doDestination(CachingDelegatedFileTransferHandler.java:46) >> at org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:487) >> at java.lang.Thread.run(Thread.java:662) >> Execution failed: >> Job failed with an exit code of 254 >> >> The files used are located in ~jonmon/test on the ci machines. >> >> On Jun 15, 2011, at 1:05 AM, Sarah Kenny wrote: >> >>> hey david, i've been trying to replicate this bug, but when i deliberately point to an executable that doesn't exist i get what seems to be an appropriate error: >>> >>> The executable /bin/echoo does not exist >>> >>> i got this on a couple of sites...can you give a little more info on what was happening with your workflow? could be i'm misunderstanding the bug. >>> >>> ~sk >>> >>> On Fri, Jun 10, 2011 at 9:31 PM, wrote: >>> https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=183 >>> >>> >>> David Kelly changed: >>> >>> What |Removed |Added >>> ---------------------------------------------------------------------------- >>> CC| |davidkelly999 at gmail.com >>> Component|SwiftScript language |error messages >>> AssignedTo|benc at hawaga.org.uk |skenny at uchicago.edu >>> >>> >>> >>> >>> --- Comment #1 from David Kelly 2011-06-10 23:32:22 --- >>> This one gets my vote.. I just wasted a bunch of time trying to figure out what >>> this error meant and why I was getting it. >>> >>> -- >>> Configure bugmail: https://bugzilla.mcs.anl.gov/swift/userprefs.cgi?tab=email >>> ------- You are receiving this mail because: ------- >>> You are the assignee for the bug. >>> >>> _______________________________________________ >>> Swift-devel mailing list >>> Swift-devel at ci.uchicago.edu >>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From skenny at uchicago.edu Wed Jun 15 11:34:39 2011 From: skenny at uchicago.edu (Sarah Kenny) Date: Wed, 15 Jun 2011 09:34:39 -0700 Subject: [Swift-devel] Re: [Bug 183] Print better error message when app executable is not found In-Reply-To: References: <20110611043107.F2349563A9@wind.mcs.anl.gov> <2B490651-432D-42DC-9844-571D6D7E80C8@utexas.edu> <2CB58196-7B17-478C-A074-580792A92504@gmail.com> Message-ID: hmm...so in my tc.data file i have the entry: localhost echo /bad/path/echo INSTALLED INTEL64::LINUX null then running 'hello world' i still get what seems like a reasonable error: The executable /bad/path/echo does not exist so apparently it's not just the path/executable name that's giving you guys an error. On Wed, Jun 15, 2011 at 9:03 AM, Jonathan Monette wrote: > From the error you got and the error I got I am thinking that the path to > an executable and the executable name in the tc file are different things. > /bin/echoo says that the executable echoo does not exist in /bin. > Something like /home/jonmon/hereiam/echo, the directory hereiam doesn't > exist so it returns a different error. Maybe the path in the tc.data file > was supposed to have hereami not hereiam. That is what I meant by "not > just". The path to an executable doesn't exist "not just" the executable is > missing. > > On Jun 15, 2011, at 10:58 AM, sarah kenny wrote: > > > > On Jun 15, 2011, at 7:24 AM, Jonathan Monette wrote: > > I don't think that is the problem. What if you have the wrong path to the > executable not just ? This is the error I got. > > > sorry I don't understand what is meant by "not just ?" I have the wrong > path to the executable in my tc.data... > > > > RunID: 20110615-0909-if0uyvm1 > Progress: > Progress: Failed but can retry:1 > Job failed with an exit code of 254 > Caused by: org.globus.cog.abstraction.impl.common.execution.JobException: > Job failed with an exit code of 254 > Execution failed: > Job failed with an exit code of 254 > > I used the stable release to reproduce this and this is the error that I > got using the trunk. > > Progress: time: Wed, 15 Jun 2011 09:18:42 -0500 Initializing:1 > Progress: time: Wed, 15 Jun 2011 09:18:43 -0500 Failed but can retry:1 > null > Caused by: org.globus.cog.abstraction.impl.common.execution.JobException: > Job failed with an exit code of 254 > org.globus.cog.abstraction.impl.file.FileNotFoundException: File not found: > /home/jonmon/test/work/localhost/tc_test-20110615-0918-46b6tgld/jobs/8/hello-80d4plbk/stderr.txt > at > org.globus.cog.abstraction.impl.file.local.FileResourceImpl.getFile(FileResourceImpl.java:225) > at > org.globus.cog.abstraction.impl.file.local.FileResourceImpl.putFile(FileResourceImpl.java:268) > at > org.globus.cog.abstraction.impl.file.AbstractFileResource.putFile(AbstractFileResource.java:158) > at > org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.doDestination(DelegatedFileTransferHandler.java:314) > at > org.globus.cog.abstraction.impl.fileTransfer.CachingDelegatedFileTransferHandler.doDestination(CachingDelegatedFileTransferHandler.java:46) > at > org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:487) > at java.lang.Thread.run(Thread.java:662) > org.globus.cog.abstraction.impl.file.FileNotFoundException: File not found: > /home/jonmon/test/work/localhost/tc_test-20110615-0918-46b6tgld/jobs/8/hello-80d4plbk/?:string > = out.txt - Closed > at > org.globus.cog.abstraction.impl.file.local.FileResourceImpl.getFile(FileResourceImpl.java:225) > at > org.globus.cog.abstraction.impl.file.local.FileResourceImpl.putFile(FileResourceImpl.java:268) > at > org.globus.cog.abstraction.impl.file.AbstractFileResource.putFile(AbstractFileResource.java:158) > at > org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.doDestination(DelegatedFileTransferHandler.java:314) > at > org.globus.cog.abstraction.impl.fileTransfer.CachingDelegatedFileTransferHandler.doDestination(CachingDelegatedFileTransferHandler.java:46) > at > org.globus.cog.abstraction.impl.fileTransfer.DelegatedFileTransferHandler.run(DelegatedFileTransferHandler.java:487) > at java.lang.Thread.run(Thread.java:662) > Execution failed: > Job failed with an exit code of 254 > > The files used are located in ~jonmon/test on the ci machines. > > On Jun 15, 2011, at 1:05 AM, Sarah Kenny wrote: > > hey david, i've been trying to replicate this bug, but when i deliberately > point to an executable that doesn't exist i get what seems to be an > appropriate error: > > The executable /bin/echoo does not exist > > i got this on a couple of sites...can you give a little more info on what > was happening with your workflow? could be i'm misunderstanding the bug. > > ~sk > > On Fri, Jun 10, 2011 at 9:31 PM, < > bugzilla-daemon at mcs.anl.gov> wrote: > >> >> https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=183 >> >> >> David Kelly < davidkelly999 at gmail.com> changed: >> >> What |Removed |Added >> >> ---------------------------------------------------------------------------- >> CC| | >> davidkelly999 at gmail.com >> Component|SwiftScript language |error messages >> AssignedTo| benc at hawaga.org.uk | >> skenny at uchicago.edu >> >> >> >> >> --- Comment #1 from David Kelly < >> davidkelly999 at gmail.com> 2011-06-10 23:32:22 --- >> This one gets my vote.. I just wasted a bunch of time trying to figure out >> what >> this error meant and why I was getting it. >> >> -- >> Configure bugmail: >> >> https://bugzilla.mcs.anl.gov/swift/userprefs.cgi?tab=email >> ------- You are receiving this mail because: ------- >> You are the assignee for the bug. >> > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidkelly999 at gmail.com Wed Jun 15 11:37:17 2011 From: davidkelly999 at gmail.com (David Kelly) Date: Wed, 15 Jun 2011 11:37:17 -0500 Subject: [Swift-devel] Re: [Bug 183] Print better error message when app executable is not found In-Reply-To: References: <20110611043107.F2349563A9@wind.mcs.anl.gov> Message-ID: Sarah, I saw this error when I copied configuration files from one machine to another. The application I was trying to run was in my home directory, but I was logged in under a username so the path did not exist. The error message I got was too vague to understand why my script failed - it was just something like "Job failed with an exit code of 254". It didn't say which job, or what that meant. I thought it was something related to the scheduler I was trying to use. I have only tested this with 0.92.1. I did not try it on trunk, so maybe this has already been fixed. I will try it again today to see if I can reproduce it. David On Wed, Jun 15, 2011 at 1:05 AM, Sarah Kenny wrote: > hey david, i've been trying to replicate this bug, but when i deliberately > point to an executable that doesn't exist i get what seems to be an > appropriate error: > > The executable /bin/echoo does not exist > > i got this on a couple of sites...can you give a little more info on what > was happening with your workflow? could be i'm misunderstanding the bug. > > ~sk > > On Fri, Jun 10, 2011 at 9:31 PM, wrote: > >> https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=183 >> >> >> David Kelly changed: >> >> What |Removed |Added >> >> ---------------------------------------------------------------------------- >> CC| |davidkelly999 at gmail.com >> Component|SwiftScript language |error messages >> AssignedTo|benc at hawaga.org.uk |skenny at uchicago.edu >> >> >> >> >> --- Comment #1 from David Kelly 2011-06-10 >> 23:32:22 --- >> This one gets my vote.. I just wasted a bunch of time trying to figure out >> what >> this error meant and why I was getting it. >> >> -- >> Configure bugmail: >> https://bugzilla.mcs.anl.gov/swift/userprefs.cgi?tab=email >> ------- You are receiving this mail because: ------- >> You are the assignee for the bug. >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skenny at uchicago.edu Wed Jun 15 12:01:10 2011 From: skenny at uchicago.edu (Sarah Kenny) Date: Wed, 15 Jun 2011 10:01:10 -0700 Subject: [Swift-devel] Re: [Bug 183] Print better error message when app executable is not found In-Reply-To: References: <20110611043107.F2349563A9@wind.mcs.anl.gov> Message-ID: yeah, if you're able to reproduce david that might be helpful. i get the same behavior with the stable release and trunk. i also get the correct error if i simply don't have execute permission on the executable. On Wed, Jun 15, 2011 at 9:37 AM, David Kelly wrote: > Sarah, > > I saw this error when I copied configuration files from one machine to > another. The application I was trying to run was in my home directory, but I > was logged in under a username so the path did not exist. The error message > I got was too vague to understand why my script failed - it was just > something like "Job failed with an exit code of 254". It didn't say which > job, or what that meant. I thought it was something related to the scheduler > I was trying to use. I have only tested this with 0.92.1. I did not try it > on trunk, so maybe this has already been fixed. I will try it again today to > see if I can reproduce it. > > David > > > On Wed, Jun 15, 2011 at 1:05 AM, Sarah Kenny wrote: > >> hey david, i've been trying to replicate this bug, but when i deliberately >> point to an executable that doesn't exist i get what seems to be an >> appropriate error: >> >> The executable /bin/echoo does not exist >> >> i got this on a couple of sites...can you give a little more info on what >> was happening with your workflow? could be i'm misunderstanding the bug. >> >> ~sk >> >> On Fri, Jun 10, 2011 at 9:31 PM, wrote: >> >>> https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=183 >>> >>> >>> David Kelly changed: >>> >>> What |Removed |Added >>> >>> ---------------------------------------------------------------------------- >>> CC| |davidkelly999 at gmail.com >>> Component|SwiftScript language |error messages >>> AssignedTo|benc at hawaga.org.uk |skenny at uchicago.edu >>> >>> >>> >>> >>> --- Comment #1 from David Kelly 2011-06-10 >>> 23:32:22 --- >>> This one gets my vote.. I just wasted a bunch of time trying to figure >>> out what >>> this error meant and why I was getting it. >>> >>> -- >>> Configure bugmail: >>> https://bugzilla.mcs.anl.gov/swift/userprefs.cgi?tab=email >>> ------- You are receiving this mail because: ------- >>> You are the assignee for the bug. >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Wed Jun 15 12:04:28 2011 From: jonmon at utexas.edu (Jonathan Monette) Date: Wed, 15 Jun 2011 12:04:28 -0500 Subject: [Swift-devel] Re: [Bug 183] Print better error message when app executable is not found In-Reply-To: References: <20110611043107.F2349563A9@wind.mcs.anl.gov> Message-ID: In ~jonmon/test there is a script that reproduces this error I believe. The executable it is trying to use is in ~jonmon. The tc.data file points to ~jonmon/Library/hello.sh You should be able to run the script there in the directory. If not you can copy the files to yours and try it. On Jun 15, 2011, at 12:01 PM, Sarah Kenny wrote: > yeah, if you're able to reproduce david that might be helpful. i get the same behavior with the stable release and trunk. i also get the correct error if i simply don't have execute permission on the executable. > > On Wed, Jun 15, 2011 at 9:37 AM, David Kelly wrote: > Sarah, > > I saw this error when I copied configuration files from one machine to another. The application I was trying to run was in my home directory, but I was logged in under a username so the path did not exist. The error message I got was too vague to understand why my script failed - it was just something like "Job failed with an exit code of 254". It didn't say which job, or what that meant. I thought it was something related to the scheduler I was trying to use. I have only tested this with 0.92.1. I did not try it on trunk, so maybe this has already been fixed. I will try it again today to see if I can reproduce it. > > David > > > > On Wed, Jun 15, 2011 at 1:05 AM, Sarah Kenny wrote: > hey david, i've been trying to replicate this bug, but when i deliberately point to an executable that doesn't exist i get what seems to be an appropriate error: > > The executable /bin/echoo does not exist > > i got this on a couple of sites...can you give a little more info on what was happening with your workflow? could be i'm misunderstanding the bug. > > ~sk > > On Fri, Jun 10, 2011 at 9:31 PM, wrote: > https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=183 > > > David Kelly changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |davidkelly999 at gmail.com > Component|SwiftScript language |error messages > AssignedTo|benc at hawaga.org.uk |skenny at uchicago.edu > > > > > --- Comment #1 from David Kelly 2011-06-10 23:32:22 --- > This one gets my vote.. I just wasted a bunch of time trying to figure out what > this error meant and why I was getting it. > > -- > Configure bugmail: https://bugzilla.mcs.anl.gov/swift/userprefs.cgi?tab=email > ------- You are receiving this mail because: ------- > You are the assignee for the bug. > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From skenny at uchicago.edu Wed Jun 15 13:15:47 2011 From: skenny at uchicago.edu (Sarah Kenny) Date: Wed, 15 Jun 2011 11:15:47 -0700 Subject: [Swift-devel] Re: [Bug 183] Print better error message when app executable is not found In-Reply-To: References: <20110611043107.F2349563A9@wind.mcs.anl.gov> Message-ID: copying over to my own machine and running with stable release: [skenny at martini missing_app]$ swift -tc.file ./tc.data tc_test.swift Swift svn swift-r3876 cog-r3007 RunID: 20110615-1113-ef32g0te Progress: The executable /home/jonmon/Library/hello.sh does not exist Progress: Stage in:1 The executable /home/jonmon/Library/hello.sh does not exist Progress: Stage in:1 The executable /home/jonmon/Library/hello.sh does not exist Execution failed: The executable /home/jonmon/Library/hello.sh does not exist On Wed, Jun 15, 2011 at 10:04 AM, Jonathan Monette wrote: > In ~jonmon/test there is a script that reproduces this error I believe. > The executable it is trying to use is in ~jonmon. The tc.data file points > to ~jonmon/Library/hello.sh You should be able to run the script there in > the directory. If not you can copy the files to yours and try it. > > On Jun 15, 2011, at 12:01 PM, Sarah Kenny wrote: > > yeah, if you're able to reproduce david that might be helpful. i get the > same behavior with the stable release and trunk. i also get the correct > error if i simply don't have execute permission on the executable. > > On Wed, Jun 15, 2011 at 9:37 AM, David Kelly wrote: > >> Sarah, >> >> I saw this error when I copied configuration files from one machine to >> another. The application I was trying to run was in my home directory, but I >> was logged in under a username so the path did not exist. The error message >> I got was too vague to understand why my script failed - it was just >> something like "Job failed with an exit code of 254". It didn't say which >> job, or what that meant. I thought it was something related to the scheduler >> I was trying to use. I have only tested this with 0.92.1. I did not try it >> on trunk, so maybe this has already been fixed. I will try it again today to >> see if I can reproduce it. >> >> David >> >> >> On Wed, Jun 15, 2011 at 1:05 AM, Sarah Kenny wrote: >> >>> hey david, i've been trying to replicate this bug, but when i >>> deliberately point to an executable that doesn't exist i get what seems to >>> be an appropriate error: >>> >>> The executable /bin/echoo does not exist >>> >>> i got this on a couple of sites...can you give a little more info on what >>> was happening with your workflow? could be i'm misunderstanding the bug. >>> >>> ~sk >>> >>> On Fri, Jun 10, 2011 at 9:31 PM, wrote: >>> >>>> https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=183 >>>> >>>> >>>> David Kelly changed: >>>> >>>> What |Removed |Added >>>> >>>> ---------------------------------------------------------------------------- >>>> CC| |davidkelly999 at gmail.com >>>> Component|SwiftScript language |error messages >>>> AssignedTo|benc at hawaga.org.uk |skenny at uchicago.edu >>>> >>>> >>>> >>>> >>>> --- Comment #1 from David Kelly 2011-06-10 >>>> 23:32:22 --- >>>> This one gets my vote.. I just wasted a bunch of time trying to figure >>>> out what >>>> this error meant and why I was getting it. >>>> >>>> -- >>>> Configure bugmail: >>>> https://bugzilla.mcs.anl.gov/swift/userprefs.cgi?tab=email >>>> ------- You are receiving this mail because: ------- >>>> You are the assignee for the bug. >>>> >>> >>> >> > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Wed Jun 15 13:24:12 2011 From: jonmon at utexas.edu (Jonathan Monette) Date: Wed, 15 Jun 2011 13:24:12 -0500 Subject: [Swift-devel] Re: [Bug 183] Print better error message when app executable is not found In-Reply-To: References: <20110611043107.F2349563A9@wind.mcs.anl.gov> Message-ID: In your .swift/swift.properties try adding the line status.mode=provider On Jun 15, 2011, at 1:15 PM, Sarah Kenny wrote: > copying over to my own machine and running with stable release: > > [skenny at martini missing_app]$ swift -tc.file ./tc.data tc_test.swift > Swift svn swift-r3876 cog-r3007 > > RunID: 20110615-1113-ef32g0te > Progress: > The executable /home/jonmon/Library/hello.sh does not exist > > Progress: Stage in:1 > The executable /home/jonmon/Library/hello.sh does not exist > > Progress: Stage in:1 > The executable /home/jonmon/Library/hello.sh does not exist > > Execution failed: > The executable /home/jonmon/Library/hello.sh does not exist > > > On Wed, Jun 15, 2011 at 10:04 AM, Jonathan Monette wrote: > In ~jonmon/test there is a script that reproduces this error I believe. The executable it is trying to use is in ~jonmon. The tc.data file points to ~jonmon/Library/hello.sh You should be able to run the script there in the directory. If not you can copy the files to yours and try it. > > On Jun 15, 2011, at 12:01 PM, Sarah Kenny wrote: > >> yeah, if you're able to reproduce david that might be helpful. i get the same behavior with the stable release and trunk. i also get the correct error if i simply don't have execute permission on the executable. >> >> On Wed, Jun 15, 2011 at 9:37 AM, David Kelly wrote: >> Sarah, >> >> I saw this error when I copied configuration files from one machine to another. The application I was trying to run was in my home directory, but I was logged in under a username so the path did not exist. The error message I got was too vague to understand why my script failed - it was just something like "Job failed with an exit code of 254". It didn't say which job, or what that meant. I thought it was something related to the scheduler I was trying to use. I have only tested this with 0.92.1. I did not try it on trunk, so maybe this has already been fixed. I will try it again today to see if I can reproduce it. >> >> David >> >> >> >> On Wed, Jun 15, 2011 at 1:05 AM, Sarah Kenny wrote: >> hey david, i've been trying to replicate this bug, but when i deliberately point to an executable that doesn't exist i get what seems to be an appropriate error: >> >> The executable /bin/echoo does not exist >> >> i got this on a couple of sites...can you give a little more info on what was happening with your workflow? could be i'm misunderstanding the bug. >> >> ~sk >> >> On Fri, Jun 10, 2011 at 9:31 PM, wrote: >> https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=183 >> >> >> David Kelly changed: >> >> What |Removed |Added >> ---------------------------------------------------------------------------- >> CC| |davidkelly999 at gmail.com >> Component|SwiftScript language |error messages >> AssignedTo|benc at hawaga.org.uk |skenny at uchicago.edu >> >> >> >> >> --- Comment #1 from David Kelly 2011-06-10 23:32:22 --- >> This one gets my vote.. I just wasted a bunch of time trying to figure out what >> this error meant and why I was getting it. >> >> -- >> Configure bugmail: https://bugzilla.mcs.anl.gov/swift/userprefs.cgi?tab=email >> ------- You are receiving this mail because: ------- >> You are the assignee for the bug. >> >> >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From skenny at uchicago.edu Wed Jun 15 13:34:58 2011 From: skenny at uchicago.edu (Sarah Kenny) Date: Wed, 15 Jun 2011 11:34:58 -0700 Subject: [Swift-devel] Re: [Bug 183] Print better error message when app executable is not found In-Reply-To: References: <20110611043107.F2349563A9@wind.mcs.anl.gov> Message-ID: aaaand there it is: [skenny at martini missing_app]$ swift -config swift.properties tc_test.swift Swift svn swift-r3876 cog-r3007 RunID: 20110615-1133-yxhqi1k6 Progress: Job failed with an exit code of 254 Caused by: org.globus.cog.abstraction.impl.common.execution.JobException: Job failed with an exit code of 254 Progress: Stage in:1 Job failed with an exit code of 254 Caused by: org.globus.cog.abstraction.impl.common.execution.JobException: Job failed with an exit code of 254 Progress: Stage in:1 Job failed with an exit code of 254 Caused by: org.globus.cog.abstraction.impl.common.execution.JobException: Job failed with an exit code of 254 Execution failed: Job failed with an exit code of 254 On Wed, Jun 15, 2011 at 11:24 AM, Jonathan Monette wrote: > In your .swift/swift.properties try adding the line status.mode=provider > > On Jun 15, 2011, at 1:15 PM, Sarah Kenny wrote: > > copying over to my own machine and running with stable release: > > [skenny at martini missing_app]$ swift -tc.file ./tc.data tc_test.swift > Swift svn swift-r3876 cog-r3007 > > RunID: 20110615-1113-ef32g0te > Progress: > The executable /home/jonmon/Library/hello.sh does not exist > > Progress: Stage in:1 > The executable /home/jonmon/Library/hello.sh does not exist > > Progress: Stage in:1 > The executable /home/jonmon/Library/hello.sh does not exist > > Execution failed: > The executable /home/jonmon/Library/hello.sh does not exist > > > On Wed, Jun 15, 2011 at 10:04 AM, Jonathan Monette wrote: > >> In ~jonmon/test there is a script that reproduces this error I believe. >> The executable it is trying to use is in ~jonmon. The tc.data file points >> to ~jonmon/Library/hello.sh You should be able to run the script there in >> the directory. If not you can copy the files to yours and try it. >> >> On Jun 15, 2011, at 12:01 PM, Sarah Kenny wrote: >> >> yeah, if you're able to reproduce david that might be helpful. i get the >> same behavior with the stable release and trunk. i also get the correct >> error if i simply don't have execute permission on the executable. >> >> On Wed, Jun 15, 2011 at 9:37 AM, David Kelly wrote: >> >>> Sarah, >>> >>> I saw this error when I copied configuration files from one machine to >>> another. The application I was trying to run was in my home directory, but I >>> was logged in under a username so the path did not exist. The error message >>> I got was too vague to understand why my script failed - it was just >>> something like "Job failed with an exit code of 254". It didn't say which >>> job, or what that meant. I thought it was something related to the scheduler >>> I was trying to use. I have only tested this with 0.92.1. I did not try it >>> on trunk, so maybe this has already been fixed. I will try it again today to >>> see if I can reproduce it. >>> >>> David >>> >>> >>> On Wed, Jun 15, 2011 at 1:05 AM, Sarah Kenny wrote: >>> >>>> hey david, i've been trying to replicate this bug, but when i >>>> deliberately point to an executable that doesn't exist i get what seems to >>>> be an appropriate error: >>>> >>>> The executable /bin/echoo does not exist >>>> >>>> i got this on a couple of sites...can you give a little more info on >>>> what was happening with your workflow? could be i'm misunderstanding the >>>> bug. >>>> >>>> ~sk >>>> >>>> On Fri, Jun 10, 2011 at 9:31 PM, wrote: >>>> >>>>> https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=183 >>>>> >>>>> >>>>> David Kelly changed: >>>>> >>>>> What |Removed |Added >>>>> >>>>> ---------------------------------------------------------------------------- >>>>> CC| | >>>>> davidkelly999 at gmail.com >>>>> Component|SwiftScript language |error messages >>>>> AssignedTo|benc at hawaga.org.uk |skenny at uchicago.edu >>>>> >>>>> >>>>> >>>>> >>>>> --- Comment #1 from David Kelly 2011-06-10 >>>>> 23:32:22 --- >>>>> This one gets my vote.. I just wasted a bunch of time trying to figure >>>>> out what >>>>> this error meant and why I was getting it. >>>>> >>>>> -- >>>>> Configure bugmail: >>>>> https://bugzilla.mcs.anl.gov/swift/userprefs.cgi?tab=email >>>>> ------- You are receiving this mail because: ------- >>>>> You are the assignee for the bug. >>>>> >>>> >>>> >>> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel >> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Wed Jun 15 13:37:24 2011 From: jonmon at utexas.edu (Jonathan Monette) Date: Wed, 15 Jun 2011 13:37:24 -0500 Subject: [Swift-devel] Re: [Bug 183] Print better error message when app executable is not found In-Reply-To: References: <20110611043107.F2349563A9@wind.mcs.anl.gov> Message-ID: <5BBF71F1-D1B7-4B56-A967-4E260EA741C7@utexas.edu> Yea. Sorry for all that. Just barely dawned on me that it could have been a property that changed the error messages. On Jun 15, 2011, at 1:34 PM, Sarah Kenny wrote: > aaaand there it is: > > [skenny at martini missing_app]$ swift -config swift.properties tc_test.swift > Swift svn swift-r3876 cog-r3007 > > RunID: 20110615-1133-yxhqi1k6 > Progress: > Job failed with an exit code of 254 > Caused by: org.globus.cog.abstraction.impl.common.execution.JobException: Job failed with an exit code of 254 > Progress: Stage in:1 > Job failed with an exit code of 254 > Caused by: org.globus.cog.abstraction.impl.common.execution.JobException: Job failed with an exit code of 254 > Progress: Stage in:1 > Job failed with an exit code of 254 > Caused by: org.globus.cog.abstraction.impl.common.execution.JobException: Job failed with an exit code of 254 > Execution failed: > Job failed with an exit code of 254 > > > On Wed, Jun 15, 2011 at 11:24 AM, Jonathan Monette wrote: > In your .swift/swift.properties try adding the line status.mode=provider > > On Jun 15, 2011, at 1:15 PM, Sarah Kenny wrote: > >> copying over to my own machine and running with stable release: >> >> [skenny at martini missing_app]$ swift -tc.file ./tc.data tc_test.swift >> Swift svn swift-r3876 cog-r3007 >> >> RunID: 20110615-1113-ef32g0te >> Progress: >> The executable /home/jonmon/Library/hello.sh does not exist >> >> Progress: Stage in:1 >> The executable /home/jonmon/Library/hello.sh does not exist >> >> Progress: Stage in:1 >> The executable /home/jonmon/Library/hello.sh does not exist >> >> Execution failed: >> The executable /home/jonmon/Library/hello.sh does not exist >> >> >> On Wed, Jun 15, 2011 at 10:04 AM, Jonathan Monette wrote: >> In ~jonmon/test there is a script that reproduces this error I believe. The executable it is trying to use is in ~jonmon. The tc.data file points to ~jonmon/Library/hello.sh You should be able to run the script there in the directory. If not you can copy the files to yours and try it. >> >> On Jun 15, 2011, at 12:01 PM, Sarah Kenny wrote: >> >>> yeah, if you're able to reproduce david that might be helpful. i get the same behavior with the stable release and trunk. i also get the correct error if i simply don't have execute permission on the executable. >>> >>> On Wed, Jun 15, 2011 at 9:37 AM, David Kelly wrote: >>> Sarah, >>> >>> I saw this error when I copied configuration files from one machine to another. The application I was trying to run was in my home directory, but I was logged in under a username so the path did not exist. The error message I got was too vague to understand why my script failed - it was just something like "Job failed with an exit code of 254". It didn't say which job, or what that meant. I thought it was something related to the scheduler I was trying to use. I have only tested this with 0.92.1. I did not try it on trunk, so maybe this has already been fixed. I will try it again today to see if I can reproduce it. >>> >>> David >>> >>> >>> >>> On Wed, Jun 15, 2011 at 1:05 AM, Sarah Kenny wrote: >>> hey david, i've been trying to replicate this bug, but when i deliberately point to an executable that doesn't exist i get what seems to be an appropriate error: >>> >>> The executable /bin/echoo does not exist >>> >>> i got this on a couple of sites...can you give a little more info on what was happening with your workflow? could be i'm misunderstanding the bug. >>> >>> ~sk >>> >>> On Fri, Jun 10, 2011 at 9:31 PM, wrote: >>> https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=183 >>> >>> >>> David Kelly changed: >>> >>> What |Removed |Added >>> ---------------------------------------------------------------------------- >>> CC| |davidkelly999 at gmail.com >>> Component|SwiftScript language |error messages >>> AssignedTo|benc at hawaga.org.uk |skenny at uchicago.edu >>> >>> >>> >>> >>> --- Comment #1 from David Kelly 2011-06-10 23:32:22 --- >>> This one gets my vote.. I just wasted a bunch of time trying to figure out what >>> this error meant and why I was getting it. >>> >>> -- >>> Configure bugmail: https://bugzilla.mcs.anl.gov/swift/userprefs.cgi?tab=email >>> ------- You are receiving this mail because: ------- >>> You are the assignee for the bug. >>> >>> >>> >>> _______________________________________________ >>> Swift-devel mailing list >>> Swift-devel at ci.uchicago.edu >>> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Wed Jun 15 13:48:06 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 15 Jun 2011 13:48:06 -0500 (CDT) Subject: [Swift-devel] Re: [Bug 183] Print better error message when app executable is not found In-Reply-To: <5BBF71F1-D1B7-4B56-A967-4E260EA741C7@utexas.edu> Message-ID: <1739859262.13031.1308163686072.JavaMail.root@zimbra.anl.gov> Related to this, I noticed when debugging Papia's script that adding the tag to the sites pool entry further degrades error reporting as well. The scratch tag tells _swiftwrap to place the jobdir on local disk instead of under the shared workdirectory. But when errors occur in this mode, you get a cryptic "error code 1" error and no info log or debug info returned. I'll file this as a ticket, and we can later gather evidence of any poor error reporting behavior. - Mike ----- Original Message ----- Yea. Sorry for all that. Just barely dawned on me that it could have been a property that changed the error messages. On Jun 15, 2011, at 1:34 PM, Sarah Kenny wrote: aaaand there it is: [skenny at martini missing_app]$ swift -config swift.properties tc_test.swift Swift svn swift-r3876 cog-r3007 RunID: 20110615-1133-yxhqi1k6 Progress: Job failed with an exit code of 254 Caused by: org.globus.cog.abstraction.impl.common.execution.JobException: Job failed with an exit code of 254 Progress: Stage in:1 Job failed with an exit code of 254 Caused by: org.globus.cog.abstraction.impl.common.execution.JobException: Job failed with an exit code of 254 Progress: Stage in:1 Job failed with an exit code of 254 Caused by: org.globus.cog.abstraction.impl.common.execution.JobException: Job failed with an exit code of 254 Execution failed: Job failed with an exit code of 254 On Wed, Jun 15, 2011 at 11:24 AM, Jonathan Monette < jonmon at utexas.edu > wrote: In your .swift/swift.properties try adding the line status.mode=provider On Jun 15, 2011, at 1:15 PM, Sarah Kenny wrote: copying over to my own machine and running with stable release: [skenny at martini missing_app]$ swift -tc.file ./tc.data tc_test.swift Swift svn swift-r3876 cog-r3007 RunID: 20110615-1113-ef32g0te Progress: The executable /home/jonmon/Library/hello.sh does not exist Progress: Stage in:1 The executable /home/jonmon/Library/hello.sh does not exist Progress: Stage in:1 The executable /home/jonmon/Library/hello.sh does not exist Execution failed: The executable /home/jonmon/Library/hello.sh does not exist On Wed, Jun 15, 2011 at 10:04 AM, Jonathan Monette < jonmon at utexas.edu > wrote: In ~jonmon/test there is a script that reproduces this error I believe. The executable it is trying to use is in ~jonmon. The tc.data file points to ~jonmon/Library/hello.sh You should be able to run the script there in the directory. If not you can copy the files to yours and try it. On Jun 15, 2011, at 12:01 PM, Sarah Kenny wrote: yeah, if you're able to reproduce david that might be helpful. i get the same behavior with the stable release and trunk. i also get the correct error if i simply don't have execute permission on the executable. On Wed, Jun 15, 2011 at 9:37 AM, David Kelly < davidkelly999 at gmail.com > wrote: Sarah, I saw this error when I copied configuration files from one machine to another. The application I was trying to run was in my home directory, but I was logged in under a username so the path did not exist. The error message I got was too vague to understand why my script failed - it was just something like "Job failed with an exit code of 254". It didn't say which job, or what that meant. I thought it was something related to the scheduler I was trying to use. I have only tested this with 0.92.1. I did not try it on trunk, so maybe this has already been fixed. I will try it again today to see if I can reproduce it. David On Wed, Jun 15, 2011 at 1:05 AM, Sarah Kenny < skenny at uchicago.edu > wrote: hey david, i've been trying to replicate this bug, but when i deliberately point to an executable that doesn't exist i get what seems to be an appropriate error: The executable /bin/echoo does not exist i got this on a couple of sites...can you give a little more info on what was happening with your workflow? could be i'm misunderstanding the bug. ~sk On Fri, Jun 10, 2011 at 9:31 PM, < bugzilla-daemon at mcs.anl.gov > wrote: https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=183 David Kelly < davidkelly999 at gmail.com > changed: What |Removed |Added ---------------------------------------------------------------------------- CC| | davidkelly999 at gmail.com Component|SwiftScript language |error messages AssignedTo| benc at hawaga.org.uk | skenny at uchicago.edu --- Comment #1 from David Kelly < davidkelly999 at gmail.com > 2011-06-10 23:32:22 --- This one gets my vote.. I just wasted a bunch of time trying to figure out what this error meant and why I was getting it. -- Configure bugmail: https://bugzilla.mcs.anl.gov/swift/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. _______________________________________________ Swift-devel mailing list Swift-devel at ci.uchicago.edu http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel _______________________________________________ Swift-devel mailing list Swift-devel at ci.uchicago.edu http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From turam at mcs.anl.gov Wed Jun 15 13:55:47 2011 From: turam at mcs.anl.gov (Thomas Uram) Date: Wed, 15 Jun 2011 13:55:47 -0500 Subject: [Swift-devel] Documentation of sites.xml In-Reply-To: <1308082740.29249.2.camel@blabla2.none> References: <1308082740.29249.2.camel@blabla2.none> Message-ID: If the spec is deprecated, what's the equivalent spec? I tried and got Caused by: org.globus.cog.karajan.scheduler.NoSuchResourceException: Could not find a suitable service/provider for host TeraGrid-Ranger This is, admittedly, with swift 0.92. On Jun 14, 2011, at 3:19 PM, Mihael Hategan wrote: > And now that we're revamping things, maybe we should have "filesystem" > and "execution" listed first and the deprecated "gridftp" and > "jobmanager" listed last or as a footnote. > > Mihael > > On Tue, 2011-06-14 at 15:13 -0500, David Kelly wrote: >> Hi Tom, >> >> I don't think there is any documentation that exists which describes >> every sites.xml option.. but there should be. I created a bugzilla >> ticket for myself to add this. The closest thing right now is the >> sites.xml entry of the userguide at >> http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_the_site_catalog_sites_xml, but it is incomplete. >> >> David >> >> >> On Tue, Jun 14, 2011 at 9:17 AM, Thomas Uram >> wrote: >> >> I know there's been a documentation push, so: Where can I find >> the best documentation of available sites.xml options? Is >> there, for example, a gsissh provider? >> >> Tom >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > From ketancmaheshwari at gmail.com Wed Jun 15 13:59:47 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Wed, 15 Jun 2011 13:59:47 -0500 Subject: [Swift-devel] Documentation of sites.xml In-Reply-To: References: <1308082740.29249.2.camel@blabla2.none> Message-ID: should work. On Wed, Jun 15, 2011 at 1:55 PM, Thomas Uram wrote: > If the spec is deprecated, what's the equivalent > spec? I tried > > > > and got > > Caused by: org.globus.cog.karajan.scheduler.NoSuchResourceException: > Could not find a suitable service/provider for host TeraGrid-Ranger > > This is, admittedly, with swift 0.92. > > > > > On Jun 14, 2011, at 3:19 PM, Mihael Hategan wrote: > > > And now that we're revamping things, maybe we should have "filesystem" > > and "execution" listed first and the deprecated "gridftp" and > > "jobmanager" listed last or as a footnote. > > > > Mihael > > > > On Tue, 2011-06-14 at 15:13 -0500, David Kelly wrote: > >> Hi Tom, > >> > >> I don't think there is any documentation that exists which describes > >> every sites.xml option.. but there should be. I created a bugzilla > >> ticket for myself to add this. The closest thing right now is the > >> sites.xml entry of the userguide at > >> > http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_the_site_catalog_sites_xml, > but it is incomplete. > >> > >> David > >> > >> > >> On Tue, Jun 14, 2011 at 9:17 AM, Thomas Uram > >> wrote: > >> > >> I know there's been a documentation push, so: Where can I find > >> the best documentation of available sites.xml options? Is > >> there, for example, a gsissh provider? > >> > >> Tom > >> > >> _______________________________________________ > >> Swift-devel mailing list > >> Swift-devel at ci.uchicago.edu > >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > >> > >> _______________________________________________ > >> Swift-devel mailing list > >> Swift-devel at ci.uchicago.edu > >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From turam at mcs.anl.gov Wed Jun 15 14:21:53 2011 From: turam at mcs.anl.gov (Thomas Uram) Date: Wed, 15 Jun 2011 14:21:53 -0500 Subject: [Swift-devel] Documentation of sites.xml In-Reply-To: References: <1308082740.29249.2.camel@blabla2.none> Message-ID: <7CDF4F1E-BE01-4290-BCB4-3581A65C6884@mcs.anl.gov> Yes of course, but Mihael is saying that is deprecated, so I'm looking for the equivalent spec. On Jun 15, 2011, at 1:59 PM, Ketan Maheshwari wrote: > > > should work. > > On Wed, Jun 15, 2011 at 1:55 PM, Thomas Uram wrote: > If the spec is deprecated, what's the equivalent spec? I tried > > > > and got > > Caused by: org.globus.cog.karajan.scheduler.NoSuchResourceException: Could not find a suitable service/provider for host TeraGrid-Ranger > > This is, admittedly, with swift 0.92. > > > > > On Jun 14, 2011, at 3:19 PM, Mihael Hategan wrote: > > > And now that we're revamping things, maybe we should have "filesystem" > > and "execution" listed first and the deprecated "gridftp" and > > "jobmanager" listed last or as a footnote. > > > > Mihael > > > > On Tue, 2011-06-14 at 15:13 -0500, David Kelly wrote: > >> Hi Tom, > >> > >> I don't think there is any documentation that exists which describes > >> every sites.xml option.. but there should be. I created a bugzilla > >> ticket for myself to add this. The closest thing right now is the > >> sites.xml entry of the userguide at > >> http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_the_site_catalog_sites_xml, but it is incomplete. > >> > >> David > >> > >> > >> On Tue, Jun 14, 2011 at 9:17 AM, Thomas Uram > >> wrote: > >> > >> I know there's been a documentation push, so: Where can I find > >> the best documentation of available sites.xml options? Is > >> there, for example, a gsissh provider? > >> > >> Tom > >> > >> _______________________________________________ > >> Swift-devel mailing list > >> Swift-devel at ci.uchicago.edu > >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > >> > >> _______________________________________________ > >> Swift-devel mailing list > >> Swift-devel at ci.uchicago.edu > >> http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Wed Jun 15 14:51:34 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 15 Jun 2011 12:51:34 -0700 Subject: [Swift-devel] Documentation of sites.xml In-Reply-To: References: <1308082740.29249.2.camel@blabla2.none> Message-ID: <1308167494.2105.3.camel@blabla> On Wed, 2011-06-15 at 13:55 -0500, Thomas Uram wrote: > If the spec is deprecated, what's the equivalent spec? I tried > > The correct provider is "gsiftp". Though I was under the impression that "gridftp" should work as well. I'll check that. From hategan at mcs.anl.gov Wed Jun 15 15:05:12 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 15 Jun 2011 13:05:12 -0700 Subject: [Swift-devel] Documentation of sites.xml In-Reply-To: <1308167494.2105.3.camel@blabla> References: <1308082740.29249.2.camel@blabla2.none> <1308167494.2105.3.camel@blabla> Message-ID: <1308168312.2105.4.camel@blabla> On Wed, 2011-06-15 at 12:51 -0700, Mihael Hategan wrote: > On Wed, 2011-06-15 at 13:55 -0500, Thomas Uram wrote: > > If the spec is deprecated, what's the equivalent spec? I tried > > > > > > The correct provider is "gsiftp". > > Though I was under the impression that "gridftp" should work as well. > I'll check that. Right. It turns out that only canonical provider names can be used (aliases cause errors like that). This should be fix-able. From hategan at mcs.anl.gov Wed Jun 15 15:41:11 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 15 Jun 2011 13:41:11 -0700 Subject: [Swift-devel] Documentation of sites.xml In-Reply-To: <1308168312.2105.4.camel@blabla> References: <1308082740.29249.2.camel@blabla2.none> <1308167494.2105.3.camel@blabla> <1308168312.2105.4.camel@blabla> Message-ID: <1308170471.4446.0.camel@blabla> On Wed, 2011-06-15 at 13:05 -0700, Mihael Hategan wrote: > On Wed, 2011-06-15 at 12:51 -0700, Mihael Hategan wrote: > > On Wed, 2011-06-15 at 13:55 -0500, Thomas Uram wrote: > > > If the spec is deprecated, what's the equivalent spec? I tried > > > > > > > > > > The correct provider is "gsiftp". > > > > Though I was under the impression that "gridftp" should work as well. > > I'll check that. Right. Provider aliases are now included in whatever list they needed to be included to make this work. This is trunk. From yadudoc1729 at gmail.com Fri Jun 17 09:56:42 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Fri, 17 Jun 2011 20:26:42 +0530 Subject: [Swift-devel] Associative array in Swift [GSoC] Message-ID: Hi, I've been working on implementing Associative arrays in swift for the past couple of weeks. So far I've gotten simple strings to work as valid array subscripts, which means that the following works : int a[ ]; a["zero"] = 100 ; trace ( a["zero"] ) ; I've posted the edits on the google docs page[1] I had shared earlier. I think I should probably host the code somewhere so that its available for review. Comments and suggestions would be greatly appreciated. [1] https://docs.google.com/document/d/1z5UvA2yUM_NjaATn-_YzH1EQUBrJvn2dMCT-nWIffLk/edit?hl=en_US -- Thanks and Regards, Yadu Nand B From wilde at mcs.anl.gov Fri Jun 17 11:20:11 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 17 Jun 2011 11:20:11 -0500 (CDT) Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: Message-ID: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> Yadu, this is excellent. Some of the next steps are: - add tests to the test suite for associative arrays - look into the behavior of the length() function - thoroughly test the trunk code with these changes (and possibly without, if you need to determine whether any problems uncovered by the tests are related to your changes or not) - work with Justin and Mihael and perhaps others on the team to review your changes. - update the user guide to describe the new capability - commit and announce the new capability Then you and Justin can go on to work on some new capabilities built on top of associative arrays, such as map-reduce mechanisms. Nice work on getting to this milestone! - Mike ----- Original Message ----- > Hi, > > I've been working on implementing Associative arrays in swift for the > past couple of weeks. So far I've gotten simple strings to work as > valid array subscripts, which means that the following works : > > int a[ ]; > a["zero"] = 100 ; > trace ( a["zero"] ) ; > > I've posted the edits on the google docs page[1] I had shared earlier. > I think I should probably host the code somewhere so that its > available > for review. > > Comments and suggestions would be greatly appreciated. > > [1] > https://docs.google.com/document/d/1z5UvA2yUM_NjaATn-_YzH1EQUBrJvn2dMCT-nWIffLk/edit?hl=en_US > -- > Thanks and Regards, > Yadu Nand B > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From wilde at mcs.anl.gov Fri Jun 17 14:04:30 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 17 Jun 2011 14:04:30 -0500 (CDT) Subject: [Swift-devel] Swift log analysis tools In-Reply-To: <1341126804.20938.1308333874067.JavaMail.root@zimbra.anl.gov> Message-ID: <1447910017.21197.1308337470757.JavaMail.root@zimbra.anl.gov> In discussing the SCEC workflow with Ketan, it became clear that a tool to extract useful debugging information from the Swift run .log file would be very useful in diagnosing failures and understanding better how the run is behaving. Ketan is about to make a few log filter scripts to tease out and format this info. Does anyone have any similar tools that wouldbe useful for this purpose? Do the new (or old) log plotting tools have and preprocessing scripts that can be used for this? Should we common scripts to reformat the log for both human analysis and erformance plotting? - Mike From jonmon at utexas.edu Fri Jun 17 14:15:53 2011 From: jonmon at utexas.edu (Jonathan Monette) Date: Fri, 17 Jun 2011 14:15:53 -0500 Subject: [Swift-devel] Re: Swift log analysis tools In-Reply-To: <1447910017.21197.1308337470757.JavaMail.root@zimbra.anl.gov> References: <1447910017.21197.1308337470757.JavaMail.root@zimbra.anl.gov> Message-ID: <76A40E50-E115-4479-B5CA-AC048CB95724@gmail.com> On Jun 17, 2011, at 2:04 PM, Michael Wilde wrote: > > In discussing the SCEC workflow with Ketan, it became clear that a tool to extract useful debugging information from the Swift run .log file would be very useful in diagnosing failures and understanding better how the run is behaving. > > Ketan is about to make a few log filter scripts to tease out and format this info. > > Does anyone have any similar tools that wouldbe useful for this purpose? The only script I have is one that counts the number of times and an app has been completed. I think it may also check to see how many times certain files have been staged in/out successfully but not sure on that one. It is just a one line script of a common grep line that I do. > > Do the new (or old) log plotting tools have and preprocessing scripts that can be used for this? I am not sure how well the old tool work and currently there is not a lot of new tools. > > Should we common scripts to reformat the log for both human analysis and erformance plotting? I think this would be very helpful for both debuging and plotting purposes. I normally just eyeball the log file and see if anything funky is happening but for large workflows that is very difficult. Scripts that pull out useful information would be helpful. But the logging levels that they need turned on should be well documented in a place that is easily found. > > - Mike From yadudoc1729 at gmail.com Fri Jun 17 14:20:34 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Sat, 18 Jun 2011 00:50:34 +0530 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> Message-ID: Hi Michael, > Yadu, this is excellent. Thank you :) > Then you and Justin can go on to work on some new capabilities built on top of associative arrays, such as map-reduce mechanisms. I've had some discussions with Justin over the features associative arrays must have in order to be useful for map-reduce. It seems that there must be some way to add multiple associations of values to the same key. Right now, if I try : a[0] = 100 ; a[0] = 200; I get an error saying " a is closed with value 100" Instead the above should probably make the lookup a[0] return a list of values, say [100, 200] . I could really use help on this. I will get started on the tests as soon as we decide on how the associative arrays should behave and whats allowed and not. -- Thanks and Regards, Yadu Nand B From alberto_chavez at live.com Fri Jun 17 14:28:13 2011 From: alberto_chavez at live.com (Alberto Chavez) Date: Fri, 17 Jun 2011 14:28:13 -0500 Subject: [Swift-devel] Swift unresponsive while using local provider. Message-ID: When I run the following SwiftScript using suite.sh, the report shows an odd behavior, most of the time it times out, but once in a while it passes, however this outcome is completely random, since sometimes that test has passed 3 times in a row, and all of the sudden it fails.This is my script:type messagefile;app (messagefile t) greeting (string s[]) { echo s[0] s[1] s[2] stdout=@filename(t);}messagefile outfile <"q5out.txt">;string words[] = ["how","are","you"];outfile = greeting(words);Swift.properties contents:$ cat swift.properties wrapperlog.always.transfer=truesitedir.keep=trueexecution.retries=0lazy.errors=falsestatus.mode=provideruse.provider.staging=falseprovider.staging.pin.swiftfiles=falseSites.template.xml contents:$ cat sites.template.xml 127.0.0.1 1000 10000 4 8 1000 1 4 /tmp -Alberto -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Fri Jun 17 14:38:53 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 17 Jun 2011 14:38:53 -0500 (CDT) Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: Message-ID: <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov> Alberto, how long are you letting it run for, and under what environment? if you are running on your laptop, how much RAM do you have? Its possible that you are seeing paging delays if you are running the Swift Java app with too little memory. Also, are you running trunk or 0.92.1? You should compare the two. Its *possible* that this simple test is hanging under recent trunk mods, but its more likely that this is some kind of resource shortage. Can you run this on one of the Swift lab machines bridled or communcado, or better yet on the MCS compute servers, or a PADS worker node (which you can get with qsub -I on pads)? Look at Swift under the "top" command to see if Swift is running and slow, or is hung. Stop by and we can discuss in more detail. - Mike ----- Original Message ----- When I run the following SwiftScript using suite.sh, the report shows an odd behavior, most of the time it times out, but once in a while it passes, however this outcome is completely random, since sometimes that test has passed 3 times in a row, and all of the sudden it fails. This is my script: type messagefile; app (messagefile t) greeting (string s[]) { echo s[0] s[1] s[2] stdout=@filename(t); } messagefile outfile <"q5out.txt">; string words[] = ["how","are","you"]; outfile = greeting(words); Swift.properties contents: $ cat swift.properties wrapperlog.always.transfer=true sitedir.keep=true execution.retries=0 lazy.errors=false status.mode=provider use.provider.staging=false provider.staging.pin.swiftfiles=false Sites.template.xml contents: $ cat sites.template.xml 127.0.0.1 1000 10000 4 8 1000 1 4 /tmp -Alberto _______________________________________________ Swift-devel mailing list Swift-devel at ci.uchicago.edu http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidkelly999 at gmail.com Fri Jun 17 14:48:05 2011 From: davidkelly999 at gmail.com (David Kelly) Date: Fri, 17 Jun 2011 14:48:05 -0500 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov> References: <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov> Message-ID: I saw similar things on my laptop (4 gb ram) this weekend when I was testing the galaxy demo scripts using the local provider. I was using trunk. In the output I would see things like "no activity for 10s" and it just would sit there and do nothing until I manually killed it. But most of the time it would work fine. I wrote a little shell script that would repeatedly run it until it hung. Then I was talking to Jon about this and he saw something similar with his montage work. He thought it might be related to a configuration issue - that either wrapper.parameter.mode=files or status.mode=provider should be set. I can send my scripts as well if you need some help in tracking this down. David On Fri, Jun 17, 2011 at 2:38 PM, Michael Wilde wrote: > Alberto, how long are you letting it run for, and under what environment? > if you are running on your laptop, how much RAM do you have? Its possible > that you are seeing paging delays if you are running the Swift Java app with > too little memory. > > Also, are you running trunk or 0.92.1? You should compare the two. > > Its *possible* that this simple test is hanging under recent trunk mods, > but its more likely that this is some kind of resource shortage. > > Can you run this on one of the Swift lab machines bridled or communcado, or > better yet on the MCS compute servers, or a PADS worker node (which you can > get with qsub -I on pads)? > > Look at Swift under the "top" command to see if Swift is running and slow, > or is hung. > > Stop by and we can discuss in more detail. > > - Mike > > > ------------------------------ > > When I run the following SwiftScript using suite.sh, the report shows an > odd behavior, most of the time it times out, but once in a while it passes, > however this outcome is completely random, since sometimes that test has > passed 3 times in a row, and all of the sudden it fails. > This is my script: > > type messagefile; > > app (messagefile t) greeting (string s[]) { > echo s[0] s[1] s[2] stdout=@filename(t); > } > > messagefile outfile <"q5out.txt">; > > string words[] = ["how","are","you"]; > > outfile = greeting(words); > > > > Swift.properties contents: > > $ cat swift.properties > wrapperlog.always.transfer=true > sitedir.keep=true > execution.retries=0 > lazy.errors=false > status.mode=provider > use.provider.staging=false > provider.staging.pin.swiftfiles=false > > Sites.template.xml contents: > > $ cat sites.template.xml > > > > > key="internalHostname">127.0.0.1 > 1000 > 10000 > 4 > 8 > 1000 > 1 > 4 > /tmp > > > > -Alberto > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Fri Jun 17 14:54:32 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 17 Jun 2011 14:54:32 -0500 (CDT) Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: Message-ID: <1483288849.21477.1308340472974.JavaMail.root@zimbra.anl.gov> Yes, please post the scripts, David. This sounds like trunk instability. Is your trunk working copy and build up to date? Does that same script run quickly every time under 0.92.1? - Mike ----- Original Message ----- I saw similar things on my laptop (4 gb ram) this weekend when I was testing the galaxy demo scripts using the local provider. I was using trunk. In the output I would see things like "no activity for 10s" and it just would sit there and do nothing until I manually killed it. But most of the time it would work fine. I wrote a little shell script that would repeatedly run it until it hung. Then I was talking to Jon about this and he saw something similar with his montage work. He thought it might be related to a configuration issue - that either wrapper.parameter.mode=files or status.mode=provider should be set. I can send my scripts as well if you need some help in tracking this down. David On Fri, Jun 17, 2011 at 2:38 PM, Michael Wilde < wilde at mcs.anl.gov > wrote: Alberto, how long are you letting it run for, and under what environment? if you are running on your laptop, how much RAM do you have? Its possible that you are seeing paging delays if you are running the Swift Java app with too little memory. Also, are you running trunk or 0.92.1? You should compare the two. Its *possible* that this simple test is hanging under recent trunk mods, but its more likely that this is some kind of resource shortage. Can you run this on one of the Swift lab machines bridled or communcado, or better yet on the MCS compute servers, or a PADS worker node (which you can get with qsub -I on pads)? Look at Swift under the "top" command to see if Swift is running and slow, or is hung. Stop by and we can discuss in more detail. - Mike When I run the following SwiftScript using suite.sh, the report shows an odd behavior, most of the time it times out, but once in a while it passes, however this outcome is completely random, since sometimes that test has passed 3 times in a row, and all of the sudden it fails. This is my script: type messagefile; app (messagefile t) greeting (string s[]) { echo s[0] s[1] s[2] stdout=@filename(t); } messagefile outfile <"q5out.txt">; string words[] = ["how","are","you"]; outfile = greeting(words); Swift.properties contents: $ cat swift.properties wrapperlog.always.transfer=true sitedir.keep=true execution.retries=0 lazy.errors=false status.mode=provider use.provider.staging=false provider.staging.pin.swiftfiles=false Sites.template.xml contents: $ cat sites.template.xml 127.0.0.1 1000 10000 4 8 1000 1 4 /tmp -Alberto _______________________________________________ Swift-devel mailing list Swift-devel at ci.uchicago.edu http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory _______________________________________________ Swift-devel mailing list Swift-devel at ci.uchicago.edu http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Fri Jun 17 14:56:25 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 17 Jun 2011 12:56:25 -0700 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: References: <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov> Message-ID: <1308340585.14231.0.camel@blabla> do "jstack -l " whenever it happens and send the output. On Fri, 2011-06-17 at 14:48 -0500, David Kelly wrote: > I saw similar things on my laptop (4 gb ram) this weekend when I was > testing the galaxy demo scripts using the local provider. I was using > trunk. In the output I would see things like "no activity for 10s" and > it just would sit there and do nothing until I manually killed it. But > most of the time it would work fine. I wrote a little shell script > that would repeatedly run it until it hung. Then I was talking to Jon > about this and he saw something similar with his montage work. He > thought it might be related to a configuration issue - that either > wrapper.parameter.mode=files or status.mode=provider should be set. > > I can send my scripts as well if you need some help in tracking this > down. > > David > > On Fri, Jun 17, 2011 at 2:38 PM, Michael Wilde > wrote: > Alberto, how long are you letting it run for, and under what > environment? if you are running on your laptop, how much RAM > do you have? Its possible that you are seeing paging delays > if you are running the Swift Java app with too little memory. > > > Also, are you running trunk or 0.92.1? You should compare the > two. > > > Its *possible* that this simple test is hanging under recent > trunk mods, but its more likely that this is some kind of > resource shortage. > > > Can you run this on one of the Swift lab machines bridled or > communcado, or better yet on the MCS compute servers, or a > PADS worker node (which you can get with qsub -I on pads)? > > > Look at Swift under the "top" command to see if Swift is > running and slow, or is hung. > > > Stop by and we can discuss in more detail. > > > - Mike > > > > ______________________________________________________________ > > When I run the following SwiftScript using suite.sh, > the report shows an odd behavior, most of the time it > times out, but once in a while it passes, however this > outcome is completely random, since sometimes that > test has passed 3 times in a row, and all of the > sudden it fails. > This is my script: > > > type messagefile; > > > app (messagefile t) greeting (string s[]) { > echo s[0] s[1] s[2] stdout=@filename(t); > } > > > messagefile outfile <"q5out.txt">; > > > string words[] = ["how","are","you"]; > > > outfile = greeting(words); > > > > > > > > Swift.properties contents: > > > $ cat swift.properties > wrapperlog.always.transfer=true > sitedir.keep=true > execution.retries=0 > lazy.errors=false > status.mode=provider > use.provider.staging=false > provider.staging.pin.swiftfiles=false > > > Sites.template.xml contents: > > > $ cat sites.template.xml > > > > jobmanager="local:local"/> > key="internalHostname">127.0.0.1 > key="jobthrottle">1000 > key="initialScore">10000 > key="jobsPerNode">4 > key="slots">8 > key="maxTime">1000 > key="nodeGranularity">1 > key="maxNodes">4 > /tmp > > > > > -Alberto > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel From wilde at mcs.anl.gov Fri Jun 17 15:03:25 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 17 Jun 2011 15:03:25 -0500 (CDT) Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: <1308340585.14231.0.camel@blabla> Message-ID: <1322347220.21510.1308341005897.JavaMail.root@zimbra.anl.gov> This would be a great thing to build into the test suite for any test run that the suite is about to cancel for exceeding its max run time: capture the stack trace, so that if its a deadlock, we can do some diagnosis right form the test results. Can we add some Swift commands to dump Swifts thread/future status as well? - Mike ----- Original Message ----- > do "jstack -l " whenever it happens and > send > the output. > > > > On Fri, 2011-06-17 at 14:48 -0500, David Kelly wrote: > > I saw similar things on my laptop (4 gb ram) this weekend when I was > > testing the galaxy demo scripts using the local provider. I was > > using > > trunk. In the output I would see things like "no activity for 10s" > > and > > it just would sit there and do nothing until I manually killed it. > > But > > most of the time it would work fine. I wrote a little shell script > > that would repeatedly run it until it hung. Then I was talking to > > Jon > > about this and he saw something similar with his montage work. He > > thought it might be related to a configuration issue - that either > > wrapper.parameter.mode=files or status.mode=provider should be set. > > > > I can send my scripts as well if you need some help in tracking this > > down. > > > > David > > > > On Fri, Jun 17, 2011 at 2:38 PM, Michael Wilde > > wrote: > > Alberto, how long are you letting it run for, and under what > > environment? if you are running on your laptop, how much RAM > > do you have? Its possible that you are seeing paging delays > > if you are running the Swift Java app with too little > > memory. > > > > > > Also, are you running trunk or 0.92.1? You should compare > > the > > two. > > > > > > Its *possible* that this simple test is hanging under recent > > trunk mods, but its more likely that this is some kind of > > resource shortage. > > > > > > Can you run this on one of the Swift lab machines bridled or > > communcado, or better yet on the MCS compute servers, or a > > PADS worker node (which you can get with qsub -I on pads)? > > > > > > Look at Swift under the "top" command to see if Swift is > > running and slow, or is hung. > > > > > > Stop by and we can discuss in more detail. > > > > > > - Mike > > > > > > > > ______________________________________________________________ > > > > When I run the following SwiftScript using suite.sh, > > the report shows an odd behavior, most of the time > > it > > times out, but once in a while it passes, however > > this > > outcome is completely random, since sometimes that > > test has passed 3 times in a row, and all of the > > sudden it fails. > > This is my script: > > > > > > type messagefile; > > > > > > app (messagefile t) greeting (string s[]) { > > echo s[0] s[1] s[2] stdout=@filename(t); > > } > > > > > > messagefile outfile <"q5out.txt">; > > > > > > string words[] = ["how","are","you"]; > > > > > > outfile = greeting(words); > > > > > > > > > > > > > > > > Swift.properties contents: > > > > > > $ cat swift.properties > > wrapperlog.always.transfer=true > > sitedir.keep=true > > execution.retries=0 > > lazy.errors=false > > status.mode=provider > > use.provider.staging=false > > provider.staging.pin.swiftfiles=false > > > > > > Sites.template.xml contents: > > > > > > $ cat sites.template.xml > > > > > > > > > jobmanager="local:local"/> > > > key="internalHostname">127.0.0.1 > > > key="jobthrottle">1000 > > > key="initialScore">10000 > > > key="jobsPerNode">4 > > > key="slots">8 > > > key="maxTime">1000 > > > key="nodeGranularity">1 > > > key="maxNodes">4 > > /tmp > > > > > > > > > > -Alberto > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From hategan at mcs.anl.gov Fri Jun 17 15:08:32 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 17 Jun 2011 13:08:32 -0700 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: <1322347220.21510.1308341005897.JavaMail.root@zimbra.anl.gov> References: <1322347220.21510.1308341005897.JavaMail.root@zimbra.anl.gov> Message-ID: <1308341312.14339.1.camel@blabla> On Fri, 2011-06-17 at 15:03 -0500, Michael Wilde wrote: > This would be a great thing to build into the test suite for any test run that the suite is about to cancel for exceeding its max run time: capture the stack trace, so that if its a deadlock, we can do some diagnosis right form the test results. > > Can we add some Swift commands to dump Swifts thread/future status as well? The hang checker already does that, but there is a class of deadlocks that has no open futures. From davidkelly999 at gmail.com Fri Jun 17 15:13:39 2011 From: davidkelly999 at gmail.com (David Kelly) Date: Fri, 17 Jun 2011 15:13:39 -0500 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: <1308341312.14339.1.camel@blabla> References: <1322347220.21510.1308341005897.JavaMail.root@zimbra.anl.gov> <1308341312.14339.1.camel@blabla> Message-ID: I just got a freeze running Alberto's script with the default local sites.xml. Here is the info: 2011-06-17 15:10:34 Full thread dump Java HotSpot(TM) Server VM (19.1-b02 mixed mode): "Attach Listener" daemon prio=10 tid=0x08fd9800 nid=0x237e waiting on condition [0x00000000] java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "Progress ticker" daemon prio=10 tid=0x08c48800 nid=0x2359 waiting on condition [0x9e05c000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.griphyn.vdl.karajan.lib.RuntimeStats$ProgressTicker.run(RuntimeStats.java:141) Locked ownable synchronizers: - None "Restart Log Sync" daemon prio=10 tid=0x08b80400 nid=0x2358 in Object.wait() [0x9e0ad000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0xaed6b160> (a org.globus.cog.karajan.workflow.nodes.restartLog.SyncThread) at java.lang.Object.wait(Object.java:485) at org.globus.cog.karajan.workflow.nodes.restartLog.SyncThread.run(SyncThread.java:47) - locked <0xaed6b160> (a org.globus.cog.karajan.workflow.nodes.restartLog.SyncThread) Locked ownable synchronizers: - None "Overloaded Host Monitor" daemon prio=10 tid=0x08b47400 nid=0x2357 waiting on condition [0x9e0fe000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.globus.cog.karajan.scheduler.OverloadedHostMonitor.run(OverloadedHostMonitor.java:47) Locked ownable synchronizers: - None "Timer-0" daemon prio=10 tid=0x08f14c00 nid=0x2356 in Object.wait() [0x9e676000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0xaf280198> (a java.util.TaskQueue) at java.util.TimerThread.mainLoop(Timer.java:509) - locked <0xaf280198> (a java.util.TaskQueue) at java.util.TimerThread.run(Timer.java:462) Locked ownable synchronizers: - None "NBS0" daemon prio=10 tid=0x08b0ec00 nid=0x2355 waiting on condition [0x9e6c7000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xaf280248> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - None "pool-1-thread-8" prio=10 tid=0x09008400 nid=0x2354 waiting for monitor entry [0x9e718000] java.lang.Thread.State: BLOCKED (on object monitor) at org.griphyn.vdl.karajan.WrapperMap.close(WrapperMap.java:25) - waiting to lock <0xaf432670> (a org.griphyn.vdl.karajan.WrapperMap) at org.griphyn.vdl.karajan.lib.VDLFunction.closeShallow(VDLFunction.java:516) - locked <0xaebee120> (a org.griphyn.vdl.mapping.RootArrayDataNode) at org.griphyn.vdl.karajan.lib.SetFieldValue.deepCopy(SetFieldValue.java:121) at org.griphyn.vdl.karajan.lib.SetFieldValue.function(SetFieldValue.java:49) - locked <0xaebee120> (a org.griphyn.vdl.mapping.RootArrayDataNode) at org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:67) at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) at org.globus.cog.karajan.workflow.nodes.functions.Argument.post(Argument.java:48) at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) at org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:71) at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) at org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28) at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) at org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28) at org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) at org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) at org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) at org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) at org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) at org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) at org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) at org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - <0xaf2805f0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) "pool-1-thread-7" prio=10 tid=0x09006800 nid=0x2353 waiting on condition [0x9e769000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xaf2806e0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - None "pool-1-thread-6" prio=10 tid=0x09004c00 nid=0x2352 waiting on condition [0x9e7ba000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xaf2806e0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - None "pool-1-thread-5" prio=10 tid=0x09002800 nid=0x2351 waiting for monitor entry [0x9e80b000] java.lang.Thread.State: BLOCKED (on object monitor) at org.griphyn.vdl.mapping.AbstractDataNode.addListener(AbstractDataNode.java:575) - waiting to lock <0xaebee120> (a org.griphyn.vdl.mapping.RootArrayDataNode) at org.griphyn.vdl.karajan.DSHandleFutureWrapper.(DSHandleFutureWrapper.java:24) at org.griphyn.vdl.karajan.WrapperMap.addNodeListener(WrapperMap.java:61) - locked <0xaf432670> (a org.griphyn.vdl.karajan.WrapperMap) at org.griphyn.vdl.karajan.lib.VDLFunction.addFutureListener(VDLFunction.java:523) at org.griphyn.vdl.karajan.lib.Stagein.function(Stagein.java:88) at org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:67) at org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) at org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) at org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) at org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) at org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) at org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) at org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) at org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - <0xaf280d48> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) "pool-1-thread-4" prio=10 tid=0x08ff3400 nid=0x2350 waiting on condition [0x9e85c000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xaf2806e0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - None "pool-1-thread-3" prio=10 tid=0x08ff0800 nid=0x234f waiting on condition [0x9e8ad000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xaf2806e0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - None "pool-1-thread-2" prio=10 tid=0x08ede000 nid=0x234e waiting on condition [0x9ea98000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xaf2806e0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - None "pool-1-thread-1" prio=10 tid=0x08ef4000 nid=0x234d waiting on condition [0x9eae9000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xaf2806e0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - None "Hang checker" prio=10 tid=0x08eda400 nid=0x234c waiting for monitor entry [0x9e8fe000] java.lang.Thread.State: BLOCKED (on object monitor) at org.griphyn.vdl.karajan.Monitor.dumpVariables(Monitor.java:220) - waiting to lock <0xaf432670> (a org.griphyn.vdl.karajan.WrapperMap) at org.griphyn.vdl.karajan.HangChecker.run(HangChecker.java:54) at java.util.TimerThread.mainLoop(Timer.java:512) at java.util.TimerThread.run(Timer.java:462) Locked ownable synchronizers: - None "Low Memory Detector" daemon prio=10 tid=0x9f514800 nid=0x234a runnable [0x00000000] java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "CompilerThread1" daemon prio=10 tid=0x9f512c00 nid=0x2349 waiting on condition [0x00000000] java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "CompilerThread0" daemon prio=10 tid=0x9f510800 nid=0x2348 waiting on condition [0x00000000] java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "Signal Dispatcher" daemon prio=10 tid=0x9f50f000 nid=0x2347 runnable [0x00000000] java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "Finalizer" daemon prio=10 tid=0x9f500800 nid=0x2346 in Object.wait() [0x9f260000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0xaf4341c0> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118) - locked <0xaf4341c0> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159) Locked ownable synchronizers: - None "Reference Handler" daemon prio=10 tid=0x089c3800 nid=0x2345 in Object.wait() [0x9f2b1000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0xaf2803a0> (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:485) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116) - locked <0xaf2803a0> (a java.lang.ref.Reference$Lock) Locked ownable synchronizers: - None "main" prio=10 tid=0x08921400 nid=0x233f in Object.wait() [0xb6ad6000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0xaf42fe78> (a org.griphyn.vdl.karajan.VDL2ExecutionContext) at java.lang.Object.wait(Object.java:485) at org.globus.cog.karajan.workflow.ExecutionContext.waitFor(ExecutionContext.java:226) - locked <0xaf42fe78> (a org.griphyn.vdl.karajan.VDL2ExecutionContext) at org.griphyn.vdl.karajan.Loader.main(Loader.java:201) Locked ownable synchronizers: - None "VM Thread" prio=10 tid=0x089bfc00 nid=0x2344 runnable "GC task thread#0 (ParallelGC)" prio=10 tid=0x08928800 nid=0x2340 runnable "GC task thread#1 (ParallelGC)" prio=10 tid=0x08929c00 nid=0x2341 runnable "GC task thread#2 (ParallelGC)" prio=10 tid=0x0892b400 nid=0x2342 runnable "GC task thread#3 (ParallelGC)" prio=10 tid=0x0892c800 nid=0x2343 runnable "VM Periodic Task Thread" prio=10 tid=0x9f51e800 nid=0x234b waiting on condition JNI global references: 1370 Found one Java-level deadlock: ============================= "pool-1-thread-8": waiting to lock monitor 0x09042b08 (object 0xaf432670, a org.griphyn.vdl.karajan.WrapperMap), which is held by "pool-1-thread-5" "pool-1-thread-5": waiting to lock monitor 0x9f56e630 (object 0xaebee120, a org.griphyn.vdl.mapping.RootArrayDataNode), which is held by "pool-1-thread-8" Java stack information for the threads listed above: =================================================== "pool-1-thread-8": at org.griphyn.vdl.karajan.WrapperMap.close(WrapperMap.java:25) - waiting to lock <0xaf432670> (a org.griphyn.vdl.karajan.WrapperMap) at org.griphyn.vdl.karajan.lib.VDLFunction.closeShallow(VDLFunction.java:516) - locked <0xaebee120> (a org.griphyn.vdl.mapping.RootArrayDataNode) at org.griphyn.vdl.karajan.lib.SetFieldValue.deepCopy(SetFieldValue.java:121) at org.griphyn.vdl.karajan.lib.SetFieldValue.function(SetFieldValue.java:49) - locked <0xaebee120> (a org.griphyn.vdl.mapping.RootArrayDataNode) at org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:67) at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) at org.globus.cog.karajan.workflow.nodes.functions.Argument.post(Argument.java:48) at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) at org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:71) at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) at org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28) at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) at org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28) at org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) at org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) at org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) at org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) at org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) at org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) at org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) at org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) "pool-1-thread-5": at org.griphyn.vdl.mapping.AbstractDataNode.addListener(AbstractDataNode.java:575) - waiting to lock <0xaebee120> (a org.griphyn.vdl.mapping.RootArrayDataNode) at org.griphyn.vdl.karajan.DSHandleFutureWrapper.(DSHandleFutureWrapper.java:24) at org.griphyn.vdl.karajan.WrapperMap.addNodeListener(WrapperMap.java:61) - locked <0xaf432670> (a org.griphyn.vdl.karajan.WrapperMap) at org.griphyn.vdl.karajan.lib.VDLFunction.addFutureListener(VDLFunction.java:523) at org.griphyn.vdl.karajan.lib.Stagein.function(Stagein.java:88) at org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:67) at org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) at org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) at org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) at org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) at org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) at org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) at org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) at org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Found 1 deadlock. On Fri, Jun 17, 2011 at 3:08 PM, Mihael Hategan wrote: > On Fri, 2011-06-17 at 15:03 -0500, Michael Wilde wrote: > > This would be a great thing to build into the test suite for any test run > that the suite is about to cancel for exceeding its max run time: capture > the stack trace, so that if its a deadlock, we can do some diagnosis right > form the test results. > > > > Can we add some Swift commands to dump Swifts thread/future status as > well? > > The hang checker already does that, but there is a class of deadlocks > that has no open futures. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alberto_chavez at live.com Fri Jun 17 15:55:32 2011 From: alberto_chavez at live.com (Alberto Chavez) Date: Fri, 17 Jun 2011 15:55:32 -0500 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: <1308341312.14339.1.camel@blabla> References: <1322347220.21510.1308341005897.JavaMail.root@zimbra.anl.gov>, <1308341312.14339.1.camel@blabla> Message-ID: I just checked the report for Bridled and Communicado and the test failed in both , with the same output:type messagefile; app (messagefile t) greeting (string s[]) { echo s[0] s[1] s[2] stdout=@filename(t); } messagefile outfile <"q5out.txt">; string words[] = ["how","are","you"]; outfile = greeting(words);--------------------------------------------------------Swift svn swift-r4629 cog-r3164 RunID: 20110617-1525-c7tjr1mg Progress: time: Fri, 17 Jun 2011 15:25:25 -0500 No events in 10s. Progress: time: Fri, 17 Jun 2011 15:25:55 -0500------------------------------------------------------------------------------------------------------------------ > Subject: Re: [Swift-devel] Swift unresponsive while using local provider. > From: hategan at mcs.anl.gov > To: wilde at mcs.anl.gov > Date: Fri, 17 Jun 2011 13:08:32 -0700 > CC: alan.chavez at live.com.mx; swift-devel at ci.uchicago.edu > > On Fri, 2011-06-17 at 15:03 -0500, Michael Wilde wrote: > > This would be a great thing to build into the test suite for any test run that the suite is about to cancel for exceeding its max run time: capture the stack trace, so that if its a deadlock, we can do some diagnosis right form the test results. > > > > Can we add some Swift commands to dump Swifts thread/future status as well? > > The hang checker already does that, but there is a class of deadlocks > that has no open futures. > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Fri Jun 17 16:19:09 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 17 Jun 2011 14:19:09 -0700 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: References: <1322347220.21510.1308341005897.JavaMail.root@zimbra.anl.gov> <1308341312.14339.1.camel@blabla> Message-ID: <1308345549.14529.0.camel@blabla> Thanks. I committed a fix yesterday around a similar issue, but I want to double check that there are no cases that are going to lead to problems. On Fri, 2011-06-17 at 15:13 -0500, David Kelly wrote: > I just got a freeze running Alberto's script with the default local > sites.xml. Here is the info: > > 2011-06-17 15:10:34 > Full thread dump Java HotSpot(TM) Server VM (19.1-b02 mixed mode): > > "Attach Listener" daemon prio=10 tid=0x08fd9800 nid=0x237e waiting on > condition [0x00000000] > java.lang.Thread.State: RUNNABLE > > Locked ownable synchronizers: > - None > > "Progress ticker" daemon prio=10 tid=0x08c48800 nid=0x2359 waiting on > condition [0x9e05c000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at org.griphyn.vdl.karajan.lib.RuntimeStats > $ProgressTicker.run(RuntimeStats.java:141) > > Locked ownable synchronizers: > - None > > "Restart Log Sync" daemon prio=10 tid=0x08b80400 nid=0x2358 in > Object.wait() [0x9e0ad000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0xaed6b160> (a > org.globus.cog.karajan.workflow.nodes.restartLog.SyncThread) > at java.lang.Object.wait(Object.java:485) > at > org.globus.cog.karajan.workflow.nodes.restartLog.SyncThread.run(SyncThread.java:47) > - locked <0xaed6b160> (a > org.globus.cog.karajan.workflow.nodes.restartLog.SyncThread) > > Locked ownable synchronizers: > - None > > "Overloaded Host Monitor" daemon prio=10 tid=0x08b47400 nid=0x2357 > waiting on condition [0x9e0fe000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.globus.cog.karajan.scheduler.OverloadedHostMonitor.run(OverloadedHostMonitor.java:47) > > Locked ownable synchronizers: > - None > > "Timer-0" daemon prio=10 tid=0x08f14c00 nid=0x2356 in Object.wait() > [0x9e676000] > java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0xaf280198> (a java.util.TaskQueue) > at java.util.TimerThread.mainLoop(Timer.java:509) > - locked <0xaf280198> (a java.util.TaskQueue) > at java.util.TimerThread.run(Timer.java:462) > > Locked ownable synchronizers: > - None > > "NBS0" daemon prio=10 tid=0x08b0ec00 nid=0x2355 waiting on condition > [0x9e6c7000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xaf280248> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > at java.util.concurrent.locks.AbstractQueuedSynchronizer > $ConditionObject.await(AbstractQueuedSynchronizer.java:1987) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) > at java.util.concurrent.ThreadPoolExecutor > $Worker.run(ThreadPoolExecutor.java:907) > at java.lang.Thread.run(Thread.java:662) > > Locked ownable synchronizers: > - None > > "pool-1-thread-8" prio=10 tid=0x09008400 nid=0x2354 waiting for > monitor entry [0x9e718000] > java.lang.Thread.State: BLOCKED (on object monitor) > at org.griphyn.vdl.karajan.WrapperMap.close(WrapperMap.java:25) > - waiting to lock <0xaf432670> (a > org.griphyn.vdl.karajan.WrapperMap) > at > org.griphyn.vdl.karajan.lib.VDLFunction.closeShallow(VDLFunction.java:516) > - locked <0xaebee120> (a > org.griphyn.vdl.mapping.RootArrayDataNode) > at > org.griphyn.vdl.karajan.lib.SetFieldValue.deepCopy(SetFieldValue.java:121) > at > org.griphyn.vdl.karajan.lib.SetFieldValue.function(SetFieldValue.java:49) > - locked <0xaebee120> (a > org.griphyn.vdl.mapping.RootArrayDataNode) > at > org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:67) > at > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > at > org.globus.cog.karajan.workflow.nodes.functions.Argument.post(Argument.java:48) > at > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > at > org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:71) > at > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > at > org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28) > at > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > at > org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28) > at > org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) > at > org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) > at > org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) > at > org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) > at > org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) > at java.util.concurrent.Executors > $RunnableAdapter.call(Executors.java:441) > at java.util.concurrent.FutureTask > $Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at java.util.concurrent.ThreadPoolExecutor > $Worker.runTask(ThreadPoolExecutor.java:886) > at java.util.concurrent.ThreadPoolExecutor > $Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > > Locked ownable synchronizers: > - <0xaf2805f0> (a java.util.concurrent.locks.ReentrantLock > $NonfairSync) > > "pool-1-thread-7" prio=10 tid=0x09006800 nid=0x2353 waiting on > condition [0x9e769000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xaf2806e0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > at java.util.concurrent.locks.AbstractQueuedSynchronizer > $ConditionObject.await(AbstractQueuedSynchronizer.java:1987) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) > at java.util.concurrent.ThreadPoolExecutor > $Worker.run(ThreadPoolExecutor.java:907) > at java.lang.Thread.run(Thread.java:662) > > Locked ownable synchronizers: > - None > > "pool-1-thread-6" prio=10 tid=0x09004c00 nid=0x2352 waiting on > condition [0x9e7ba000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xaf2806e0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > at java.util.concurrent.locks.AbstractQueuedSynchronizer > $ConditionObject.await(AbstractQueuedSynchronizer.java:1987) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) > at java.util.concurrent.ThreadPoolExecutor > $Worker.run(ThreadPoolExecutor.java:907) > at java.lang.Thread.run(Thread.java:662) > > Locked ownable synchronizers: > - None > > "pool-1-thread-5" prio=10 tid=0x09002800 nid=0x2351 waiting for > monitor entry [0x9e80b000] > java.lang.Thread.State: BLOCKED (on object monitor) > at > org.griphyn.vdl.mapping.AbstractDataNode.addListener(AbstractDataNode.java:575) > - waiting to lock <0xaebee120> (a > org.griphyn.vdl.mapping.RootArrayDataNode) > at > org.griphyn.vdl.karajan.DSHandleFutureWrapper.(DSHandleFutureWrapper.java:24) > at > org.griphyn.vdl.karajan.WrapperMap.addNodeListener(WrapperMap.java:61) > - locked <0xaf432670> (a org.griphyn.vdl.karajan.WrapperMap) > at > org.griphyn.vdl.karajan.lib.VDLFunction.addFutureListener(VDLFunction.java:523) > at org.griphyn.vdl.karajan.lib.Stagein.function(Stagein.java:88) > at > org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:67) > at > org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) > at > org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) > at > org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) > at > org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) > at > org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) > at java.util.concurrent.Executors > $RunnableAdapter.call(Executors.java:441) > at java.util.concurrent.FutureTask > $Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at java.util.concurrent.ThreadPoolExecutor > $Worker.runTask(ThreadPoolExecutor.java:886) > at java.util.concurrent.ThreadPoolExecutor > $Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > > Locked ownable synchronizers: > - <0xaf280d48> (a java.util.concurrent.locks.ReentrantLock > $NonfairSync) > > "pool-1-thread-4" prio=10 tid=0x08ff3400 nid=0x2350 waiting on > condition [0x9e85c000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xaf2806e0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > at java.util.concurrent.locks.AbstractQueuedSynchronizer > $ConditionObject.await(AbstractQueuedSynchronizer.java:1987) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) > at java.util.concurrent.ThreadPoolExecutor > $Worker.run(ThreadPoolExecutor.java:907) > at java.lang.Thread.run(Thread.java:662) > > Locked ownable synchronizers: > - None > > "pool-1-thread-3" prio=10 tid=0x08ff0800 nid=0x234f waiting on > condition [0x9e8ad000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xaf2806e0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > at java.util.concurrent.locks.AbstractQueuedSynchronizer > $ConditionObject.await(AbstractQueuedSynchronizer.java:1987) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) > at java.util.concurrent.ThreadPoolExecutor > $Worker.run(ThreadPoolExecutor.java:907) > at java.lang.Thread.run(Thread.java:662) > > Locked ownable synchronizers: > - None > > "pool-1-thread-2" prio=10 tid=0x08ede000 nid=0x234e waiting on > condition [0x9ea98000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xaf2806e0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > at java.util.concurrent.locks.AbstractQueuedSynchronizer > $ConditionObject.await(AbstractQueuedSynchronizer.java:1987) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) > at java.util.concurrent.ThreadPoolExecutor > $Worker.run(ThreadPoolExecutor.java:907) > at java.lang.Thread.run(Thread.java:662) > > Locked ownable synchronizers: > - None > > "pool-1-thread-1" prio=10 tid=0x08ef4000 nid=0x234d waiting on > condition [0x9eae9000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xaf2806e0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > at java.util.concurrent.locks.AbstractQueuedSynchronizer > $ConditionObject.await(AbstractQueuedSynchronizer.java:1987) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) > at java.util.concurrent.ThreadPoolExecutor > $Worker.run(ThreadPoolExecutor.java:907) > at java.lang.Thread.run(Thread.java:662) > > Locked ownable synchronizers: > - None > > "Hang checker" prio=10 tid=0x08eda400 nid=0x234c waiting for monitor > entry [0x9e8fe000] > java.lang.Thread.State: BLOCKED (on object monitor) > at org.griphyn.vdl.karajan.Monitor.dumpVariables(Monitor.java:220) > - waiting to lock <0xaf432670> (a > org.griphyn.vdl.karajan.WrapperMap) > at org.griphyn.vdl.karajan.HangChecker.run(HangChecker.java:54) > at java.util.TimerThread.mainLoop(Timer.java:512) > at java.util.TimerThread.run(Timer.java:462) > > Locked ownable synchronizers: > - None > > "Low Memory Detector" daemon prio=10 tid=0x9f514800 nid=0x234a > runnable [0x00000000] > java.lang.Thread.State: RUNNABLE > > Locked ownable synchronizers: > - None > > "CompilerThread1" daemon prio=10 tid=0x9f512c00 nid=0x2349 waiting on > condition [0x00000000] > java.lang.Thread.State: RUNNABLE > > Locked ownable synchronizers: > - None > > "CompilerThread0" daemon prio=10 tid=0x9f510800 nid=0x2348 waiting on > condition [0x00000000] > java.lang.Thread.State: RUNNABLE > > Locked ownable synchronizers: > - None > > "Signal Dispatcher" daemon prio=10 tid=0x9f50f000 nid=0x2347 runnable > [0x00000000] > java.lang.Thread.State: RUNNABLE > > Locked ownable synchronizers: > - None > > "Finalizer" daemon prio=10 tid=0x9f500800 nid=0x2346 in Object.wait() > [0x9f260000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0xaf4341c0> (a java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118) > - locked <0xaf4341c0> (a java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134) > at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159) > > Locked ownable synchronizers: > - None > > "Reference Handler" daemon prio=10 tid=0x089c3800 nid=0x2345 in > Object.wait() [0x9f2b1000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0xaf2803a0> (a java.lang.ref.Reference$Lock) > at java.lang.Object.wait(Object.java:485) > at java.lang.ref.Reference > $ReferenceHandler.run(Reference.java:116) > - locked <0xaf2803a0> (a java.lang.ref.Reference$Lock) > > Locked ownable synchronizers: > - None > > "main" prio=10 tid=0x08921400 nid=0x233f in Object.wait() [0xb6ad6000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0xaf42fe78> (a > org.griphyn.vdl.karajan.VDL2ExecutionContext) > at java.lang.Object.wait(Object.java:485) > at > org.globus.cog.karajan.workflow.ExecutionContext.waitFor(ExecutionContext.java:226) > - locked <0xaf42fe78> (a > org.griphyn.vdl.karajan.VDL2ExecutionContext) > at org.griphyn.vdl.karajan.Loader.main(Loader.java:201) > > Locked ownable synchronizers: > - None > > "VM Thread" prio=10 tid=0x089bfc00 nid=0x2344 runnable > > "GC task thread#0 (ParallelGC)" prio=10 tid=0x08928800 nid=0x2340 > runnable > > "GC task thread#1 (ParallelGC)" prio=10 tid=0x08929c00 nid=0x2341 > runnable > > "GC task thread#2 (ParallelGC)" prio=10 tid=0x0892b400 nid=0x2342 > runnable > > "GC task thread#3 (ParallelGC)" prio=10 tid=0x0892c800 nid=0x2343 > runnable > > "VM Periodic Task Thread" prio=10 tid=0x9f51e800 nid=0x234b waiting on > condition > > JNI global references: 1370 > > > Found one Java-level deadlock: > ============================= > "pool-1-thread-8": > waiting to lock monitor 0x09042b08 (object 0xaf432670, a > org.griphyn.vdl.karajan.WrapperMap), > which is held by "pool-1-thread-5" > "pool-1-thread-5": > waiting to lock monitor 0x9f56e630 (object 0xaebee120, a > org.griphyn.vdl.mapping.RootArrayDataNode), > which is held by "pool-1-thread-8" > > Java stack information for the threads listed above: > =================================================== > "pool-1-thread-8": > at org.griphyn.vdl.karajan.WrapperMap.close(WrapperMap.java:25) > - waiting to lock <0xaf432670> (a > org.griphyn.vdl.karajan.WrapperMap) > at > org.griphyn.vdl.karajan.lib.VDLFunction.closeShallow(VDLFunction.java:516) > - locked <0xaebee120> (a > org.griphyn.vdl.mapping.RootArrayDataNode) > at > org.griphyn.vdl.karajan.lib.SetFieldValue.deepCopy(SetFieldValue.java:121) > at > org.griphyn.vdl.karajan.lib.SetFieldValue.function(SetFieldValue.java:49) > - locked <0xaebee120> (a > org.griphyn.vdl.mapping.RootArrayDataNode) > at > org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:67) > at > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > at > org.globus.cog.karajan.workflow.nodes.functions.Argument.post(Argument.java:48) > at > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > at > org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:71) > at > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > at > org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28) > at > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > at > org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28) > at > org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) > at > org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) > at > org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) > at > org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) > at > org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) > at java.util.concurrent.Executors > $RunnableAdapter.call(Executors.java:441) > at java.util.concurrent.FutureTask > $Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at java.util.concurrent.ThreadPoolExecutor > $Worker.runTask(ThreadPoolExecutor.java:886) > at java.util.concurrent.ThreadPoolExecutor > $Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > "pool-1-thread-5": > at > org.griphyn.vdl.mapping.AbstractDataNode.addListener(AbstractDataNode.java:575) > - waiting to lock <0xaebee120> (a > org.griphyn.vdl.mapping.RootArrayDataNode) > at > org.griphyn.vdl.karajan.DSHandleFutureWrapper.(DSHandleFutureWrapper.java:24) > at > org.griphyn.vdl.karajan.WrapperMap.addNodeListener(WrapperMap.java:61) > - locked <0xaf432670> (a org.griphyn.vdl.karajan.WrapperMap) > at > org.griphyn.vdl.karajan.lib.VDLFunction.addFutureListener(VDLFunction.java:523) > at org.griphyn.vdl.karajan.lib.Stagein.function(Stagein.java:88) > at > org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:67) > at > org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) > at > org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) > at > org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) > at > org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) > at > org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) > at java.util.concurrent.Executors > $RunnableAdapter.call(Executors.java:441) > at java.util.concurrent.FutureTask > $Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at java.util.concurrent.ThreadPoolExecutor > $Worker.runTask(ThreadPoolExecutor.java:886) > at java.util.concurrent.ThreadPoolExecutor > $Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > > Found 1 deadlock. > > > On Fri, Jun 17, 2011 at 3:08 PM, Mihael Hategan > wrote: > On Fri, 2011-06-17 at 15:03 -0500, Michael Wilde wrote: > > This would be a great thing to build into the test suite for > any test run that the suite is about to cancel for exceeding > its max run time: capture the stack trace, so that if its a > deadlock, we can do some diagnosis right form the test > results. > > > > Can we add some Swift commands to dump Swifts thread/future > status as well? > > > The hang checker already does that, but there is a class of > deadlocks > that has no open futures. > > From benc at hawaga.org.uk Fri Jun 17 16:26:49 2011 From: benc at hawaga.org.uk (Ben Clifford) Date: Fri, 17 Jun 2011 21:26:49 +0000 (GMT) Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> Message-ID: > must have in order to be useful for map-reduce. It seems that there must > be some way to add multiple associations of values to the same key. > Right now, if I try : > a[0] = 100 ; a[0] = 200; > I get an error saying " a is closed with value 100" > Instead the above should probably make the lookup a[0] return a list of > values, say [100, 200] . I could really use help on this. As you phrase it above, that seems quite awkward. The syntax you specify above looks like a mutation: a[0] "starts" empty, then you say a[0]=100, meaning something like a[0] = a[0] ++ [100]; (where ++ is list concatenation); and then you say a[0] = 200; which means a[0] = a[0] + [200]; Expressing mutating behaviour like that is very wrong in the existing model. One practical problem, for example, is if I want to use the value of a[0]. How can I know when that value is fully specified? If I have to wait until there are no more writes to a[], then a lot of the pipelining stops working - in swift now, i can use values of a[] as they are written (for example, in foreach loop), without having to wait for a[] to be fully populated. I can see various ways that might work, though. -- From hategan at mcs.anl.gov Fri Jun 17 16:36:43 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 17 Jun 2011 14:36:43 -0700 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> Message-ID: <1308346603.15198.1.camel@blabla> On Fri, 2011-06-17 at 21:26 +0000, Ben Clifford wrote: > One practical problem, for example, is if I want to use the value of a[0]. > How can I know when that value is fully specified? If I have to wait until > there are no more writes to a[], then a lot of the pipelining stops > working - in swift now, i can use values of a[] as they are written (for > example, in foreach loop), without having to wait for a[] to be fully > populated. I think a[0] would be treated like an array over which one could iterate. But then we can already have arrays of arrays, so I'm not quite sure what the issue is here. From yadudoc1729 at gmail.com Fri Jun 17 17:08:50 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Sat, 18 Jun 2011 03:38:50 +0530 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: <1308346603.15198.1.camel@blabla> References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> Message-ID: >> One practical problem, for example, is if I want to use the value of a[0]. >> How can I know when that value is fully specified? If I have to wait until >> there are no more writes to a[], then a lot of the pipelining stops >> working - in swift now, i can use values of a[] as they are written (for >> example, in foreach loop), without having to wait for a[] to be fully >> populated. > > I think a[0] would be treated like an array over which one could > iterate. But then we can already have arrays of arrays, so I'm not quite > sure what the issue is here. I tested the following piece of code, and I think my earlier doubts are cleared. int a[ ][ ]; a["hi"][0] = 10; a["hi"][1] = 20; a["hi"][2] = 30; foreach value, i in a["hi"] { trace(value); } It works :) Now the only doubt that remains is, what happens when you write code in which we don't know the 2nd index. Say something like this : a[0][1] = 100; ... a[0][n] = 10000; Is such a situation possible ? In which case is the logic following logic valid? a[0] [ a[0].length + 1 ] = val ; -- Thanks and Regards, Yadu Nand B From benc at hawaga.org.uk Fri Jun 17 17:32:18 2011 From: benc at hawaga.org.uk (Ben Clifford) Date: Fri, 17 Jun 2011 22:32:18 +0000 (GMT) Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> Message-ID: > In which case is the logic following > logic valid? > a[0] [ a[0].length + 1 ] = val ; not really - the array has an unchanging length - you don't know it before you've finished assigning all the elements into it. [*] you don't have to use sequential indices: you can assign elements 1,2,3,8,100,101,102 if you like without needing to assign the elements in between (or, now, strings). if you have some other value associated with each assignment, for example, the index in an input array, then you can use that in your output array too rather than inventing a new index. Swift won't (/shouldn't) get upset by the holes. [*] that's not strictly true, but the obscure semantics are something thats probably only of interest to me and hategan as a matter of pedantry. -- From hategan at mcs.anl.gov Fri Jun 17 18:54:35 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 17 Jun 2011 16:54:35 -0700 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: <1308345549.14529.0.camel@blabla> References: <1322347220.21510.1308341005897.JavaMail.root@zimbra.anl.gov> <1308341312.14339.1.camel@blabla> <1308345549.14529.0.camel@blabla> Message-ID: <1308354875.18067.0.camel@blabla> On Fri, 2011-06-17 at 14:19 -0700, Mihael Hategan wrote: > Thanks. I committed a fix yesterday around a similar issue, but I want > to double check that there are no cases that are going to lead to > problems. Seems ok. There is an additional fix now in svn for a class that I missed yesterday. Please give it a shot. Mihael From alberto_chavez at live.com Fri Jun 17 18:55:26 2011 From: alberto_chavez at live.com (Alberto Chavez) Date: Fri, 17 Jun 2011 18:55:26 -0500 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: <1308340585.14231.0.camel@blabla> References: , <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov>, , <1308340585.14231.0.camel@blabla> Message-ID: sites.template.xml is producing this error, as soon as I remove the file from the directory, the error goes away as well.These are the contents of such file: 127.0.0.1 1000 10000 4 8 1000 1 4 /tmp -Alberto > Subject: Re: [Swift-devel] Swift unresponsive while using local provider. > From: hategan at mcs.anl.gov > To: davidkelly999 at gmail.com > Date: Fri, 17 Jun 2011 12:56:25 -0700 > CC: swift-devel at ci.uchicago.edu > > do "jstack -l " whenever it happens and send > the output. > > > > On Fri, 2011-06-17 at 14:48 -0500, David Kelly wrote: > > I saw similar things on my laptop (4 gb ram) this weekend when I was > > testing the galaxy demo scripts using the local provider. I was using > > trunk. In the output I would see things like "no activity for 10s" and > > it just would sit there and do nothing until I manually killed it. But > > most of the time it would work fine. I wrote a little shell script > > that would repeatedly run it until it hung. Then I was talking to Jon > > about this and he saw something similar with his montage work. He > > thought it might be related to a configuration issue - that either > > wrapper.parameter.mode=files or status.mode=provider should be set. > > > > I can send my scripts as well if you need some help in tracking this > > down. > > > > David > > > > On Fri, Jun 17, 2011 at 2:38 PM, Michael Wilde > > wrote: > > Alberto, how long are you letting it run for, and under what > > environment? if you are running on your laptop, how much RAM > > do you have? Its possible that you are seeing paging delays > > if you are running the Swift Java app with too little memory. > > > > > > Also, are you running trunk or 0.92.1? You should compare the > > two. > > > > > > Its *possible* that this simple test is hanging under recent > > trunk mods, but its more likely that this is some kind of > > resource shortage. > > > > > > Can you run this on one of the Swift lab machines bridled or > > communcado, or better yet on the MCS compute servers, or a > > PADS worker node (which you can get with qsub -I on pads)? > > > > > > Look at Swift under the "top" command to see if Swift is > > running and slow, or is hung. > > > > > > Stop by and we can discuss in more detail. > > > > > > - Mike > > > > > > > > ______________________________________________________________ > > > > When I run the following SwiftScript using suite.sh, > > the report shows an odd behavior, most of the time it > > times out, but once in a while it passes, however this > > outcome is completely random, since sometimes that > > test has passed 3 times in a row, and all of the > > sudden it fails. > > This is my script: > > > > > > type messagefile; > > > > > > app (messagefile t) greeting (string s[]) { > > echo s[0] s[1] s[2] stdout=@filename(t); > > } > > > > > > messagefile outfile <"q5out.txt">; > > > > > > string words[] = ["how","are","you"]; > > > > > > outfile = greeting(words); > > > > > > > > > > > > > > > > Swift.properties contents: > > > > > > $ cat swift.properties > > wrapperlog.always.transfer=true > > sitedir.keep=true > > execution.retries=0 > > lazy.errors=false > > status.mode=provider > > use.provider.staging=false > > provider.staging.pin.swiftfiles=false > > > > > > Sites.template.xml contents: > > > > > > $ cat sites.template.xml > > > > > > > > > jobmanager="local:local"/> > > > key="internalHostname">127.0.0.1 > > > key="jobthrottle">1000 > > > key="initialScore">10000 > > > key="jobsPerNode">4 > > > key="slots">8 > > > key="maxTime">1000 > > > key="nodeGranularity">1 > > > key="maxNodes">4 > > /tmp > > > > > > > > > > -Alberto > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Fri Jun 17 19:09:52 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 17 Jun 2011 17:09:52 -0700 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: References: ,<1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov> , ,<1308340585.14231.0.camel@blabla> Message-ID: <1308355792.24489.2.camel@blabla> I'm sorry, but I don't follow. Is there a new error? On Fri, 2011-06-17 at 18:55 -0500, Alberto Chavez wrote: > sites.template.xml is producing this error, as soon as I remove the > file from the directory, the error goes away as well. > These are the contents of such file: > > > > > > key="internalHostname">127.0.0.1 > 1000 > 10000 > 4 > 8 > 1000 > 1 > 4 > /tmp > > > > > -Alberto > > > > Subject: Re: [Swift-devel] Swift unresponsive while using local > provider. > > From: hategan at mcs.anl.gov > > To: davidkelly999 at gmail.com > > Date: Fri, 17 Jun 2011 12:56:25 -0700 > > CC: swift-devel at ci.uchicago.edu > > > > do "jstack -l " whenever it happens and > send > > the output. > > > > > > > > On Fri, 2011-06-17 at 14:48 -0500, David Kelly wrote: > > > I saw similar things on my laptop (4 gb ram) this weekend when I > was > > > testing the galaxy demo scripts using the local provider. I was > using > > > trunk. In the output I would see things like "no activity for 10s" > and > > > it just would sit there and do nothing until I manually killed it. > But > > > most of the time it would work fine. I wrote a little shell script > > > that would repeatedly run it until it hung. Then I was talking to > Jon > > > about this and he saw something similar with his montage work. He > > > thought it might be related to a configuration issue - that either > > > wrapper.parameter.mode=files or status.mode=provider should be > set. > > > > > > I can send my scripts as well if you need some help in tracking > this > > > down. > > > > > > David > > > > > > On Fri, Jun 17, 2011 at 2:38 PM, Michael Wilde > > > wrote: > > > Alberto, how long are you letting it run for, and under what > > > environment? if you are running on your laptop, how much RAM > > > do you have? Its possible that you are seeing paging delays > > > if you are running the Swift Java app with too little memory. > > > > > > > > > Also, are you running trunk or 0.92.1? You should compare the > > > two. > > > > > > > > > Its *possible* that this simple test is hanging under recent > > > trunk mods, but its more likely that this is some kind of > > > resource shortage. > > > > > > > > > Can you run this on one of the Swift lab machines bridled or > > > communcado, or better yet on the MCS compute servers, or a > > > PADS worker node (which you can get with qsub -I on pads)? > > > > > > > > > Look at Swift under the "top" command to see if Swift is > > > running and slow, or is hung. > > > > > > > > > Stop by and we can discuss in more detail. > > > > > > > > > - Mike > > > > > > > > > > > > ______________________________________________________________ > > > > > > When I run the following SwiftScript using suite.sh, > > > the report shows an odd behavior, most of the time it > > > times out, but once in a while it passes, however this > > > outcome is completely random, since sometimes that > > > test has passed 3 times in a row, and all of the > > > sudden it fails. > > > This is my script: > > > > > > > > > type messagefile; > > > > > > > > > app (messagefile t) greeting (string s[]) { > > > echo s[0] s[1] s[2] stdout=@filename(t); > > > } > > > > > > > > > messagefile outfile <"q5out.txt">; > > > > > > > > > string words[] = ["how","are","you"]; > > > > > > > > > outfile = greeting(words); > > > > > > > > > > > > > > > > > > > > > > > > Swift.properties contents: > > > > > > > > > $ cat swift.properties > > > wrapperlog.always.transfer=true > > > sitedir.keep=true > > > execution.retries=0 > > > lazy.errors=false > > > status.mode=provider > > > use.provider.staging=false > > > provider.staging.pin.swiftfiles=false > > > > > > > > > Sites.template.xml contents: > > > > > > > > > $ cat sites.template.xml > > > > > > > > > > > > > > jobmanager="local:local"/> > > > > > key="internalHostname">127.0.0.1 > > > > > key="jobthrottle">1000 > > > > > key="initialScore">10000 > > > > > key="jobsPerNode">4 > > > > > key="slots">8 > > > > > key="maxTime">1000 > > > > > key="nodeGranularity">1 > > > > > key="maxNodes">4 > > > /tmp > > > > > > > > > > > > > > > -Alberto > > > > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > > > > > -- > > > Michael Wilde > > > Computation Institute, University of Chicago > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > > > > > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > From alberto_chavez at live.com Fri Jun 17 19:13:12 2011 From: alberto_chavez at live.com (Alberto Chavez) Date: Fri, 17 Jun 2011 19:13:12 -0500 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: <1308355792.24489.2.camel@blabla> References: , , <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov>, , , , <1308340585.14231.0.camel@blabla>, , <1308355792.24489.2.camel@blabla> Message-ID: No, is the same one.But I had a sites.template.xml file in that directory, which contained that information; as soon as I removed sites.template.xml from my directory, the script worked just fine. type messagefile;app (messagefile t) greeting (string s[]) {echo s[0] s[1] s[2] stdout=@filename(t);}messagefile outfile <"q5out.txt">;string words[] = ["how","are","you"];outfile = greeting(words); > Subject: RE: [Swift-devel] Swift unresponsive while using local provider. > From: hategan at mcs.anl.gov > To: alberto_chavez at live.com > CC: davidkelly999 at gmail.com; swift-devel at ci.uchicago.edu > Date: Fri, 17 Jun 2011 17:09:52 -0700 > > I'm sorry, but I don't follow. Is there a new error? > > > On Fri, 2011-06-17 at 18:55 -0500, Alberto Chavez wrote: > > sites.template.xml is producing this error, as soon as I remove the > > file from the directory, the error goes away as well. > > These are the contents of such file: > > > > > > > > > > > > > key="internalHostname">127.0.0.1 > > 1000 > > 10000 > > 4 > > 8 > > 1000 > > 1 > > 4 > > /tmp > > > > > > > > > > -Alberto > > > > > > > Subject: Re: [Swift-devel] Swift unresponsive while using local > > provider. > > > From: hategan at mcs.anl.gov > > > To: davidkelly999 at gmail.com > > > Date: Fri, 17 Jun 2011 12:56:25 -0700 > > > CC: swift-devel at ci.uchicago.edu > > > > > > do "jstack -l " whenever it happens and > > send > > > the output. > > > > > > > > > > > > On Fri, 2011-06-17 at 14:48 -0500, David Kelly wrote: > > > > I saw similar things on my laptop (4 gb ram) this weekend when I > > was > > > > testing the galaxy demo scripts using the local provider. I was > > using > > > > trunk. In the output I would see things like "no activity for 10s" > > and > > > > it just would sit there and do nothing until I manually killed it. > > But > > > > most of the time it would work fine. I wrote a little shell script > > > > that would repeatedly run it until it hung. Then I was talking to > > Jon > > > > about this and he saw something similar with his montage work. He > > > > thought it might be related to a configuration issue - that either > > > > wrapper.parameter.mode=files or status.mode=provider should be > > set. > > > > > > > > I can send my scripts as well if you need some help in tracking > > this > > > > down. > > > > > > > > David > > > > > > > > On Fri, Jun 17, 2011 at 2:38 PM, Michael Wilde > > > > wrote: > > > > Alberto, how long are you letting it run for, and under what > > > > environment? if you are running on your laptop, how much RAM > > > > do you have? Its possible that you are seeing paging delays > > > > if you are running the Swift Java app with too little memory. > > > > > > > > > > > > Also, are you running trunk or 0.92.1? You should compare the > > > > two. > > > > > > > > > > > > Its *possible* that this simple test is hanging under recent > > > > trunk mods, but its more likely that this is some kind of > > > > resource shortage. > > > > > > > > > > > > Can you run this on one of the Swift lab machines bridled or > > > > communcado, or better yet on the MCS compute servers, or a > > > > PADS worker node (which you can get with qsub -I on pads)? > > > > > > > > > > > > Look at Swift under the "top" command to see if Swift is > > > > running and slow, or is hung. > > > > > > > > > > > > Stop by and we can discuss in more detail. > > > > > > > > > > > > - Mike > > > > > > > > > > > > > > > > ______________________________________________________________ > > > > > > > > When I run the following SwiftScript using suite.sh, > > > > the report shows an odd behavior, most of the time it > > > > times out, but once in a while it passes, however this > > > > outcome is completely random, since sometimes that > > > > test has passed 3 times in a row, and all of the > > > > sudden it fails. > > > > This is my script: > > > > > > > > > > > > type messagefile; > > > > > > > > > > > > app (messagefile t) greeting (string s[]) { > > > > echo s[0] s[1] s[2] stdout=@filename(t); > > > > } > > > > > > > > > > > > messagefile outfile <"q5out.txt">; > > > > > > > > > > > > string words[] = ["how","are","you"]; > > > > > > > > > > > > outfile = greeting(words); > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Swift.properties contents: > > > > > > > > > > > > $ cat swift.properties > > > > wrapperlog.always.transfer=true > > > > sitedir.keep=true > > > > execution.retries=0 > > > > lazy.errors=false > > > > status.mode=provider > > > > use.provider.staging=false > > > > provider.staging.pin.swiftfiles=false > > > > > > > > > > > > Sites.template.xml contents: > > > > > > > > > > > > $ cat sites.template.xml > > > > > > > > > > > > > > > > > > > jobmanager="local:local"/> > > > > > > > key="internalHostname">127.0.0.1 > > > > > > > key="jobthrottle">1000 > > > > > > > key="initialScore">10000 > > > > > > > key="jobsPerNode">4 > > > > > > > key="slots">8 > > > > > > > key="maxTime">1000 > > > > > > > key="nodeGranularity">1 > > > > > > > key="maxNodes">4 > > > > /tmp > > > > > > > > > > > > > > > > > > > > -Alberto > > > > > > > > > > > > _______________________________________________ > > > > Swift-devel mailing list > > > > Swift-devel at ci.uchicago.edu > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > -- > > > > Michael Wilde > > > > Computation Institute, University of Chicago > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > _______________________________________________ > > > > Swift-devel mailing list > > > > Swift-devel at ci.uchicago.edu > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > > > > > _______________________________________________ > > > > Swift-devel mailing list > > > > Swift-devel at ci.uchicago.edu > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Fri Jun 17 19:19:46 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 17 Jun 2011 17:19:46 -0700 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: References: ,,<1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov> ,, ,,<1308340585.14231.0.camel@blabla> , ,<1308355792.24489.2.camel@blabla> Message-ID: <1308356386.24582.2.camel@blabla> Right. Now run it 100 more times (make a loop in a shell script) and see if none of those deadlock. Then update to the latest svn code, re-compile and run the script 100 more times. See if it deadlocks then. On Fri, 2011-06-17 at 19:13 -0500, Alberto Chavez wrote: > No, is the same one. > But I had a sites.template.xml file in that directory, which contained > that information; as soon as I removed sites.template.xml from my > directory, the script worked just fine. > > > type messagefile; > app (messagefile t) greeting (string s[]) { > echo s[0] s[1] s[2] stdout=@filename(t); > } > messagefile outfile <"q5out.txt">; > string words[] = ["how","are","you"]; > outfile = greeting(words); > > > > > > > Subject: RE: [Swift-devel] Swift unresponsive while using local > provider. > > From: hategan at mcs.anl.gov > > To: alberto_chavez at live.com > > CC: davidkelly999 at gmail.com; swift-devel at ci.uchicago.edu > > Date: Fri, 17 Jun 2011 17:09:52 -0700 > > > > I'm sorry, but I don't follow. Is there a new error? > > > > > > On Fri, 2011-06-17 at 18:55 -0500, Alberto Chavez wrote: > > > sites.template.xml is producing this error, as soon as I remove > the > > > file from the directory, the error goes away as well. > > > These are the contents of such file: > > > > > > > > > > > > > > > > > > > > key="internalHostname">127.0.0.1 > > > 1000 > > > 10000 > > > 4 > > > 8 > > > 1000 > > > 1 > > > 4 > > > /tmp > > > > > > > > > > > > > > > -Alberto > > > > > > > > > > Subject: Re: [Swift-devel] Swift unresponsive while using local > > > provider. > > > > From: hategan at mcs.anl.gov > > > > To: davidkelly999 at gmail.com > > > > Date: Fri, 17 Jun 2011 12:56:25 -0700 > > > > CC: swift-devel at ci.uchicago.edu > > > > > > > > do "jstack -l " whenever it happens > and > > > send > > > > the output. > > > > > > > > > > > > > > > > On Fri, 2011-06-17 at 14:48 -0500, David Kelly wrote: > > > > > I saw similar things on my laptop (4 gb ram) this weekend when > I > > > was > > > > > testing the galaxy demo scripts using the local provider. I > was > > > using > > > > > trunk. In the output I would see things like "no activity for > 10s" > > > and > > > > > it just would sit there and do nothing until I manually killed > it. > > > But > > > > > most of the time it would work fine. I wrote a little shell > script > > > > > that would repeatedly run it until it hung. Then I was talking > to > > > Jon > > > > > about this and he saw something similar with his montage work. > He > > > > > thought it might be related to a configuration issue - that > either > > > > > wrapper.parameter.mode=files or status.mode=provider should be > > > set. > > > > > > > > > > I can send my scripts as well if you need some help in > tracking > > > this > > > > > down. > > > > > > > > > > David > > > > > > > > > > On Fri, Jun 17, 2011 at 2:38 PM, Michael Wilde > > > > > > wrote: > > > > > Alberto, how long are you letting it run for, and under what > > > > > environment? if you are running on your laptop, how much RAM > > > > > do you have? Its possible that you are seeing paging delays > > > > > if you are running the Swift Java app with too little memory. > > > > > > > > > > > > > > > Also, are you running trunk or 0.92.1? You should compare the > > > > > two. > > > > > > > > > > > > > > > Its *possible* that this simple test is hanging under recent > > > > > trunk mods, but its more likely that this is some kind of > > > > > resource shortage. > > > > > > > > > > > > > > > Can you run this on one of the Swift lab machines bridled or > > > > > communcado, or better yet on the MCS compute servers, or a > > > > > PADS worker node (which you can get with qsub -I on pads)? > > > > > > > > > > > > > > > Look at Swift under the "top" command to see if Swift is > > > > > running and slow, or is hung. > > > > > > > > > > > > > > > Stop by and we can discuss in more detail. > > > > > > > > > > > > > > > - Mike > > > > > > > > > > > > > > > > > > > > ______________________________________________________________ > > > > > > > > > > When I run the following SwiftScript using suite.sh, > > > > > the report shows an odd behavior, most of the time it > > > > > times out, but once in a while it passes, however this > > > > > outcome is completely random, since sometimes that > > > > > test has passed 3 times in a row, and all of the > > > > > sudden it fails. > > > > > This is my script: > > > > > > > > > > > > > > > type messagefile; > > > > > > > > > > > > > > > app (messagefile t) greeting (string s[]) { > > > > > echo s[0] s[1] s[2] stdout=@filename(t); > > > > > } > > > > > > > > > > > > > > > messagefile outfile <"q5out.txt">; > > > > > > > > > > > > > > > string words[] = ["how","are","you"]; > > > > > > > > > > > > > > > outfile = greeting(words); > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Swift.properties contents: > > > > > > > > > > > > > > > $ cat swift.properties > > > > > wrapperlog.always.transfer=true > > > > > sitedir.keep=true > > > > > execution.retries=0 > > > > > lazy.errors=false > > > > > status.mode=provider > > > > > use.provider.staging=false > > > > > provider.staging.pin.swiftfiles=false > > > > > > > > > > > > > > > Sites.template.xml contents: > > > > > > > > > > > > > > > $ cat sites.template.xml > > > > > > > > > > > > > > > > > > > > > > > > jobmanager="local:local"/> > > > > > > > > > key="internalHostname">127.0.0.1 > > > > > > > > > key="jobthrottle">1000 > > > > > > > > > key="initialScore">10000 > > > > > > > > > key="jobsPerNode">4 > > > > > > > > > key="slots">8 > > > > > > > > > key="maxTime">1000 > > > > > > > > > key="nodeGranularity">1 > > > > > > > > > key="maxNodes">4 > > > > > /tmp > > > > > > > > > > > > > > > > > > > > > > > > > -Alberto > > > > > > > > > > > > > > > _______________________________________________ > > > > > Swift-devel mailing list > > > > > Swift-devel at ci.uchicago.edu > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > > > > > -- > > > > > Michael Wilde > > > > > Computation Institute, University of Chicago > > > > > Mathematics and Computer Science Division > > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > Swift-devel mailing list > > > > > Swift-devel at ci.uchicago.edu > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > _______________________________________________ > > > > > Swift-devel mailing list > > > > > Swift-devel at ci.uchicago.edu > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > > > > > _______________________________________________ > > > > Swift-devel mailing list > > > > Swift-devel at ci.uchicago.edu > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > From jonmon at utexas.edu Fri Jun 17 19:19:50 2011 From: jonmon at utexas.edu (Jonathan Monette) Date: Fri, 17 Jun 2011 19:19:50 -0500 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: References: , , <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov>, , , , <1308340585.14231.0.camel@blabla>, , <1308355792.24489.2.camel@blabla> Message-ID: Have you been running this within the test suite or by itself? What is the command line that was used to execute this script? On Jun 17, 2011, at 7:13 PM, Alberto Chavez wrote: > No, is the same one. > But I had a sites.template.xml file in that directory, which contained that information; as soon as I removed sites.template.xml from my directory, the script worked just fine. > > type messagefile; > app (messagefile t) greeting (string s[]) { > echo s[0] s[1] s[2] stdout=@filename(t); > } > messagefile outfile <"q5out.txt">; > string words[] = ["how","are","you"]; > outfile = greeting(words); > > > > > Subject: RE: [Swift-devel] Swift unresponsive while using local provider. > > From: hategan at mcs.anl.gov > > To: alberto_chavez at live.com > > CC: davidkelly999 at gmail.com; swift-devel at ci.uchicago.edu > > Date: Fri, 17 Jun 2011 17:09:52 -0700 > > > > I'm sorry, but I don't follow. Is there a new error? > > > > > > On Fri, 2011-06-17 at 18:55 -0500, Alberto Chavez wrote: > > > sites.template.xml is producing this error, as soon as I remove the > > > file from the directory, the error goes away as well. > > > These are the contents of such file: > > > > > > > > > > > > > > > > > > > > key="internalHostname">127.0.0.1 > > > 1000 > > > 10000 > > > 4 > > > 8 > > > 1000 > > > 1 > > > 4 > > > /tmp > > > > > > > > > > > > > > > -Alberto > > > > > > > > > > Subject: Re: [Swift-devel] Swift unresponsive while using local > > > provider. > > > > From: hategan at mcs.anl.gov > > > > To: davidkelly999 at gmail.com > > > > Date: Fri, 17 Jun 2011 12:56:25 -0700 > > > > CC: swift-devel at ci.uchicago.edu > > > > > > > > do "jstack -l " whenever it happens and > > > send > > > > the output. > > > > > > > > > > > > > > > > On Fri, 2011-06-17 at 14:48 -0500, David Kelly wrote: > > > > > I saw similar things on my laptop (4 gb ram) this weekend when I > > > was > > > > > testing the galaxy demo scripts using the local provider. I was > > > using > > > > > trunk. In the output I would see things like "no activity for 10s" > > > and > > > > > it just would sit there and do nothing until I manually killed it. > > > But > > > > > most of the time it would work fine. I wrote a little shell script > > > > > that would repeatedly run it until it hung. Then I was talking to > > > Jon > > > > > about this and he saw something similar with his montage work. He > > > > > thought it might be related to a configuration issue - that either > > > > > wrapper.parameter.mode=files or status.mode=provider should be > > > set. > > > > > > > > > > I can send my scripts as well if you need some help in tracking > > > this > > > > > down. > > > > > > > > > > David > > > > > > > > > > On Fri, Jun 17, 2011 at 2:38 PM, Michael Wilde > > > > > wrote: > > > > > Alberto, how long are you letting it run for, and under what > > > > > environment? if you are running on your laptop, how much RAM > > > > > do you have? Its possible that you are seeing paging delays > > > > > if you are running the Swift Java app with too little memory. > > > > > > > > > > > > > > > Also, are you running trunk or 0.92.1? You should compare the > > > > > two. > > > > > > > > > > > > > > > Its *possible* that this simple test is hanging under recent > > > > > trunk mods, but its more likely that this is some kind of > > > > > resource shortage. > > > > > > > > > > > > > > > Can you run this on one of the Swift lab machines bridled or > > > > > communcado, or better yet on the MCS compute servers, or a > > > > > PADS worker node (which you can get with qsub -I on pads)? > > > > > > > > > > > > > > > Look at Swift under the "top" command to see if Swift is > > > > > running and slow, or is hung. > > > > > > > > > > > > > > > Stop by and we can discuss in more detail. > > > > > > > > > > > > > > > - Mike > > > > > > > > > > > > > > > > > > > > ______________________________________________________________ > > > > > > > > > > When I run the following SwiftScript using suite.sh, > > > > > the report shows an odd behavior, most of the time it > > > > > times out, but once in a while it passes, however this > > > > > outcome is completely random, since sometimes that > > > > > test has passed 3 times in a row, and all of the > > > > > sudden it fails. > > > > > This is my script: > > > > > > > > > > > > > > > type messagefile; > > > > > > > > > > > > > > > app (messagefile t) greeting (string s[]) { > > > > > echo s[0] s[1] s[2] stdout=@filename(t); > > > > > } > > > > > > > > > > > > > > > messagefile outfile <"q5out.txt">; > > > > > > > > > > > > > > > string words[] = ["how","are","you"]; > > > > > > > > > > > > > > > outfile = greeting(words); > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Swift.properties contents: > > > > > > > > > > > > > > > $ cat swift.properties > > > > > wrapperlog.always.transfer=true > > > > > sitedir.keep=true > > > > > execution.retries=0 > > > > > lazy.errors=false > > > > > status.mode=provider > > > > > use.provider.staging=false > > > > > provider.staging.pin.swiftfiles=false > > > > > > > > > > > > > > > Sites.template.xml contents: > > > > > > > > > > > > > > > $ cat sites.template.xml > > > > > > > > > > > > > > > > > > > > > > > > jobmanager="local:local"/> > > > > > > > > > key="internalHostname">127.0.0.1 > > > > > > > > > key="jobthrottle">1000 > > > > > > > > > key="initialScore">10000 > > > > > > > > > key="jobsPerNode">4 > > > > > > > > > key="slots">8 > > > > > > > > > key="maxTime">1000 > > > > > > > > > key="nodeGranularity">1 > > > > > > > > > key="maxNodes">4 > > > > > /tmp > > > > > > > > > > > > > > > > > > > > > > > > > -Alberto > > > > > > > > > > > > > > > _______________________________________________ > > > > > Swift-devel mailing list > > > > > Swift-devel at ci.uchicago.edu > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > > > > > -- > > > > > Michael Wilde > > > > > Computation Institute, University of Chicago > > > > > Mathematics and Computer Science Division > > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > Swift-devel mailing list > > > > > Swift-devel at ci.uchicago.edu > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > _______________________________________________ > > > > > Swift-devel mailing list > > > > > Swift-devel at ci.uchicago.edu > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > > > > > _______________________________________________ > > > > Swift-devel mailing list > > > > Swift-devel at ci.uchicago.edu > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From alberto_chavez at live.com Fri Jun 17 19:29:16 2011 From: alberto_chavez at live.com (Alberto Chavez) Date: Fri, 17 Jun 2011 19:29:16 -0500 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: <1308356386.24582.2.camel@blabla> References: , , , <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov>, , , , , , <1308340585.14231.0.camel@blabla>, , , , <1308355792.24489.2.camel@blabla>, , <1308356386.24582.2.camel@blabla> Message-ID: I already did that in the revision I have, and the tests passed every iteration; I am going to update my working copies to the latest svn code, and I'll run it again.By the way, I put sites.template.xml back, and ran the command jstack This is the output: $ jstack -l 27912791: Unable to open socket file: target process not responding or HotSpot VM not loadedThe -F option can be used when the target process is not responding$ jstack -F 2791Attaching to process ID 2791, please wait...sun.jvm.hotspot.debugger.NoSuchSymbolException: Could not find symbol "gHotSpotVMTypes" in any of the known library names (libjvm.so, libjvm_g.so, gamma_g) at sun.jvm.hotspot.HotSpotTypeDataBase.lookupInProcess(HotSpotTypeDataBase.java:389) at sun.jvm.hotspot.HotSpotTypeDataBase.readVMTypes(HotSpotTypeDataBase.java:104) at sun.jvm.hotspot.HotSpotTypeDataBase.(HotSpotTypeDataBase.java:85) at sun.jvm.hotspot.bugspot.BugSpotAgent.setupVM(BugSpotAgent.java:568) at sun.jvm.hotspot.bugspot.BugSpotAgent.go(BugSpotAgent.java:494) at sun.jvm.hotspot.bugspot.BugSpotAgent.attach(BugSpotAgent.java:332) at sun.jvm.hotspot.tools.Tool.start(Tool.java:163) at sun.jvm.hotspot.tools.JStack.main(JStack.java:86) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at sun.tools.jstack.JStack.runJStackTool(JStack.java:118) at sun.tools.jstack.JStack.main(JStack.java:84)Debugger attached successfully.jstack requires a java VM process/core! > Subject: RE: [Swift-devel] Swift unresponsive while using local provider. > From: hategan at mcs.anl.gov > To: alberto_chavez at live.com > CC: davidkelly999 at gmail.com; swift-devel at ci.uchicago.edu > Date: Fri, 17 Jun 2011 17:19:46 -0700 > > Right. Now run it 100 more times (make a loop in a shell script) and see > if none of those deadlock. > > Then update to the latest svn code, re-compile and run the script 100 > more times. See if it deadlocks then. > > On Fri, 2011-06-17 at 19:13 -0500, Alberto Chavez wrote: > > No, is the same one. > > But I had a sites.template.xml file in that directory, which contained > > that information; as soon as I removed sites.template.xml from my > > directory, the script worked just fine. > > > > > > type messagefile; > > app (messagefile t) greeting (string s[]) { > > echo s[0] s[1] s[2] stdout=@filename(t); > > } > > messagefile outfile <"q5out.txt">; > > string words[] = ["how","are","you"]; > > outfile = greeting(words); > > > > > > > > > > > > > Subject: RE: [Swift-devel] Swift unresponsive while using local > > provider. > > > From: hategan at mcs.anl.gov > > > To: alberto_chavez at live.com > > > CC: davidkelly999 at gmail.com; swift-devel at ci.uchicago.edu > > > Date: Fri, 17 Jun 2011 17:09:52 -0700 > > > > > > I'm sorry, but I don't follow. Is there a new error? > > > > > > > > > On Fri, 2011-06-17 at 18:55 -0500, Alberto Chavez wrote: > > > > sites.template.xml is producing this error, as soon as I remove > > the > > > > file from the directory, the error goes away as well. > > > > These are the contents of such file: > > > > > > > > > > > > > > > > > > > > > > > > > > > key="internalHostname">127.0.0.1 > > > > 1000 > > > > 10000 > > > > 4 > > > > 8 > > > > 1000 > > > > 1 > > > > 4 > > > > /tmp > > > > > > > > > > > > > > > > > > > > -Alberto > > > > > > > > > > > > > Subject: Re: [Swift-devel] Swift unresponsive while using local > > > > provider. > > > > > From: hategan at mcs.anl.gov > > > > > To: davidkelly999 at gmail.com > > > > > Date: Fri, 17 Jun 2011 12:56:25 -0700 > > > > > CC: swift-devel at ci.uchicago.edu > > > > > > > > > > do "jstack -l " whenever it happens > > and > > > > send > > > > > the output. > > > > > > > > > > > > > > > > > > > > On Fri, 2011-06-17 at 14:48 -0500, David Kelly wrote: > > > > > > I saw similar things on my laptop (4 gb ram) this weekend when > > I > > > > was > > > > > > testing the galaxy demo scripts using the local provider. I > > was > > > > using > > > > > > trunk. In the output I would see things like "no activity for > > 10s" > > > > and > > > > > > it just would sit there and do nothing until I manually killed > > it. > > > > But > > > > > > most of the time it would work fine. I wrote a little shell > > script > > > > > > that would repeatedly run it until it hung. Then I was talking > > to > > > > Jon > > > > > > about this and he saw something similar with his montage work. > > He > > > > > > thought it might be related to a configuration issue - that > > either > > > > > > wrapper.parameter.mode=files or status.mode=provider should be > > > > set. > > > > > > > > > > > > I can send my scripts as well if you need some help in > > tracking > > > > this > > > > > > down. > > > > > > > > > > > > David > > > > > > > > > > > > On Fri, Jun 17, 2011 at 2:38 PM, Michael Wilde > > > > > > > > wrote: > > > > > > Alberto, how long are you letting it run for, and under what > > > > > > environment? if you are running on your laptop, how much RAM > > > > > > do you have? Its possible that you are seeing paging delays > > > > > > if you are running the Swift Java app with too little memory. > > > > > > > > > > > > > > > > > > Also, are you running trunk or 0.92.1? You should compare the > > > > > > two. > > > > > > > > > > > > > > > > > > Its *possible* that this simple test is hanging under recent > > > > > > trunk mods, but its more likely that this is some kind of > > > > > > resource shortage. > > > > > > > > > > > > > > > > > > Can you run this on one of the Swift lab machines bridled or > > > > > > communcado, or better yet on the MCS compute servers, or a > > > > > > PADS worker node (which you can get with qsub -I on pads)? > > > > > > > > > > > > > > > > > > Look at Swift under the "top" command to see if Swift is > > > > > > running and slow, or is hung. > > > > > > > > > > > > > > > > > > Stop by and we can discuss in more detail. > > > > > > > > > > > > > > > > > > - Mike > > > > > > > > > > > > > > > > > > > > > > > > ______________________________________________________________ > > > > > > > > > > > > When I run the following SwiftScript using suite.sh, > > > > > > the report shows an odd behavior, most of the time it > > > > > > times out, but once in a while it passes, however this > > > > > > outcome is completely random, since sometimes that > > > > > > test has passed 3 times in a row, and all of the > > > > > > sudden it fails. > > > > > > This is my script: > > > > > > > > > > > > > > > > > > type messagefile; > > > > > > > > > > > > > > > > > > app (messagefile t) greeting (string s[]) { > > > > > > echo s[0] s[1] s[2] stdout=@filename(t); > > > > > > } > > > > > > > > > > > > > > > > > > messagefile outfile <"q5out.txt">; > > > > > > > > > > > > > > > > > > string words[] = ["how","are","you"]; > > > > > > > > > > > > > > > > > > outfile = greeting(words); > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Swift.properties contents: > > > > > > > > > > > > > > > > > > $ cat swift.properties > > > > > > wrapperlog.always.transfer=true > > > > > > sitedir.keep=true > > > > > > execution.retries=0 > > > > > > lazy.errors=false > > > > > > status.mode=provider > > > > > > use.provider.staging=false > > > > > > provider.staging.pin.swiftfiles=false > > > > > > > > > > > > > > > > > > Sites.template.xml contents: > > > > > > > > > > > > > > > > > > $ cat sites.template.xml > > > > > > > > > > > > > > > > > > > > > > > > > > > > > jobmanager="local:local"/> > > > > > > > > > > > key="internalHostname">127.0.0.1 > > > > > > > > > > > key="jobthrottle">1000 > > > > > > > > > > > key="initialScore">10000 > > > > > > > > > > > key="jobsPerNode">4 > > > > > > > > > > > key="slots">8 > > > > > > > > > > > key="maxTime">1000 > > > > > > > > > > > key="nodeGranularity">1 > > > > > > > > > > > key="maxNodes">4 > > > > > > /tmp > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -Alberto > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > Swift-devel mailing list > > > > > > Swift-devel at ci.uchicago.edu > > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Michael Wilde > > > > > > Computation Institute, University of Chicago > > > > > > Mathematics and Computer Science Division > > > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > Swift-devel mailing list > > > > > > Swift-devel at ci.uchicago.edu > > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > > Swift-devel mailing list > > > > > > Swift-devel at ci.uchicago.edu > > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > _______________________________________________ > > > > > Swift-devel mailing list > > > > > Swift-devel at ci.uchicago.edu > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alberto_chavez at live.com Fri Jun 17 19:30:26 2011 From: alberto_chavez at live.com (Alberto Chavez) Date: Fri, 17 Jun 2011 19:30:26 -0500 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: References: , , <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov>, , , , <1308340585.14231.0.camel@blabla>, , <1308355792.24489.2.camel@blabla> , Message-ID: >Have you been running this within the test suite or by itself?Within the test suite, >What is the command line that was used to execute this script?bash suite.sh -t -o /tmp/chavez documentation/ On Jun 17, 2011, at 7:13 PM, Alberto Chavez wrote:No, is the same one.But I had a sites.template.xml file in that directory, which contained that information; as soon as I removed sites.template.xml from my directory, the script worked just fine. type messagefile;app (messagefile t) greeting (string s[]) {echo s[0] s[1] s[2] stdout=@filename(t);}messagefile outfile <"q5out.txt">;string words[] = ["how","are","you"];outfile = greeting(words); > Subject: RE: [Swift-devel] Swift unresponsive while using local provider. > From: hategan at mcs.anl.gov > To: alberto_chavez at live.com > CC: davidkelly999 at gmail.com; swift-devel at ci.uchicago.edu > Date: Fri, 17 Jun 2011 17:09:52 -0700 > > I'm sorry, but I don't follow. Is there a new error? > > > On Fri, 2011-06-17 at 18:55 -0500, Alberto Chavez wrote: > > sites.template.xml is producing this error, as soon as I remove the > > file from the directory, the error goes away as well. > > These are the contents of such file: > > > > > > > > > > > > > key="internalHostname">127.0.0.1 > > 1000 > > 10000 > > 4 > > 8 > > 1000 > > 1 > > 4 > > /tmp > > > > > > > > > > -Alberto > > > > > > > Subject: Re: [Swift-devel] Swift unresponsive while using local > > provider. > > > From: hategan at mcs.anl.gov > > > To: davidkelly999 at gmail.com > > > Date: Fri, 17 Jun 2011 12:56:25 -0700 > > > CC: swift-devel at ci.uchicago.edu > > > > > > do "jstack -l " whenever it happens and > > send > > > the output. > > > > > > > > > > > > On Fri, 2011-06-17 at 14:48 -0500, David Kelly wrote: > > > > I saw similar things on my laptop (4 gb ram) this weekend when I > > was > > > > testing the galaxy demo scripts using the local provider. I was > > using > > > > trunk. In the output I would see things like "no activity for 10s" > > and > > > > it just would sit there and do nothing until I manually killed it. > > But > > > > most of the time it would work fine. I wrote a little shell script > > > > that would repeatedly run it until it hung. Then I was talking to > > Jon > > > > about this and he saw something similar with his montage work. He > > > > thought it might be related to a configuration issue - that either > > > > wrapper.parameter.mode=files or status.mode=provider should be > > set. > > > > > > > > I can send my scripts as well if you need some help in tracking > > this > > > > down. > > > > > > > > David > > > > > > > > On Fri, Jun 17, 2011 at 2:38 PM, Michael Wilde > > > > wrote: > > > > Alberto, how long are you letting it run for, and under what > > > > environment? if you are running on your laptop, how much RAM > > > > do you have? Its possible that you are seeing paging delays > > > > if you are running the Swift Java app with too little memory. > > > > > > > > > > > > Also, are you running trunk or 0.92.1? You should compare the > > > > two. > > > > > > > > > > > > Its *possible* that this simple test is hanging under recent > > > > trunk mods, but its more likely that this is some kind of > > > > resource shortage. > > > > > > > > > > > > Can you run this on one of the Swift lab machines bridled or > > > > communcado, or better yet on the MCS compute servers, or a > > > > PADS worker node (which you can get with qsub -I on pads)? > > > > > > > > > > > > Look at Swift under the "top" command to see if Swift is > > > > running and slow, or is hung. > > > > > > > > > > > > Stop by and we can discuss in more detail. > > > > > > > > > > > > - Mike > > > > > > > > > > > > > > > > ______________________________________________________________ > > > > > > > > When I run the following SwiftScript using suite.sh, > > > > the report shows an odd behavior, most of the time it > > > > times out, but once in a while it passes, however this > > > > outcome is completely random, since sometimes that > > > > test has passed 3 times in a row, and all of the > > > > sudden it fails. > > > > This is my script: > > > > > > > > > > > > type messagefile; > > > > > > > > > > > > app (messagefile t) greeting (string s[]) { > > > > echo s[0] s[1] s[2] stdout=@filename(t); > > > > } > > > > > > > > > > > > messagefile outfile <"q5out.txt">; > > > > > > > > > > > > string words[] = ["how","are","you"]; > > > > > > > > > > > > outfile = greeting(words); > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Swift.properties contents: > > > > > > > > > > > > $ cat swift.properties > > > > wrapperlog.always.transfer=true > > > > sitedir.keep=true > > > > execution.retries=0 > > > > lazy.errors=false > > > > status.mode=provider > > > > use.provider.staging=false > > > > provider.staging.pin.swiftfiles=false > > > > > > > > > > > > Sites.template.xml contents: > > > > > > > > > > > > $ cat sites.template.xml > > > > > > > > > > > > > > > > > > > jobmanager="local:local"/> > > > > > > > key="internalHostname">127.0.0.1 > > > > > > > key="jobthrottle">1000 > > > > > > > key="initialScore">10000 > > > > > > > key="jobsPerNode">4 > > > > > > > key="slots">8 > > > > > > > key="maxTime">1000 > > > > > > > key="nodeGranularity">1 > > > > > > > key="maxNodes">4 > > > > /tmp > > > > > > > > > > > > > > > > > > > > -Alberto > > > > > > > > > > > > _______________________________________________ > > > > Swift-devel mailing list > > > > Swift-devel at ci.uchicago.edu > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > -- > > > > Michael Wilde > > > > Computation Institute, University of Chicago > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > _______________________________________________ > > > > Swift-devel mailing list > > > > Swift-devel at ci.uchicago.edu > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > > > > > _______________________________________________ > > > > Swift-devel mailing list > > > > Swift-devel at ci.uchicago.edu > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > _______________________________________________ Swift-devel mailing list Swift-devel at ci.uchicago.edu http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Fri Jun 17 19:47:56 2011 From: jonmon at utexas.edu (Jonathan Monette) Date: Fri, 17 Jun 2011 19:47:56 -0500 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: References: , , <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov>, , , , <1308340585.14231.0.camel@blabla>, , <1308355792.24489.2.camel@blabla> , Message-ID: I meant what was the swift command line that is executed but if it is run within the test suite I guess you don't know that upfront. The reason I am asking is I wanted to know what sites file the test suite reverted to when you moved the other on out. Since you set the workdirectory to /tmp I thought maybe /tmp was being filled up which was causing the timeouts and hangs since there was no more room in the workdirectory. Not sure if this is the case but it was a thought when I saw that /tmp was the workdirectory. On Jun 17, 2011, at 7:30 PM, Alberto Chavez wrote: > >Have you been running this within the test suite or by itself? > Within the test suite, > >What is the command line that was used to execute this script? > bash suite.sh -t -o /tmp/chavez documentation/ > > On Jun 17, 2011, at 7:13 PM, Alberto Chavez wrote: > > No, is the same one. > But I had a sites.template.xml file in that directory, which contained that information; as soon as I removed sites.template.xml from my directory, the script worked just fine. > > type messagefile; > app (messagefile t) greeting (string s[]) { > echo s[0] s[1] s[2] stdout=@filename(t); > } > messagefile outfile <"q5out.txt">; > string words[] = ["how","are","you"]; > outfile = greeting(words); > > > > > Subject: RE: [Swift-devel] Swift unresponsive while using local provider. > > From: hategan at mcs.anl.gov > > To: alberto_chavez at live.com > > CC: davidkelly999 at gmail.com; swift-devel at ci.uchicago.edu > > Date: Fri, 17 Jun 2011 17:09:52 -0700 > > > > I'm sorry, but I don't follow. Is there a new error? > > > > > > On Fri, 2011-06-17 at 18:55 -0500, Alberto Chavez wrote: > > > sites.template.xml is producing this error, as soon as I remove the > > > file from the directory, the error goes away as well. > > > These are the contents of such file: > > > > > > > > > > > > > > > > > > > > key="internalHostname">127.0.0.1 > > > 1000 > > > 10000 > > > 4 > > > 8 > > > 1000 > > > 1 > > > 4 > > > /tmp > > > > > > > > > > > > > > > -Alberto > > > > > > > > > > Subject: Re: [Swift-devel] Swift unresponsive while using local > > > provider. > > > > From: hategan at mcs.anl.gov > > > > To: davidkelly999 at gmail.com > > > > Date: Fri, 17 Jun 2011 12:56:25 -0700 > > > > CC: swift-devel at ci.uchicago.edu > > > > > > > > do "jstack -l " whenever it happens and > > > send > > > > the output. > > > > > > > > > > > > > > > > On Fri, 2011-06-17 at 14:48 -0500, David Kelly wrote: > > > > > I saw similar things on my laptop (4 gb ram) this weekend when I > > > was > > > > > testing the galaxy demo scripts using the local provider. I was > > > using > > > > > trunk. In the output I would see things like "no activity for 10s" > > > and > > > > > it just would sit there and do nothing until I manually killed it. > > > But > > > > > most of the time it would work fine. I wrote a little shell script > > > > > that would repeatedly run it until it hung. Then I was talking to > > > Jon > > > > > about this and he saw something similar with his montage work. He > > > > > thought it might be related to a configuration issue - that either > > > > > wrapper.parameter.mode=files or status.mode=provider should be > > > set. > > > > > > > > > > I can send my scripts as well if you need some help in tracking > > > this > > > > > down. > > > > > > > > > > David > > > > > > > > > > On Fri, Jun 17, 2011 at 2:38 PM, Michael Wilde > > > > > wrote: > > > > > Alberto, how long are you letting it run for, and under what > > > > > environment? if you are running on your laptop, how much RAM > > > > > do you have? Its possible that you are seeing paging delays > > > > > if you are running the Swift Java app with too little memory. > > > > > > > > > > > > > > > Also, are you running trunk or 0.92.1? You should compare the > > > > > two. > > > > > > > > > > > > > > > Its *possible* that this simple test is hanging under recent > > > > > trunk mods, but its more likely that this is some kind of > > > > > resource shortage. > > > > > > > > > > > > > > > Can you run this on one of the Swift lab machines bridled or > > > > > communcado, or better yet on the MCS compute servers, or a > > > > > PADS worker node (which you can get with qsub -I on pads)? > > > > > > > > > > > > > > > Look at Swift under the "top" command to see if Swift is > > > > > running and slow, or is hung. > > > > > > > > > > > > > > > Stop by and we can discuss in more detail. > > > > > > > > > > > > > > > - Mike > > > > > > > > > > > > > > > > > > > > ______________________________________________________________ > > > > > > > > > > When I run the following SwiftScript using suite.sh, > > > > > the report shows an odd behavior, most of the time it > > > > > times out, but once in a while it passes, however this > > > > > outcome is completely random, since sometimes that > > > > > test has passed 3 times in a row, and all of the > > > > > sudden it fails. > > > > > This is my script: > > > > > > > > > > > > > > > type messagefile; > > > > > > > > > > > > > > > app (messagefile t) greeting (string s[]) { > > > > > echo s[0] s[1] s[2] stdout=@filename(t); > > > > > } > > > > > > > > > > > > > > > messagefile outfile <"q5out.txt">; > > > > > > > > > > > > > > > string words[] = ["how","are","you"]; > > > > > > > > > > > > > > > outfile = greeting(words); > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Swift.properties contents: > > > > > > > > > > > > > > > $ cat swift.properties > > > > > wrapperlog.always.transfer=true > > > > > sitedir.keep=true > > > > > execution.retries=0 > > > > > lazy.errors=false > > > > > status.mode=provider > > > > > use.provider.staging=false > > > > > provider.staging.pin.swiftfiles=false > > > > > > > > > > > > > > > Sites.template.xml contents: > > > > > > > > > > > > > > > $ cat sites.template.xml > > > > > > > > > > > > > > > > > > > > > > > > jobmanager="local:local"/> > > > > > > > > > key="internalHostname">127.0.0.1 > > > > > > > > > key="jobthrottle">1000 > > > > > > > > > key="initialScore">10000 > > > > > > > > > key="jobsPerNode">4 > > > > > > > > > key="slots">8 > > > > > > > > > key="maxTime">1000 > > > > > > > > > key="nodeGranularity">1 > > > > > > > > > key="maxNodes">4 > > > > > /tmp > > > > > > > > > > > > > > > > > > > > > > > > > -Alberto > > > > > > > > > > > > > > > _______________________________________________ > > > > > Swift-devel mailing list > > > > > Swift-devel at ci.uchicago.edu > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > > > > > -- > > > > > Michael Wilde > > > > > Computation Institute, University of Chicago > > > > > Mathematics and Computer Science Division > > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > Swift-devel mailing list > > > > > Swift-devel at ci.uchicago.edu > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > _______________________________________________ > > > > > Swift-devel mailing list > > > > > Swift-devel at ci.uchicago.edu > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > > > > > _______________________________________________ > > > > Swift-devel mailing list > > > > Swift-devel at ci.uchicago.edu > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Fri Jun 17 19:50:46 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 17 Jun 2011 17:50:46 -0700 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: References: , , <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov> , , , , <1308340585.14231.0.camel@blabla> , , <1308355792.24489.2.camel@blabla> , Message-ID: <1308358246.24760.6.camel@blabla> On Fri, 2011-06-17 at 19:47 -0500, Jonathan Monette wrote: > I meant what was the swift command line that is executed but if it is > run within the test suite I guess you don't know that upfront. > > > The reason I am asking is I wanted to know what sites file the test > suite reverted to when you moved the other on out. Since you set the > workdirectory to /tmp I thought maybe /tmp was being filled up which > was causing the timeouts and hangs since there was no more room in the > workdirectory. $SWIFT_HOME/etc/sites.xml which has local/local as providers. But now. The deadlock has nothing to do with disk space. The stack trace shows what the problem is. So I would suggest ignoring the sites file issue, since I believe it completely uncorrelated with the deadlock we are talking about. From hategan at mcs.anl.gov Fri Jun 17 19:54:25 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 17 Jun 2011 17:54:25 -0700 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: References: ,,,<1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov> ,,, ,,,<1308340585.14231.0.camel@blabla> ,, ,,<1308355792.24489.2.camel@blabla> , ,<1308356386.24582.2.camel@blabla> Message-ID: <1308358465.24760.10.camel@blabla> On Fri, 2011-06-17 at 19:29 -0500, Alberto Chavez wrote: > I already did that in the revision I have, and the tests passed every > iteration; I am going to update my working copies to the latest svn > code, and I'll run it again. If you want to help, please hold on a second. Let me give you some background. This is a bug that doesn't happen on every run. I may stand on one foot and chant things to the god of rain, then run the test and have it run ok. Therefore I may conclude that my chanting solved the problem. But you can see how the argument is bogus. Clearly the god of rain has no bearing on computer code. So there are many things you could do after which running a script works, but the important test here is to run the same script repeatedly and see the frequency with which the problem happens. So please run the same script 100 times without updating the code from svn and report back the results. > By the way, I put sites.template.xml back, and ran the command jstack > This is the output: > > > $ jstack -l 2791 And how did you come upon the number 2791? From hategan at mcs.anl.gov Fri Jun 17 19:55:06 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 17 Jun 2011 17:55:06 -0700 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: <1308358246.24760.6.camel@blabla> References: , , <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov> , , , , <1308340585.14231.0.camel@blabla> , , <1308355792.24489.2.camel@blabla> , <1308358246.24760.6.camel@blabla> Message-ID: <1308358506.24760.11.camel@blabla> On Fri, 2011-06-17 at 17:50 -0700, Mihael Hategan wrote: > On Fri, 2011-06-17 at 19:47 -0500, Jonathan Monette wrote: > > I meant what was the swift command line that is executed but if it is > > run within the test suite I guess you don't know that upfront. > > > > > > The reason I am asking is I wanted to know what sites file the test > > suite reverted to when you moved the other on out. Since you set the > > workdirectory to /tmp I thought maybe /tmp was being filled up which > > was causing the timeouts and hangs since there was no more room in the > > workdirectory. > > $SWIFT_HOME/etc/sites.xml which has local/local as providers. > > But now. The deadlock has nothing to do with disk space. The stack trace "But no" ^^ > shows what the problem is. So I would suggest ignoring the sites file > issue, since I believe it completely uncorrelated with the deadlock we > are talking about. > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel From alberto_chavez at live.com Fri Jun 17 21:48:04 2011 From: alberto_chavez at live.com (Alberto Chavez) Date: Fri, 17 Jun 2011 21:48:04 -0500 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: <1308358465.24760.10.camel@blabla> References: , , , , <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov>, , , , , , , , <1308340585.14231.0.camel@blabla>, , , , , , <1308355792.24489.2.camel@blabla>, , , , <1308356386.24582.2.camel@blabla>, , <1308358465.24760.10.camel@blabla> Message-ID: > > I already did that in the revision I have, and the tests passed every > > iteration; I am going to update my working copies to the latest svn > > code, and I'll run it again. > > If you want to help, please hold on a second. > > Let me give you some background. This is a bug that doesn't happen on > every run. I may stand on one foot and chant things to the god of rain, > then run the test and have it run ok. Therefore I may conclude that my > chanting solved the problem. But you can see how the argument is bogus. > Clearly the god of rain has no bearing on computer code. > > So there are many things you could do after which running a script > works, but the important test here is to run the same script repeatedly > and see the frequency with which the problem happens. > > So please run the same script 100 times without updating the code from > svn and report back the results. Got it. > > > By the way, I put sites.template.xml back, and ran the command jstack > > > > This is the output: > > > > > > $ jstack -l 2791 > > And how did you come upon the number 2791? that's the process ID of swift. -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Fri Jun 17 21:53:09 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 17 Jun 2011 19:53:09 -0700 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: References: ,,,,<1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov> ,,,, ,,,,<1308340585.14231.0.camel@blabla> ,,, ,,,<1308355792.24489.2.camel@blabla> ,, ,,<1308356386.24582.2.camel@blabla> , ,<1308358465.24760.10.camel@blabla> Message-ID: <1308365589.25377.2.camel@blabla> On Fri, 2011-06-17 at 21:48 -0500, Alberto Chavez wrote: > > > > > > $ jstack -l 2791 > > > > And how did you come upon the number 2791? > > > that's the process ID of swift. There are two such process IDs. One belongs to the launcher script (bin/swift) and one to the java process. You'll need to use the java one. From the output you got, it's somewhat apparent that the PID you used was not that of a java process. From alberto_chavez at live.com Fri Jun 17 22:22:12 2011 From: alberto_chavez at live.com (Alberto Chavez) Date: Fri, 17 Jun 2011 22:22:12 -0500 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: <1308365589.25377.2.camel@blabla> References: , , , , , <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov>, , , , , , , , , , <1308340585.14231.0.camel@blabla>, , , , , , , , <1308355792.24489.2.camel@blabla>, , , , , , <1308356386.24582.2.camel@blabla>, , , , <1308358465.24760.10.camel@blabla>, , <1308365589.25377.2.camel@blabla> Message-ID: Attached is the output of the java process. > Subject: RE: [Swift-devel] Swift unresponsive while using local provider. > From: hategan at mcs.anl.gov > To: alberto_chavez at live.com > CC: davidkelly999 at gmail.com; swift-devel at ci.uchicago.edu > Date: Fri, 17 Jun 2011 19:53:09 -0700 > > On Fri, 2011-06-17 at 21:48 -0500, Alberto Chavez wrote: > > > > > > > > $ jstack -l 2791 > > > > > > And how did you come upon the number 2791? > > > > > > that's the process ID of swift. > > There are two such process IDs. One belongs to the launcher script > (bin/swift) and one to the java process. You'll need to use the java > one. From the output you got, it's somewhat apparent that the PID you > used was not that of a java process. > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: deadlock.log Type: text/x-log Size: 22132 bytes Desc: not available URL: From alberto_chavez at live.com Fri Jun 17 22:29:08 2011 From: alberto_chavez at live.com (Alberto Chavez) Date: Fri, 17 Jun 2011 22:29:08 -0500 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: <1308358465.24760.10.camel@blabla> References: , , , , <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov>, , , , , , , , <1308340585.14231.0.camel@blabla>, , , , , , <1308355792.24489.2.camel@blabla>, , , , <1308356386.24582.2.camel@blabla>, , <1308358465.24760.10.camel@blabla> Message-ID: Mihael, you were right.The test hung in the 4th iteration, even though only one hour ago, it passed them all, and the site.template.xml wasn't in the directory. > Subject: RE: [Swift-devel] Swift unresponsive while using local provider. > From: hategan at mcs.anl.gov > To: alberto_chavez at live.com > CC: davidkelly999 at gmail.com; swift-devel at ci.uchicago.edu > Date: Fri, 17 Jun 2011 17:54:25 -0700 > > On Fri, 2011-06-17 at 19:29 -0500, Alberto Chavez wrote: > > I already did that in the revision I have, and the tests passed every > > iteration; I am going to update my working copies to the latest svn > > code, and I'll run it again. > > If you want to help, please hold on a second. > > Let me give you some background. This is a bug that doesn't happen on > every run. I may stand on one foot and chant things to the god of rain, > then run the test and have it run ok. Therefore I may conclude that my > chanting solved the problem. But you can see how the argument is bogus. > Clearly the god of rain has no bearing on computer code. > > So there are many things you could do after which running a script > works, but the important test here is to run the same script repeatedly > and see the frequency with which the problem happens. > > So please run the same script 100 times without updating the code from > svn and report back the results. > > > > By the way, I put sites.template.xml back, and ran the command jstack > > > > This is the output: > > > > > > $ jstack -l 2791 > > And how did you come upon the number 2791? > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Fri Jun 17 22:31:51 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 17 Jun 2011 20:31:51 -0700 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: References: ,,,,<1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov> ,,,, ,,,,<1308340585.14231.0.camel@blabla> ,,, ,,,<1308355792.24489.2.camel@blabla> ,, ,,<1308356386.24582.2.camel@blabla> , ,<1308358465.24760.10.camel@blabla> Message-ID: <1308367911.25750.0.camel@blabla> On Fri, 2011-06-17 at 22:29 -0500, Alberto Chavez wrote: > Mihael, you were right. > The test hung in the 4th iteration, even though only one hour ago, it > passed them all, and the site.template.xml wasn't in the directory. Good! Now, update to latest svn (both cog and swift) and repeat. Mihael From alberto_chavez at live.com Fri Jun 17 22:39:50 2011 From: alberto_chavez at live.com (Alberto Chavez) Date: Fri, 17 Jun 2011 22:39:50 -0500 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: <1308367911.25750.0.camel@blabla> References: , , , , , <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov>, , , , , , , , , , <1308340585.14231.0.camel@blabla>, , , , , , , , <1308355792.24489.2.camel@blabla>, , , , , , <1308356386.24582.2.camel@blabla>, , , , <1308358465.24760.10.camel@blabla>, , <1308367911.25750.0.camel@blabla> Message-ID: It hung on the 2nd iteration, attached is the output. > Subject: RE: [Swift-devel] Swift unresponsive while using local provider. > From: hategan at mcs.anl.gov > To: alberto_chavez at live.com > CC: davidkelly999 at gmail.com; swift-devel at ci.uchicago.edu > Date: Fri, 17 Jun 2011 20:31:51 -0700 > > On Fri, 2011-06-17 at 22:29 -0500, Alberto Chavez wrote: > > Mihael, you were right. > > The test hung in the 4th iteration, even though only one hour ago, it > > passed them all, and the site.template.xml wasn't in the directory. > > Good! Now, update to latest svn (both cog and swift) and repeat. > > Mihael > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: deadlock-svn-update.log Type: text/x-log Size: 22132 bytes Desc: not available URL: From hategan at mcs.anl.gov Fri Jun 17 22:47:29 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 17 Jun 2011 20:47:29 -0700 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: References: ,,,,,<1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov> ,,,,, ,,,,,<1308340585.14231.0.camel@blabla> ,,,, ,,,,<1308355792.24489.2.camel@blabla> ,,, ,,,<1308356386.24582.2.camel@blabla> ,, ,,<1308358465.24760.10.camel@blabla> , ,<1308367911.25750.0.camel@blabla> Message-ID: <1308368849.25915.3.camel@blabla> On Fri, 2011-06-17 at 22:39 -0500, Alberto Chavez wrote: > It hung on the 2nd iteration, attached is the output. I would then say I don't believe you. Line 575 of AbstractDataNode is not in addListener() in the latest svn code. Make sure you have updated the code and re-compiled swift and that you are running the correct version of swift. If in doubt post the log (it contains the swift svn revision). From alberto_chavez at live.com Fri Jun 17 22:48:16 2011 From: alberto_chavez at live.com (Alberto Chavez) Date: Fri, 17 Jun 2011 22:48:16 -0500 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: <1308368849.25915.3.camel@blabla> References: , , , , , , <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov>, , , , , , , , , , , , <1308340585.14231.0.camel@blabla>, , , , , , , , , , <1308355792.24489.2.camel@blabla>, , , , , , , , <1308356386.24582.2.camel@blabla>, , , , , , <1308358465.24760.10.camel@blabla>, , , , <1308367911.25750.0.camel@blabla>, , <1308368849.25915.3.camel@blabla> Message-ID: I forgot to rebuild, sorry. > Subject: RE: [Swift-devel] Swift unresponsive while using local provider. > From: hategan at mcs.anl.gov > To: alberto_chavez at live.com > CC: davidkelly999 at gmail.com; swift-devel at ci.uchicago.edu > Date: Fri, 17 Jun 2011 20:47:29 -0700 > > On Fri, 2011-06-17 at 22:39 -0500, Alberto Chavez wrote: > > It hung on the 2nd iteration, attached is the output. > > I would then say I don't believe you. Line 575 of AbstractDataNode is > not in addListener() in the latest svn code. > > Make sure you have updated the code and re-compiled swift and that you > are running the correct version of swift. If in doubt post the log (it > contains the swift svn revision). > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Fri Jun 17 22:50:17 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Fri, 17 Jun 2011 22:50:17 -0500 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: References: <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov> <1308340585.14231.0.camel@blabla> <1308355792.24489.2.camel@blabla> <1308356386.24582.2.camel@blabla> <1308358465.24760.10.camel@blabla> <1308367911.25750.0.camel@blabla> <1308368849.25915.3.camel@blabla> Message-ID: Alberto, Just a note that do an "ant clean" before you do "ant dist". A mistake I did last week. Ketan On Fri, Jun 17, 2011 at 10:48 PM, Alberto Chavez wrote: > I forgot to rebuild, sorry. > > > Subject: RE: [Swift-devel] Swift unresponsive while using local provider. > > From: hategan at mcs.anl.gov > > To: alberto_chavez at live.com > > CC: davidkelly999 at gmail.com; swift-devel at ci.uchicago.edu > > Date: Fri, 17 Jun 2011 20:47:29 -0700 > > > > > On Fri, 2011-06-17 at 22:39 -0500, Alberto Chavez wrote: > > > It hung on the 2nd iteration, attached is the output. > > > > I would then say I don't believe you. Line 575 of AbstractDataNode is > > not in addListener() in the latest svn code. > > > > Make sure you have updated the code and re-compiled swift and that you > > are running the correct version of swift. If in doubt post the log (it > > contains the swift svn revision). > > > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alberto_chavez at live.com Fri Jun 17 22:50:58 2011 From: alberto_chavez at live.com (Alberto Chavez) Date: Fri, 17 Jun 2011 22:50:58 -0500 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: References: , <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov>, , <1308340585.14231.0.camel@blabla>, , <1308355792.24489.2.camel@blabla>, , <1308356386.24582.2.camel@blabla>, , <1308358465.24760.10.camel@blabla>, , <1308367911.25750.0.camel@blabla>, , <1308368849.25915.3.camel@blabla>, , Message-ID: Oops, Too late. Do i need to do ant clean , and ant dist again? Date: Fri, 17 Jun 2011 22:50:17 -0500 Subject: Re: [Swift-devel] Swift unresponsive while using local provider. From: ketancmaheshwari at gmail.com To: alberto_chavez at live.com CC: hategan at mcs.anl.gov; swift-devel at ci.uchicago.edu Alberto, Just a note that do an "ant clean" before you do "ant dist". A mistake I did last week. Ketan On Fri, Jun 17, 2011 at 10:48 PM, Alberto Chavez wrote: I forgot to rebuild, sorry. > Subject: RE: [Swift-devel] Swift unresponsive while using local provider. > From: hategan at mcs.anl.gov > To: alberto_chavez at live.com > CC: davidkelly999 at gmail.com; swift-devel at ci.uchicago.edu > Date: Fri, 17 Jun 2011 20:47:29 -0700 > > On Fri, 2011-06-17 at 22:39 -0500, Alberto Chavez wrote: > > It hung on the 2nd iteration, attached is the output. > > I would then say I don't believe you. Line 575 of AbstractDataNode is > not in addListener() in the latest svn code. > > Make sure you have updated the code and re-compiled swift and that you > are running the correct version of swift. If in doubt post the log (it > contains the swift svn revision). > > _______________________________________________ Swift-devel mailing list Swift-devel at ci.uchicago.edu http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Fri Jun 17 23:02:42 2011 From: jonmon at utexas.edu (Jonathan Monette) Date: Fri, 17 Jun 2011 23:02:42 -0500 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: References: , <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov>, , <1308340585.14231.0.camel@blabla>, , <1308355792.24489.2.camel@blabla>, , <1308356386.24582.2.camel@blabla>, , <1308358465.24760.10.camel@blabla>, , <1308367911.25750.0.camel@blabla>, , <1308368849.25915.3.camel@blabla>, , Message-ID: <89F76A4E-9AE8-4455-A2F5-D662A6F53E91@utexas.edu> I am not sure. The command always use it ant redist. This will clean and compile the code in one call. Just to be sure :) On Jun 17, 2011, at 10:50 PM, Alberto Chavez wrote: > Oops, Too late. > Do i need to do ant clean , and ant dist again? > > Date: Fri, 17 Jun 2011 22:50:17 -0500 > Subject: Re: [Swift-devel] Swift unresponsive while using local provider. > From: ketancmaheshwari at gmail.com > To: alberto_chavez at live.com > CC: hategan at mcs.anl.gov; swift-devel at ci.uchicago.edu > > Alberto, > > Just a note that do an "ant clean" before you do "ant dist". A mistake I did last week. > > Ketan > > On Fri, Jun 17, 2011 at 10:48 PM, Alberto Chavez wrote: > I forgot to rebuild, sorry. > > > Subject: RE: [Swift-devel] Swift unresponsive while using local provider. > > From: hategan at mcs.anl.gov > > To: alberto_chavez at live.com > > CC: davidkelly999 at gmail.com; swift-devel at ci.uchicago.edu > > Date: Fri, 17 Jun 2011 20:47:29 -0700 > > > > > On Fri, 2011-06-17 at 22:39 -0500, Alberto Chavez wrote: > > > It hung on the 2nd iteration, attached is the output. > > > > I would then say I don't believe you. Line 575 of AbstractDataNode is > > not in addListener() in the latest svn code. > > > > Make sure you have updated the code and re-compiled swift and that you > > are running the correct version of swift. If in doubt post the log (it > > contains the swift svn revision). > > > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Fri Jun 17 23:09:04 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 17 Jun 2011 21:09:04 -0700 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: References: ,<1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov> , ,<1308340585.14231.0.camel@blabla> , ,<1308355792.24489.2.camel@blabla> , ,<1308356386.24582.2.camel@blabla> , ,<1308358465.24760.10.camel@blabla> , ,<1308367911.25750.0.camel@blabla> , ,<1308368849.25915.3.camel@blabla> , , Message-ID: <1308370144.26103.1.camel@blabla> On Fri, 2011-06-17 at 22:50 -0500, Alberto Chavez wrote: > Oops, Too late. > Do i need to do ant clean , and ant dist again? You're probably fine in most cases with just "ant dist". But if you want to be sure, do what Jonathan is saying: "ant redist" From alberto_chavez at live.com Fri Jun 17 23:10:38 2011 From: alberto_chavez at live.com (Alberto Chavez) Date: Fri, 17 Jun 2011 23:10:38 -0500 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: <1308370144.26103.1.camel@blabla> References: , , <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov>, , , , <1308340585.14231.0.camel@blabla>, , , , <1308355792.24489.2.camel@blabla>, , , , <1308356386.24582.2.camel@blabla>, , , , <1308358465.24760.10.camel@blabla>, , , , <1308367911.25750.0.camel@blabla>, , , , <1308368849.25915.3.camel@blabla>, , , , , , <1308370144.26103.1.camel@blabla> Message-ID: ant redist did the trick. > Subject: RE: [Swift-devel] Swift unresponsive while using local provider. > From: hategan at mcs.anl.gov > To: alberto_chavez at live.com > CC: ketancmaheshwari at gmail.com; swift-devel at ci.uchicago.edu > Date: Fri, 17 Jun 2011 21:09:04 -0700 > > On Fri, 2011-06-17 at 22:50 -0500, Alberto Chavez wrote: > > Oops, Too late. > > Do i need to do ant clean , and ant dist again? > > You're probably fine in most cases with just "ant dist". But if you want > to be sure, do what Jonathan is saying: "ant redist" > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From alberto_chavez at live.com Sat Jun 18 00:08:12 2011 From: alberto_chavez at live.com (Alberto Chavez) Date: Sat, 18 Jun 2011 00:08:12 -0500 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: <1308370144.26103.1.camel@blabla> References: , , <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov>, , , , <1308340585.14231.0.camel@blabla>, , , , <1308355792.24489.2.camel@blabla>, , , , <1308356386.24582.2.camel@blabla>, , , , <1308358465.24760.10.camel@blabla>, , , , <1308367911.25750.0.camel@blabla>, , , , <1308368849.25915.3.camel@blabla>, , , , , , <1308370144.26103.1.camel@blabla> Message-ID: I already did the svn updates for cog, and swift, rebuilt swift with ant redist, and ant clean + ant dist,the test keeps hanging, but I'm missing probably something: $ svn update cog At revision 3167. $ cd cog/modules/$ svn update swift At revision 4632. I did ant redist, and it was successfully built, but the test is still hanging, now it hung on the 11th iteration. However I'm not quite sure if the svn was properly updated, since I did $ jstack -l 11471 | grep addListener at org.griphyn.vdl.mapping.AbstractDataNode.addListener(AbstractDataNode.java:583) at org.griphyn.vdl.mapping.AbstractDataNode.addListener(AbstractDataNode.java:583) but Mihael Mentioned that addListener is not AbstractDataNode in the newer version.Any thoughts on that? Alberto. > Subject: RE: [Swift-devel] Swift unresponsive while using local provider. > From: hategan at mcs.anl.gov > To: alberto_chavez at live.com > CC: ketancmaheshwari at gmail.com; swift-devel at ci.uchicago.edu > Date: Fri, 17 Jun 2011 21:09:04 -0700 > > On Fri, 2011-06-17 at 22:50 -0500, Alberto Chavez wrote: > > Oops, Too late. > > Do i need to do ant clean , and ant dist again? > > You're probably fine in most cases with just "ant dist". But if you want > to be sure, do what Jonathan is saying: "ant redist" > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at utexas.edu Sat Jun 18 00:18:13 2011 From: jonmon at utexas.edu (Jonathan Monette) Date: Sat, 18 Jun 2011 00:18:13 -0500 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: References: , , <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov>, , , , <1308340585.14231.0.camel@blabla>, , , , <1308355792.24489.2.camel@blabla>, , , , <1308356386.24582.2.camel@blabla>, , , , <1308358465.24760.10.camel@blabla>, , , , <1308367911.25750.0.camel@blabla>, , , , <1308368849.25915.3.camel@blabla>, , , , , , <1308370144.26103.1.camel@blabla> Message-ID: <8082600D-1A9F-4B33-9A4E-3D320E076857@utexas.edu> I would say maybe Mihael mispoke. I think he meant to say the line 575 of the source is not in the addListener method. The addListener method is indeed in AbstractDataNode but line 575 is not part of the method. On Jun 18, 2011, at 12:08 AM, Alberto Chavez wrote: > I already did the svn updates for cog, and swift, rebuilt swift with ant redist, and ant clean + ant dist, > the test keeps hanging, but I'm missing probably something: > > $ svn update cog > At revision 3167. > > $ cd cog/modules/ > $ svn update swift > At revision 4632. > > I did ant redist, and it was successfully built, but the test is still hanging, now it hung on the 11th iteration. However I'm not quite sure if the svn was properly updated, since I did > > $ jstack -l 11471 | grep addListener > at org.griphyn.vdl.mapping.AbstractDataNode.addListener(AbstractDataNode.java:583) > at org.griphyn.vdl.mapping.AbstractDataNode.addListener(AbstractDataNode.java:583) > > but Mihael Mentioned that addListener is not AbstractDataNode in the newer version. > Any thoughts on that? > > Alberto. > > Subject: RE: [Swift-devel] Swift unresponsive while using local provider. > > From: hategan at mcs.anl.gov > > To: alberto_chavez at live.com > > CC: ketancmaheshwari at gmail.com; swift-devel at ci.uchicago.edu > > Date: Fri, 17 Jun 2011 21:09:04 -0700 > > > > On Fri, 2011-06-17 at 22:50 -0500, Alberto Chavez wrote: > > > Oops, Too late. > > > Do i need to do ant clean , and ant dist again? > > > > You're probably fine in most cases with just "ant dist". But if you want > > to be sure, do what Jonathan is saying: "ant redist" > > > > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From benc at hawaga.org.uk Sat Jun 18 09:48:46 2011 From: benc at hawaga.org.uk (Ben Clifford) Date: Sat, 18 Jun 2011 14:48:46 +0000 (GMT) Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: <1308346603.15198.1.camel@blabla> References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> Message-ID: > But then we can already have arrays of arrays, so I'm not quite > sure what the issue is here. There might be some use in list-like syntax, rather than array-like syntax. Or there might not be. By that I mean that when you assign to an array, you have to have some value to index it. If you're doing some arbitrary nested foreach loops to populate an output array you have to construct an index out of some thing from each enclosing loop. eg: foreach i in a { foreach j in b { output[i+j*100000] = f(i,j) // or output[i][j] = f(i,j) } } If you happen to not care about that (i.e. you care about the collection of values output by f, but not about their indices) then that [i][j] structure or [i+j*100000] structure is a bit artificial. When you (in Haskell) write: a = a ++ (f i j) then you add an element to the collection a without having to contrive an index - in as much as lists are indexed by their position, the new element gets "the next one". "the next one" makes sense in sequential code. In a Swift context, I could imagine a syntax: a[!] = f(i,j) where ! means "some appropriate unique index" (I think the restart code did/does do that foreach loops to label variables in nested scope in a way that could be used here). So then Yadu's: a[0] = 100; a[0] = 200; example might turn into: int a[][]; a[0][!] = 100; a[0][!] = 200; That parallelises still but doesn't require you to explicitly think about the index. -- From wozniak at mcs.anl.gov Sat Jun 18 10:19:23 2011 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Sat, 18 Jun 2011 10:19:23 -0500 (Central Daylight Time) Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> Message-ID: On Sat, 18 Jun 2011, Ben Clifford wrote: > >> But then we can already have arrays of arrays, so I'm not quite >> sure what the issue is here. > > In a Swift context, I could imagine a syntax: > > a[!] = f(i,j) > > where ! means "some appropriate unique index" (I think the restart code > did/does do that foreach loops to label variables in nested scope in a way > that could be used here). > > So then Yadu's: a[0] = 100; a[0] = 200; example might turn into: > > int a[][]; a[0][!] = 100; a[0][!] = 200; > > That parallelises still but doesn't require you to explicitly think about > the index. I think this is a nice next step. -- Justin M Wozniak From yadudoc1729 at gmail.com Sat Jun 18 11:37:29 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Sat, 18 Jun 2011 22:07:29 +0530 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> Message-ID: >> So then Yadu's: a[0] = 100; a[0] = 200; example might turn into: >> ?int a[][]; a[0][!] = 100; a[0][!] = 200; I think this looks really neat. I'll try to get this working and post results. -- Thanks and Regards, Yadu Nand B From yadudoc1729 at gmail.com Sat Jun 18 12:21:56 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Sat, 18 Jun 2011 22:51:56 +0530 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> Message-ID: >> where ! means "some appropriate unique index" (I think the restart code >> did/does do that foreach loops to label variables in nested scope in a way >> that could be used here). >> >> So then Yadu's: a[0] = 100; a[0] = 200; example might turn into: >> >> ?int a[][]; a[0][!] = 100; a[0][!] = 200; I've got this almost working. When I specify int a[ ] [ ] ; a["hi"]["!"] = 100 ; a["hi"]["!"] = 200 ; foreach value, index in a["hi"] { trace ( value ); } I get this : Swift svn swift-r4525 (swift modified locally) cog-r3116 RunID: 20110618-2237-u7avzkd2 Progress: time: Sat, 18 Jun 2011 22:37:11 +0530 SwiftScript trace: a.[hi][2034564079]:int = 30.0 - Closed SwiftScript trace: a.[hi][1072406687]:int = 10.0 - Closed SwiftScript trace: a.[hi][598210806]:int = 20.0 - Closed A diff at this stage is attached. As you might have guessed from the output, whenever I see a "!" string as a subscript I used a random function to generate a random key. This might be bad as there is no guarantee that there will be no collision. I'm trying to replace this with a UUID so that no-collisions can be guaranteed. -- Thanks and Regards, Yadu Nand B -------------- next part -------------- A non-text attachment was scrubbed... Name: second.patch Type: text/x-patch Size: 4467 bytes Desc: not available URL: From hategan at mcs.anl.gov Sat Jun 18 14:52:10 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sat, 18 Jun 2011 12:52:10 -0700 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: References: ,,<1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov> ,, ,,<1308340585.14231.0.camel@blabla> ,, ,,<1308355792.24489.2.camel@blabla> ,, ,,<1308356386.24582.2.camel@blabla> ,, ,,<1308358465.24760.10.camel@blabla> ,, ,,<1308367911.25750.0.camel@blabla> ,, ,,<1308368849.25915.3.camel@blabla> ,, ,, , ,<1308370144.26103.1.camel@blabla> Message-ID: <1308426730.30334.0.camel@blabla> Post the entire output of jstack please. On Sat, 2011-06-18 at 00:08 -0500, Alberto Chavez wrote: > I already did the svn updates for cog, and swift, rebuilt swift with > ant redist, and ant clean + ant dist, > the test keeps hanging, but I'm missing probably something: > > > $ svn update cog > At revision 3167. > > > $ cd cog/modules/ > $ svn update swift > At revision 4632. > > > I did ant redist, and it was successfully built, but the test is still > hanging, now it hung on the 11th iteration. However I'm not quite sure > if the svn was properly updated, since I did > > > $ jstack -l 11471 | grep addListener > at > org.griphyn.vdl.mapping.AbstractDataNode.addListener(AbstractDataNode.java:583) > at > org.griphyn.vdl.mapping.AbstractDataNode.addListener(AbstractDataNode.java:583) > > > but Mihael Mentioned that addListener is not AbstractDataNode in the > newer version. > Any thoughts on that? > > > Alberto. > > Subject: RE: [Swift-devel] Swift unresponsive while using local > provider. > > From: hategan at mcs.anl.gov > > To: alberto_chavez at live.com > > CC: ketancmaheshwari at gmail.com; swift-devel at ci.uchicago.edu > > Date: Fri, 17 Jun 2011 21:09:04 -0700 > > > > On Fri, 2011-06-17 at 22:50 -0500, Alberto Chavez wrote: > > > Oops, Too late. > > > Do i need to do ant clean , and ant dist again? > > > > You're probably fine in most cases with just "ant dist". But if you > want > > to be sure, do what Jonathan is saying: "ant redist" > > > > > > > From hategan at mcs.anl.gov Sat Jun 18 15:04:05 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sat, 18 Jun 2011 13:04:05 -0700 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> Message-ID: <1308427445.30334.8.camel@blabla> I see no reason* why the addition operator couldn't be overloaded for arrays to mean "append". That, to me, seems less random than a random index symbol ("!"). Of course, I'd rather have this whole issue implemented as list comprehensions in python/karajan style (a = for v, k in b { f(v);}), but I think the changes, both semantic and syntactic to swift required to have that would be too dramatic. * Well, there is a reason, as that may conflict with bulk array operations if we ever implement those (i.e. a + 3 meaning add 3 to each element in the array). On Sat, 2011-06-18 at 14:48 +0000, Ben Clifford wrote: [...] > In a Swift context, I could imagine a syntax: > > a[!] = f(i,j) > > where ! means "some appropriate unique index" (I think the restart code > did/does do that foreach loops to label variables in nested scope in a way > that could be used here). > > So then Yadu's: a[0] = 100; a[0] = 200; example might turn into: > > int a[][]; a[0][!] = 100; a[0][!] = 200; > > That parallelises still but doesn't require you to explicitly think about > the index. > From hategan at mcs.anl.gov Sat Jun 18 15:20:31 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sat, 18 Jun 2011 13:20:31 -0700 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> Message-ID: <1308428431.30967.1.camel@blabla> On Sat, 2011-06-18 at 22:51 +0530, Yadu Nand wrote: > As you might have guessed from the output, whenever I see > a "!" string as a subscript I used a random function to generate a random key. Use the size of the array at the time of the append. That's guaranteed to be unique (as long as both the actual append and the size() call are in the same critical section). > This might be bad as there is no guarantee that there will be no collision. > I'm trying to replace this with a UUID so that no-collisions can be guaranteed. From benc at hawaga.org.uk Sat Jun 18 19:11:22 2011 From: benc at hawaga.org.uk (Ben Clifford) Date: Sun, 19 Jun 2011 00:11:22 +0000 (GMT) Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: <1308427445.30334.8.camel@blabla> References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> <1308427445.30334.8.camel@blabla> Message-ID: > I see no reason* why the addition operator couldn't be overloaded for > arrays to mean "append". With the control structures that exist now, you need to specify *some* index because you can extract that index (not just the values of the array) with foreach index,value in a { ... }. I personally find the idea of expression what looks like a mutating concatenation but isn't really to be more distasteful. But I agree with you on this for the most part: > Of course, I'd rather have this whole issue > implemented as list comprehensions in python/karajan style (a = for v, k > in b { f(v);}), but I think the changes, both semantic and syntactic to > swift required to have that would be too dramatic. -- From benc at hawaga.org.uk Sat Jun 18 19:15:46 2011 From: benc at hawaga.org.uk (Ben Clifford) Date: Sun, 19 Jun 2011 00:15:46 +0000 (GMT) Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: <1308428431.30967.1.camel@blabla> References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> <1308428431.30967.1.camel@blabla> Message-ID: > Use the size of the array at the time of the append. might introduce non-deterministic behaviour when used with restarts when the index is used later (for example, to determine a filename) as the index is then a race-based random number, I think. That's what I was getting at with my comment about how things were done in the past with restarts: I think that its possible to make restart-proofed identifiers that look something like a combination of the position in the source file combined with an index for any containing code that causes the same source text to be executed multiple times (eg. each time you have a foreach, stick the index onto the end of the identifier; each time you call a function, stick the source position of the call onto the end of the identifier). -- From hategan at mcs.anl.gov Sat Jun 18 19:21:46 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sat, 18 Jun 2011 17:21:46 -0700 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> <1308427445.30334.8.camel@blabla> Message-ID: <1308442906.17227.6.camel@blabla> On Sun, 2011-06-19 at 00:11 +0000, Ben Clifford wrote: > > I see no reason* why the addition operator couldn't be overloaded for > > arrays to mean "append". > > With the control structures that exist now, you need to specify *some* > index because you can extract that index (not just the values of the > array) with foreach index,value in a { ... > }. I wasn't discussing an issue of semantics. I think we agree on that. I'm simply saying that I prefer one syntax to the other due to the less arbitrariness. I mean it's pretty clear why one would say a = a + [1], or a += [1] or append(a, 1), but not quite sure why I should say a[!] = 1 instead of a[$] = 1 or a[_] = 1. > > I personally find the idea of expression what looks like a mutating > concatenation but isn't really to be more distasteful. Except it is a mutating concatenation. But it's one that is acceptable because there is no way to have code that is nondeterministic because of it. It's a non-destructive mutation. From benc at hawaga.org.uk Sat Jun 18 19:36:55 2011 From: benc at hawaga.org.uk (Ben Clifford) Date: Sun, 19 Jun 2011 01:36:55 +0100 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: <1308442906.17227.6.camel@blabla> References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> <1308427445.30334.8.camel@blabla> <1308442906.17227.6.camel@blabla> Message-ID: <52893FDE-3A39-4D6C-BE8E-2A7C8FA738B9@hawaga.org.uk> On Jun 19, 2011, at 1:21 AM, Mihael Hategan wrote: > I wasn't discussing an issue of semantics. I think we agree on that. I'm > simply saying that I prefer one syntax to the other ok. I'm not overly enthused about the syntax either way. >> >> I personally find the idea of expression what looks like a mutating >> concatenation but isn't really to be more distasteful. > > Except it is a mutating concatenation. But it's one that is acceptable > because there is no way to have code that is nondeterministic because of > it. It's a non-destructive mutation. do you regard the 2nd statement in: a[0] = 4343; a[1] = 54354; to be mutating? if so, then a += 54354; is also a mutating concatenation in that definition of the word, and I agree with you. Otherwise I disagree with you. But my perspective is that in a swift run, the 'a' array above has a single value, described as: "the 0th element is 4343, and the 1st element is 4343" and that the value becomes more accurately known as the run progresses until such time as you know its value completely. In that sense, the 2nd statement above, and the concatenation-like operators discussed in this thread are absolutely not mutating: they define the final value and that is all. From that perspective, the ability to pipeline based on partial knowledge of that value is somewhat accidental. -- From hategan at mcs.anl.gov Sat Jun 18 19:42:26 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sat, 18 Jun 2011 17:42:26 -0700 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> <1308428431.30967.1.camel@blabla> Message-ID: <1308444146.17227.14.camel@blabla> On Sun, 2011-06-19 at 00:15 +0000, Ben Clifford wrote: > > Use the size of the array at the time of the append. > > might introduce non-deterministic behaviour when used with restarts when > the index is used later Fair point. > (for example, to determine a filename) as the > index is then a race-based random number, I think. That's what I was > getting at with my comment about how things were done in the past with > restarts: I think that its possible to make restart-proofed identifiers > that look something like a combination of the position in the source file > combined with an index for any containing code that causes the same source > text to be executed multiple times (eg. each time you have a foreach, > stick the index onto the end of the identifier; each time you call a > function, stick the source position of the call onto the end of the > identifier). So then the index would be the thread index. I think that might work. From hategan at mcs.anl.gov Sat Jun 18 19:57:32 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sat, 18 Jun 2011 17:57:32 -0700 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: <52893FDE-3A39-4D6C-BE8E-2A7C8FA738B9@hawaga.org.uk> References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> <1308427445.30334.8.camel@blabla> <1308442906.17227.6.camel@blabla> <52893FDE-3A39-4D6C-BE8E-2A7C8FA738B9@hawaga.org.uk> Message-ID: <1308445052.17227.25.camel@blabla> On Sun, 2011-06-19 at 01:36 +0100, Ben Clifford wrote: > On Jun 19, 2011, at 1:21 AM, Mihael Hategan wrote: > > > I wasn't discussing an issue of semantics. I think we agree on that. I'm > > simply saying that I prefer one syntax to the other > > ok. I'm not overly enthused about the syntax either way. > > >> > >> I personally find the idea of expression what looks like a mutating > >> concatenation but isn't really to be more distasteful. > > > > Except it is a mutating concatenation. But it's one that is acceptable > > because there is no way to have code that is nondeterministic because of > > it. It's a non-destructive mutation. > > do you regard the 2nd statement in: > > a[0] = 4343; > a[1] = 54354; > > to be mutating? Looks like it: a_without_it != a_with_it (where it = second statement), Though, again, swift doesn't allow you to compare a before with a after. There is no explicit or implicit notion of time or time sequence. > if so, then a += 54354; is also a mutating concatenation in that > definition of the word, and I agree with you. Otherwise I disagree > with you. Right. The statement above applies equally well. > > But my perspective is that in a swift run, the 'a' array above has a > single value, described as: "the 0th element is 4343, and the 1st > element is 4343" and that the value becomes more accurately known as > the run progresses until such time as you know its value completely. > In that sense, the 2nd statement above, and the concatenation-like > operators discussed in this thread are absolutely not mutating: they > define the final value and that is all. From that perspective, the > ability to pipeline based on partial knowledge of that value is > somewhat accidental. > I agree. Point being that both may look like a mutating operation in some arbitrary non-swift language, but they are what they are in swift. So I don't think that a = a + [1] looks more mutating than a[next_index] = 1. Given that they are equivalent from that perspective, the remaining deciding factors should be unrelated to this issue. From davidkelly999 at gmail.com Sat Jun 18 21:21:49 2011 From: davidkelly999 at gmail.com (David Kelly) Date: Sat, 18 Jun 2011 21:21:49 -0500 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: <1308426730.30334.0.camel@blabla> References: <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov> <1308340585.14231.0.camel@blabla> <1308355792.24489.2.camel@blabla> <1308356386.24582.2.camel@blabla> <1308358465.24760.10.camel@blabla> <1308367911.25750.0.camel@blabla> <1308368849.25915.3.camel@blabla> <1308370144.26103.1.camel@blabla> <1308426730.30334.0.camel@blabla> Message-ID: Here's one I got with the latest version tonight: 2011-06-18 21:01:34 Full thread dump Java HotSpot(TM) Server VM (19.1-b02 mixed mode): "Attach Listener" daemon prio=10 tid=0x087d6c00 nid=0x882 runnable [0x00000000] java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "Progress ticker" daemon prio=10 tid=0x9e854400 nid=0x85e waiting on condition [0x9dfad000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.griphyn.vdl.karajan.lib.RuntimeStats$ProgressTicker.run(RuntimeStats.java:141) Locked ownable synchronizers: - None "Restart Log Sync" daemon prio=10 tid=0x9e82bc00 nid=0x85d in Object.wait() [0x9dffe000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0xaedb4778> (a org.globus.cog.karajan.workflow.nodes.restartLog.SyncThread) at java.lang.Object.wait(Object.java:485) at org.globus.cog.karajan.workflow.nodes.restartLog.SyncThread.run(SyncThread.java:47) - locked <0xaedb4778> (a org.globus.cog.karajan.workflow.nodes.restartLog.SyncThread) Locked ownable synchronizers: - None "Overloaded Host Monitor" daemon prio=10 tid=0x08b19400 nid=0x85c waiting on condition [0x9e15c000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.globus.cog.karajan.scheduler.OverloadedHostMonitor.run(OverloadedHostMonitor.java:47) Locked ownable synchronizers: - None "Timer-0" daemon prio=10 tid=0x08354400 nid=0x85b in Object.wait() [0x9e1ad000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0xaf5e2c38> (a java.util.TaskQueue) at java.util.TimerThread.mainLoop(Timer.java:509) - locked <0xaf5e2c38> (a java.util.TaskQueue) at java.util.TimerThread.run(Timer.java:462) Locked ownable synchronizers: - None "NBS0" daemon prio=10 tid=0x087de000 nid=0x85a waiting on condition [0x9e1fe000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xaf5e3628> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - None "pool-1-thread-8" prio=10 tid=0x08445c00 nid=0x859 waiting for monitor entry [0x9e369000] java.lang.Thread.State: BLOCKED (on object monitor) at org.griphyn.vdl.karajan.WrapperMap.close(WrapperMap.java:25) - waiting to lock <0xaf5ed6c8> (a org.griphyn.vdl.karajan.WrapperMap) at org.griphyn.vdl.karajan.lib.VDLFunction.closeShallow(VDLFunction.java:516) - locked <0xaed9c108> (a org.griphyn.vdl.mapping.RootArrayDataNode) at org.griphyn.vdl.karajan.lib.SetFieldValue.deepCopy(SetFieldValue.java:121) at org.griphyn.vdl.karajan.lib.SetFieldValue.function(SetFieldValue.java:49) - locked <0xaed9c108> (a org.griphyn.vdl.mapping.RootArrayDataNode) at org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:67) at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) at org.globus.cog.karajan.workflow.nodes.functions.Argument.post(Argument.java:48) at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) at org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:71) at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) at org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28) at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) at org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28) at org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) at org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) at org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) at org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) at org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) at org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) at org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) at org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - <0xaf5e3b10> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) "pool-1-thread-7" prio=10 tid=0x08443000 nid=0x858 waiting on condition [0x9e3ba000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xaf5ec2b0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - None "pool-1-thread-6" prio=10 tid=0x08441800 nid=0x857 waiting for monitor entry [0x9e40b000] java.lang.Thread.State: BLOCKED (on object monitor) at org.griphyn.vdl.mapping.AbstractDataNode.addListener(AbstractDataNode.java:583) - waiting to lock <0xaed9c108> (a org.griphyn.vdl.mapping.RootArrayDataNode) at org.griphyn.vdl.karajan.DSHandleFutureWrapper.(DSHandleFutureWrapper.java:24) at org.griphyn.vdl.karajan.WrapperMap.addNodeListener(WrapperMap.java:61) - locked <0xaf5ed6c8> (a org.griphyn.vdl.karajan.WrapperMap) at org.griphyn.vdl.karajan.lib.VDLFunction.addFutureListener(VDLFunction.java:523) at org.griphyn.vdl.karajan.lib.Stagein.function(Stagein.java:88) at org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:67) at org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) at org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) at org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) at org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) at org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) at org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) at org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) at org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - <0xaf5ec3a0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync) "pool-1-thread-5" prio=10 tid=0x085d2000 nid=0x856 waiting on condition [0x9e45c000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xaf5ec2b0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - None "pool-1-thread-4" prio=10 tid=0x085d2800 nid=0x855 waiting on condition [0x9e4ad000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xaf5ec2b0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - None "pool-1-thread-3" prio=10 tid=0x08839400 nid=0x854 waiting on condition [0x9e65c000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xaf5ec2b0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - None "pool-1-thread-2" prio=10 tid=0x08837800 nid=0x853 waiting on condition [0x9e6ad000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xaf5ec2b0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - None "pool-1-thread-1" prio=10 tid=0x9e826c00 nid=0x852 waiting on condition [0x9e6fe000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xaf5ec2b0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907) at java.lang.Thread.run(Thread.java:662) Locked ownable synchronizers: - None "Hang checker" prio=10 tid=0x9e81bc00 nid=0x851 waiting for monitor entry [0x9e4fe000] java.lang.Thread.State: BLOCKED (on object monitor) at org.griphyn.vdl.karajan.Monitor.dumpVariables(Monitor.java:220) - waiting to lock <0xaf5ed6c8> (a org.griphyn.vdl.karajan.WrapperMap) at org.griphyn.vdl.karajan.HangChecker.run(HangChecker.java:54) at java.util.TimerThread.mainLoop(Timer.java:512) at java.util.TimerThread.run(Timer.java:462) Locked ownable synchronizers: - None "Low Memory Detector" daemon prio=10 tid=0x08235800 nid=0x84f runnable [0x00000000] java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "CompilerThread1" daemon prio=10 tid=0x9f4a9800 nid=0x84e waiting on condition [0x00000000] java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "CompilerThread0" daemon prio=10 tid=0x9f4a7800 nid=0x84d waiting on condition [0x00000000] java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "Signal Dispatcher" daemon prio=10 tid=0x9f4a5c00 nid=0x84c runnable [0x00000000] java.lang.Thread.State: RUNNABLE Locked ownable synchronizers: - None "Finalizer" daemon prio=10 tid=0x9f497400 nid=0x84b in Object.wait() [0x9f194000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0xa398ce08> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118) - locked <0xa398ce08> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159) Locked ownable synchronizers: - None "Reference Handler" daemon prio=10 tid=0x9f496000 nid=0x84a in Object.wait() [0x9f1e5000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0xa397b6e8> (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:485) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116) - locked <0xa397b6e8> (a java.lang.ref.Reference$Lock) Locked ownable synchronizers: - None "main" prio=10 tid=0x08224400 nid=0x844 in Object.wait() [0xb6a06000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0xaf50ac30> (a org.griphyn.vdl.karajan.VDL2ExecutionContext) at java.lang.Object.wait(Object.java:485) at org.globus.cog.karajan.workflow.ExecutionContext.waitFor(ExecutionContext.java:226) - locked <0xaf50ac30> (a org.griphyn.vdl.karajan.VDL2ExecutionContext) at org.griphyn.vdl.karajan.Loader.main(Loader.java:201) Locked ownable synchronizers: - None "VM Thread" prio=10 tid=0x9f492400 nid=0x849 runnable "GC task thread#0 (ParallelGC)" prio=10 tid=0x0822b800 nid=0x845 runnable "GC task thread#1 (ParallelGC)" prio=10 tid=0x0822cc00 nid=0x846 runnable "GC task thread#2 (ParallelGC)" prio=10 tid=0x0822e400 nid=0x847 runnable "GC task thread#3 (ParallelGC)" prio=10 tid=0x0822f800 nid=0x848 runnable "VM Periodic Task Thread" prio=10 tid=0x9f4b4000 nid=0x850 waiting on condition JNI global references: 1392 Found one Java-level deadlock: ============================= "pool-1-thread-8": waiting to lock monitor 0x08b1859c (object 0xaf5ed6c8, a org.griphyn.vdl.karajan.WrapperMap), which is held by "pool-1-thread-6" "pool-1-thread-6": waiting to lock monitor 0x9e89178c (object 0xaed9c108, a org.griphyn.vdl.mapping.RootArrayDataNode), which is held by "pool-1-thread-8" Java stack information for the threads listed above: =================================================== "pool-1-thread-8": at org.griphyn.vdl.karajan.WrapperMap.close(WrapperMap.java:25) - waiting to lock <0xaf5ed6c8> (a org.griphyn.vdl.karajan.WrapperMap) at org.griphyn.vdl.karajan.lib.VDLFunction.closeShallow(VDLFunction.java:516) - locked <0xaed9c108> (a org.griphyn.vdl.mapping.RootArrayDataNode) at org.griphyn.vdl.karajan.lib.SetFieldValue.deepCopy(SetFieldValue.java:121) at org.griphyn.vdl.karajan.lib.SetFieldValue.function(SetFieldValue.java:49) - locked <0xaed9c108> (a org.griphyn.vdl.mapping.RootArrayDataNode) at org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:67) at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) at org.globus.cog.karajan.workflow.nodes.functions.Argument.post(Argument.java:48) at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) at org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:71) at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) at org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28) at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) at org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28) at org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) at org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) at org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) at org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) at org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) at org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) at org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) at org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) "pool-1-thread-6": at org.griphyn.vdl.mapping.AbstractDataNode.addListener(AbstractDataNode.java:583) - waiting to lock <0xaed9c108> (a org.griphyn.vdl.mapping.RootArrayDataNode) at org.griphyn.vdl.karajan.DSHandleFutureWrapper.(DSHandleFutureWrapper.java:24) at org.griphyn.vdl.karajan.WrapperMap.addNodeListener(WrapperMap.java:61) - locked <0xaf5ed6c8> (a org.griphyn.vdl.karajan.WrapperMap) at org.griphyn.vdl.karajan.lib.VDLFunction.addFutureListener(VDLFunction.java:523) at org.griphyn.vdl.karajan.lib.Stagein.function(Stagein.java:88) at org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:67) at org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) at org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) at org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) at org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) at org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) at org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) at org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) at org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Found 1 deadlock. On Sat, Jun 18, 2011 at 2:52 PM, Mihael Hategan wrote: > Post the entire output of jstack please. > > On Sat, 2011-06-18 at 00:08 -0500, Alberto Chavez wrote: > > I already did the svn updates for cog, and swift, rebuilt swift with > > ant redist, and ant clean + ant dist, > > the test keeps hanging, but I'm missing probably something: > > > > > > $ svn update cog > > At revision 3167. > > > > > > $ cd cog/modules/ > > $ svn update swift > > At revision 4632. > > > > > > I did ant redist, and it was successfully built, but the test is still > > hanging, now it hung on the 11th iteration. However I'm not quite sure > > if the svn was properly updated, since I did > > > > > > $ jstack -l 11471 | grep addListener > > at > > > org.griphyn.vdl.mapping.AbstractDataNode.addListener(AbstractDataNode.java:583) > > at > > > org.griphyn.vdl.mapping.AbstractDataNode.addListener(AbstractDataNode.java:583) > > > > > > but Mihael Mentioned that addListener is not AbstractDataNode in the > > newer version. > > Any thoughts on that? > > > > > > Alberto. > > > Subject: RE: [Swift-devel] Swift unresponsive while using local > > provider. > > > From: hategan at mcs.anl.gov > > > To: alberto_chavez at live.com > > > CC: ketancmaheshwari at gmail.com; swift-devel at ci.uchicago.edu > > > Date: Fri, 17 Jun 2011 21:09:04 -0700 > > > > > > On Fri, 2011-06-17 at 22:50 -0500, Alberto Chavez wrote: > > > > Oops, Too late. > > > > Do i need to do ant clean , and ant dist again? > > > > > > You're probably fine in most cases with just "ant dist". But if you > > want > > > to be sure, do what Jonathan is saying: "ant redist" > > > > > > > > > > > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aespinosa at cs.uchicago.edu Sat Jun 18 22:07:09 2011 From: aespinosa at cs.uchicago.edu (Allan Espinosa) Date: Sat, 18 Jun 2011 21:07:09 -0600 Subject: [Swift-devel] Swift log analysis tools In-Reply-To: <1447910017.21197.1308337470757.JavaMail.root@zimbra.anl.gov> References: <1341126804.20938.1308333874067.JavaMail.root@zimbra.anl.gov> <1447910017.21197.1308337470757.JavaMail.root@zimbra.anl.gov> Message-ID: I indicate individual *.event files from the swift-plot-log command for each type of metric I need. Then I pass these files to a bunch of R scripts for summarizing or plotting that is not normally available in the suite. For developing new filters, I usually grep out the log4j class entry i need and then write my filter scripts from there. -Allan 2011/6/17 Michael Wilde : > > In discussing the SCEC workflow with Ketan, it became clear that a tool to extract useful debugging information from the Swift run .log file would be very useful in diagnosing failures and understanding better how the run is behaving. > > Ketan is about to make a few log filter scripts to tease out and format this info. > > Does anyone have any similar tools that wouldbe useful for this purpose? > > Do the new (or old) log plotting tools have and preprocessing scripts that can be used for this? > > Should we common scripts to reformat the log for both human analysis and erformance plotting? > > - Mike > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > -- Allan M. Espinosa PhD student, Computer Science University of Chicago From tim.g.armstrong at gmail.com Sun Jun 19 01:32:11 2011 From: tim.g.armstrong at gmail.com (Tim Armstrong) Date: Sat, 18 Jun 2011 23:32:11 -0700 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: <1308445052.17227.25.camel@blabla> References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> <1308427445.30334.8.camel@blabla> <1308442906.17227.6.camel@blabla> <52893FDE-3A39-4D6C-BE8E-2A7C8FA738B9@hawaga.org.uk> <1308445052.17227.25.camel@blabla> Message-ID: Maybe I missed something, but from the discussion I'm not quite clear about what the use case is. It seems like there are three possibilities for the data structure: 1 Order is completely unimportant (i.e. the data structure is a set or multi-set) 2 Relative order is important (i.e. if the assignment happens in a later iteration of a for loop, it should be later in the list) 3 Absolute order is important - i.e. we're using explicit array indices It seems like you're proposing 2, which seems a bit odd to me in the context of swift as it adds some kind of sequential dependency between iterations of a for loop when determining array indices. The idea of building a nested (ragged) array and then flattening it seems more swift-like to me. Ie. foreach i in a { foreach j in b { output[i][j] = f(i,j) } } file output2[]; output2 = flatten(output) That is kind of clunky though. Would it be possible to have some kind of syntax for implicitly flattening arrays, e.g.: file output2[]; foreach i in a { foreach j in b { output[i,j] = f(i,j) } } - Tim On Sat, Jun 18, 2011 at 5:57 PM, Mihael Hategan wrote: > On Sun, 2011-06-19 at 01:36 +0100, Ben Clifford wrote: > > On Jun 19, 2011, at 1:21 AM, Mihael Hategan wrote: > > > > > I wasn't discussing an issue of semantics. I think we agree on that. > I'm > > > simply saying that I prefer one syntax to the other > > > > ok. I'm not overly enthused about the syntax either way. > > > > >> > > >> I personally find the idea of expression what looks like a mutating > > >> concatenation but isn't really to be more distasteful. > > > > > > Except it is a mutating concatenation. But it's one that is acceptable > > > because there is no way to have code that is nondeterministic because > of > > > it. It's a non-destructive mutation. > > > > do you regard the 2nd statement in: > > > > a[0] = 4343; > > a[1] = 54354; > > > > to be mutating? > > Looks like it: a_without_it != a_with_it (where it = second statement), > Though, again, swift doesn't allow you to compare a before with a after. > There is no explicit or implicit notion of time or time sequence. > > > if so, then a += 54354; is also a mutating concatenation in that > > definition of the word, and I agree with you. Otherwise I disagree > > with you. > > Right. The statement above applies equally well. > > > > > But my perspective is that in a swift run, the 'a' array above has a > > single value, described as: "the 0th element is 4343, and the 1st > > element is 4343" and that the value becomes more accurately known as > > the run progresses until such time as you know its value completely. > > In that sense, the 2nd statement above, and the concatenation-like > > operators discussed in this thread are absolutely not mutating: they > > define the final value and that is all. From that perspective, the > > ability to pipeline based on partial knowledge of that value is > > somewhat accidental. > > > I agree. Point being that both may look like a mutating operation in > some arbitrary non-swift language, but they are what they are in swift. > So I don't think that a = a + [1] looks more mutating than a[next_index] > = 1. Given that they are equivalent from that perspective, the remaining > deciding factors should be unrelated to this issue. > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From benc at hawaga.org.uk Sun Jun 19 02:41:16 2011 From: benc at hawaga.org.uk (Ben Clifford) Date: Sun, 19 Jun 2011 08:41:16 +0100 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> <1308427445.30334.8.camel@blabla> <1308442906.17227.6.camel@blabla> <52893FDE-3A39-4D6C-BE8E-2A7C8FA738B9@hawaga.org.uk> <1308445052.17227.25.camel@blabla> Message-ID: <6D8ECEE9-63D8-49DC-A467-393C9630B182@hawaga.org.uk> On Jun 19, 2011, at 7:32 AM, Tim Armstrong wrote: > Maybe I missed something, but from the discussion I'm not quite clear about what the use case is. > > It seems like there are three possibilities for the data structure: > > 1 Order is completely unimportant (i.e. the data structure is a set or multi-set) > 2 Relative order is important (i.e. if the assignment happens in a later iteration of a for loop, it should be later in the list) > 3 Absolute order is important - i.e. we're using explicit array indices > > It seems like you're proposing 2, which seems a bit odd to me in the context of swift as it adds some kind of sequential dependency between iterations of a for loop when determining array indices. i was proposing 1 from the above. The key difference from I think what you've seen (and a bit of the basis of what hategan and I were debating about) was in the nature of the fabricated key. My proposal was that the index was that the fabricated key is allocated in a loop in some way that does not need any of the other iterations to have executed (that's the thread ID or the program context + loop context stuff that was mentioned in this thread). Thats going to look a bit pseudo-random to someone who doesn't know the details of swift's execution internals. I was, I guess, suggesting a syntax that means: > That is kind of clunky though. Would it be possible to have some kind of syntax for implicitly flattening arrays, e.g.: ... > output[i,j] = f(i,j) "order is completely unimportant for me, so make up this [i,j] for me so that i don't have to think if I've accurately captured the enclosing loop context" -- From hategan at mcs.anl.gov Sun Jun 19 03:18:07 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sun, 19 Jun 2011 01:18:07 -0700 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> <1308427445.30334.8.camel@blabla> <1308442906.17227.6.camel@blabla> <52893FDE-3A39-4D6C-BE8E-2A7C8FA738B9@hawaga.org.uk> <1308445052.17227.25.camel@blabla> Message-ID: <1308471487.19473.51.camel@blabla> On Sat, 2011-06-18 at 23:32 -0700, Tim Armstrong wrote: > Maybe I missed something, but from the discussion I'm not quite clear > about what the use case is. > > It seems like there are three possibilities for the data structure: > > 1 Order is completely unimportant (i.e. the data structure is a set or > multi-set) That is essentially the case. Arrays are maps. However Ben points out the issue of restart logs. So let me explain that a bit. There can be different dynamic scopes for variables. Consider the following example: function g(a) { for v in a { file x = f(v); ... } } And imagine that some of the iterations have completed invoking f(v) and some haven't. Also picture that there are multiple parallel invocations of g(). The question is how can one record and distinguish which x-es have been computed and which haven't. One relatively simple answer is that given that all invocations of g() are parallel and all iterations are parallel, one can picture each scope as a node in a tree, and each spawned thread in that scope as a branch. If branches are numbered sequentially then it would be sufficient to record the path in the tree from the root to the scope of the x we're interested in. We call this path a "thread prefix" and its property is that no two different scopes have the same thread prefix. This is something internal to swift. The issue that can arise however is that if, upon a restart, the correspondence between the thread prefix and the actual iteration value is not kept, then we may end up with mismatched stuff (this is only an issue with iterations, since different invocations of g() are labeled based on their lexical position) So either we make sure that the order of iteration is the same, or alternatively use the actual iteration index to construct the thread prefix. We do the latter. So normally there is no issue since every new array index is a swift expression which is deterministic. But if we do: file a[]; for i in [1:2] { a = a + [f(i)]; // instead of a[i] = f(i); } then the order in which things are put in a is nondeterministic (it is in both cases). In the commented out case one would log that, say, a[1] in thread-prefix 0 (the variable was declared in the global scope) is already calculated. Which is fine. A restart would check that a[1] was calculated and it could only have the value of f((1). In the non-commented out case, well, it depends on whether you assign based on iteration sequence or completion of f() sequence, but in either case it's nondeterministic, so a[0] could contain either f(1) or f(2). Simply checking that a[0] was calculated is not sufficient to restore the state. The correspondence between a[0] and f(1) or f(2) also needs to be established. That makes things more complicated and we don't want that. Also, it makes swift programs nondeterministic and we don't want that either. > 2 Relative order is important (i.e. if the assignment happens in a > later iteration of a for loop, it should be later in the list) > 3 Absolute order is important - i.e. we're using explicit array > indices > > It seems like you're proposing 2, which seems a bit odd to me in the > context of swift as it adds some kind of sequential dependency between > iterations of a for loop when determining array indices. I guess what's being proposed, and I'm beginning to forget what the overall goal was, is a way to not care about indices. And when you don't care about them, they may as well be sequential since it's straightforward to implement. I mean I don't see why a random sequence is better then a sequence here. The random sequence is essentially a hash of a sequence, so I only see it as introducing nonsense. We're not trying to "secure" the indices. But looking back at the initial email, I think that we should dig into the assumptions that prompted this discussion. There has been a lot of talking on how to do this, but almost nothing on why we should do it. > > The idea of building a nested (ragged) array and then flattening it > seems more swift-like to me. > > Ie. > foreach i in a { > foreach j in b { > output[i][j] = f(i,j) > } > } > file output2[]; > output2 = flatten(output) > > That is kind of clunky though. Would it be possible to have some kind > of syntax for implicitly flattening arrays, e.g.: > > file output2[]; > foreach i in a { > foreach j in b { > output[i,j] = f(i,j) > } > } > If you have an upper bound for j, then I guess i*n + j would work. Alternatively if we allow array keys or tuple keys, output[[i, j]]/output[(i, j)] might work. Or we could provide some magic injective f(i, j) to be used as index for the flattened array. Mihael From benc at hawaga.org.uk Mon Jun 20 03:02:01 2011 From: benc at hawaga.org.uk (Ben Clifford) Date: Mon, 20 Jun 2011 09:02:01 +0100 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: <1308471487.19473.51.camel@blabla> References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> <1308427445.30334.8.camel@blabla> <1308442906.17227.6.camel@blabla> <52893FDE-3A39-4D6C-BE8E-2A7C8FA738B9@hawaga.org.uk> <1308445052.17227.25.camel@blabla> <1308471487.19473.51.camel@blabla> Message-ID: On Jun 19, 2011, at 9:18 AM, Mihael Hategan wrote: > But looking back at the initial email, I think that we should dig into > the assumptions that prompted this discussion. There has been a lot of > talking on how to do this, but almost nothing on why we should do it. My understanding is its the "collect all pairs with same key" bit of the below wikipedia quote. It seems superficially to look quite nice, but I'd be interested in seeing what the reduce bit would look like. ===== http://en.wikipedia.org/wiki/MapReduce#Logical_view The Map and Reduce functions of MapReduce are both defined with respect to data structured in (key, value) pairs. Map takes one pair of data with a type in one data domain, and returns a list of pairs in a different domain: Map(k1,v1) ? list(k2,v2) The Map function is applied in parallel to every item in the input dataset. This produces a list of (k2,v2) pairs for each call. After that, the MapReduce framework collects all pairs with the same key from all lists and groups them together, thus creating one group for each one of the different generated keys. The Reduce function is then applied in parallel to each group, which in turn produces a collection of values in the same domain: Reduce(k2, list (v2)) ? list(v3) Each Reduce call typically produces either one value v3 or an empty return, though one call is allowed to return more than one value. The returns of all calls are collected as the desired result list. ==== From hategan at mcs.anl.gov Mon Jun 20 04:44:45 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 20 Jun 2011 02:44:45 -0700 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> <1308427445.30334.8.camel@blabla> <1308442906.17227.6.camel@blabla> <52893FDE-3A39-4D6C-BE8E-2A7C8FA738B9@hawaga.org.uk> <1308445052.17227.25.camel@blabla> <1308471487.19473.51.camel@blabla> Message-ID: <1308563085.9001.31.camel@blabla> That's some crappy writing, but if I'm reading it right and in haskell lingo: map_ :: (Tk1 Tv1) -> [(Tk2, Tv2)] There is an automatic collect_:: [(Tk2, Tv2)] -> (Map Tk2 [Tv2]) reduce_ :: (Tk2 [Tv2]) -> [Tv3] There's an input: (Map Tk1 Tv1) and the system does: intermediate = map (map_) toList input result = foldl (++) [] map (reduce_) toList collect_ intermediate So it does seem like the shapes of the arrays can depend in non-trivial ways on the algorithm. On Mon, 2011-06-20 at 09:02 +0100, Ben Clifford wrote: > On Jun 19, 2011, at 9:18 AM, Mihael Hategan wrote: > > > But looking back at the initial email, I think that we should dig into > > the assumptions that prompted this discussion. There has been a lot of > > talking on how to do this, but almost nothing on why we should do it. > > My understanding is its the "collect all pairs with same key" bit of the below wikipedia quote. It seems superficially to look quite nice, but I'd be interested in seeing what the reduce bit would look like. > > ===== http://en.wikipedia.org/wiki/MapReduce#Logical_view > The Map and Reduce functions of MapReduce are both defined with respect to data structured in (key, value) pairs. Map takes one pair of data with a type in one data domain, and returns a list of pairs in a different domain: > Map(k1,v1) ? list(k2,v2) > The Map function is applied in parallel to every item in the input dataset. This produces a list of (k2,v2) pairs for each call. After that, the MapReduce framework collects all pairs with the same key from all lists and groups them together, thus creating one group for each one of the different generated keys. > The Reduce function is then applied in parallel to each group, which in turn produces a collection of values in the same domain: > Reduce(k2, list (v2)) ? list(v3) > Each Reduce call typically produces either one value v3 or an empty return, though one call is allowed to return more than one value. The returns of all calls are collected as the desired result list. > ==== From yadudoc1729 at gmail.com Mon Jun 20 15:37:29 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Tue, 21 Jun 2011 02:07:29 +0530 Subject: [Swift-devel] Associative arrays issue Message-ID: Hi, I've come across a very curious situation in which array["key"][1] works while array["key"]["id1"] fails. I was trying to create unique ids using UUIDs and casted to strings when I came across this. Any help would be appreciated. I've attached the logs to this mail. script: int array[ ][ ]; array["key"]["a"] = 10; array["key"]["b"] = 20; array["key"]["c"] = 30; foreach value, index in array["key"] { trace(value); } Output: Progress: time: Tue, 21 Jun 2011 02:00:33 +0530 Execution failed: For input string: "b" -- Thanks and Regards, Yadu Nand B -------------- next part -------------- A non-text attachment was scrubbed... Name: assoc_array-20110621-0200-4y90zv88.log Type: text/x-log Size: 12479 bytes Desc: not available URL: From benc at hawaga.org.uk Tue Jun 21 04:11:32 2011 From: benc at hawaga.org.uk (Ben Clifford) Date: Tue, 21 Jun 2011 11:11:32 +0200 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: <1308563085.9001.31.camel@blabla> References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> <1308427445.30334.8.camel@blabla> <1308442906.17227.6.camel@blabla> <52893FDE-3A39-4D6C-BE8E-2A7C8FA738B9@hawaga.org.uk> <1308445052.17227.25.camel@blabla> <1308471487.19473.51.camel@blabla> <1308563085.9001.31.camel@blabla> Message-ID: <9990D2B0-3D75-4078-9A38-5E9117952070@hawaga.org.uk> On Jun 20, 2011, at 11:44 AM, Mihael Hategan wrote: > That's some crappy writing, but if I'm reading it right and in haskell > lingo: > > map_ :: (Tk1 Tv1) -> [(Tk2, Tv2)] > There is an automatic collect_:: [(Tk2, Tv2)] -> (Map Tk2 [Tv2]) > reduce_ :: (Tk2 [Tv2]) -> [Tv3] > > There's an input: (Map Tk1 Tv1) and the system does: > intermediate = map (map_) toList input > result = foldl (++) [] map (reduce_) toList collect_ intermediate > > So it does seem like the shapes of the arrays can depend in non-trivial > ways on the algorithm. the above looks ok. I think swift arrays can pretty much handle the shapes of array needed. When Tk1 and Tk2, the key types of the input data and intermediate data, are numeric, then swift already has that. But what would a whole map-reduce like program look like in Swift? Trying to write out a (non-contrived) example of "this is the program that I think I should be able to write, even though Swift can't run it today" would be interesting, perhaps. I found that very interesting when arguing how while loops were useless in Swift all those years ago, and I think there are some interesting issues waiting to be discovered around how you would actually implement a useful map body and reduce body. From hategan at mcs.anl.gov Tue Jun 21 05:05:27 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 21 Jun 2011 03:05:27 -0700 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: <9990D2B0-3D75-4078-9A38-5E9117952070@hawaga.org.uk> References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> <1308427445.30334.8.camel@blabla> <1308442906.17227.6.camel@blabla> <52893FDE-3A39-4D6C-BE8E-2A7C8FA738B9@hawaga.org.uk> <1308445052.17227.25.camel@blabla> <1308471487.19473.51.camel@blabla> <1308563085.9001.31.camel@blabla> <9990D2B0-3D75-4078-9A38-5E9117952070@hawaga.org.uk> Message-ID: <1308650727.10855.32.camel@blabla> On Tue, 2011-06-21 at 11:11 +0200, Ben Clifford wrote: > On Jun 20, 2011, at 11:44 AM, Mihael Hategan wrote: > > > That's some crappy writing, but if I'm reading it right and in haskell > > lingo: > > > > map_ :: (Tk1 Tv1) -> [(Tk2, Tv2)] > > There is an automatic collect_:: [(Tk2, Tv2)] -> (Map Tk2 [Tv2]) > > reduce_ :: (Tk2 [Tv2]) -> [Tv3] > > > > There's an input: (Map Tk1 Tv1) and the system does: > > intermediate = map (map_) toList input > > result = foldl (++) [] map (reduce_) toList collect_ intermediate > > > > So it does seem like the shapes of the arrays can depend in non-trivial > > ways on the algorithm. > > the above looks ok. I think swift arrays can pretty much handle the > shapes of array needed. When Tk1 and Tk2, the key types of the input > data and intermediate data, are numeric, then swift already has that. Right. I'm saying though that I see where Yadu's and Justin's assessment about the need for an array append mechanism is coming from. But then, on a second thought, since we don't care about indices, then we could do (and I'm taking some liberties there with explicitly specifying array key types): type T2 { Tk2 k; Tv2 v; } (Tv2 r[Tk2][]) collect(T2 input[]) { foreach v, i in input { r[v.k][i] = v.v; } } type Tk3 { Tk2 k1; int k2; } (Tv3 r[Tk3]) reduce(Tv2 i[Tk2][]) { foreach v2array, k in i { x = reduce_(k, v2array); foreach y, j in x { Tk3 k3; k3.k1 = k; k3.k2 = j; r[k3] = y; } } } We can probably do without arbitrary typed keys if all TkX are int, but in the case of word counting (see below), that won't quite be the case. > > But what would a whole map-reduce like program look like in Swift? > Trying to write out a (non-contrived) example of "this is the program > that I think I should be able to write, even though Swift can't run it > today" would be interesting, perhaps. I found that very interesting > when arguing how while loops were useless in Swift all those years > ago, and I think there are some interesting issues waiting to be > discovered around how you would actually implement a useful map body > and reduce body. I think the typical example (if I remember correctly) used in the map/reduce paper is a reasonable case: count the number of occurrences of each word in a set of files. In a sense, this can be easy if we don't insist that this be implemented in the exact same way as with m/r. The one can say something along the lines of: result = uniq(sort(cat(map(count_words, files)))) (of course, with map being a foreach and so on). But if we need a collect() with the same semantics as the one in m/r, then it's a bit of a different story I think. From benc at hawaga.org.uk Tue Jun 21 05:27:32 2011 From: benc at hawaga.org.uk (Ben Clifford) Date: Tue, 21 Jun 2011 12:27:32 +0200 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: <1308650727.10855.32.camel@blabla> References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> <1308427445.30334.8.camel@blabla> <1308442906.17227.6.camel@blabla> <52893FDE-3A39-4D6C-BE8E-2A7C8FA738B9@hawaga.org.uk> <1308445052.17227.25.camel@blabla> <1308471487.19473.51.camel@blabla> <1308563085.9001.31.camel@blabla> <9990D2B0-3D75-4078-9A38-5E9117952070@hawaga.org.uk> <1308650727.10855.32.camel@blabla> Message-ID: <66433A35-B988-4AAA-B994-5E3D1F69AFC1@hawaga.org.uk> > > I think the typical example (if I remember correctly) used in the > map/reduce paper is a reasonable case: count the number of occurrences > of each word in a set of files. > > In a sense, this can be easy if we don't insist that this be implemented > in the exact same way as with m/r. The "can be done but not like mapreduce" is something that hangs around the edge of the map-reduce-in-swift idea the whole time. Is there a reason to want to do it other than fascination-with-google? > The one can say something along the > lines of: > > result = uniq(sort(cat(map(count_words, files)))) (of course, with map > being a foreach and so on). > > But if we need a collect() with the same semantics as the one in m/r, > then it's a bit of a different story I think. Right. From yadudoc1729 at gmail.com Tue Jun 21 10:16:50 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Tue, 21 Jun 2011 20:46:50 +0530 Subject: [Swift-devel] Re: Associative arrays issue In-Reply-To: References: Message-ID: > I've come across a very curious situation in which > array["key"][1] ?works while array["key"]["id1"] fails. I've found that the problem is not from the arrays but from the foreach construct I was using. Foreach expects an int for the index, whereas in this case the index being a string fails. I'm trying to fix this. -- Thanks and Regards, Yadu Nand B From jonmon at utexas.edu Tue Jun 21 15:34:32 2011 From: jonmon at utexas.edu (Jonathan Monette) Date: Tue, 21 Jun 2011 15:34:32 -0500 Subject: [Swift-devel] error in Swift Message-ID: <166AA9C7-07BA-4B6B-A3DE-EA4F2736681F@utexas.edu> Hello, I found an error when Swift creates the command line script it uses to execute. Here is the script: #!/bin/bash "/home/jonmon/Library/Montage/bin/mProject" "[-X, raw_dir/raw_image_4.fits, proj_dir/proj_raw_image_4.fits, header.hdr]" 1>"stdout.txt" 2>"stderr.txt" That script was generated by Swift. The "[", "]", and "," are not supposed to be there. Here is the config file and sites file: execution.retries=0 sitedir.keep=true status.mode=provider wrapper.log.always.transfer=true foreach.maxthreads=1024 wrapper.parameter.mode=files use.provider.staging=true provider.staging.pin.swiftfiles=false /gpfs/pads/swift/jonmon/Swift/work/localhost .05 KEEP CI-CCR000013 /gpfs/pads/swift/jonmon/Swift/work/pads 3600 1 500 1 1 fast 5 10000 KEEP attached is the log file. The error that was reported came from the app call. It said the wrong usage because the command line had those extra symbols in them. I have a feeling it has something to do with the last two lines in the config file but I am basing that completely on the fact that if I remove those two line the scripts run to completion. -------------- next part -------------- A non-text attachment was scrubbed... Name: montage.log Type: application/octet-stream Size: 55741 bytes Desc: not available URL: -------------- next part -------------- From hategan at mcs.anl.gov Thu Jun 23 13:28:33 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 23 Jun 2011 13:28:33 -0500 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: References: <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov> <1308340585.14231.0.camel@blabla> <1308355792.24489.2.camel@blabla> <1308356386.24582.2.camel@blabla> <1308358465.24760.10.camel@blabla> <1308367911.25750.0.camel@blabla> <1308368849.25915.3.camel@blabla> <1308370144.26103.1.camel@blabla> <1308426730.30334.0.camel@blabla> Message-ID: <1308853713.17296.0.camel@blabla> I committed a tentative fix to svn. swift trunk r4666. On Sat, 2011-06-18 at 21:21 -0500, David Kelly wrote: > Here's one I got with the latest version tonight: > > 2011-06-18 21:01:34 > Full thread dump Java HotSpot(TM) Server VM (19.1-b02 mixed mode): > > "Attach Listener" daemon prio=10 tid=0x087d6c00 nid=0x882 runnable > [0x00000000] > java.lang.Thread.State: RUNNABLE > > Locked ownable synchronizers: > - None > > "Progress ticker" daemon prio=10 tid=0x9e854400 nid=0x85e waiting on > condition [0x9dfad000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at org.griphyn.vdl.karajan.lib.RuntimeStats > $ProgressTicker.run(RuntimeStats.java:141) > > Locked ownable synchronizers: > - None > > "Restart Log Sync" daemon prio=10 tid=0x9e82bc00 nid=0x85d in > Object.wait() [0x9dffe000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0xaedb4778> (a > org.globus.cog.karajan.workflow.nodes.restartLog.SyncThread) > at java.lang.Object.wait(Object.java:485) > at > org.globus.cog.karajan.workflow.nodes.restartLog.SyncThread.run(SyncThread.java:47) > - locked <0xaedb4778> (a > org.globus.cog.karajan.workflow.nodes.restartLog.SyncThread) > > Locked ownable synchronizers: > - None > > "Overloaded Host Monitor" daemon prio=10 tid=0x08b19400 nid=0x85c > waiting on condition [0x9e15c000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.globus.cog.karajan.scheduler.OverloadedHostMonitor.run(OverloadedHostMonitor.java:47) > > Locked ownable synchronizers: > - None > > "Timer-0" daemon prio=10 tid=0x08354400 nid=0x85b in Object.wait() > [0x9e1ad000] > java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0xaf5e2c38> (a java.util.TaskQueue) > at java.util.TimerThread.mainLoop(Timer.java:509) > - locked <0xaf5e2c38> (a java.util.TaskQueue) > at java.util.TimerThread.run(Timer.java:462) > > Locked ownable synchronizers: > - None > > "NBS0" daemon prio=10 tid=0x087de000 nid=0x85a waiting on condition > [0x9e1fe000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xaf5e3628> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > at java.util.concurrent.locks.AbstractQueuedSynchronizer > $ConditionObject.await(AbstractQueuedSynchronizer.java:1987) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) > at java.util.concurrent.ThreadPoolExecutor > $Worker.run(ThreadPoolExecutor.java:907) > at java.lang.Thread.run(Thread.java:662) > > Locked ownable synchronizers: > - None > > "pool-1-thread-8" prio=10 tid=0x08445c00 nid=0x859 waiting for monitor > entry [0x9e369000] > java.lang.Thread.State: BLOCKED (on object monitor) > at org.griphyn.vdl.karajan.WrapperMap.close(WrapperMap.java:25) > - waiting to lock <0xaf5ed6c8> (a > org.griphyn.vdl.karajan.WrapperMap) > at > org.griphyn.vdl.karajan.lib.VDLFunction.closeShallow(VDLFunction.java:516) > - locked <0xaed9c108> (a > org.griphyn.vdl.mapping.RootArrayDataNode) > at > org.griphyn.vdl.karajan.lib.SetFieldValue.deepCopy(SetFieldValue.java:121) > at > org.griphyn.vdl.karajan.lib.SetFieldValue.function(SetFieldValue.java:49) > - locked <0xaed9c108> (a > org.griphyn.vdl.mapping.RootArrayDataNode) > at > org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:67) > at > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > at > org.globus.cog.karajan.workflow.nodes.functions.Argument.post(Argument.java:48) > at > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > at > org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:71) > at > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > at > org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28) > at > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > at > org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28) > at > org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) > at > org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) > at > org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) > at > org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) > at > org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) > at java.util.concurrent.Executors > $RunnableAdapter.call(Executors.java:441) > at java.util.concurrent.FutureTask > $Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at java.util.concurrent.ThreadPoolExecutor > $Worker.runTask(ThreadPoolExecutor.java:886) > at java.util.concurrent.ThreadPoolExecutor > $Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > > Locked ownable synchronizers: > - <0xaf5e3b10> (a java.util.concurrent.locks.ReentrantLock > $NonfairSync) > > "pool-1-thread-7" prio=10 tid=0x08443000 nid=0x858 waiting on > condition [0x9e3ba000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xaf5ec2b0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > at java.util.concurrent.locks.AbstractQueuedSynchronizer > $ConditionObject.await(AbstractQueuedSynchronizer.java:1987) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) > at java.util.concurrent.ThreadPoolExecutor > $Worker.run(ThreadPoolExecutor.java:907) > at java.lang.Thread.run(Thread.java:662) > > Locked ownable synchronizers: > - None > > "pool-1-thread-6" prio=10 tid=0x08441800 nid=0x857 waiting for monitor > entry [0x9e40b000] > java.lang.Thread.State: BLOCKED (on object monitor) > at > org.griphyn.vdl.mapping.AbstractDataNode.addListener(AbstractDataNode.java:583) > - waiting to lock <0xaed9c108> (a > org.griphyn.vdl.mapping.RootArrayDataNode) > at > org.griphyn.vdl.karajan.DSHandleFutureWrapper.(DSHandleFutureWrapper.java:24) > at > org.griphyn.vdl.karajan.WrapperMap.addNodeListener(WrapperMap.java:61) > - locked <0xaf5ed6c8> (a org.griphyn.vdl.karajan.WrapperMap) > at > org.griphyn.vdl.karajan.lib.VDLFunction.addFutureListener(VDLFunction.java:523) > at org.griphyn.vdl.karajan.lib.Stagein.function(Stagein.java:88) > at > org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:67) > at > org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) > at > org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) > at > org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) > at > org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) > at > org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) > at java.util.concurrent.Executors > $RunnableAdapter.call(Executors.java:441) > at java.util.concurrent.FutureTask > $Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at java.util.concurrent.ThreadPoolExecutor > $Worker.runTask(ThreadPoolExecutor.java:886) > at java.util.concurrent.ThreadPoolExecutor > $Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > > Locked ownable synchronizers: > - <0xaf5ec3a0> (a java.util.concurrent.locks.ReentrantLock > $NonfairSync) > > "pool-1-thread-5" prio=10 tid=0x085d2000 nid=0x856 waiting on > condition [0x9e45c000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xaf5ec2b0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > at java.util.concurrent.locks.AbstractQueuedSynchronizer > $ConditionObject.await(AbstractQueuedSynchronizer.java:1987) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) > at java.util.concurrent.ThreadPoolExecutor > $Worker.run(ThreadPoolExecutor.java:907) > at java.lang.Thread.run(Thread.java:662) > > Locked ownable synchronizers: > - None > > "pool-1-thread-4" prio=10 tid=0x085d2800 nid=0x855 waiting on > condition [0x9e4ad000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xaf5ec2b0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > at java.util.concurrent.locks.AbstractQueuedSynchronizer > $ConditionObject.await(AbstractQueuedSynchronizer.java:1987) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) > at java.util.concurrent.ThreadPoolExecutor > $Worker.run(ThreadPoolExecutor.java:907) > at java.lang.Thread.run(Thread.java:662) > > Locked ownable synchronizers: > - None > > "pool-1-thread-3" prio=10 tid=0x08839400 nid=0x854 waiting on > condition [0x9e65c000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xaf5ec2b0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > at java.util.concurrent.locks.AbstractQueuedSynchronizer > $ConditionObject.await(AbstractQueuedSynchronizer.java:1987) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) > at java.util.concurrent.ThreadPoolExecutor > $Worker.run(ThreadPoolExecutor.java:907) > at java.lang.Thread.run(Thread.java:662) > > Locked ownable synchronizers: > - None > > "pool-1-thread-2" prio=10 tid=0x08837800 nid=0x853 waiting on > condition [0x9e6ad000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xaf5ec2b0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > at java.util.concurrent.locks.AbstractQueuedSynchronizer > $ConditionObject.await(AbstractQueuedSynchronizer.java:1987) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) > at java.util.concurrent.ThreadPoolExecutor > $Worker.run(ThreadPoolExecutor.java:907) > at java.lang.Thread.run(Thread.java:662) > > Locked ownable synchronizers: > - None > > "pool-1-thread-1" prio=10 tid=0x9e826c00 nid=0x852 waiting on > condition [0x9e6fe000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xaf5ec2b0> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at > java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > at java.util.concurrent.locks.AbstractQueuedSynchronizer > $ConditionObject.await(AbstractQueuedSynchronizer.java:1987) > at > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) > at java.util.concurrent.ThreadPoolExecutor > $Worker.run(ThreadPoolExecutor.java:907) > at java.lang.Thread.run(Thread.java:662) > > Locked ownable synchronizers: > - None > > "Hang checker" prio=10 tid=0x9e81bc00 nid=0x851 waiting for monitor > entry [0x9e4fe000] > java.lang.Thread.State: BLOCKED (on object monitor) > at org.griphyn.vdl.karajan.Monitor.dumpVariables(Monitor.java:220) > - waiting to lock <0xaf5ed6c8> (a > org.griphyn.vdl.karajan.WrapperMap) > at org.griphyn.vdl.karajan.HangChecker.run(HangChecker.java:54) > at java.util.TimerThread.mainLoop(Timer.java:512) > at java.util.TimerThread.run(Timer.java:462) > > Locked ownable synchronizers: > - None > > "Low Memory Detector" daemon prio=10 tid=0x08235800 nid=0x84f runnable > [0x00000000] > java.lang.Thread.State: RUNNABLE > > Locked ownable synchronizers: > - None > > "CompilerThread1" daemon prio=10 tid=0x9f4a9800 nid=0x84e waiting on > condition [0x00000000] > java.lang.Thread.State: RUNNABLE > > Locked ownable synchronizers: > - None > > "CompilerThread0" daemon prio=10 tid=0x9f4a7800 nid=0x84d waiting on > condition [0x00000000] > java.lang.Thread.State: RUNNABLE > > Locked ownable synchronizers: > - None > > "Signal Dispatcher" daemon prio=10 tid=0x9f4a5c00 nid=0x84c runnable > [0x00000000] > java.lang.Thread.State: RUNNABLE > > Locked ownable synchronizers: > - None > > "Finalizer" daemon prio=10 tid=0x9f497400 nid=0x84b in Object.wait() > [0x9f194000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0xa398ce08> (a java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118) > - locked <0xa398ce08> (a java.lang.ref.ReferenceQueue$Lock) > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134) > at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159) > > Locked ownable synchronizers: > - None > > "Reference Handler" daemon prio=10 tid=0x9f496000 nid=0x84a in > Object.wait() [0x9f1e5000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0xa397b6e8> (a java.lang.ref.Reference$Lock) > at java.lang.Object.wait(Object.java:485) > at java.lang.ref.Reference > $ReferenceHandler.run(Reference.java:116) > - locked <0xa397b6e8> (a java.lang.ref.Reference$Lock) > > Locked ownable synchronizers: > - None > > "main" prio=10 tid=0x08224400 nid=0x844 in Object.wait() [0xb6a06000] > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > - waiting on <0xaf50ac30> (a > org.griphyn.vdl.karajan.VDL2ExecutionContext) > at java.lang.Object.wait(Object.java:485) > at > org.globus.cog.karajan.workflow.ExecutionContext.waitFor(ExecutionContext.java:226) > - locked <0xaf50ac30> (a > org.griphyn.vdl.karajan.VDL2ExecutionContext) > at org.griphyn.vdl.karajan.Loader.main(Loader.java:201) > > Locked ownable synchronizers: > - None > > "VM Thread" prio=10 tid=0x9f492400 nid=0x849 runnable > > "GC task thread#0 (ParallelGC)" prio=10 tid=0x0822b800 nid=0x845 > runnable > > "GC task thread#1 (ParallelGC)" prio=10 tid=0x0822cc00 nid=0x846 > runnable > > "GC task thread#2 (ParallelGC)" prio=10 tid=0x0822e400 nid=0x847 > runnable > > "GC task thread#3 (ParallelGC)" prio=10 tid=0x0822f800 nid=0x848 > runnable > > "VM Periodic Task Thread" prio=10 tid=0x9f4b4000 nid=0x850 waiting on > condition > > JNI global references: 1392 > > > Found one Java-level deadlock: > ============================= > "pool-1-thread-8": > waiting to lock monitor 0x08b1859c (object 0xaf5ed6c8, a > org.griphyn.vdl.karajan.WrapperMap), > which is held by "pool-1-thread-6" > "pool-1-thread-6": > waiting to lock monitor 0x9e89178c (object 0xaed9c108, a > org.griphyn.vdl.mapping.RootArrayDataNode), > which is held by "pool-1-thread-8" > > Java stack information for the threads listed above: > =================================================== > "pool-1-thread-8": > at org.griphyn.vdl.karajan.WrapperMap.close(WrapperMap.java:25) > - waiting to lock <0xaf5ed6c8> (a > org.griphyn.vdl.karajan.WrapperMap) > at > org.griphyn.vdl.karajan.lib.VDLFunction.closeShallow(VDLFunction.java:516) > - locked <0xaed9c108> (a > org.griphyn.vdl.mapping.RootArrayDataNode) > at > org.griphyn.vdl.karajan.lib.SetFieldValue.deepCopy(SetFieldValue.java:121) > at > org.griphyn.vdl.karajan.lib.SetFieldValue.function(SetFieldValue.java:49) > - locked <0xaed9c108> (a > org.griphyn.vdl.mapping.RootArrayDataNode) > at > org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:67) > at > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > at > org.globus.cog.karajan.workflow.nodes.functions.Argument.post(Argument.java:48) > at > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > at > org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:71) > at > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > at > org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28) > at > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > at > org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28) > at > org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) > at > org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) > at > org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) > at > org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) > at > org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) > at java.util.concurrent.Executors > $RunnableAdapter.call(Executors.java:441) > at java.util.concurrent.FutureTask > $Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at java.util.concurrent.ThreadPoolExecutor > $Worker.runTask(ThreadPoolExecutor.java:886) > at java.util.concurrent.ThreadPoolExecutor > $Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > "pool-1-thread-6": > at > org.griphyn.vdl.mapping.AbstractDataNode.addListener(AbstractDataNode.java:583) > - waiting to lock <0xaed9c108> (a > org.griphyn.vdl.mapping.RootArrayDataNode) > at > org.griphyn.vdl.karajan.DSHandleFutureWrapper.(DSHandleFutureWrapper.java:24) > at > org.griphyn.vdl.karajan.WrapperMap.addNodeListener(WrapperMap.java:61) > - locked <0xaf5ed6c8> (a org.griphyn.vdl.karajan.WrapperMap) > at > org.griphyn.vdl.karajan.lib.VDLFunction.addFutureListener(VDLFunction.java:523) > at org.griphyn.vdl.karajan.lib.Stagein.function(Stagein.java:88) > at > org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:67) > at > org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) > at > org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) > at > org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) > at > org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) > at > org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) > at > org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) > at > org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) > at java.util.concurrent.Executors > $RunnableAdapter.call(Executors.java:441) > at java.util.concurrent.FutureTask > $Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at java.util.concurrent.ThreadPoolExecutor > $Worker.runTask(ThreadPoolExecutor.java:886) > at java.util.concurrent.ThreadPoolExecutor > $Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > > Found 1 deadlock. > > > On Sat, Jun 18, 2011 at 2:52 PM, Mihael Hategan > wrote: > Post the entire output of jstack please. > > > On Sat, 2011-06-18 at 00:08 -0500, Alberto Chavez wrote: > > I already did the svn updates for cog, and swift, rebuilt > swift with > > ant redist, and ant clean + ant dist, > > the test keeps hanging, but I'm missing probably something: > > > > > > $ svn update cog > > At revision 3167. > > > > > > $ cd cog/modules/ > > $ svn update swift > > At revision 4632. > > > > > > I did ant redist, and it was successfully built, but the > test is still > > hanging, now it hung on the 11th iteration. However I'm not > quite sure > > if the svn was properly updated, since I did > > > > > > $ jstack -l 11471 | grep addListener > > at > > > org.griphyn.vdl.mapping.AbstractDataNode.addListener(AbstractDataNode.java:583) > > at > > > org.griphyn.vdl.mapping.AbstractDataNode.addListener(AbstractDataNode.java:583) > > > > > > but Mihael Mentioned that addListener is not > AbstractDataNode in the > > newer version. > > Any thoughts on that? > > > > > > Alberto. > > > Subject: RE: [Swift-devel] Swift unresponsive while using > local > > provider. > > > From: hategan at mcs.anl.gov > > > To: alberto_chavez at live.com > > > CC: ketancmaheshwari at gmail.com; > swift-devel at ci.uchicago.edu > > > Date: Fri, 17 Jun 2011 21:09:04 -0700 > > > > > > On Fri, 2011-06-17 at 22:50 -0500, Alberto Chavez wrote: > > > > Oops, Too late. > > > > Do i need to do ant clean , and ant dist again? > > > > > > You're probably fine in most cases with just "ant dist". > But if you > > want > > > to be sure, do what Jonathan is saying: "ant redist" > > > > > > > > > > > > > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > From hategan at mcs.anl.gov Thu Jun 23 13:29:47 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 23 Jun 2011 13:29:47 -0500 Subject: [Swift-devel] error in Swift In-Reply-To: <166AA9C7-07BA-4B6B-A3DE-EA4F2736681F@utexas.edu> References: <166AA9C7-07BA-4B6B-A3DE-EA4F2736681F@utexas.edu> Message-ID: <1308853787.17296.1.camel@blabla> This was present when using provider staging. Fixed in r4667. On Tue, 2011-06-21 at 15:34 -0500, Jonathan Monette wrote: > Hello, > I found an error when Swift creates the command line script it uses to execute. Here is the script: > > #!/bin/bash > "/home/jonmon/Library/Montage/bin/mProject" "[-X, raw_dir/raw_image_4.fits, proj_dir/proj_raw_image_4.fits, header.hdr]" 1>"stdout.txt" 2>"stderr.txt" > > > That script was generated by Swift. The "[", "]", and "," are not supposed to be there. > > Here is the config file and sites file: > execution.retries=0 > sitedir.keep=true > status.mode=provider > wrapper.log.always.transfer=true > foreach.maxthreads=1024 > wrapper.parameter.mode=files > use.provider.staging=true > provider.staging.pin.swiftfiles=false > > > > > > /gpfs/pads/swift/jonmon/Swift/work/localhost > > .05 > > KEEP > > > > > CI-CCR000013 > /gpfs/pads/swift/jonmon/Swift/work/pads > > 3600 > 1 > 500 > 1 > 1 > fast > > 5 > 10000 > > KEEP > > > > attached is the log file. The error that was reported came from the app call. It said the wrong usage because the command line had those extra symbols in them. I have a feeling it has something to do with the last two lines in the config file but I am basing that completely on the fact that if I remove those two line the scripts run to completion. > > From jonmon at utexas.edu Thu Jun 23 15:23:05 2011 From: jonmon at utexas.edu (Jonathan Monette) Date: Thu, 23 Jun 2011 15:23:05 -0500 Subject: [Swift-devel] error in Swift In-Reply-To: <1308853787.17296.1.camel@blabla> References: <166AA9C7-07BA-4B6B-A3DE-EA4F2736681F@utexas.edu> <1308853787.17296.1.camel@blabla> Message-ID: <17E2A6FC-A37D-4D32-8D46-8E4A8C4EFAF3@utexas.edu> Alright. I will give it a try today. On Jun 23, 2011, at 1:29 PM, Mihael Hategan wrote: > This was present when using provider staging. Fixed in r4667. > > On Tue, 2011-06-21 at 15:34 -0500, Jonathan Monette wrote: >> Hello, >> I found an error when Swift creates the command line script it uses to execute. Here is the script: >> >> #!/bin/bash >> "/home/jonmon/Library/Montage/bin/mProject" "[-X, raw_dir/raw_image_4.fits, proj_dir/proj_raw_image_4.fits, header.hdr]" 1>"stdout.txt" 2>"stderr.txt" >> >> >> That script was generated by Swift. The "[", "]", and "," are not supposed to be there. >> >> Here is the config file and sites file: >> execution.retries=0 >> sitedir.keep=true >> status.mode=provider >> wrapper.log.always.transfer=true >> foreach.maxthreads=1024 >> wrapper.parameter.mode=files >> use.provider.staging=true >> provider.staging.pin.swiftfiles=false >> >> >> >> >> >> /gpfs/pads/swift/jonmon/Swift/work/localhost >> >> .05 >> >> KEEP >> >> >> >> >> CI-CCR000013 >> /gpfs/pads/swift/jonmon/Swift/work/pads >> >> 3600 >> 1 >> 500 >> 1 >> 1 >> fast >> >> 5 >> 10000 >> >> KEEP >> >> >> >> attached is the log file. The error that was reported came from the app call. It said the wrong usage because the command line had those extra symbols in them. I have a feeling it has something to do with the last two lines in the config file but I am basing that completely on the fact that if I remove those two line the scripts run to completion. >> >> > > From davidkelly999 at gmail.com Thu Jun 23 20:12:02 2011 From: davidkelly999 at gmail.com (David Kelly) Date: Thu, 23 Jun 2011 20:12:02 -0500 Subject: [Swift-devel] Swift Documentation Message-ID: Hello all, Just wanted to send a quick update on the status of the swift documentation. Starting with trunk, all website documentation is kept in the swift/docs directory. Everything is in asciidoc format and will get converted to HTML and PDF. More information about how to use asciidoc can be found at http://www.methods.co.nz/asciidoc. Currently in this directory are the userguide, tutorial, cookbook and something I created called the newuser guide. The newuser guide should probably be renamed because it's slightly confusing - it's not a replacement for the userguide, it's a guide for new users. It's still a work in progress and needs to be put together, but there are bits and pieces there. It will contain information on how to set up Swift in specific environments. Right now there is info for how to run on PADS, Fusion, and Beagle. It gives information on how to request an account, which queues to use, setting your projects, how to set up sites.xml, use gensites, and any other specific information about how to get Swift up and running in that environment. To build the guides, run: $ swift/docs/build_docs.sh This script assumes that you have asciidoc already installed. You can edit the build_docs.sh script to change information about ownership and file permissions of the HTML and PDF output files as needed. When you want to make a change to the documentation, make your changes directly to the files in your swift/docs directory. The guides are split into chapters. Chapters are located in the same directory as the guide but do not have a file extension. Please build your documents and take a look at the output before committing. As of tonight, there is a cron job running under my account that will update the website nightly. It runs the build_docs.sh script every night at 7pm on communicado. If there is an important change that needs to be updated sooner, you can also manually update the website. From one of the CI machines, run: $ swift/docs/build_docs.sh /ci/www/projects/swift/guides Manual updates require that you a member of the vdl2-svn group. If you are not a member of this group, you can request access by mailing support at ci.uchicago.edu. I'll write a more detailed guide about this at some point (perhaps an update to the maintaining swift web content on the wiki), but just wanted to get this info out for now. Let me know if you have any questions or run into issues with it. Thanks, David -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Thu Jun 23 22:59:26 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 23 Jun 2011 22:59:26 -0500 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: <66433A35-B988-4AAA-B994-5E3D1F69AFC1@hawaga.org.uk> References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> <1308427445.30334.8.camel@blabla> <1308442906.17227.6.camel@blabla> <52893FDE-3A39-4D6C-BE8E-2A7C8FA738B9@hawaga.org.uk> <1308445052.17227.25.camel@blabla> <1308471487.19473.51.camel@blabla> <1308563085.9001.31.camel@blabla> <9990D2B0-3D75-4078-9A38-5E9117952070@hawaga.org.uk> <1308650727.10855.32.camel@blabla> <66433A35-B988-4AAA-B994-5E3D1F69AFC1@hawaga.org.uk> Message-ID: <1308887966.21540.5.camel@blabla> On Tue, 2011-06-21 at 12:27 +0200, Ben Clifford wrote: > > > > I think the typical example (if I remember correctly) used in the > > map/reduce paper is a reasonable case: count the number of occurrences > > of each word in a set of files. > > > > In a sense, this can be easy if we don't insist that this be implemented > > in the exact same way as with m/r. > > The "can be done but not like mapreduce" is something that hangs > around the edge of the map-reduce-in-swift idea the whole time. Is > there a reason to want to do it other than fascination-with-google? > I'm inclined to side with your view. But I can't stop thinking that the real question is whether swift is able to meaningfully tie together some two particular applications which happen to behave like implementations of map and reduce as used by the google framework. From yadudoc1729 at gmail.com Fri Jun 24 01:17:36 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Fri, 24 Jun 2011 11:47:36 +0530 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: <1308887966.21540.5.camel@blabla> References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> <1308427445.30334.8.camel@blabla> <1308442906.17227.6.camel@blabla> <52893FDE-3A39-4D6C-BE8E-2A7C8FA738B9@hawaga.org.uk> <1308445052.17227.25.camel@blabla> <1308471487.19473.51.camel@blabla> <1308563085.9001.31.camel@blabla> <9990D2B0-3D75-4078-9A38-5E9117952070@hawaga.org.uk> <1308650727.10855.32.camel@blabla> <66433A35-B988-4AAA-B994-5E3D1F69AFC1@hawaga.org.uk> <1308887966.21540.5.camel@blabla> Message-ID: Hi, >> > I think the typical example (if I remember correctly) used in the >> > map/reduce paper is a reasonable case: count the number of occurrences >> > of each word in a set of files. As far as I know and understand (which is very little), MapReduce as used in Hadoop works because it runs on top of HDFS. It is not the ability to compute in a distributed fashion that gives advantage but the idea of the operation going to the system which holds the data. This is handled by HDFS doing replication and other stuff. It is the huge size of the data repositories, and their distributed storage structure that allowed MapReduce to work its magic. Do we have that ? Or could we have that on swift ? -- Thanks and Regards, Yadu Nand B From hategan at mcs.anl.gov Fri Jun 24 01:32:43 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 24 Jun 2011 01:32:43 -0500 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> <1308427445.30334.8.camel@blabla> <1308442906.17227.6.camel@blabla> <52893FDE-3A39-4D6C-BE8E-2A7C8FA738B9@hawaga.org.uk> <1308445052.17227.25.camel@blabla> <1308471487.19473.51.camel@blabla> <1308563085.9001.31.camel@blabla> <9990D2B0-3D75-4078-9A38-5E9117952070@hawaga.org.uk> <1308650727.10855.32.camel@blabla> <66433A35-B988-4AAA-B994-5E3D1F69AFC1@hawaga.org.uk> <1308887966.21540.5.camel@blabla> Message-ID: <1308897163.25370.4.camel@blabla> On Fri, 2011-06-24 at 11:47 +0530, Yadu Nand wrote: > Hi, > > >> > I think the typical example (if I remember correctly) used in the > >> > map/reduce paper is a reasonable case: count the number of occurrences > >> > of each word in a set of files. > > As far as I know and understand (which is very little), MapReduce > as used in Hadoop works because it runs on top of HDFS. It is not > the ability to compute in a distributed fashion that gives advantage > but the idea of the operation going to the system which holds the > data. This is handled by HDFS doing replication and other stuff. > > It is the huge size of the data repositories, and their distributed > storage structure that allowed MapReduce to work its magic. Do > we have that ? Or could we have that on swift ? > I'd say many things conspired to make m/r work: came from google, simple concept, efficiently distribut-able, etc. But yes, for I/O bound things minimizing data movement is an important aspect. However, that is a scheduling issue which is somewhat orthogonal to the language issue. It is not currently done by swift, but we previously explored data location biasing for site selection, but we never got to actually writing committable code for it. From benc at hawaga.org.uk Fri Jun 24 01:44:43 2011 From: benc at hawaga.org.uk (Ben Clifford) Date: Fri, 24 Jun 2011 08:44:43 +0200 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: <1308897163.25370.4.camel@blabla> References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> <1308427445.30334.8.camel@blabla> <1308442906.17227.6.camel@blabla> <52893FDE-3A39-4D6C-BE8E-2A7C8FA738B9@hawaga.org.uk> <1308445052.17227.25.camel@blabla> <1308471487.19473.51.camel@blabla> <1308563085.9001.31.camel@blabla> <9990D2B0-3D75-4078-9A38-5E9117952070@hawaga.org.uk> <1308650727.10855.32.camel@blabla> <66433A35-B988-4AAA-B994-5E3D1F69AFC1@hawaga.org.uk> <1308887966.21540.5.camel@blabla> <1308897163.25370.4.camel@blabla> Message-ID: On Jun 24, 2011, at 8:32 AM, Mihael Hategan wrote: > > I'd say many things conspired to make m/r work: came from google, simple > concept, efficiently distribut-able, etc. But yes, for I/O bound things > minimizing data movement is an important aspect. However, that is a > scheduling issue which is somewhat orthogonal to the language issue. It > is not currently done by swift, but we previously explored data location > biasing for site selection, but we never got to actually writing > committable code for it. Something even less implemented was talk of interfacing to some replica management system which would be aware that the same data could exist in multiple places and do useful things based on that. Useful things might be only running on sites that already had the data, and staging from that site's store (which integrates replica location into the pick site / stage data / run job process at at least two places) From davidkelly999 at gmail.com Fri Jun 24 01:56:56 2011 From: davidkelly999 at gmail.com (David Kelly) Date: Fri, 24 Jun 2011 01:56:56 -0500 Subject: [Swift-devel] Swift unresponsive while using local provider. In-Reply-To: <1308853713.17296.0.camel@blabla> References: <1057363198.21380.1308339533392.JavaMail.root@zimbra.anl.gov> <1308340585.14231.0.camel@blabla> <1308355792.24489.2.camel@blabla> <1308356386.24582.2.camel@blabla> <1308358465.24760.10.camel@blabla> <1308367911.25750.0.camel@blabla> <1308368849.25915.3.camel@blabla> <1308370144.26103.1.camel@blabla> <1308426730.30334.0.camel@blabla> <1308853713.17296.0.camel@blabla> Message-ID: Here is the jstack output from the most recent version On Thu, Jun 23, 2011 at 1:28 PM, Mihael Hategan wrote: > I committed a tentative fix to svn. swift trunk r4666. > > On Sat, 2011-06-18 at 21:21 -0500, David Kelly wrote: > > Here's one I got with the latest version tonight: > > > > 2011-06-18 21:01:34 > > Full thread dump Java HotSpot(TM) Server VM (19.1-b02 mixed mode): > > > > "Attach Listener" daemon prio=10 tid=0x087d6c00 nid=0x882 runnable > > [0x00000000] > > java.lang.Thread.State: RUNNABLE > > > > Locked ownable synchronizers: > > - None > > > > "Progress ticker" daemon prio=10 tid=0x9e854400 nid=0x85e waiting on > > condition [0x9dfad000] > > java.lang.Thread.State: TIMED_WAITING (sleeping) > > at java.lang.Thread.sleep(Native Method) > > at org.griphyn.vdl.karajan.lib.RuntimeStats > > $ProgressTicker.run(RuntimeStats.java:141) > > > > Locked ownable synchronizers: > > - None > > > > "Restart Log Sync" daemon prio=10 tid=0x9e82bc00 nid=0x85d in > > Object.wait() [0x9dffe000] > > java.lang.Thread.State: WAITING (on object monitor) > > at java.lang.Object.wait(Native Method) > > - waiting on <0xaedb4778> (a > > org.globus.cog.karajan.workflow.nodes.restartLog.SyncThread) > > at java.lang.Object.wait(Object.java:485) > > at > > > org.globus.cog.karajan.workflow.nodes.restartLog.SyncThread.run(SyncThread.java:47) > > - locked <0xaedb4778> (a > > org.globus.cog.karajan.workflow.nodes.restartLog.SyncThread) > > > > Locked ownable synchronizers: > > - None > > > > "Overloaded Host Monitor" daemon prio=10 tid=0x08b19400 nid=0x85c > > waiting on condition [0x9e15c000] > > java.lang.Thread.State: TIMED_WAITING (sleeping) > > at java.lang.Thread.sleep(Native Method) > > at > > > org.globus.cog.karajan.scheduler.OverloadedHostMonitor.run(OverloadedHostMonitor.java:47) > > > > Locked ownable synchronizers: > > - None > > > > "Timer-0" daemon prio=10 tid=0x08354400 nid=0x85b in Object.wait() > > [0x9e1ad000] > > java.lang.Thread.State: TIMED_WAITING (on object monitor) > > at java.lang.Object.wait(Native Method) > > - waiting on <0xaf5e2c38> (a java.util.TaskQueue) > > at java.util.TimerThread.mainLoop(Timer.java:509) > > - locked <0xaf5e2c38> (a java.util.TaskQueue) > > at java.util.TimerThread.run(Timer.java:462) > > > > Locked ownable synchronizers: > > - None > > > > "NBS0" daemon prio=10 tid=0x087de000 nid=0x85a waiting on condition > > [0x9e1fe000] > > java.lang.Thread.State: WAITING (parking) > > at sun.misc.Unsafe.park(Native Method) > > - parking to wait for <0xaf5e3628> (a > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > > at > > java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > > at java.util.concurrent.locks.AbstractQueuedSynchronizer > > $ConditionObject.await(AbstractQueuedSynchronizer.java:1987) > > at > > > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) > > at > > > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) > > at java.util.concurrent.ThreadPoolExecutor > > $Worker.run(ThreadPoolExecutor.java:907) > > at java.lang.Thread.run(Thread.java:662) > > > > Locked ownable synchronizers: > > - None > > > > "pool-1-thread-8" prio=10 tid=0x08445c00 nid=0x859 waiting for monitor > > entry [0x9e369000] > > java.lang.Thread.State: BLOCKED (on object monitor) > > at org.griphyn.vdl.karajan.WrapperMap.close(WrapperMap.java:25) > > - waiting to lock <0xaf5ed6c8> (a > > org.griphyn.vdl.karajan.WrapperMap) > > at > > > org.griphyn.vdl.karajan.lib.VDLFunction.closeShallow(VDLFunction.java:516) > > - locked <0xaed9c108> (a > > org.griphyn.vdl.mapping.RootArrayDataNode) > > at > > > org.griphyn.vdl.karajan.lib.SetFieldValue.deepCopy(SetFieldValue.java:121) > > at > > org.griphyn.vdl.karajan.lib.SetFieldValue.function(SetFieldValue.java:49) > > - locked <0xaed9c108> (a > > org.griphyn.vdl.mapping.RootArrayDataNode) > > at > > org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:67) > > at > > > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > > at > > > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > > at > > > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > > at > > > org.globus.cog.karajan.workflow.nodes.functions.Argument.post(Argument.java:48) > > at > > > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > > at > > > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > > at > > > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > > at > > org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:71) > > at > > > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > > at > > > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > > at > > > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > > at > > > org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28) > > at > > > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > > at > > > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > > at > > > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > > at > > > org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28) > > at > > > org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) > > at > > > org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) > > at > > > org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) > > at > > org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) > > at > > org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) > > at > > > org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) > > at > > org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) > > at > > > org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) > > at java.util.concurrent.Executors > > $RunnableAdapter.call(Executors.java:441) > > at java.util.concurrent.FutureTask > > $Sync.innerRun(FutureTask.java:303) > > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > > at java.util.concurrent.ThreadPoolExecutor > > $Worker.runTask(ThreadPoolExecutor.java:886) > > at java.util.concurrent.ThreadPoolExecutor > > $Worker.run(ThreadPoolExecutor.java:908) > > at java.lang.Thread.run(Thread.java:662) > > > > Locked ownable synchronizers: > > - <0xaf5e3b10> (a java.util.concurrent.locks.ReentrantLock > > $NonfairSync) > > > > "pool-1-thread-7" prio=10 tid=0x08443000 nid=0x858 waiting on > > condition [0x9e3ba000] > > java.lang.Thread.State: WAITING (parking) > > at sun.misc.Unsafe.park(Native Method) > > - parking to wait for <0xaf5ec2b0> (a > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > > at > > java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > > at java.util.concurrent.locks.AbstractQueuedSynchronizer > > $ConditionObject.await(AbstractQueuedSynchronizer.java:1987) > > at > > > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) > > at > > > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) > > at java.util.concurrent.ThreadPoolExecutor > > $Worker.run(ThreadPoolExecutor.java:907) > > at java.lang.Thread.run(Thread.java:662) > > > > Locked ownable synchronizers: > > - None > > > > "pool-1-thread-6" prio=10 tid=0x08441800 nid=0x857 waiting for monitor > > entry [0x9e40b000] > > java.lang.Thread.State: BLOCKED (on object monitor) > > at > > > org.griphyn.vdl.mapping.AbstractDataNode.addListener(AbstractDataNode.java:583) > > - waiting to lock <0xaed9c108> (a > > org.griphyn.vdl.mapping.RootArrayDataNode) > > at > > > org.griphyn.vdl.karajan.DSHandleFutureWrapper.(DSHandleFutureWrapper.java:24) > > at > > org.griphyn.vdl.karajan.WrapperMap.addNodeListener(WrapperMap.java:61) > > - locked <0xaf5ed6c8> (a org.griphyn.vdl.karajan.WrapperMap) > > at > > > org.griphyn.vdl.karajan.lib.VDLFunction.addFutureListener(VDLFunction.java:523) > > at org.griphyn.vdl.karajan.lib.Stagein.function(Stagein.java:88) > > at > > org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:67) > > at > > > org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) > > at > > > org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) > > at > > > org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) > > at > > org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) > > at > > org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) > > at > > > org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) > > at > > org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) > > at > > > org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) > > at java.util.concurrent.Executors > > $RunnableAdapter.call(Executors.java:441) > > at java.util.concurrent.FutureTask > > $Sync.innerRun(FutureTask.java:303) > > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > > at java.util.concurrent.ThreadPoolExecutor > > $Worker.runTask(ThreadPoolExecutor.java:886) > > at java.util.concurrent.ThreadPoolExecutor > > $Worker.run(ThreadPoolExecutor.java:908) > > at java.lang.Thread.run(Thread.java:662) > > > > Locked ownable synchronizers: > > - <0xaf5ec3a0> (a java.util.concurrent.locks.ReentrantLock > > $NonfairSync) > > > > "pool-1-thread-5" prio=10 tid=0x085d2000 nid=0x856 waiting on > > condition [0x9e45c000] > > java.lang.Thread.State: WAITING (parking) > > at sun.misc.Unsafe.park(Native Method) > > - parking to wait for <0xaf5ec2b0> (a > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > > at > > java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > > at java.util.concurrent.locks.AbstractQueuedSynchronizer > > $ConditionObject.await(AbstractQueuedSynchronizer.java:1987) > > at > > > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) > > at > > > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) > > at java.util.concurrent.ThreadPoolExecutor > > $Worker.run(ThreadPoolExecutor.java:907) > > at java.lang.Thread.run(Thread.java:662) > > > > Locked ownable synchronizers: > > - None > > > > "pool-1-thread-4" prio=10 tid=0x085d2800 nid=0x855 waiting on > > condition [0x9e4ad000] > > java.lang.Thread.State: WAITING (parking) > > at sun.misc.Unsafe.park(Native Method) > > - parking to wait for <0xaf5ec2b0> (a > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > > at > > java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > > at java.util.concurrent.locks.AbstractQueuedSynchronizer > > $ConditionObject.await(AbstractQueuedSynchronizer.java:1987) > > at > > > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) > > at > > > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) > > at java.util.concurrent.ThreadPoolExecutor > > $Worker.run(ThreadPoolExecutor.java:907) > > at java.lang.Thread.run(Thread.java:662) > > > > Locked ownable synchronizers: > > - None > > > > "pool-1-thread-3" prio=10 tid=0x08839400 nid=0x854 waiting on > > condition [0x9e65c000] > > java.lang.Thread.State: WAITING (parking) > > at sun.misc.Unsafe.park(Native Method) > > - parking to wait for <0xaf5ec2b0> (a > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > > at > > java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > > at java.util.concurrent.locks.AbstractQueuedSynchronizer > > $ConditionObject.await(AbstractQueuedSynchronizer.java:1987) > > at > > > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) > > at > > > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) > > at java.util.concurrent.ThreadPoolExecutor > > $Worker.run(ThreadPoolExecutor.java:907) > > at java.lang.Thread.run(Thread.java:662) > > > > Locked ownable synchronizers: > > - None > > > > "pool-1-thread-2" prio=10 tid=0x08837800 nid=0x853 waiting on > > condition [0x9e6ad000] > > java.lang.Thread.State: WAITING (parking) > > at sun.misc.Unsafe.park(Native Method) > > - parking to wait for <0xaf5ec2b0> (a > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > > at > > java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > > at java.util.concurrent.locks.AbstractQueuedSynchronizer > > $ConditionObject.await(AbstractQueuedSynchronizer.java:1987) > > at > > > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) > > at > > > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) > > at java.util.concurrent.ThreadPoolExecutor > > $Worker.run(ThreadPoolExecutor.java:907) > > at java.lang.Thread.run(Thread.java:662) > > > > Locked ownable synchronizers: > > - None > > > > "pool-1-thread-1" prio=10 tid=0x9e826c00 nid=0x852 waiting on > > condition [0x9e6fe000] > > java.lang.Thread.State: WAITING (parking) > > at sun.misc.Unsafe.park(Native Method) > > - parking to wait for <0xaf5ec2b0> (a > > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > > at > > java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) > > at java.util.concurrent.locks.AbstractQueuedSynchronizer > > $ConditionObject.await(AbstractQueuedSynchronizer.java:1987) > > at > > > java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) > > at > > > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947) > > at java.util.concurrent.ThreadPoolExecutor > > $Worker.run(ThreadPoolExecutor.java:907) > > at java.lang.Thread.run(Thread.java:662) > > > > Locked ownable synchronizers: > > - None > > > > "Hang checker" prio=10 tid=0x9e81bc00 nid=0x851 waiting for monitor > > entry [0x9e4fe000] > > java.lang.Thread.State: BLOCKED (on object monitor) > > at org.griphyn.vdl.karajan.Monitor.dumpVariables(Monitor.java:220) > > - waiting to lock <0xaf5ed6c8> (a > > org.griphyn.vdl.karajan.WrapperMap) > > at org.griphyn.vdl.karajan.HangChecker.run(HangChecker.java:54) > > at java.util.TimerThread.mainLoop(Timer.java:512) > > at java.util.TimerThread.run(Timer.java:462) > > > > Locked ownable synchronizers: > > - None > > > > "Low Memory Detector" daemon prio=10 tid=0x08235800 nid=0x84f runnable > > [0x00000000] > > java.lang.Thread.State: RUNNABLE > > > > Locked ownable synchronizers: > > - None > > > > "CompilerThread1" daemon prio=10 tid=0x9f4a9800 nid=0x84e waiting on > > condition [0x00000000] > > java.lang.Thread.State: RUNNABLE > > > > Locked ownable synchronizers: > > - None > > > > "CompilerThread0" daemon prio=10 tid=0x9f4a7800 nid=0x84d waiting on > > condition [0x00000000] > > java.lang.Thread.State: RUNNABLE > > > > Locked ownable synchronizers: > > - None > > > > "Signal Dispatcher" daemon prio=10 tid=0x9f4a5c00 nid=0x84c runnable > > [0x00000000] > > java.lang.Thread.State: RUNNABLE > > > > Locked ownable synchronizers: > > - None > > > > "Finalizer" daemon prio=10 tid=0x9f497400 nid=0x84b in Object.wait() > > [0x9f194000] > > java.lang.Thread.State: WAITING (on object monitor) > > at java.lang.Object.wait(Native Method) > > - waiting on <0xa398ce08> (a java.lang.ref.ReferenceQueue$Lock) > > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:118) > > - locked <0xa398ce08> (a java.lang.ref.ReferenceQueue$Lock) > > at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:134) > > at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159) > > > > Locked ownable synchronizers: > > - None > > > > "Reference Handler" daemon prio=10 tid=0x9f496000 nid=0x84a in > > Object.wait() [0x9f1e5000] > > java.lang.Thread.State: WAITING (on object monitor) > > at java.lang.Object.wait(Native Method) > > - waiting on <0xa397b6e8> (a java.lang.ref.Reference$Lock) > > at java.lang.Object.wait(Object.java:485) > > at java.lang.ref.Reference > > $ReferenceHandler.run(Reference.java:116) > > - locked <0xa397b6e8> (a java.lang.ref.Reference$Lock) > > > > Locked ownable synchronizers: > > - None > > > > "main" prio=10 tid=0x08224400 nid=0x844 in Object.wait() [0xb6a06000] > > java.lang.Thread.State: WAITING (on object monitor) > > at java.lang.Object.wait(Native Method) > > - waiting on <0xaf50ac30> (a > > org.griphyn.vdl.karajan.VDL2ExecutionContext) > > at java.lang.Object.wait(Object.java:485) > > at > > > org.globus.cog.karajan.workflow.ExecutionContext.waitFor(ExecutionContext.java:226) > > - locked <0xaf50ac30> (a > > org.griphyn.vdl.karajan.VDL2ExecutionContext) > > at org.griphyn.vdl.karajan.Loader.main(Loader.java:201) > > > > Locked ownable synchronizers: > > - None > > > > "VM Thread" prio=10 tid=0x9f492400 nid=0x849 runnable > > > > "GC task thread#0 (ParallelGC)" prio=10 tid=0x0822b800 nid=0x845 > > runnable > > > > "GC task thread#1 (ParallelGC)" prio=10 tid=0x0822cc00 nid=0x846 > > runnable > > > > "GC task thread#2 (ParallelGC)" prio=10 tid=0x0822e400 nid=0x847 > > runnable > > > > "GC task thread#3 (ParallelGC)" prio=10 tid=0x0822f800 nid=0x848 > > runnable > > > > "VM Periodic Task Thread" prio=10 tid=0x9f4b4000 nid=0x850 waiting on > > condition > > > > JNI global references: 1392 > > > > > > Found one Java-level deadlock: > > ============================= > > "pool-1-thread-8": > > waiting to lock monitor 0x08b1859c (object 0xaf5ed6c8, a > > org.griphyn.vdl.karajan.WrapperMap), > > which is held by "pool-1-thread-6" > > "pool-1-thread-6": > > waiting to lock monitor 0x9e89178c (object 0xaed9c108, a > > org.griphyn.vdl.mapping.RootArrayDataNode), > > which is held by "pool-1-thread-8" > > > > Java stack information for the threads listed above: > > =================================================== > > "pool-1-thread-8": > > at org.griphyn.vdl.karajan.WrapperMap.close(WrapperMap.java:25) > > - waiting to lock <0xaf5ed6c8> (a > > org.griphyn.vdl.karajan.WrapperMap) > > at > > > org.griphyn.vdl.karajan.lib.VDLFunction.closeShallow(VDLFunction.java:516) > > - locked <0xaed9c108> (a > > org.griphyn.vdl.mapping.RootArrayDataNode) > > at > > > org.griphyn.vdl.karajan.lib.SetFieldValue.deepCopy(SetFieldValue.java:121) > > at > > org.griphyn.vdl.karajan.lib.SetFieldValue.function(SetFieldValue.java:49) > > - locked <0xaed9c108> (a > > org.griphyn.vdl.mapping.RootArrayDataNode) > > at > > org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:67) > > at > > > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > > at > > > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > > at > > > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > > at > > > org.globus.cog.karajan.workflow.nodes.functions.Argument.post(Argument.java:48) > > at > > > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > > at > > > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > > at > > > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > > at > > org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:71) > > at > > > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > > at > > > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > > at > > > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > > at > > > org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28) > > at > > > org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > > at > > > org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > > at > > > org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > > at > > > org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28) > > at > > > org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) > > at > > > org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) > > at > > > org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) > > at > > org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) > > at > > org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) > > at > > > org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) > > at > > org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) > > at > > > org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) > > at java.util.concurrent.Executors > > $RunnableAdapter.call(Executors.java:441) > > at java.util.concurrent.FutureTask > > $Sync.innerRun(FutureTask.java:303) > > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > > at java.util.concurrent.ThreadPoolExecutor > > $Worker.runTask(ThreadPoolExecutor.java:886) > > at java.util.concurrent.ThreadPoolExecutor > > $Worker.run(ThreadPoolExecutor.java:908) > > at java.lang.Thread.run(Thread.java:662) > > "pool-1-thread-6": > > at > > > org.griphyn.vdl.mapping.AbstractDataNode.addListener(AbstractDataNode.java:583) > > - waiting to lock <0xaed9c108> (a > > org.griphyn.vdl.mapping.RootArrayDataNode) > > at > > > org.griphyn.vdl.karajan.DSHandleFutureWrapper.(DSHandleFutureWrapper.java:24) > > at > > org.griphyn.vdl.karajan.WrapperMap.addNodeListener(WrapperMap.java:61) > > - locked <0xaf5ed6c8> (a org.griphyn.vdl.karajan.WrapperMap) > > at > > > org.griphyn.vdl.karajan.lib.VDLFunction.addFutureListener(VDLFunction.java:523) > > at org.griphyn.vdl.karajan.lib.Stagein.function(Stagein.java:88) > > at > > org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:67) > > at > > > org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) > > at > > > org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) > > at > > > org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) > > at > > org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) > > at > > org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) > > at > > > org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) > > at > > org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) > > at > > > org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) > > at java.util.concurrent.Executors > > $RunnableAdapter.call(Executors.java:441) > > at java.util.concurrent.FutureTask > > $Sync.innerRun(FutureTask.java:303) > > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > > at java.util.concurrent.ThreadPoolExecutor > > $Worker.runTask(ThreadPoolExecutor.java:886) > > at java.util.concurrent.ThreadPoolExecutor > > $Worker.run(ThreadPoolExecutor.java:908) > > at java.lang.Thread.run(Thread.java:662) > > > > Found 1 deadlock. > > > > > > On Sat, Jun 18, 2011 at 2:52 PM, Mihael Hategan > > wrote: > > Post the entire output of jstack please. > > > > > > On Sat, 2011-06-18 at 00:08 -0500, Alberto Chavez wrote: > > > I already did the svn updates for cog, and swift, rebuilt > > swift with > > > ant redist, and ant clean + ant dist, > > > the test keeps hanging, but I'm missing probably something: > > > > > > > > > $ svn update cog > > > At revision 3167. > > > > > > > > > $ cd cog/modules/ > > > $ svn update swift > > > At revision 4632. > > > > > > > > > I did ant redist, and it was successfully built, but the > > test is still > > > hanging, now it hung on the 11th iteration. However I'm not > > quite sure > > > if the svn was properly updated, since I did > > > > > > > > > $ jstack -l 11471 | grep addListener > > > at > > > > > > org.griphyn.vdl.mapping.AbstractDataNode.addListener(AbstractDataNode.java:583) > > > at > > > > > > org.griphyn.vdl.mapping.AbstractDataNode.addListener(AbstractDataNode.java:583) > > > > > > > > > but Mihael Mentioned that addListener is not > > AbstractDataNode in the > > > newer version. > > > Any thoughts on that? > > > > > > > > > Alberto. > > > > Subject: RE: [Swift-devel] Swift unresponsive while using > > local > > > provider. > > > > From: hategan at mcs.anl.gov > > > > To: alberto_chavez at live.com > > > > CC: ketancmaheshwari at gmail.com; > > swift-devel at ci.uchicago.edu > > > > Date: Fri, 17 Jun 2011 21:09:04 -0700 > > > > > > > > On Fri, 2011-06-17 at 22:50 -0500, Alberto Chavez wrote: > > > > > Oops, Too late. > > > > > Do i need to do ant clean , and ant dist again? > > > > > > > > You're probably fine in most cases with just "ant dist". > > But if you > > > want > > > > to be sure, do what Jonathan is saying: "ant redist" > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-devel > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: jstack.log Type: text/x-log Size: 21446 bytes Desc: not available URL: From yadudoc1729 at gmail.com Fri Jun 24 04:51:11 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Fri, 24 Jun 2011 15:21:11 +0530 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: <1308897163.25370.4.camel@blabla> References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> <1308427445.30334.8.camel@blabla> <1308442906.17227.6.camel@blabla> <52893FDE-3A39-4D6C-BE8E-2A7C8FA738B9@hawaga.org.uk> <1308445052.17227.25.camel@blabla> <1308471487.19473.51.camel@blabla> <1308563085.9001.31.camel@blabla> <9990D2B0-3D75-4078-9A38-5E9117952070@hawaga.org.uk> <1308650727.10855.32.camel@blabla> <66433A35-B988-4AAA-B994-5E3D1F69AFC1@hawaga.org.uk> <1308887966.21540.5.camel@blabla> <1308897163.25370.4.camel@blabla> Message-ID: > I'd say many things conspired to make m/r work: came from google, simple > concept, efficiently distribut-able, etc. But yes, for I/O bound things > minimizing data movement is an important aspect. However, that is a > scheduling issue which is somewhat orthogonal to the language issue. It > is not currently done by swift, but we previously explored data location > biasing for site selection, but we never got to actually writing > committable code for it. Should we look into data location biasing again ? As Michael had mentioned earlier, the ability to do a preprocessing of intermediate data from the map stage before going into reduce is interesting. Hadoop already does that in the (optional) [1] combine step. [1] http://wiki.apache.org/hadoop/HadoopMapReduce -- Thanks and Regards, Yadu Nand B From yadudoc1729 at gmail.com Fri Jun 24 05:02:55 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Fri, 24 Jun 2011 15:32:55 +0530 Subject: [Swift-devel] Associative array in Swift [GSoC] In-Reply-To: References: <1330405204.20349.1308327611730.JavaMail.root@zimbra.anl.gov> <1308346603.15198.1.camel@blabla> <1308427445.30334.8.camel@blabla> <1308442906.17227.6.camel@blabla> <52893FDE-3A39-4D6C-BE8E-2A7C8FA738B9@hawaga.org.uk> <1308445052.17227.25.camel@blabla> <1308471487.19473.51.camel@blabla> <1308563085.9001.31.camel@blabla> <9990D2B0-3D75-4078-9A38-5E9117952070@hawaga.org.uk> <1308650727.10855.32.camel@blabla> <66433A35-B988-4AAA-B994-5E3D1F69AFC1@hawaga.org.uk> <1308887966.21540.5.camel@blabla> <1308897163.25370.4.camel@blabla> Message-ID: > Something even less implemented was talk of interfacing to some replica management system which would be aware that the same data could exist in multiple places and do useful things based on that. Useful things might be only running on sites that already had the data, and staging from that site's store (which integrates replica location into the pick site / stage data / run job process at at least two places) How else is fault tolerance handled ? Do we send the compute to some other site when one fails ? If we are aware of data-locality we could probably send the compute to sites holding the relevant data. But all this is meaningful, if we already have a distributed data storage situation ( I haven't seen any other situations ). -- Thanks and Regards, Yadu Nand B From wilde at mcs.anl.gov Fri Jun 24 10:09:19 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 24 Jun 2011 10:09:19 -0500 (CDT) Subject: [Swift-devel] Swift Documentation In-Reply-To: Message-ID: <904097453.40741.1308928159952.JavaMail.root@zimbra.anl.gov> David, thanks, this is great. I tried build_docs.sh and note a few things (which I and others should fix): - We need a disclaimer of "under construction" on the cookbook - The cookbook isnt getting section numbered, and has a format problem in the Beagle section, step 4. - build_docs.sh should leave top-level links in the top-level directory below the destination dir. Currently all the doc roots are left deep below a trunk/ subdir palced in the dest dir. - the build processes seems to need dblatex for making pdfs - need to note and point to an install package for needed tools Nice work! - Mike ----- Original Message ----- Hello all, Just wanted to send a quick update on the status of the swift documentation. Starting with trunk, all website documentation is kept in the swift/docs directory. Everything is in asciidoc format and will get converted to HTML and PDF. More information about how to use asciidoc can be found at http://www.methods.co.nz/asciidoc . Currently in this directory are the userguide, tutorial, cookbook and something I created called the newuser guide. The newuser guide should probably be renamed because it's slightly confusing - it's not a replacement for the userguide, it's a guide for new users. It's still a work in progress and needs to be put together, but there are bits and pieces there. It will contain information on how to set up Swift in specific environments. Right now there is info for how to run on PADS, Fusion, and Beagle. It gives information on how to request an account, which queues to use, setting your projects, how to set up sites.xml, use gensites, and any other specific information about how to get Swift up and running in that environment. To build the guides, run: $ swift/docs/build_docs.sh This script assumes that you have asciidoc already installed. You can edit the build_docs.sh script to change information about ownership and file permissions of the HTML and PDF output files as needed. When you want to make a change to the documentation, make your changes directly to the files in your swift/docs directory. The guides are split into chapters. Chapters are located in the same directory as the guide but do not have a file extension. Please build your documents and take a look at the output before committing. As of tonight, there is a cron job running under my account that will update the website nightly. It runs the build_docs.sh script every night at 7pm on communicado. If there is an important change that needs to be updated sooner, you can also manually update the website. From one of the CI machines, run: $ swift/docs/build_docs.sh /ci/www/projects/swift/guides Manual updates require that you a member of the vdl2-svn group. If you are not a member of this group, you can request access by mailing support at ci.uchicago.edu . I'll write a more detailed guide about this at some point (perhaps an update to the maintaining swift web content on the wiki), but just wanted to get this info out for now. Let me know if you have any questions or run into issues with it. Thanks, David _______________________________________________ Swift-devel mailing list Swift-devel at ci.uchicago.edu https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From yadudoc1729 at gmail.com Fri Jun 24 13:39:03 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Sat, 25 Jun 2011 00:09:03 +0530 Subject: [Swift-devel] Associative arrays: foreach index issue. Message-ID: Hi, After discussing with Justin I used java.util.UUIDs as indices for cases in which we need something close to appending the value associated with a particular key. Now, array["key"]["!"] = 1 ; array["key"]["!"] = 2 ; works alright, except for the part in which we use foreach to retrieve the values as the index is actually a UUID casted to string. Justin asked me to propose a way to fix this, including a new associative array declaration syntax. 1. We may try using a different declaration style to denote non-int subscripts. int array [$] [$] ( where $ represents strings ) 2. Or simply modify the foreach syntax to accommodate string indices. Leaving the following style foreach value, string:index in array["key"] { // do work } I can't think of anything simpler and which makes more meaning. Please help! Of the 2 I mentioned above, I think the 2nd method is the easiest to implement. -- Thanks and Regards, Yadu Nand B From hategan at mcs.anl.gov Fri Jun 24 13:51:53 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 24 Jun 2011 13:51:53 -0500 Subject: [Swift-devel] Associative arrays: foreach index issue. In-Reply-To: References: Message-ID: <1308941513.27422.5.camel@blabla> Ok, so slow down a bit. On Sat, 2011-06-25 at 00:09 +0530, Yadu Nand wrote: > Hi, > > After discussing with Justin I used java.util.UUIDs as indices for > cases in which we need something close to appending the value > associated with a particular key. We've established that any random id is as bad as a sequential id. So use the thread index instead if you really want this. But I think we also established that we don't really need implicit indices. > > Now, array["key"]["!"] = 1 ; array["key"]["!"] = 2 ; > works alright, except for the part in which we use foreach to retrieve > the values as the index is actually a UUID casted to string. > > Justin asked me to propose a way to fix this, including a new > associative array declaration syntax. > > 1. We may try using a different declaration style to denote non-int > subscripts. > int array [$] [$] ( where $ represents strings ) That's random an ugly. We should use the exact type for declarations: int array[string]. > 2. Or simply modify the foreach syntax to accommodate string > indices. Leaving the following style > foreach value, string:index in array["key"] { > // do work > } And what happens if there is a mismatch between the foreach index type and the array type? Point being that we can use type inference to figure out the type of the index since we know the type of the array key. > > I can't think of anything simpler and which makes more meaning. > Please help! > Of the 2 I mentioned above, I think the 2nd method is the easiest > to implement. > From yadudoc1729 at gmail.com Fri Jun 24 14:20:28 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Sat, 25 Jun 2011 00:50:28 +0530 Subject: [Swift-devel] Associative arrays: foreach index issue. In-Reply-To: <1308941513.27422.5.camel@blabla> References: <1308941513.27422.5.camel@blabla> Message-ID: Hi Mihael, > We've established that any random id is as bad as a sequential id. So > use the thread index instead if you really want this. But I think we > also established that we don't really need implicit indices. So, the UUID approach is not right/preferable. I'm sorry I didn't understand much from the earlier mail thread, and went on thinking its ok as long as the index is unique. >From the earlier thread you proposed using thread prefixes for array indices, should I try that ? Or is there a way to avoid indices altogether ? > That's random an ugly. We should use the exact type for declarations: > int array[string]. > >> 2. Or simply modify the foreach syntax to accommodate string >> ? ? ?indices. Leaving the following style >> ? ? ?foreach value, string:index in array["key"] { >> ? ? ? ? // do work >> ? ? ?} > And what happens if there is a mismatch between the foreach index type > and the array type? I don't understand, why would that matter as long as we know the types of each ? > Point being that we can use type inference to figure out the type of the > index since we know the type of the array key. -- Thanks and Regards, Yadu Nand B From benc at hawaga.org.uk Fri Jun 24 14:30:31 2011 From: benc at hawaga.org.uk (Ben Clifford) Date: Fri, 24 Jun 2011 21:30:31 +0200 Subject: [Swift-devel] Associative arrays: foreach index issue. In-Reply-To: <1308941513.27422.5.camel@blabla> References: <1308941513.27422.5.camel@blabla> Message-ID: On Jun 24, 2011, at 8:51 PM, Mihael Hategan wrote: > Ok, so slow down a bit. > > On Sat, 2011-06-25 at 00:09 +0530, Yadu Nand wrote: >> Hi, >> >> After discussing with Justin I used java.util.UUIDs as indices for >> cases in which we need something close to appending the value >> associated with a particular key. > > We've established that any random id is as bad as a sequential id. So > use the thread index instead if you really want this. agree. > But I think we > also established that we don't really need implicit indices. > disagree. I think they can make coding simpler by allowing Swift to figure out stuff for you rather than you having to express boilerplate indices that you can get wrong. I think the ["!"] syntax is awful - I suggested it only for the purposes of my original message for the sake of having *something*. >> >> 1. We may try using a different declaration style to denote non-int >> subscripts. >> int array [$] [$] ( where $ represents strings ) > > That's random an ugly. We should use the exact type for declarations: > > int array[string]. agree. (Part of my justification: conceptually (even if it is never implemented) any data type that has a sensible equality relation can be used as an index. Other examples are booleans, and structs composed of other types with equality.) > >> 2. Or simply modify the foreach syntax to accommodate string >> indices. Leaving the following style >> foreach value, string:index in array["key"] { >> // do work >> } > > And what happens if there is a mismatch between the foreach index type > and the array type? > > Point being that we can use type inference to figure out the type of the > index since we know the type of the array key. agree. originally foreach had a type spec for 'value' too, which was replaced by inference - it could never contain any useful information. From hategan at mcs.anl.gov Fri Jun 24 14:48:56 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 24 Jun 2011 14:48:56 -0500 Subject: [Swift-devel] Associative arrays: foreach index issue. In-Reply-To: References: <1308941513.27422.5.camel@blabla> Message-ID: <1308944936.27759.1.camel@blabla> On Fri, 2011-06-24 at 21:30 +0200, Ben Clifford wrote: > > But I think we > > also established that we don't really need implicit indices. > > > > disagree. I think they can make coding simpler by allowing Swift to > figure out stuff for you rather than you having to express boilerplate > indices that you can get wrong. > Right. I'm only trying to say that we don't strictly need this, not that it wouldn't be nice or useful if it's conceptually sound. From yadudoc1729 at gmail.com Fri Jun 24 14:49:15 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Sat, 25 Jun 2011 01:19:15 +0530 Subject: [Swift-devel] Associative arrays: foreach index issue. In-Reply-To: References: <1308941513.27422.5.camel@blabla> Message-ID: On Sat, Jun 25, 2011 at 1:00 AM, Ben Clifford wrote: > > On Jun 24, 2011, at 8:51 PM, Mihael Hategan wrote: > >> Ok, so slow down a bit. >> >> On Sat, 2011-06-25 at 00:09 +0530, Yadu Nand wrote: >>> Hi, >>> >>> After discussing with Justin I used java.util.UUIDs as indices for >>> cases in which we need something close to appending the value >>> associated with a particular key. >> >> We've established that any random id is as bad as a sequential id. So >> use the thread index instead if you really want this. > > agree. Okay, I'll try to put in the index as the thread index. If this happens to be and int, we don't need to make changes anywhere else. I'm not sure what the type the thread index is. >> But I think we >> also established that we don't really need implicit indices. >> > > disagree. I think they can make coding simpler by allowing Swift to figure out stuff for you rather than you having to express boilerplate indices that you can get wrong. > > I think the ["!"] syntax is awful - I suggested it only for the purposes of my original message for the sake of having *something*. I tried the ["!"] because it was easy to implement. int array[ ][ ]; append ( array["key"] , ) Or we'll have to try what Mihael said which is, array["key"] += >>> >>> 1. We may try using a different declaration style to denote non-int >>> ? ? subscripts. >>> ? ? int array [$] [$] ? ( where $ represents strings ) >> >> That's random an ugly. We should use the exact type for declarations: >> >> int array[string]. > > agree. > > (Part of my justification: conceptually (even if it is never implemented) any data type that has a sensible equality relation can be used as an index. Other examples are booleans, and structs composed of other types with equality.) > >> >>> 2. Or simply modify the foreach syntax to accommodate string >>> ? ? indices. Leaving the following style >>> ? ? foreach value, string:index in array["key"] { >>> ? ? ? ?// do work >>> ? ? } >> >> And what happens if there is a mismatch between the foreach index type >> and the array type? >> >> Point being that we can use type inference to figure out the type of the >> index since we know the type of the array key. > > agree. originally foreach had a type spec for 'value' too, which was replaced by inference - it could never contain any useful information. So, I'll have to look into handling type inference which handling the foreach indices. -- Thanks and Regards, Yadu Nand B From hategan at mcs.anl.gov Fri Jun 24 14:51:28 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 24 Jun 2011 14:51:28 -0500 Subject: [Swift-devel] Associative arrays: foreach index issue. In-Reply-To: References: <1308941513.27422.5.camel@blabla> Message-ID: <1308945088.27759.3.camel@blabla> On Fri, 2011-06-24 at 21:30 +0200, Ben Clifford wrote: > > > > int array[string]. > > agree. > > (Part of my justification: conceptually (even if it is never > implemented) any data type that has a sensible equality relation can > be used as an index. Other examples are booleans, and structs composed > of other types with equality.) Right. I was thinking about that and I believe it's straightforward to implement equality for structs. In one of the m/r "translations" i was talking about earlier there was a need for a struct key. From wilde at mcs.anl.gov Fri Jun 24 18:54:34 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 24 Jun 2011 18:54:34 -0500 (CDT) Subject: [Swift-devel] coaster termination problems cause large runs to hang In-Reply-To: <10437909.42587.1308955029827.JavaMail.root@zimbra.anl.gov> Message-ID: <1220179917.42705.1308959674417.JavaMail.root@zimbra.anl.gov> Mihael, Papia is running large sweeps of the DSSAT land use model on PADS, and getting failures, it seems, when the coasters time out. Her script is attempting about 120K model invocations, each taking about 60 seconds to run. She gets between 30K and 60K of these done before it fails. Can you look at the example below, on the CI network in /home/papia/dssat/run01 (which I will copy to ~wilde/dssat.run01 on the CI net)? The swift.out file shows the run progressing nicely until the first coaster worker timeout occurs. The run was started with ./RunSweep.sh: time swift -tc.file tc -sites.file sites.xml -config cf RunDssat.swift >& swift.out The run id is RunID: 20110624-1333-r17fczk0 Swift is 0.92.1. Thanks, Mike login2$ head swift.out Swift svn swift-r4371 cog-r3096 RunID: 20110624-1333-r17fczk0 Progress: Progress: uninitialized:2 Progress: Selecting site:36 Stage in:53 Submitting:1 Submitted:10 Progress: Selecting site:36 Stage in:8 Submitting:2 Submitted:54 Progress: Selecting site:36 Submitted:64 Progress: Selecting site:36 Submitted:64 Progress: Selecting site:36 Submitted:63 Active:1 login2$ ls -l *zk0.log -rw-r--r-- 1 papia ci-users 161039247 Jun 24 17:25 RunDssat-20110624-1333-r17fczk0.log login2$ pwd /home/papia/dssat/run01 login2$ -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From alberto_chavez at live.com Fri Jun 24 19:37:57 2011 From: alberto_chavez at live.com (Alberto Chavez) Date: Fri, 24 Jun 2011 19:37:57 -0500 Subject: [Swift-devel] writeData() output Message-ID: Hi,I have been using suite.sh to run all available tests for swift, and there are a couple of them that use writeData() to generate output; however the output seems odd to me. For instance, the following script: type file; type S { string l; string c; string r; } S s[]; file f <"writeDataStructArray2.out">; f=writeData(s); s[2].l = "baz"; s[2].c = "BAZ"; s[2].r = "Baz"; s[3].l = "qux"; s[3].c = "QUX"; s[3].r = "Qux"; s[0].l = "foo"; s[0].c = "FOO"; s[0].r = "Foo"; s[1].l = "bar"; s[1].c = "BAR"; s[1].r = "Bar"; s[4].l = "frrrr"; s[4].c = "FRRRR"; s[4].r = "Frrrr";Swift 0.92.1 generates writeDataStructArray2.out:r c l Foo FOO foo Bar BAR bar Baz BAZ baz Qux QUX qux Frrrr FRRRR frrrr But, Trunk generates writeDataStructArray2.outr c l s[0].r:string = Foo - Closed s[0].c:string = FOO - Closed s[0].l:string = foo - Closed s[1].r:string = Bar - Closed s[1].c:string = BAR - Closed s[1].l:string = bar - Closed s[2].r:string = Baz - Closed s[2].c:string = BAZ - Closed s[2].l:string = baz - Closed s[3].r:string = Qux - Closed s[3].c:string = QUX - Closed s[3].l:string = qux - Closed s[4].r:string = Frrrr - Closed s[4].c:string = FRRRR - Closed s[4].l:string = frrrr - Closed and I am uncertain whether to consider trunk output as correct or not. Alberto. -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Mon Jun 27 12:31:28 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 27 Jun 2011 12:31:28 -0500 Subject: [Swift-devel] writeData() output In-Reply-To: References: Message-ID: <1309195888.17934.0.camel@blabla> Fixed in swift/trunk/r4696. On Fri, 2011-06-24 at 19:37 -0500, Alberto Chavez wrote: > Hi, > I have been using suite.sh to run all available tests for swift, and > there are a couple of them that use writeData() to generate output; > however the output seems odd to me. > For instance, the following script: > > > type file; > type S { string l; string c; string r; } > S s[]; > file f <"writeDataStructArray2.out">; > f=writeData(s); > s[2].l = "baz"; > s[2].c = "BAZ"; > s[2].r = "Baz"; > s[3].l = "qux"; > s[3].c = "QUX"; > s[3].r = "Qux"; > s[0].l = "foo"; > s[0].c = "FOO"; > s[0].r = "Foo"; > s[1].l = "bar"; > s[1].c = "BAR"; > s[1].r = "Bar"; > s[4].l = "frrrr"; > s[4].c = "FRRRR"; > s[4].r = "Frrrr"; > Swift 0.92.1 generates writeDataStructArray2.out: > r c l > Foo FOO foo > Bar BAR bar > Baz BAZ baz > Qux QUX qux > Frrrr FRRRR frrrr > > But, Trunk generates writeDataStructArray2.out > r c l > s[0].r:string = Foo - Closed s[0].c:string = FOO - Closed s[0].l:string = foo - Closed > s[1].r:string = Bar - Closed s[1].c:string = BAR - Closed s[1].l:string = bar - Closed > s[2].r:string = Baz - Closed s[2].c:string = BAZ - Closed s[2].l:string = baz - Closed > s[3].r:string = Qux - Closed s[3].c:string = QUX - Closed s[3].l:string = qux - Closed > s[4].r:string = Frrrr - Closed s[4].c:string = FRRRR - Closed s[4].l:string = frrrr - Closed > > > and I am uncertain whether to consider trunk output as correct or > not. > > > Alberto. > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From hategan at mcs.anl.gov Mon Jun 27 13:40:23 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 27 Jun 2011 13:40:23 -0500 Subject: [Swift-devel] coaster termination problems cause large runs to hang In-Reply-To: <1220179917.42705.1308959674417.JavaMail.root@zimbra.anl.gov> References: <1220179917.42705.1308959674417.JavaMail.root@zimbra.anl.gov> Message-ID: <1309200023.17934.1.camel@blabla> [hategan at login ~]$ cd /home/papia/dssat/run01 -bash: cd: /home/papia/dssat/run01: Permission denied On Fri, 2011-06-24 at 18:54 -0500, Michael Wilde wrote: > Mihael, > > Papia is running large sweeps of the DSSAT land use model on PADS, and getting failures, it seems, when the coasters time out. Her script is attempting about 120K model invocations, each taking about 60 seconds to run. She gets between 30K and 60K of these done before it fails. > > Can you look at the example below, on the CI network in /home/papia/dssat/run01 > (which I will copy to ~wilde/dssat.run01 on the CI net)? > > The swift.out file shows the run progressing nicely until the first coaster worker timeout occurs. > > The run was started with ./RunSweep.sh: > time swift -tc.file tc -sites.file sites.xml -config cf RunDssat.swift >& swift.out > > The run id is RunID: 20110624-1333-r17fczk0 > Swift is 0.92.1. > > Thanks, > > Mike > > > login2$ head swift.out > Swift svn swift-r4371 cog-r3096 > > RunID: 20110624-1333-r17fczk0 > Progress: > Progress: uninitialized:2 > Progress: Selecting site:36 Stage in:53 Submitting:1 Submitted:10 > Progress: Selecting site:36 Stage in:8 Submitting:2 Submitted:54 > Progress: Selecting site:36 Submitted:64 > Progress: Selecting site:36 Submitted:64 > Progress: Selecting site:36 Submitted:63 Active:1 > login2$ ls -l *zk0.log > -rw-r--r-- 1 papia ci-users 161039247 Jun 24 17:25 RunDssat-20110624-1333-r17fczk0.log > login2$ pwd > /home/papia/dssat/run01 > login2$ > > From wilde at mcs.anl.gov Mon Jun 27 13:48:50 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 27 Jun 2011 13:48:50 -0500 (CDT) Subject: [Swift-devel] coaster termination problems cause large runs to hang In-Reply-To: <1309200023.17934.1.camel@blabla> Message-ID: <666843584.46021.1309200530906.JavaMail.root@zimbra.anl.gov> > (which I will copy to ~wilde/dssat.run01 on the CI net)? bri$ pwd /home/wilde bri$ ls -ld dssat.run01/ drwxr-xr-x 2 wilde ci-users 4096 Jun 24 19:01 dssat.run01// bri$ ----- Original Message ----- > [hategan at login ~]$ cd /home/papia/dssat/run01 > -bash: cd: /home/papia/dssat/run01: Permission denied > > > On Fri, 2011-06-24 at 18:54 -0500, Michael Wilde wrote: > > Mihael, > > > > Papia is running large sweeps of the DSSAT land use model on PADS, > > and getting failures, it seems, when the coasters time out. Her > > script is attempting about 120K model invocations, each taking about > > 60 seconds to run. She gets between 30K and 60K of these done before > > it fails. > > > > Can you look at the example below, on the CI network in > > /home/papia/dssat/run01 > > (which I will copy to ~wilde/dssat.run01 on the CI net)? > > > > The swift.out file shows the run progressing nicely until the first > > coaster worker timeout occurs. > > > > The run was started with ./RunSweep.sh: > > time swift -tc.file tc -sites.file sites.xml -config cf > > RunDssat.swift >& swift.out > > > > The run id is RunID: 20110624-1333-r17fczk0 > > Swift is 0.92.1. > > > > Thanks, > > > > Mike > > > > > > login2$ head swift.out > > Swift svn swift-r4371 cog-r3096 > > > > RunID: 20110624-1333-r17fczk0 > > Progress: > > Progress: uninitialized:2 > > Progress: Selecting site:36 Stage in:53 Submitting:1 Submitted:10 > > Progress: Selecting site:36 Stage in:8 Submitting:2 Submitted:54 > > Progress: Selecting site:36 Submitted:64 > > Progress: Selecting site:36 Submitted:64 > > Progress: Selecting site:36 Submitted:63 Active:1 > > login2$ ls -l *zk0.log > > -rw-r--r-- 1 papia ci-users 161039247 Jun 24 17:25 > > RunDssat-20110624-1333-r17fczk0.log > > login2$ pwd > > /home/papia/dssat/run01 > > login2$ > > > > -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From hategan at mcs.anl.gov Mon Jun 27 13:56:18 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 27 Jun 2011 13:56:18 -0500 Subject: [Swift-devel] coaster termination problems cause large runs to hang In-Reply-To: <666843584.46021.1309200530906.JavaMail.root@zimbra.anl.gov> References: <666843584.46021.1309200530906.JavaMail.root@zimbra.anl.gov> Message-ID: <1309200978.18649.0.camel@blabla> Ah, sorry. On Mon, 2011-06-27 at 13:48 -0500, Michael Wilde wrote: > > (which I will copy to ~wilde/dssat.run01 on the CI net)? > > bri$ pwd > /home/wilde > bri$ ls -ld dssat.run01/ > drwxr-xr-x 2 wilde ci-users 4096 Jun 24 19:01 dssat.run01// > bri$ > > > ----- Original Message ----- > > [hategan at login ~]$ cd /home/papia/dssat/run01 > > -bash: cd: /home/papia/dssat/run01: Permission denied > > > > > > On Fri, 2011-06-24 at 18:54 -0500, Michael Wilde wrote: > > > Mihael, > > > > > > Papia is running large sweeps of the DSSAT land use model on PADS, > > > and getting failures, it seems, when the coasters time out. Her > > > script is attempting about 120K model invocations, each taking about > > > 60 seconds to run. She gets between 30K and 60K of these done before > > > it fails. > > > > > > Can you look at the example below, on the CI network in > > > /home/papia/dssat/run01 > > > (which I will copy to ~wilde/dssat.run01 on the CI net)? > > > > > > The swift.out file shows the run progressing nicely until the first > > > coaster worker timeout occurs. > > > > > > The run was started with ./RunSweep.sh: > > > time swift -tc.file tc -sites.file sites.xml -config cf > > > RunDssat.swift >& swift.out > > > > > > The run id is RunID: 20110624-1333-r17fczk0 > > > Swift is 0.92.1. > > > > > > Thanks, > > > > > > Mike > > > > > > > > > login2$ head swift.out > > > Swift svn swift-r4371 cog-r3096 > > > > > > RunID: 20110624-1333-r17fczk0 > > > Progress: > > > Progress: uninitialized:2 > > > Progress: Selecting site:36 Stage in:53 Submitting:1 Submitted:10 > > > Progress: Selecting site:36 Stage in:8 Submitting:2 Submitted:54 > > > Progress: Selecting site:36 Submitted:64 > > > Progress: Selecting site:36 Submitted:64 > > > Progress: Selecting site:36 Submitted:63 Active:1 > > > login2$ ls -l *zk0.log > > > -rw-r--r-- 1 papia ci-users 161039247 Jun 24 17:25 > > > RunDssat-20110624-1333-r17fczk0.log > > > login2$ pwd > > > /home/papia/dssat/run01 > > > login2$ > > > > > > > From ketancmaheshwari at gmail.com Mon Jun 27 14:00:19 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Mon, 27 Jun 2011 14:00:19 -0500 Subject: [Swift-devel] using Swift Trunk for pbs/beagle: doesn't scale Message-ID: I was trying trunk on beagle today. Seems when I try to scale up the number of app tasks, it hangs. To be more precise, I found that above a threshold of 1600 app tasks, the run hangs with following periodic std.outs: No events in 10s. Registered futures: string[] str_roots Closed, 500 elements, no listeners int n Closed, no listeners int n Closed, no listeners int n Closed, no listeners int n Closed, no listeners int n Closed, no listeners int n Closed, no listeners int n Closed, no listeners int n Closed, no listeners int n Closed, no listeners int n Closed, no listeners int n Closed, no listeners int n Closed, no listeners int n Closed, no listeners int n Closed, no listeners int n Closed, no listeners ---- Waiting threads: 0-2 ---- When, the threshold is <=1600 app tasks, it seems to work. The SWIFT_MAX_HEAP is 7000MB. -- Ketan Sent from my typewriter -------------- next part -------------- An HTML attachment was scrubbed... URL: From yadudoc1729 at gmail.com Tue Jun 28 08:40:43 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Tue, 28 Jun 2011 19:10:43 +0530 Subject: [Swift-devel] New syntax for appending to arrays Message-ID: Hi, As proposed earlier, a new syntax for appending values to arrays have been implemented. Examples are given below. int array[ ] [ ] ; array ["key1"] += 10; array ["key1"] += 20; foreach value, index in array["key1"] { trace ( array[index]) ; } Please note that, we still need to make sure that when we can only append to an array, so to associate keys and multiple values, (i think) we will need two(or more) dimensional arrays. Please let me know if I need to make any changes. I've attached a diff to this mail. Implementation for type definition for array subscripts at declaration is still pending. I'm working on that now. -- Thanks and Regards, Yadu Nand B -------------- next part -------------- A non-text attachment was scrubbed... Name: fourth.patch Type: text/x-patch Size: 7203 bytes Desc: not available URL: From benc at hawaga.org.uk Tue Jun 28 08:54:01 2011 From: benc at hawaga.org.uk (Ben Clifford) Date: Tue, 28 Jun 2011 13:54:01 +0000 (GMT) Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: References: Message-ID: > As proposed earlier, a new syntax for appending values to > arrays have been implemented. Examples are given below. I think I have been won over to the += syntax. > Please note that, we still need to make sure that when we can > only append to an array, so to associate keys and multiple > values, (i think) we will need two(or more) dimensional arrays. yes. but there are already multi-dimensional arrays in swift. also, a one dimensional array should still work, like this: int a[]; a += 232; a += 111; -- From wilde at mcs.anl.gov Tue Jun 28 09:04:35 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 28 Jun 2011 09:04:35 -0500 (CDT) Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: Message-ID: <1564902491.48379.1309269875632.JavaMail.root@zimbra.anl.gov> This sound good. I assume one should be able to append to any dimension of a multi-dimensional array with +=. int a[][]; a[5] += 1 a[5] += 2 sets a[5][?] = 1 a[5][?] = 2 where ? is the index values chosen by += Further, as you work out the typing rules for array subscripts, should the following work? int a[][[]; int b[]; b += 100; b += 200; a += b; # <=== should this append a new array to a? - Mike ----- Original Message ----- > > As proposed earlier, a new syntax for appending values to > > arrays have been implemented. Examples are given below. > > I think I have been won over to the += syntax. > > > Please note that, we still need to make sure that when we can > > only append to an array, so to associate keys and multiple > > values, (i think) we will need two(or more) dimensional arrays. > > yes. but there are already multi-dimensional arrays in swift. > > also, a one dimensional array should still work, like this: > > int a[]; > a += 232; > a += 111; > > -- -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From foster at anl.gov Tue Jun 28 09:08:30 2011 From: foster at anl.gov (Ian Foster) Date: Tue, 28 Jun 2011 09:08:30 -0500 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: <1564902491.48379.1309269875632.JavaMail.root@zimbra.anl.gov> References: <1564902491.48379.1309269875632.JavaMail.root@zimbra.anl.gov> Message-ID: <3535F1E8-5CB2-4586-8A1A-547B45D50553@anl.gov> $0,02 from me -- given that Swift has a C-like syntax, this syntax will surely confuse people?? On Jun 28, 2011, at 9:04 AM, Michael Wilde wrote: > This sound good. I assume one should be able to append to any dimension of a multi-dimensional array with +=. > > int a[][]; > > a[5] += 1 > a[5] += 2 > > sets a[5][?] = 1 > a[5][?] = 2 > > where ? is the index values chosen by += > > Further, as you work out the typing rules for array subscripts, should the following work? > > int a[][[]; > int b[]; > b += 100; > b += 200; > a += b; # <=== should this append a new array to a? > > - Mike > > ----- Original Message ----- >>> As proposed earlier, a new syntax for appending values to >>> arrays have been implemented. Examples are given below. >> >> I think I have been won over to the += syntax. >> >>> Please note that, we still need to make sure that when we can >>> only append to an array, so to associate keys and multiple >>> values, (i think) we will need two(or more) dimensional arrays. >> >> yes. but there are already multi-dimensional arrays in swift. >> >> also, a one dimensional array should still work, like this: >> >> int a[]; >> a += 232; >> a += 111; >> >> -- > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From wilde at mcs.anl.gov Tue Jun 28 09:16:21 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 28 Jun 2011 09:16:21 -0500 (CDT) Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: <3535F1E8-5CB2-4586-8A1A-547B45D50553@anl.gov> Message-ID: <507899029.48454.1309270581804.JavaMail.root@zimbra.anl.gov> Ah, you mean specifically using the operator += conflicts with its use as an increment operator in C? I wasn't thinking of that - I was just commenting that "append will be useful". I agree, now that you raise the issue. I need to study this length thread - Ive not been following it close enough. Were any other symbols proposed for the operator other than the original "!"? Would "<<" be more appropriate, suggestive of various stream-append operators? Or append() as a built-in function rather than an operator? a << 123; - Mike ----- Original Message ----- > $0,02 from me -- given that Swift has a C-like syntax, this syntax > will surely confuse people?? > > On Jun 28, 2011, at 9:04 AM, Michael Wilde wrote: > > > This sound good. I assume one should be able to append to any > > dimension of a multi-dimensional array with +=. > > > > int a[][]; > > > > a[5] += 1 > > a[5] += 2 > > > > sets a[5][?] = 1 > > a[5][?] = 2 > > > > where ? is the index values chosen by += > > > > Further, as you work out the typing rules for array subscripts, > > should the following work? > > > > int a[][[]; > > int b[]; > > b += 100; > > b += 200; > > a += b; # <=== should this append a new array to a? > > > > - Mike > > > > ----- Original Message ----- > >>> As proposed earlier, a new syntax for appending values to > >>> arrays have been implemented. Examples are given below. > >> > >> I think I have been won over to the += syntax. > >> > >>> Please note that, we still need to make sure that when we can > >>> only append to an array, so to associate keys and multiple > >>> values, (i think) we will need two(or more) dimensional arrays. > >> > >> yes. but there are already multi-dimensional arrays in swift. > >> > >> also, a one dimensional array should still work, like this: > >> > >> int a[]; > >> a += 232; > >> a += 111; > >> > >> -- > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From yadudoc1729 at gmail.com Tue Jun 28 09:17:37 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Tue, 28 Jun 2011 19:47:37 +0530 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: References: Message-ID: >> Please note that, we still need to make sure that when we can >> only append to an array, so to associate keys and multiple >> values, (i think) we will need two(or more) dimensional arrays. > > yes. but there are already multi-dimensional arrays in swift. > > also, a one dimensional array should still work, like this: > > ?int a[]; > ?a += 232; > ?a += 111; Yes. This works fine. I checked with the following piece of code int array[ ]; array += 10; array += 20; array += 30; foreach value, index in array { trace ( array[index] ); } -- Thanks and Regards, Yadu Nand B From benc at hawaga.org.uk Tue Jun 28 09:19:19 2011 From: benc at hawaga.org.uk (Ben Clifford) Date: Tue, 28 Jun 2011 16:19:19 +0200 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: <507899029.48454.1309270581804.JavaMail.root@zimbra.anl.gov> References: <507899029.48454.1309270581804.JavaMail.root@zimbra.anl.gov> Message-ID: <4F4372A1-8573-4408-B413-831B1CEFF321@hawaga.org.uk> my opinion on the matter is that: the meaning of the operator has no direct analogue in C, Java or Haskell which I think are the three main influences. So whatever is chosen is going to look weird. So pick one, any one, perhaps using a pseudorandom number generator, or a single round election, and move on ;) From yadudoc1729 at gmail.com Tue Jun 28 09:24:49 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Tue, 28 Jun 2011 19:54:49 +0530 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: <1564902491.48379.1309269875632.JavaMail.root@zimbra.anl.gov> References: <1564902491.48379.1309269875632.JavaMail.root@zimbra.anl.gov> Message-ID: > This sound good. ?I assume one should be able to append to any dimension of a multi-dimensional array with +=. > > int a[][]; > > a[5] += 1 > a[5] += 2 > > sets a[5][?] = 1 > ? ? a[5][?] = 2 > > where ? is the index values chosen by += Yes, where ? is the thread prefix id. (usually this is just 0,1,2.. ) > Further, as you work out the typing rules for array subscripts, should the following work? > > int a[][[]; > int b[]; > b += 100; > b += 200; > a += b; ?# <=== should this append a new array to a? Right now, a+= b works. += or APPEND is an improvisation over the usual ASSIGN. So everything that is meaningful to the compiler in ASSIGN works here as well. The question of whether this is supposed to work, I don't really know. -- Thanks and Regards, Yadu Nand B From yadudoc1729 at gmail.com Tue Jun 28 09:31:44 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Tue, 28 Jun 2011 20:01:44 +0530 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: <4F4372A1-8573-4408-B413-831B1CEFF321@hawaga.org.uk> References: <507899029.48454.1309270581804.JavaMail.root@zimbra.anl.gov> <4F4372A1-8573-4408-B413-831B1CEFF321@hawaga.org.uk> Message-ID: > my opinion on the matter is that: the meaning of the operator has no direct analogue in C, Java or Haskell which I think are the three main influences. So whatever is chosen is going to look weird. So pick one, any one, perhaps using a pseudorandom number generator, or a single round election, and move on ;) I'm still working on doing the other proposed syntax which is append ( array, ). Though this clears the issue of "+= " resembling the increment and assign operator, it does not bear any similarity with other functional languages. If it is just the choice of the "+=" symbol for the APPEND operation, maybe we could choose another ? -- Thanks and Regards, Yadu Nand B From davidkelly999 at gmail.com Tue Jun 28 10:00:52 2011 From: davidkelly999 at gmail.com (David Kelly) Date: Tue, 28 Jun 2011 10:00:52 -0500 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: <507899029.48454.1309270581804.JavaMail.root@zimbra.anl.gov> References: <3535F1E8-5CB2-4586-8A1A-547B45D50553@anl.gov> <507899029.48454.1309270581804.JavaMail.root@zimbra.anl.gov> Message-ID: How about: a[5] .= 1; a[5] .= 2; That seems very appendish to me, similar to some operators in PHP and Perl. On Tue, Jun 28, 2011 at 9:16 AM, Michael Wilde wrote: > Ah, you mean specifically using the operator += conflicts with its use as > an increment operator in C? I wasn't thinking of that - I was just > commenting that "append will be useful". > > I agree, now that you raise the issue. I need to study this length thread - > Ive not been following it close enough. Were any other symbols proposed for > the operator other than the original "!"? > > Would "<<" be more appropriate, suggestive of various stream-append > operators? Or append() as a built-in function rather than an operator? > > a << 123; > > - Mike > > ----- Original Message ----- > > $0,02 from me -- given that Swift has a C-like syntax, this syntax > > will surely confuse people?? > > > > On Jun 28, 2011, at 9:04 AM, Michael Wilde wrote: > > > > > This sound good. I assume one should be able to append to any > > > dimension of a multi-dimensional array with +=. > > > > > > int a[][]; > > > > > > a[5] += 1 > > > a[5] += 2 > > > > > > sets a[5][?] = 1 > > > a[5][?] = 2 > > > > > > where ? is the index values chosen by += > > > > > > Further, as you work out the typing rules for array subscripts, > > > should the following work? > > > > > > int a[][[]; > > > int b[]; > > > b += 100; > > > b += 200; > > > a += b; # <=== should this append a new array to a? > > > > > > - Mike > > > > > > ----- Original Message ----- > > >>> As proposed earlier, a new syntax for appending values to > > >>> arrays have been implemented. Examples are given below. > > >> > > >> I think I have been won over to the += syntax. > > >> > > >>> Please note that, we still need to make sure that when we can > > >>> only append to an array, so to associate keys and multiple > > >>> values, (i think) we will need two(or more) dimensional arrays. > > >> > > >> yes. but there are already multi-dimensional arrays in swift. > > >> > > >> also, a one dimensional array should still work, like this: > > >> > > >> int a[]; > > >> a += 232; > > >> a += 111; > > >> > > >> -- > > > > > > -- > > > Michael Wilde > > > Computation Institute, University of Chicago > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Tue Jun 28 11:36:05 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Tue, 28 Jun 2011 11:36:05 -0500 Subject: [Swift-devel] using Swift Trunk for pbs/beagle: doesn't scale In-Reply-To: References: Message-ID: It seems at 1600 app task rate, after a while the run stalls. However, this does not get reflected on the std.err which seems to think tasks are still active. While std.out reports the waiting thread status and job queues die down. On Mon, Jun 27, 2011 at 2:00 PM, Ketan Maheshwari < ketancmaheshwari at gmail.com> wrote: > I was trying trunk on beagle today. Seems when I try to scale up the number > of app tasks, it hangs. To be more precise, I found that above a threshold > of 1600 app tasks, the run hangs with following periodic std.outs: > > No events in 10s. > > Registered futures: > string[] str_roots Closed, 500 elements, no listeners > int n Closed, no listeners > int n Closed, no listeners > int n Closed, no listeners > int n Closed, no listeners > int n Closed, no listeners > int n Closed, no listeners > int n Closed, no listeners > int n Closed, no listeners > int n Closed, no listeners > int n Closed, no listeners > int n Closed, no listeners > int n Closed, no listeners > int n Closed, no listeners > int n Closed, no listeners > int n Closed, no listeners > ---- > > Waiting threads: > 0-2 > ---- > > When, the threshold is <=1600 app tasks, it seems to work. The > SWIFT_MAX_HEAP is 7000MB. > > > -- > Ketan > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Tue Jun 28 12:06:01 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Tue, 28 Jun 2011 12:06:01 -0500 Subject: [Swift-devel] using Swift Trunk for pbs/beagle: doesn't scale In-Reply-To: References: Message-ID: A log of the run is here: /lustre/beagle/ketan/labs/modftdock/production/campaign5/ftdock-20110627-1307-5uq66e5a.log On Tue, Jun 28, 2011 at 11:36 AM, Ketan Maheshwari < ketancmaheshwari at gmail.com> wrote: > It seems at 1600 app task rate, after a while the run stalls. However, this > does not get reflected on the std.err which seems to think tasks are still > active. While std.out reports the waiting thread status and job queues die > down. > > > On Mon, Jun 27, 2011 at 2:00 PM, Ketan Maheshwari < > ketancmaheshwari at gmail.com> wrote: > >> I was trying trunk on beagle today. Seems when I try to scale up the >> number of app tasks, it hangs. To be more precise, I found that above a >> threshold of 1600 app tasks, the run hangs with following periodic std.outs: >> >> No events in 10s. >> >> Registered futures: >> string[] str_roots Closed, 500 elements, no listeners >> int n Closed, no listeners >> int n Closed, no listeners >> int n Closed, no listeners >> int n Closed, no listeners >> int n Closed, no listeners >> int n Closed, no listeners >> int n Closed, no listeners >> int n Closed, no listeners >> int n Closed, no listeners >> int n Closed, no listeners >> int n Closed, no listeners >> int n Closed, no listeners >> int n Closed, no listeners >> int n Closed, no listeners >> int n Closed, no listeners >> ---- >> >> Waiting threads: >> 0-2 >> ---- >> >> When, the threshold is <=1600 app tasks, it seems to work. The >> SWIFT_MAX_HEAP is 7000MB. >> >> >> -- >> Ketan >> >> > > > -- > Ketan > > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Tue Jun 28 12:17:12 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 28 Jun 2011 12:17:12 -0500 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: <1564902491.48379.1309269875632.JavaMail.root@zimbra.anl.gov> References: <1564902491.48379.1309269875632.JavaMail.root@zimbra.anl.gov> Message-ID: <1309281432.21830.2.camel@blabla> On Tue, 2011-06-28 at 09:04 -0500, Michael Wilde wrote: > int a[][[]; > int b[]; > b += 100; > b += 200; > a += b; # <=== should this append a new array to a? It should append something to a. And the type of a is int array which matches b's type, so I don't see why not. From hategan at mcs.anl.gov Tue Jun 28 12:21:21 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 28 Jun 2011 12:21:21 -0500 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: References: <507899029.48454.1309270581804.JavaMail.root@zimbra.anl.gov> <4F4372A1-8573-4408-B413-831B1CEFF321@hawaga.org.uk> Message-ID: <1309281681.21830.6.camel@blabla> On Tue, 2011-06-28 at 20:01 +0530, Yadu Nand wrote: > > my opinion on the matter is that: the meaning of the operator has no direct analogue in C, Java or Haskell which I think are the three main influences. So whatever is chosen is going to look weird. So pick one, any one, perhaps using a pseudorandom number generator, or a single round election, and move on ;) > > I'm still working on doing the other proposed syntax which is > append ( array, ). There have been some schools of thought in language design that said that there should be one and only one way to do a certain thing. From jonmon at utexas.edu Tue Jun 28 12:21:56 2011 From: jonmon at utexas.edu (Jonathan Monette) Date: Tue, 28 Jun 2011 12:21:56 -0500 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: <1309281432.21830.2.camel@blabla> References: <1564902491.48379.1309269875632.JavaMail.root@zimbra.anl.gov> <1309281432.21830.2.camel@blabla> Message-ID: I guess my question would be what would a look like? What if this was called. int a[]; int b[]; a=[1,2,3]; b=[4,5,6]; a+=b; Saying append makes me think that a+=b would result in something like: [1,2,3[4,5,6]] Is this the case? Or would it be [1,2.3,4,5,6]. In python this is called extend while the first is called append. So are we talking about an extend operation or append operation? On Jun 28, 2011, at 12:17 PM, Mihael Hategan wrote: > On Tue, 2011-06-28 at 09:04 -0500, Michael Wilde wrote: > >> int a[][[]; >> int b[]; >> b += 100; >> b += 200; >> a += b; # <=== should this append a new array to a? > > It should append something to a. And the type of a is int array which > matches b's type, so I don't see why not. > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From hategan at mcs.anl.gov Tue Jun 28 12:30:33 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 28 Jun 2011 12:30:33 -0500 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: References: <1564902491.48379.1309269875632.JavaMail.root@zimbra.anl.gov> <1309281432.21830.2.camel@blabla> Message-ID: <1309282233.21830.9.camel@blabla> On Tue, 2011-06-28 at 12:21 -0500, Jonathan Monette wrote: > I guess my question would be what would a look like? What if this was called. > > int a[]; > int b[]; > a=[1,2,3]; > b=[4,5,6]; > > a+=b; > > Saying append makes me think that a+=b would result in something like: > > [1,2,3[4,5,6]] But a is an array of ints. b is not an int. > > Is this the case? Or would it be [1,2.3,4,5,6]. In python this is called extend while the first is called append. No. > > So are we talking about an extend operation or append operation? So far we're only talking about: t a[]; a += t; From jonmon at utexas.edu Tue Jun 28 12:35:13 2011 From: jonmon at utexas.edu (Jonathan Monette) Date: Tue, 28 Jun 2011 12:35:13 -0500 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: <1309282233.21830.9.camel@blabla> References: <1564902491.48379.1309269875632.JavaMail.root@zimbra.anl.gov> <1309281432.21830.2.camel@blabla> <1309282233.21830.9.camel@blabla> Message-ID: <06E89C25-44B5-4AA0-837D-3EF18914FF53@utexas.edu> Ok. I thought this was a general append where basically you can append anything as long as it had the same type(int, float, string, user definded,...). That was my confusion. Thanks. On Jun 28, 2011, at 12:30 PM, Mihael Hategan wrote: > On Tue, 2011-06-28 at 12:21 -0500, Jonathan Monette wrote: >> I guess my question would be what would a look like? What if this was called. >> >> int a[]; >> int b[]; >> a=[1,2,3]; >> b=[4,5,6]; >> >> a+=b; >> >> Saying append makes me think that a+=b would result in something like: >> >> [1,2,3[4,5,6]] > > But a is an array of ints. b is not an int. > >> >> Is this the case? Or would it be [1,2.3,4,5,6]. In python this is called extend while the first is called append. > > No. > >> >> So are we talking about an extend operation or append operation? > > So far we're only talking about: > t a[]; a += t; > > From yadudoc1729 at gmail.com Tue Jun 28 12:37:34 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Tue, 28 Jun 2011 23:07:34 +0530 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: <1309281681.21830.6.camel@blabla> References: <507899029.48454.1309270581804.JavaMail.root@zimbra.anl.gov> <4F4372A1-8573-4408-B413-831B1CEFF321@hawaga.org.uk> <1309281681.21830.6.camel@blabla> Message-ID: Hi Mihael, > There have been some schools of thought in language design that said > that there should be one and only one way to do a certain thing. I planned on implementing both methods, += operator as well as append() so that once its done, it'd be easier to decide which is better. Should I drop implementation for append() style ? I'm ok with that. I'd like to point out one issue that came up between a discussion with Justin today. I am using the earlier idea of the append being an assign with the array subscript being the thread prefix. Meaning array["key"] += 10 is [1]converted to array["key"]["!"] = 10 by the parser. There is still the possibility that the user may inadvertently use the string "!" as subscript and create weird situations. So instead of "!" we could use a string say "_SWIFT_AUTO_INCREMENT" . This being something used internally. [1] Well not converted exactly, but the kml generated is the same. -- Thanks and Regards, Yadu Nand B From hategan at mcs.anl.gov Tue Jun 28 12:50:02 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 28 Jun 2011 12:50:02 -0500 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: References: <507899029.48454.1309270581804.JavaMail.root@zimbra.anl.gov> <4F4372A1-8573-4408-B413-831B1CEFF321@hawaga.org.uk> <1309281681.21830.6.camel@blabla> Message-ID: <1309283402.22240.2.camel@blabla> On Tue, 2011-06-28 at 23:07 +0530, Yadu Nand wrote: > Hi Mihael, > > > There have been some schools of thought in language design that said > > that there should be one and only one way to do a certain thing. > I planned on implementing both methods, += operator as well as append() > so that once its done, it'd be easier to decide which is better. > Should I drop implementation for append() style ? I'm ok with that. > > I'd like to point out one issue that came up between a discussion with > Justin today. I am using the earlier idea of the append being an assign > with the array subscript being the thread prefix. > Meaning array["key"] += 10 is [1]converted to array["key"]["!"] = 10 by > the parser. > There is still the possibility that the user may inadvertently use the string > "!" as subscript and create weird situations. So instead of "!" we could use > a string say "_SWIFT_AUTO_INCREMENT" . This being something used > internally. > > [1] Well not converted exactly, but the kml generated is the same. Well, don't. Make it generate something else, like path=getThreadIndex() instead of path="_SWIFT_AUTO_INCREMENT". From benc at hawaga.org.uk Tue Jun 28 13:34:52 2011 From: benc at hawaga.org.uk (Ben Clifford) Date: Tue, 28 Jun 2011 20:34:52 +0200 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: References: <507899029.48454.1309270581804.JavaMail.root@zimbra.anl.gov> <4F4372A1-8573-4408-B413-831B1CEFF321@hawaga.org.uk> <1309281681.21830.6.camel@blabla> Message-ID: <44A9D76A-339D-4A4F-8D32-42F2B79EE1B5@hawaga.org.uk> On Jun 28, 2011, at 7:37 PM, Yadu Nand wrote: > > I'd like to point out one issue that came up between a discussion with > Justin today. I am using the earlier idea of the append being an assign > with the array subscript being the thread prefix. > Meaning array["key"] += 10 is [1]converted to array["key"]["!"] = 10 by > the parser. > There is still the possibility that the user may inadvertently use the string > "!" as subscript and create weird situations. So instead of "!" we could use > a string say "_SWIFT_AUTO_INCREMENT" . This being something used > internally. You could approach this at the type level too, if you're putting types for indices. A user can only specify ["!"] as an index if the array takes a string in that position. You could have a different type (for the sake of argument called 'auto' (cue another naming discussion)) for automatically assigned indices. (which in the implementation under question are thread IDs - but that's an accident of implementation). So if I am going to say array[2] += 10; then array must have been declared with type: int array[int][auto]; and if I'm going to say array[2]["!"] = 10; (meaning, use key "!", not use an automatically assigned variable) then array must be declared as: int array[int][string]. In that approach, it is a type checking error to either use += on a non-auto array, or specify an explicit string or int index on an auto array. You could then have variables with type auto, which you could extra auto values into through foreach, but not other wise being able to specify auto values. -- From yadudoc1729 at gmail.com Tue Jun 28 13:51:44 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Wed, 29 Jun 2011 00:21:44 +0530 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: <44A9D76A-339D-4A4F-8D32-42F2B79EE1B5@hawaga.org.uk> References: <507899029.48454.1309270581804.JavaMail.root@zimbra.anl.gov> <4F4372A1-8573-4408-B413-831B1CEFF321@hawaga.org.uk> <1309281681.21830.6.camel@blabla> <44A9D76A-339D-4A4F-8D32-42F2B79EE1B5@hawaga.org.uk> Message-ID: > You could approach this at the type level too, if you're putting types for indices. > A user can only specify ["!"] as an index if the array takes a string in that position. > You could have a different type (for the sake of argument called 'auto' (cue another naming discussion)) for automatically assigned indices. (which in the implementation under question are thread IDs - but that's an accident of implementation). > So if I am going to say ?array[2] += 10; then array must have been declared with type: > int array[int][auto]; > and if I'm going to say > array[2]["!"] = 10; ? (meaning, use key "!", not use an automatically assigned variable) > then array must be declared as: > int array[int][string]. > In that approach, it is a type checking error to either use += on a non-auto array, or specify an explicit string or int index on an auto array. > You could then have variables with type auto, which you could extra auto values into through foreach, but not other wise being able to specify auto values. This is great. This would solve other problems as well. But, involves changing array definition as well as lot of code changes, almost everyplace arrays are handled. I'll try to get this done. -- Thanks and Regards, Yadu Nand B From hategan at mcs.anl.gov Tue Jun 28 13:53:02 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 28 Jun 2011 13:53:02 -0500 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: References: <507899029.48454.1309270581804.JavaMail.root@zimbra.anl.gov> <4F4372A1-8573-4408-B413-831B1CEFF321@hawaga.org.uk> <1309281681.21830.6.camel@blabla> <1309283402.22240.2.camel@blabla> Message-ID: <1309287182.22843.5.camel@blabla> On Wed, 2011-06-29 at 00:00 +0530, Yadu Nand wrote: > > Well, don't. Make it generate something else, like path=getThreadIndex() > > instead of path="_SWIFT_AUTO_INCREMENT". > I don't understand this clearly. I am generating kml with say, _AUTO_INC > just so that, function parsePath in VDLFuntion.java can identify and add > the thread prefix id to the end of the path instead of _AUTO_INC . Right. I'm not sure how exactly this ends up looking like in kml, but I suspect something like: ... a _AUTO_INC This is the standard translation. I suggest: a Where is a karajan function that returns the thread index. Let me know if you need help implementing , but you should be able to figure it out if you take a look at libexec/vdl-lib.xml From hategan at mcs.anl.gov Tue Jun 28 13:56:55 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 28 Jun 2011 13:56:55 -0500 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: <44A9D76A-339D-4A4F-8D32-42F2B79EE1B5@hawaga.org.uk> References: <507899029.48454.1309270581804.JavaMail.root@zimbra.anl.gov> <4F4372A1-8573-4408-B413-831B1CEFF321@hawaga.org.uk> <1309281681.21830.6.camel@blabla> <44A9D76A-339D-4A4F-8D32-42F2B79EE1B5@hawaga.org.uk> Message-ID: <1309287415.22843.8.camel@blabla> I very much agree with this, except I wouldn't say that one could approach this from a type system perspective, but it is necessary to do so. Using auto-indices on an int array would break things if the auto-indices are string (which they are). In other words, the following code would pass type checking but is essentially incorrect: int a[]; a += 1; a += 2; foreach v, k in a { trace(k + 1); } On Tue, 2011-06-28 at 20:34 +0200, Ben Clifford wrote: > On Jun 28, 2011, at 7:37 PM, Yadu Nand wrote: > > > > I'd like to point out one issue that came up between a discussion with > > Justin today. I am using the earlier idea of the append being an assign > > with the array subscript being the thread prefix. > > Meaning array["key"] += 10 is [1]converted to array["key"]["!"] = 10 by > > the parser. > > There is still the possibility that the user may inadvertently use the string > > "!" as subscript and create weird situations. So instead of "!" we could use > > a string say "_SWIFT_AUTO_INCREMENT" . This being something used > > internally. > > You could approach this at the type level too, if you're putting types for indices. > > A user can only specify ["!"] as an index if the array takes a string in that position. > > You could have a different type (for the sake of argument called 'auto' (cue another naming discussion)) for automatically assigned indices. (which in the implementation under question are thread IDs - but that's an accident of implementation). > > So if I am going to say array[2] += 10; then array must have been declared with type: > > int array[int][auto]; > > and if I'm going to say > > array[2]["!"] = 10; (meaning, use key "!", not use an automatically assigned variable) > > then array must be declared as: > > int array[int][string]. > > In that approach, it is a type checking error to either use += on a non-auto array, or specify an explicit string or int index on an auto array. > > You could then have variables with type auto, which you could extra auto values into through foreach, but not other wise being able to specify auto values. > From yadudoc1729 at gmail.com Tue Jun 28 14:01:57 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Wed, 29 Jun 2011 00:31:57 +0530 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: <1309287415.22843.8.camel@blabla> References: <507899029.48454.1309270581804.JavaMail.root@zimbra.anl.gov> <4F4372A1-8573-4408-B413-831B1CEFF321@hawaga.org.uk> <1309281681.21830.6.camel@blabla> <44A9D76A-339D-4A4F-8D32-42F2B79EE1B5@hawaga.org.uk> <1309287415.22843.8.camel@blabla> Message-ID: On Wed, Jun 29, 2011 at 12:26 AM, Mihael Hategan wrote: > I very much agree with this, except I wouldn't say that one could > approach this from a type system perspective, but it is necessary to do > so. Using auto-indices on an int array would break things if the > auto-indices are string (which they are). In other words, the following > code would pass type checking but is essentially incorrect: > int a[]; > a += 1; a += 2; > > foreach v, k in a { > ?trace(k + 1); > } This doesn't fail. a += 1 ; a += 2 ; Eventually comes to a[x] = 1 ; a[y] = 2 (where x and y are thread prefixes which turn out to be integers). -- Thanks and Regards, Yadu Nand B From hategan at mcs.anl.gov Tue Jun 28 14:12:34 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 28 Jun 2011 14:12:34 -0500 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: References: <507899029.48454.1309270581804.JavaMail.root@zimbra.anl.gov> <4F4372A1-8573-4408-B413-831B1CEFF321@hawaga.org.uk> <1309281681.21830.6.camel@blabla> <44A9D76A-339D-4A4F-8D32-42F2B79EE1B5@hawaga.org.uk> <1309287415.22843.8.camel@blabla> Message-ID: <1309288354.23076.2.camel@blabla> On Wed, 2011-06-29 at 00:31 +0530, Yadu Nand wrote: > On Wed, Jun 29, 2011 at 12:26 AM, Mihael Hategan wrote: > > I very much agree with this, except I wouldn't say that one could > > approach this from a type system perspective, but it is necessary to do > > so. Using auto-indices on an int array would break things if the > > auto-indices are string (which they are). In other words, the following > > code would pass type checking but is essentially incorrect: > > int a[]; > > a += 1; a += 2; > > > > foreach v, k in a { > > trace(k + 1); > > } > This doesn't fail. > a += 1 ; a += 2 ; > Eventually comes to a[x] = 1 ; a[y] = 2 (where x and y are thread prefixes > which turn out to be integers). Only in special cases. Try this: ... foreach dummy in [1..2] { int a[]; a += 1; a += 2; foreach v, k in a { trace(k + 1); } } From hategan at mcs.anl.gov Tue Jun 28 14:29:21 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 28 Jun 2011 14:29:21 -0500 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: <3535F1E8-5CB2-4586-8A1A-547B45D50553@anl.gov> References: <1564902491.48379.1309269875632.JavaMail.root@zimbra.anl.gov> <3535F1E8-5CB2-4586-8A1A-547B45D50553@anl.gov> Message-ID: <1309289361.23076.9.camel@blabla> On Tue, 2011-06-28 at 09:08 -0500, Ian Foster wrote: > $0,02 from me -- given that Swift has a C-like syntax, this syntax will surely confuse people?? "+=" is a bit confusing, but then I don't think it's more so than using "+" as a string concatenation operator. So please vote or propose an alternative: 0. a += 1; 1. a ++= 1; 2. a ++ 1; 3. a << 1; 4. a[] = 1; 5. a .= 1; (that would make more sense if "." meant concatenation in swift). From jonmon at utexas.edu Tue Jun 28 14:33:10 2011 From: jonmon at utexas.edu (Jonathan Monette) Date: Tue, 28 Jun 2011 14:33:10 -0500 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: <1309289361.23076.9.camel@blabla> References: <1564902491.48379.1309269875632.JavaMail.root@zimbra.anl.gov> <3535F1E8-5CB2-4586-8A1A-547B45D50553@anl.gov> <1309289361.23076.9.camel@blabla> Message-ID: I vote for 3. On Jun 28, 2011, at 2:29 PM, Mihael Hategan wrote: > On Tue, 2011-06-28 at 09:08 -0500, Ian Foster wrote: >> $0,02 from me -- given that Swift has a C-like syntax, this syntax will surely confuse people?? > > "+=" is a bit confusing, but then I don't think it's more so than using > "+" as a string concatenation operator. > > So please vote or propose an alternative: > > 0. a += 1; > 1. a ++= 1; > 2. a ++ 1; > 3. a << 1; > 4. a[] = 1; > 5. a .= 1; (that would make more sense if "." meant concatenation in > swift). > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From wozniak at mcs.anl.gov Tue Jun 28 14:51:21 2011 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Tue, 28 Jun 2011 14:51:21 -0500 (Central Daylight Time) Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: References: <1564902491.48379.1309269875632.JavaMail.root@zimbra.anl.gov> <3535F1E8-5CB2-4586-8A1A-547B45D50553@anl.gov> <1309289361.23076.9.camel@blabla> Message-ID: I vote for 0. On Tue, 28 Jun 2011, Jonathan Monette wrote: > I vote for 3. > > On Jun 28, 2011, at 2:29 PM, Mihael Hategan wrote: > >> On Tue, 2011-06-28 at 09:08 -0500, Ian Foster wrote: >>> $0,02 from me -- given that Swift has a C-like syntax, this syntax will surely confuse people?? >> >> "+=" is a bit confusing, but then I don't think it's more so than using >> "+" as a string concatenation operator. >> >> So please vote or propose an alternative: >> >> 0. a += 1; >> 1. a ++= 1; >> 2. a ++ 1; >> 3. a << 1; >> 4. a[] = 1; >> 5. a .= 1; (that would make more sense if "." meant concatenation in >> swift). >> >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -- Justin M Wozniak From wilde at mcs.anl.gov Tue Jun 28 14:56:37 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 28 Jun 2011 14:56:37 -0500 (CDT) Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: Message-ID: <487016554.50320.1309290997162.JavaMail.root@zimbra.anl.gov> I vote for 3 ----- Original Message ----- > I vote for 0. > > On Tue, 28 Jun 2011, Jonathan Monette wrote: > > > I vote for 3. > > > > On Jun 28, 2011, at 2:29 PM, Mihael Hategan wrote: > > > >> On Tue, 2011-06-28 at 09:08 -0500, Ian Foster wrote: > >>> $0,02 from me -- given that Swift has a C-like syntax, this syntax > >>> will surely confuse people?? > >> > >> "+=" is a bit confusing, but then I don't think it's more so than > >> using > >> "+" as a string concatenation operator. > >> > >> So please vote or propose an alternative: > >> > >> 0. a += 1; > >> 1. a ++= 1; > >> 2. a ++ 1; > >> 3. a << 1; > >> 4. a[] = 1; > >> 5. a .= 1; (that would make more sense if "." meant concatenation > >> in > >> swift). > >> > >> > >> _______________________________________________ > >> Swift-devel mailing list > >> Swift-devel at ci.uchicago.edu > >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > Justin M Wozniak > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From davidkelly999 at gmail.com Tue Jun 28 15:14:11 2011 From: davidkelly999 at gmail.com (David Kelly) Date: Tue, 28 Jun 2011 15:14:11 -0500 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: <1309289361.23076.9.camel@blabla> References: <1564902491.48379.1309269875632.JavaMail.root@zimbra.anl.gov> <3535F1E8-5CB2-4586-8A1A-547B45D50553@anl.gov> <1309289361.23076.9.camel@blabla> Message-ID: 5 On Tue, Jun 28, 2011 at 2:29 PM, Mihael Hategan wrote: > On Tue, 2011-06-28 at 09:08 -0500, Ian Foster wrote: > > $0,02 from me -- given that Swift has a C-like syntax, this syntax will > surely confuse people?? > > "+=" is a bit confusing, but then I don't think it's more so than using > "+" as a string concatenation operator. > > So please vote or propose an alternative: > > 0. a += 1; > 1. a ++= 1; > 2. a ++ 1; > 3. a << 1; > 4. a[] = 1; > 5. a .= 1; (that would make more sense if "." meant concatenation in > swift). > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Tue Jun 28 15:17:51 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Tue, 28 Jun 2011 15:17:51 -0500 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: References: <1564902491.48379.1309269875632.JavaMail.root@zimbra.anl.gov> <3535F1E8-5CB2-4586-8A1A-547B45D50553@anl.gov> <1309289361.23076.9.camel@blabla> Message-ID: 3. or 1 >> a (following the UNIX append-like syntax "echo 1" >> a.txt) On Tue, Jun 28, 2011 at 3:14 PM, David Kelly wrote: > 5 > > > On Tue, Jun 28, 2011 at 2:29 PM, Mihael Hategan wrote: > >> On Tue, 2011-06-28 at 09:08 -0500, Ian Foster wrote: >> > $0,02 from me -- given that Swift has a C-like syntax, this syntax will >> surely confuse people?? >> >> "+=" is a bit confusing, but then I don't think it's more so than using >> "+" as a string concatenation operator. >> >> So please vote or propose an alternative: >> >> 0. a += 1; >> 1. a ++= 1; >> 2. a ++ 1; >> 3. a << 1; >> 4. a[] = 1; >> 5. a .= 1; (that would make more sense if "." meant concatenation in >> swift). >> >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From alberto_chavez at live.com Tue Jun 28 15:18:27 2011 From: alberto_chavez at live.com (Alberto Chavez) Date: Tue, 28 Jun 2011 15:18:27 -0500 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: References: <1564902491.48379.1309269875632.JavaMail.root@zimbra.anl.gov>, <3535F1E8-5CB2-4586-8A1A-547B45D50553@anl.gov>, <1309289361.23076.9.camel@blabla>, , Message-ID: 3 Date: Tue, 28 Jun 2011 15:17:51 -0500 From: ketancmaheshwari at gmail.com To: davidkelly999 at gmail.com CC: swift-devel at ci.uchicago.edu; foster at anl.gov Subject: Re: [Swift-devel] New syntax for appending to arrays 3. or 1 >> a (following the UNIX append-like syntax "echo 1" >> a.txt) On Tue, Jun 28, 2011 at 3:14 PM, David Kelly wrote: 5 On Tue, Jun 28, 2011 at 2:29 PM, Mihael Hategan wrote: On Tue, 2011-06-28 at 09:08 -0500, Ian Foster wrote: > $0,02 from me -- given that Swift has a C-like syntax, this syntax will surely confuse people?? "+=" is a bit confusing, but then I don't think it's more so than using "+" as a string concatenation operator. So please vote or propose an alternative: 0. a += 1; 1. a ++= 1; 2. a ++ 1; 3. a << 1; 4. a[] = 1; 5. a .= 1; (that would make more sense if "." meant concatenation in swift). _______________________________________________ Swift-devel mailing list Swift-devel at ci.uchicago.edu https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel _______________________________________________ Swift-devel mailing list Swift-devel at ci.uchicago.edu https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Ketan _______________________________________________ Swift-devel mailing list Swift-devel at ci.uchicago.edu https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Tue Jun 28 15:44:55 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 28 Jun 2011 15:44:55 -0500 Subject: [Swift-devel] [Fwd: Re: New syntax for appending to arrays] Message-ID: <1309293895.23406.0.camel@blabla> -------- Forwarded Message -------- From: Ben Clifford To: Mihael Hategan Subject: Re: [Swift-devel] New syntax for appending to arrays Date: Tue, 28 Jun 2011 19:37:13 +0000 (GMT) > So please vote or propose an alternative: Any of these three, with option 0 most preferred. > 0. a += 1; > 5. a .= 1; (that would make more sense if "." meant concatenation in > swift). > 1. a ++= 1; -- From hategan at mcs.anl.gov Tue Jun 28 15:48:05 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 28 Jun 2011 15:48:05 -0500 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: <1309289361.23076.9.camel@blabla> References: <1564902491.48379.1309269875632.JavaMail.root@zimbra.anl.gov> <3535F1E8-5CB2-4586-8A1A-547B45D50553@anl.gov> <1309289361.23076.9.camel@blabla> Message-ID: <1309294085.23406.1.camel@blabla> So far we have: 3, 0, 3, 5, 3, 3, (0, 1, 5) Anybody else? On Tue, 2011-06-28 at 14:29 -0500, Mihael Hategan wrote: > On Tue, 2011-06-28 at 09:08 -0500, Ian Foster wrote: > > $0,02 from me -- given that Swift has a C-like syntax, this syntax will surely confuse people?? > > "+=" is a bit confusing, but then I don't think it's more so than using > "+" as a string concatenation operator. > > So please vote or propose an alternative: > > 0. a += 1; > 1. a ++= 1; > 2. a ++ 1; > 3. a << 1; > 4. a[] = 1; > 5. a .= 1; (that would make more sense if "." meant concatenation in > swift). > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From yadudoc1729 at gmail.com Wed Jun 29 08:28:51 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Wed, 29 Jun 2011 18:58:51 +0530 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: <1309288354.23076.2.camel@blabla> References: <507899029.48454.1309270581804.JavaMail.root@zimbra.anl.gov> <4F4372A1-8573-4408-B413-831B1CEFF321@hawaga.org.uk> <1309281681.21830.6.camel@blabla> <44A9D76A-339D-4A4F-8D32-42F2B79EE1B5@hawaga.org.uk> <1309287415.22843.8.camel@blabla> <1309288354.23076.2.camel@blabla> Message-ID: Hi Mihael, > foreach dummy in [1..2] { > ?int a[]; > ?a += 1; a += 2; > ?foreach v, k in a { > ? trace(k + 1); > ?} > } Thanks for pointing this out. I didn't consider cases where append isn't a top-level-statement. Making changes to the parse to account for this. My bad. -- Thanks and Regards, Yadu Nand B From benc at hawaga.org.uk Wed Jun 29 08:53:24 2011 From: benc at hawaga.org.uk (Ben Clifford) Date: Wed, 29 Jun 2011 15:53:24 +0200 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: References: <507899029.48454.1309270581804.JavaMail.root@zimbra.anl.gov> <4F4372A1-8573-4408-B413-831B1CEFF321@hawaga.org.uk> <1309281681.21830.6.camel@blabla> <44A9D76A-339D-4A4F-8D32-42F2B79EE1B5@hawaga.org.uk> <1309287415.22843.8.camel@blabla> <1309288354.23076.2.camel@blabla> Message-ID: something that maybe needs a bunch fo testing is any of the mappers that make use of an array index in their generated file name - traditionally those have done parsing and formatting of numbers. making arrays use other types might upset them, or require more documentation/type checking making sure they are only used with numeric types. From yadudoc1729 at gmail.com Wed Jun 29 09:06:06 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Wed, 29 Jun 2011 19:36:06 +0530 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: References: <507899029.48454.1309270581804.JavaMail.root@zimbra.anl.gov> <4F4372A1-8573-4408-B413-831B1CEFF321@hawaga.org.uk> <1309281681.21830.6.camel@blabla> <44A9D76A-339D-4A4F-8D32-42F2B79EE1B5@hawaga.org.uk> <1309287415.22843.8.camel@blabla> <1309288354.23076.2.camel@blabla> Message-ID: On Wed, Jun 29, 2011 at 7:23 PM, Ben Clifford wrote: > something that maybe needs a bunch fo testing is any of the mappers that make use of an array index in their generated file name - traditionally those have done parsing and formatting of numbers. making arrays use other types might upset them, or require more documentation/type checking making sure they are only used with numeric types. Making arrays use other types, as in for subscripts ? Where should I look into for the mapper stuff ? I've been trying to avoid doing subscript type specification, but that isn't going to work. I see that inside nested loops the thread prefix is of the form (number - )* number , so that basically means I can't assume its an int and writing a function to generate a number from the prefix doesn't look like the right way to go about this. -- Thanks and Regards, Yadu Nand B From benc at hawaga.org.uk Wed Jun 29 09:09:30 2011 From: benc at hawaga.org.uk (Ben Clifford) Date: Wed, 29 Jun 2011 14:09:30 +0000 (GMT) Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: References: <507899029.48454.1309270581804.JavaMail.root@zimbra.anl.gov> <4F4372A1-8573-4408-B413-831B1CEFF321@hawaga.org.uk> <1309281681.21830.6.camel@blabla> <44A9D76A-339D-4A4F-8D32-42F2B79EE1B5@hawaga.org.uk> <1309287415.22843.8.camel@blabla> <1309288354.23076.2.camel@blabla> Message-ID: > I've been trying to avoid doing subscript type specification, but > that isn't going to work. I see that inside nested loops the thread > prefix is of the form (number - )* number , so that basically means > I can't assume its an int and writing a function to generate a number > from the prefix doesn't look like the right way to go about this. I thought you were already allowing strings as subscripts? (as in the example a["foo"] += "bar" ) Porbably you shoudl treat the thread ID as opaque and only use it in ways that rely on two IDs either comparing equal or not. If you hash it in some way and used that hashed value, then you're going to end up with obscure bugs caused by collisions. -- From wilde at mcs.anl.gov Wed Jun 29 09:15:49 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 29 Jun 2011 09:15:49 -0500 (CDT) Subject: [Swift-devel] coaster termination problems cause large runs to hang In-Reply-To: <666843584.46021.1309200530906.JavaMail.root@zimbra.anl.gov> Message-ID: <360734726.52653.1309356949795.JavaMail.root@zimbra.anl.gov> Mihael, did you have a chance to assess the log of Papia's failing DSSAT run? I think it shut down after approx 30K or 120K app invocations. Looked like the run went bad when coaster workers shut down at the end of their maxtime slot. Papia, in the meantime, can you change the following parameters and re-run? * in cf, from: execution.retries=0 lazy.errors=false * to: execution.retries=2 lazy.errors=true * in sites.xml, from: 00:02:00 * to: 00:10:00 The cf change will make Swift retry any app that fails, eg, because it may have run longer than its maxwalltime estimate. ANd Swift will continue to execute runnable app calls even if other app calls have failed. The sites change gives a longer time estimate for all app calls on the PADS site, so that Swift wont start an app on any coaster that has less than the amount of time (10 mins vs 2 mins) remaining in its run time allocation. Finally, Papia, could you try this on both PADS and Beagle, first on 0.92.1 and then on the latest trunk (as of this morning)? Thanks, - Mike ----- Original Message ----- > > (which I will copy to ~wilde/dssat.run01 on the CI net)? > > bri$ pwd > /home/wilde > bri$ ls -ld dssat.run01/ > drwxr-xr-x 2 wilde ci-users 4096 Jun 24 19:01 dssat.run01// > bri$ > > > ----- Original Message ----- > > [hategan at login ~]$ cd /home/papia/dssat/run01 > > -bash: cd: /home/papia/dssat/run01: Permission denied > > > > > > On Fri, 2011-06-24 at 18:54 -0500, Michael Wilde wrote: > > > Mihael, > > > > > > Papia is running large sweeps of the DSSAT land use model on PADS, > > > and getting failures, it seems, when the coasters time out. Her > > > script is attempting about 120K model invocations, each taking > > > about > > > 60 seconds to run. She gets between 30K and 60K of these done > > > before > > > it fails. > > > > > > Can you look at the example below, on the CI network in > > > /home/papia/dssat/run01 > > > (which I will copy to ~wilde/dssat.run01 on the CI net)? > > > > > > The swift.out file shows the run progressing nicely until the > > > first > > > coaster worker timeout occurs. > > > > > > The run was started with ./RunSweep.sh: > > > time swift -tc.file tc -sites.file sites.xml -config cf > > > RunDssat.swift >& swift.out > > > > > > The run id is RunID: 20110624-1333-r17fczk0 > > > Swift is 0.92.1. > > > > > > Thanks, > > > > > > Mike > > > > > > > > > login2$ head swift.out > > > Swift svn swift-r4371 cog-r3096 > > > > > > RunID: 20110624-1333-r17fczk0 > > > Progress: > > > Progress: uninitialized:2 > > > Progress: Selecting site:36 Stage in:53 Submitting:1 Submitted:10 > > > Progress: Selecting site:36 Stage in:8 Submitting:2 Submitted:54 > > > Progress: Selecting site:36 Submitted:64 > > > Progress: Selecting site:36 Submitted:64 > > > Progress: Selecting site:36 Submitted:63 Active:1 > > > login2$ ls -l *zk0.log > > > -rw-r--r-- 1 papia ci-users 161039247 Jun 24 17:25 > > > RunDssat-20110624-1333-r17fczk0.log > > > login2$ pwd > > > /home/papia/dssat/run01 > > > login2$ > > > > > > > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From yadudoc1729 at gmail.com Wed Jun 29 09:25:42 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Wed, 29 Jun 2011 19:55:42 +0530 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: References: <507899029.48454.1309270581804.JavaMail.root@zimbra.anl.gov> <4F4372A1-8573-4408-B413-831B1CEFF321@hawaga.org.uk> <1309281681.21830.6.camel@blabla> <44A9D76A-339D-4A4F-8D32-42F2B79EE1B5@hawaga.org.uk> <1309287415.22843.8.camel@blabla> <1309288354.23076.2.camel@blabla> Message-ID: > I thought you were already ?allowing strings as subscripts? > > (as in the example a["foo"] += "bar" ) Yes, but foreach loops don't allow indices to be non-int. I still haven't fixed that. > Porbably you shoudl treat the thread ID as opaque and only use it in ways > that rely on two IDs either comparing equal or not. If you hash it in some > way and used that hashed value, then you're going to end up with obscure > bugs caused by collisions. Yes, when there was no nesting, and appends were done as toplevel- statements the thread ids were number like "2" , "0" but when I modified the parser to allow appends to be done inside loop-bodies the thread indices turned out to be like "0-1-3" , "1-0-3-2-1" etc.. and key collisions are happening. Wasn't thread prefixes supposed to be unique ? -- Thanks and Regards, Yadu Nand B From benc at hawaga.org.uk Wed Jun 29 09:32:43 2011 From: benc at hawaga.org.uk (Ben Clifford) Date: Wed, 29 Jun 2011 14:32:43 +0000 (GMT) Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: References: <507899029.48454.1309270581804.JavaMail.root@zimbra.anl.gov> <4F4372A1-8573-4408-B413-831B1CEFF321@hawaga.org.uk> <1309281681.21830.6.camel@blabla> <44A9D76A-339D-4A4F-8D32-42F2B79EE1B5@hawaga.org.uk> <1309287415.22843.8.camel@blabla> <1309288354.23076.2.camel@blabla> Message-ID: > > (as in the example a["foo"] += "bar" ) > Yes, but foreach loops don't allow indices to be non-int. I still haven't > fixed that. ok > > Porbably you shoudl treat the thread ID as opaque and only use it in ways > > that rely on two IDs either comparing equal or not. If you hash it in some > > way and used that hashed value, then you're going to end up with obscure > > bugs caused by collisions. > > Yes, when there was no nesting, and appends were done as toplevel- > statements the thread ids were number like "2" , "0" but when I modified the > parser to allow appends to be done inside loop-bodies the thread indices > turned out to be like "0-1-3" , "1-0-3-2-1" etc.. and key collisions are > happening. Wasn't thread prefixes supposed to be unique ? Yes. You shouldn't be seeing multiple threads with the same thread id. For example, if you do this: foreach dummy in [1] { foreach i in [1:10] { a += i; // or whatever syntax was chosen to be? a << i ? } foreach value, key in a { trace(key); } } should give 10 unique thread IDs as traced values. Those IDs won't be integers in general, though. -- From hategan at mcs.anl.gov Wed Jun 29 11:17:16 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 29 Jun 2011 11:17:16 -0500 Subject: [Swift-devel] coaster termination problems cause large runs to hang In-Reply-To: <360734726.52653.1309356949795.JavaMail.root@zimbra.anl.gov> References: <360734726.52653.1309356949795.JavaMail.root@zimbra.anl.gov> Message-ID: <1309364236.27619.1.camel@blabla> Yes, but I can't tell why the service isn't shutting down. I do see however that qdel doesn't work properly there. As to why jobs are failing in the first place, I don't know. What's the error on stdout? On Wed, 2011-06-29 at 09:15 -0500, Michael Wilde wrote: > Mihael, did you have a chance to assess the log of Papia's failing DSSAT run? > > I think it shut down after approx 30K or 120K app invocations. > > Looked like the run went bad when coaster workers shut down at the end of their maxtime slot. > > Papia, in the meantime, can you change the following parameters and re-run? > > * in cf, from: > > execution.retries=0 > lazy.errors=false > > * to: > > execution.retries=2 > lazy.errors=true > > * in sites.xml, from: > 00:02:00 > * to: > 00:10:00 > > The cf change will make Swift retry any app that fails, eg, because it may have run longer than its maxwalltime estimate. ANd Swift will continue to execute runnable app calls even if other app calls have failed. > > The sites change gives a longer time estimate for all app calls on the PADS site, so that Swift wont start an app on any coaster that has less than the amount of time (10 mins vs 2 mins) remaining in its run time allocation. > > Finally, Papia, could you try this on both PADS and Beagle, first on 0.92.1 and then on the latest trunk (as of this morning)? > > Thanks, > > - Mike > > > > ----- Original Message ----- > > > (which I will copy to ~wilde/dssat.run01 on the CI net)? > > > > bri$ pwd > > /home/wilde > > bri$ ls -ld dssat.run01/ > > drwxr-xr-x 2 wilde ci-users 4096 Jun 24 19:01 dssat.run01// > > bri$ > > > > > > ----- Original Message ----- > > > [hategan at login ~]$ cd /home/papia/dssat/run01 > > > -bash: cd: /home/papia/dssat/run01: Permission denied > > > > > > > > > On Fri, 2011-06-24 at 18:54 -0500, Michael Wilde wrote: > > > > Mihael, > > > > > > > > Papia is running large sweeps of the DSSAT land use model on PADS, > > > > and getting failures, it seems, when the coasters time out. Her > > > > script is attempting about 120K model invocations, each taking > > > > about > > > > 60 seconds to run. She gets between 30K and 60K of these done > > > > before > > > > it fails. > > > > > > > > Can you look at the example below, on the CI network in > > > > /home/papia/dssat/run01 > > > > (which I will copy to ~wilde/dssat.run01 on the CI net)? > > > > > > > > The swift.out file shows the run progressing nicely until the > > > > first > > > > coaster worker timeout occurs. > > > > > > > > The run was started with ./RunSweep.sh: > > > > time swift -tc.file tc -sites.file sites.xml -config cf > > > > RunDssat.swift >& swift.out > > > > > > > > The run id is RunID: 20110624-1333-r17fczk0 > > > > Swift is 0.92.1. > > > > > > > > Thanks, > > > > > > > > Mike > > > > > > > > > > > > login2$ head swift.out > > > > Swift svn swift-r4371 cog-r3096 > > > > > > > > RunID: 20110624-1333-r17fczk0 > > > > Progress: > > > > Progress: uninitialized:2 > > > > Progress: Selecting site:36 Stage in:53 Submitting:1 Submitted:10 > > > > Progress: Selecting site:36 Stage in:8 Submitting:2 Submitted:54 > > > > Progress: Selecting site:36 Submitted:64 > > > > Progress: Selecting site:36 Submitted:64 > > > > Progress: Selecting site:36 Submitted:63 Active:1 > > > > login2$ ls -l *zk0.log > > > > -rw-r--r-- 1 papia ci-users 161039247 Jun 24 17:25 > > > > RunDssat-20110624-1333-r17fczk0.log > > > > login2$ pwd > > > > /home/papia/dssat/run01 > > > > login2$ > > > > > > > > > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > From ketancmaheshwari at gmail.com Wed Jun 29 11:36:48 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Wed, 29 Jun 2011 11:36:48 -0500 Subject: [Swift-devel] Hangchecker tweak Message-ID: Mihael, All, Continuing with my experiments with Swift trunk pbs coaster provider on Beagle, somehow, it seems that as soon as the hangchecker kicks in, it prevents jobs from getting submitted. In support of this hypothesis, I have about 12 runs with different job throttle values from 100 - 1000 where, I observe that when there is no activity for 10s while the stageins are being done, the hangchecker thread gets invoked and no further job submissions takes place after that. In one of the longer experiments, I also observed this after a long time while jobs were running and when the hangchecker gets invoked Swift does not submit any more jobs. Following are 2 instances of log where the said phenomena occurs: /lustre/beagle/ketan/labs/modftdock/production/campaign5/ftdock-20110629-1020-x1la8psc.log /lustre/beagle/ketan/labs/modftdock/production/campaign5/ftdock-20110629-1021-y43tid39.log Following is a log where it does not occur: /lustre/beagle/ketan/labs/modftdock/production/campaign5/ftdock-20110629-1023-rq2eosp5.log Further notes: 1. In cases where jobs do not get submitted, no submit files are created in ~/.globus/scripts 2. I went through the log files, however, it seems the log file records things that have happened, for instance vdl:execute2 lines for jobs that have submitted and absence of them in case they were not. I could not find any error messages in the log that could indicate what has been happening. To confirm the hypothesis, could you indicate how could I disable the hangchecker or increase the time period before it gets invoked. Any other help you can offer to resolve this would be very useful. Regards, -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Wed Jun 29 11:46:31 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 29 Jun 2011 11:46:31 -0500 Subject: [Swift-devel] Hangchecker tweak In-Reply-To: References: Message-ID: <1309365991.28552.0.camel@blabla> Can you copy those logs to a ci machine? On Wed, 2011-06-29 at 11:36 -0500, Ketan Maheshwari wrote: > Mihael, All, > > Continuing with my experiments with Swift trunk pbs coaster provider > on Beagle, somehow, it seems that as soon as the hangchecker kicks in, > it prevents jobs from getting submitted. > > In support of this hypothesis, I have about 12 runs with different job > throttle values from 100 - 1000 where, I observe that when there is no > activity for 10s while the stageins are being done, the hangchecker > thread gets invoked and no further job submissions takes place after > that. > > In one of the longer experiments, I also observed this after a long > time while jobs were running and when the hangchecker gets invoked > Swift does not submit any more jobs. > > Following are 2 instances of log where the said phenomena occurs: > > /lustre/beagle/ketan/labs/modftdock/production/campaign5/ftdock-20110629-1020-x1la8psc.log > /lustre/beagle/ketan/labs/modftdock/production/campaign5/ftdock-20110629-1021-y43tid39.log > > Following is a log where it does not occur: > > /lustre/beagle/ketan/labs/modftdock/production/campaign5/ftdock-20110629-1023-rq2eosp5.log > > > Further notes: > 1. In cases where jobs do not get submitted, no submit files are > created in ~/.globus/scripts > 2. I went through the log files, however, it seems the log file > records things that have happened, for instance vdl:execute2 lines for > jobs that have submitted and absence of them in case they were not. I > could not find any error messages in the log that could indicate what > has been happening. > > To confirm the hypothesis, could you indicate how could I disable the > hangchecker or increase the time period before it gets invoked. > > Any other help you can offer to resolve this would be very useful. > > Regards, > -- > Ketan > > From hategan at mcs.anl.gov Wed Jun 29 11:48:22 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 29 Jun 2011 11:48:22 -0500 Subject: [Swift-devel] Hangchecker tweak In-Reply-To: References: Message-ID: <1309366102.28552.1.camel@blabla> On Wed, 2011-06-29 at 11:36 -0500, Ketan Maheshwari wrote: > To confirm the hypothesis, could you indicate how could I disable the > hangchecker or increase the time period before it gets invoked. in Loader.main(), comment out the 'new HangChecker(stack).start()' line. From ketancmaheshwari at gmail.com Wed Jun 29 11:52:17 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Wed, 29 Jun 2011 11:52:17 -0500 Subject: [Swift-devel] Hangchecker tweak In-Reply-To: <1309366102.28552.1.camel@blabla> References: <1309366102.28552.1.camel@blabla> Message-ID: Thanks. You can find the logs here: /home/ketan/labs/modftdock/logs On Wed, Jun 29, 2011 at 11:48 AM, Mihael Hategan wrote: > On Wed, 2011-06-29 at 11:36 -0500, Ketan Maheshwari wrote: > > To confirm the hypothesis, could you indicate how could I disable the > > hangchecker or increase the time period before it gets invoked. > > in Loader.main(), comment out the 'new HangChecker(stack).start()' line. > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Wed Jun 29 13:02:46 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Wed, 29 Jun 2011 13:02:46 -0500 Subject: [Swift-devel] Hangchecker tweak In-Reply-To: <1309366102.28552.1.camel@blabla> References: <1309366102.28552.1.camel@blabla> Message-ID: I built Swift with this change and submitted a run with throttle value of 3600 app tasks. It seems to be working. I see 3600 PBS jobs have been submitted to Beagle. On Wed, Jun 29, 2011 at 11:48 AM, Mihael Hategan wrote: > On Wed, 2011-06-29 at 11:36 -0500, Ketan Maheshwari wrote: > > To confirm the hypothesis, could you indicate how could I disable the > > hangchecker or increase the time period before it gets invoked. > > in Loader.main(), comment out the 'new HangChecker(stack).start()' line. > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Wed Jun 29 13:32:15 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 29 Jun 2011 13:32:15 -0500 Subject: [Swift-devel] Hangchecker tweak In-Reply-To: References: <1309366102.28552.1.camel@blabla> Message-ID: <1309372335.29757.3.camel@blabla> I strongly suspect that the hangs are not due to the hang checker. On Wed, 2011-06-29 at 13:02 -0500, Ketan Maheshwari wrote: > > I built Swift with this change and submitted a run with throttle value > of 3600 app tasks. It seems to be working. I see 3600 PBS jobs have > been submitted to Beagle. > > > On Wed, Jun 29, 2011 at 11:48 AM, Mihael Hategan > wrote: > On Wed, 2011-06-29 at 11:36 -0500, Ketan Maheshwari wrote: > > > To confirm the hypothesis, could you indicate how could I > disable the > > hangchecker or increase the time period before it gets > invoked. > > > in Loader.main(), comment out the 'new > HangChecker(stack).start()' line. > > > > > -- > Ketan > > From jonmon at utexas.edu Wed Jun 29 13:37:44 2011 From: jonmon at utexas.edu (Jonathan Monette) Date: Wed, 29 Jun 2011 13:37:44 -0500 Subject: [Swift-devel] give name to PBS job Message-ID: Hello, I think I mentioned this before but is there a way to assign a name to a PBS job in Swift? Something like: some.name and then some.name would show up on qstat? This would be useful to see which workers are from what Swift run when checking qstat. From yadudoc1729 at gmail.com Wed Jun 29 13:49:20 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Thu, 30 Jun 2011 00:19:20 +0530 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: References: <507899029.48454.1309270581804.JavaMail.root@zimbra.anl.gov> <4F4372A1-8573-4408-B413-831B1CEFF321@hawaga.org.uk> <1309281681.21830.6.camel@blabla> <44A9D76A-339D-4A4F-8D32-42F2B79EE1B5@hawaga.org.uk> <1309287415.22843.8.camel@blabla> <1309288354.23076.2.camel@blabla> Message-ID: > You shouldn't be seeing multiple threads with the same thread id. I was working with this assumption. I tried int array[ ]; foreach outer in [1:2]{ foreach inner in [1:10]{ array += inner; trace(inner); }} This works fine. > For example, if you do this: > foreach dummy in [1] { > ?foreach i in [1:10] { > ? ?a += i; // or whatever syntax was chosen to be? a << i ? > ?} > > ?foreach value, key in a { > ?trace(key); > ?} > } > should give 10 unique thread IDs as traced values. Those IDs won't be > integers in general, though. I am having some issue with array appends in nested loops. I'm pasting the code here, I have attached the output, logs and kml generated. Any help will be appreciated. int array[ ][ ]; foreach dummy in [1:2] { array["key"] += 1; array["key"] += 2; foreach v in array["key"]{ trace (v); } } -- Thanks and Regards, Yadu Nand B -------------- next part -------------- A non-text attachment was scrubbed... Name: t.output Type: application/octet-stream Size: 903 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: t.kml Type: application/vnd.google-earth.kml+xml Size: 7356 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: t-20110630-0013-f75upjc9.log Type: text/x-log Size: 6899 bytes Desc: not available URL: From wozniak at mcs.anl.gov Wed Jun 29 13:56:08 2011 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Wed, 29 Jun 2011 13:56:08 -0500 (CDT) Subject: [Swift-devel] give name to PBS job In-Reply-To: References: Message-ID: I looked at this a little bit. One thing is that Coasters generates its own block names when it decides to get an allocation, and these names correspond to useful Coasters identifiers. Another is that there is a 15-character limit. However, it still could be done by modifying PBSExecutor.makeName(). Justin On Wed, 29 Jun 2011, Jonathan Monette wrote: > Hello, > I think I mentioned this before but is there a way to assign a name to > a PBS job in Swift? Something like: > > some.name > > and then some.name would show up on qstat? This would be useful to see which workers are from what Swift run when checking qstat. > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -- Justin M Wozniak From jonmon at utexas.edu Wed Jun 29 13:58:23 2011 From: jonmon at utexas.edu (Jonathan Monette) Date: Wed, 29 Jun 2011 13:58:23 -0500 Subject: [Swift-devel] give name to PBS job In-Reply-To: References: Message-ID: I'll take a look at that method then. Mike just mentioned that he thought the ability to name your coaster workers was already implemented so I emailed the list. On Jun 29, 2011, at 1:56 PM, Justin M Wozniak wrote: > > I looked at this a little bit. One thing is that Coasters generates its own block names when it decides to get an allocation, and these names correspond to useful Coasters identifiers. Another is that there is a 15-character limit. However, it still could be done by modifying PBSExecutor.makeName(). > Justin > > On Wed, 29 Jun 2011, Jonathan Monette wrote: > >> Hello, > >> I think I mentioned this before but is there a way to assign a name to a PBS job in Swift? Something like: >> >> some.name >> >> and then some.name would show up on qstat? This would be useful to see which workers are from what Swift run when checking qstat. >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> > > -- > Justin M Wozniak From wilde at mcs.anl.gov Wed Jun 29 14:14:09 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 29 Jun 2011 14:14:09 -0500 (CDT) Subject: [Swift-devel] give name to PBS job In-Reply-To: Message-ID: <719312289.54276.1309374849685.JavaMail.root@zimbra.anl.gov> I was referring to this enhancement from Sarah: --- From: "Sarah Kenny" To: "Swift Devel" Sent: Tuesday, October 5, 2010 12:26:18 PM Subject: [Swift-devel] new commits hi all, i have a fix to swift that allows the jobName parameter to be passed to GRAM (so the job name actually shows up in the queue). this will be my first commit to the cog/swift source so i want to know where the best place is to do this? i'd like it to be in the stable release so that it's there when we update. my user group (HNL) has been using the stable branch with this fix added for several months so it's pretty well-tested. i guess the question is would it be best to commit this to the development code (which as i understand is cogkit/trunk/current) to be migrated to the stable code (cogkit/branches/4.1.7) in the near future when we decide to update the stable branch? ----- Original Message ----- > I'll take a look at that method then. Mike just mentioned that he > thought the ability to name your coaster workers was already > implemented so I emailed the list. > > On Jun 29, 2011, at 1:56 PM, Justin M Wozniak wrote: > > > > > I looked at this a little bit. One thing is that Coasters generates > > its own block names when it decides to get an allocation, and these > > names correspond to useful Coasters identifiers. Another is that > > there is a 15-character limit. However, it still could be done by > > modifying PBSExecutor.makeName(). > > Justin > > > > On Wed, 29 Jun 2011, Jonathan Monette wrote: > > > >> Hello, > > > >> I think I mentioned this before but is there a way to assign a > >> name to a PBS job in Swift? Something like: > >> > >> some.name > >> > >> and then some.name would show up on qstat? This would be useful to > >> see which workers are from what Swift run when checking qstat. > >> _______________________________________________ > >> Swift-devel mailing list > >> Swift-devel at ci.uchicago.edu > >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >> > > > > -- > > Justin M Wozniak > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From skenny at uchicago.edu Wed Jun 29 14:30:42 2011 From: skenny at uchicago.edu (Sarah Kenny) Date: Wed, 29 Jun 2011 12:30:42 -0700 Subject: [Swift-devel] give name to PBS job In-Reply-To: <719312289.54276.1309374849685.JavaMail.root@zimbra.anl.gov> References: <719312289.54276.1309374849685.JavaMail.root@zimbra.anl.gov> Message-ID: apparently this didn't get implemented because i *think* what was happening is that when mihael added in my fix it then caused problems in gram2 if you tried to submit something and failed/forgot to set that parameter: On Tue, 2010-10-05 at 14:31 -0500, Sarah Kenny wrote: > On Tue, Oct 5, 2010 at 12:58 PM, Mihael Hategan > wrote: > I think I committed some changes based on your patch. One > problem was > that gram2 doesn't seem to support a job name. Did you find a > way around > that? > > hmm, it is working on abe (gram2) and queenbee (gram5)...were you > getting an error running it elsewhere or did it just not pick up the > name? i'm thinking if it works for some gram versions and not others > it might be ok to commit unless it actually causes an error in > versions where the name is not passed (?) That was the problem. It caused an error. On Wed, Jun 29, 2011 at 12:14 PM, Michael Wilde wrote: > I was referring to this enhancement from Sarah: > --- > From: "Sarah Kenny" > To: "Swift Devel" > Sent: Tuesday, October 5, 2010 12:26:18 PM > Subject: [Swift-devel] new commits > > hi all, i have a fix to swift that allows the jobName parameter to be > passed to GRAM (so the job name actually shows up in the queue). this will > be my first commit to the cog/swift source so i want to know where the best > place is to do this? i'd like it to be in the stable release so that it's > there when we update. my user group (HNL) has been using the stable branch > with this fix added for several months so it's pretty well-tested. > > i guess the question is would it be best to commit this to the development > code (which as i understand is cogkit/trunk/current) to be migrated to the > stable code (cogkit/branches/4.1.7) in the near future when we decide to > update the stable branch? > > > ----- Original Message ----- > > I'll take a look at that method then. Mike just mentioned that he > > thought the ability to name your coaster workers was already > > implemented so I emailed the list. > > > > On Jun 29, 2011, at 1:56 PM, Justin M Wozniak wrote: > > > > > > > > I looked at this a little bit. One thing is that Coasters generates > > > its own block names when it decides to get an allocation, and these > > > names correspond to useful Coasters identifiers. Another is that > > > there is a 15-character limit. However, it still could be done by > > > modifying PBSExecutor.makeName(). > > > Justin > > > > > > On Wed, 29 Jun 2011, Jonathan Monette wrote: > > > > > >> Hello, > > > > > >> I think I mentioned this before but is there a way to assign a > > >> name to a PBS job in Swift? Something like: > > >> > > >> some.name > > >> > > >> and then some.name would show up on qstat? This would be useful to > > >> see which workers are from what Swift run when checking qstat. > > >> _______________________________________________ > > >> Swift-devel mailing list > > >> Swift-devel at ci.uchicago.edu > > >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > >> > > > > > > -- > > > Justin M Wozniak > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Wed Jun 29 15:18:29 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 29 Jun 2011 15:18:29 -0500 Subject: [Swift-devel] New syntax for appending to arrays In-Reply-To: References: <507899029.48454.1309270581804.JavaMail.root@zimbra.anl.gov> <4F4372A1-8573-4408-B413-831B1CEFF321@hawaga.org.uk> <1309281681.21830.6.camel@blabla> <44A9D76A-339D-4A4F-8D32-42F2B79EE1B5@hawaga.org.uk> <1309287415.22843.8.camel@blabla> <1309288354.23076.2.camel@blabla> Message-ID: <1309378709.31845.2.camel@blabla> On Thu, 2011-06-30 at 00:19 +0530, Yadu Nand wrote: > I am having some issue with array appends in nested loops. I'm pasting > the code here, I have attached the output, logs and kml generated. > Any help will be appreciated. > > int array[ ][ ]; > foreach dummy in [1:2] { > array["key"] += 1; > array["key"] += 2; > foreach v in array["key"]{ > trace (v); > } > } > So you are appending 1 and 2 twice, which should give you [1, 2, 1, 2]. And then you are printing that array twice, so you should have 8 lines of output from trace. Which checks out. And then the consumer inside the producer loop, I don't think that the compiler flow detector can deal with that, but that's another issue. From skenny at uchicago.edu Wed Jun 29 15:50:35 2011 From: skenny at uchicago.edu (Sarah Kenny) Date: Wed, 29 Jun 2011 13:50:35 -0700 Subject: [Swift-devel] Fwd: [Bug 453] New: Swift accepts bad sites parameters without warning to user In-Reply-To: References: Message-ID: hey all, i've added a check for some of the incorrect parameters for local execution provider but need to add for the other providers....this is perhaps a naively optimistic question :) but is there a comprehensive list of which parameters are accepted by which provider? just thought i'd check before continuing to dig around for them. thanks ~sk ---------- Forwarded message ---------- From: Date: Fri, Jun 24, 2011 at 10:29 AM Subject: [Bug 453] New: Swift accepts bad sites parameters without warning to user To: skenny at uchicago.edu https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=453 Summary: Swift accepts bad sites parameters without warning to user Product: Swift Version: 0.92 Platform: All OS/Version: All Status: ASSIGNED Severity: major Priority: P1 Component: SwiftScript language AssignedTo: skenny at uchicago.edu ReportedBy: wilde at mcs.anl.gov In the newuser-guide pads-quickstart, we tell the user to use the following sites file: 0 _QUEUE_ _WORK_ which is included from the test/ directory. This file is half coasters, half plain pbs, and as a result happily runs the users jobs on localhost - without any indication of problems. Swift should warn the user (or give a fatal error) if parameters specified in the sites file dont apply to the selected provider(s). -- Configure bugmail: https://bugzilla.mcs.anl.gov/swift/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Wed Jun 29 15:57:06 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 29 Jun 2011 15:57:06 -0500 (CDT) Subject: [Swift-devel] Fwd: [Bug 453] New: Swift accepts bad sites parameters without warning to user In-Reply-To: Message-ID: <1501953336.54980.1309381026366.JavaMail.root@zimbra.anl.gov> Sarah, The only detailed list I know of is for the coaster provider, in the User Guide. And even that is likely not 100% in sync with the code. Anything you can do to create such a list as part of this bug fix would be great. - Mike ----- Original Message ----- hey all, i've added a check for some of the incorrect parameters for local execution provider but need to add for the other providers....this is perhaps a naively optimistic question :) but is there a comprehensive list of which parameters are accepted by which provider? just thought i'd check before continuing to dig around for them. thanks ~sk ---------- Forwarded message ---------- From: < bugzilla-daemon at mcs.anl.gov > Date: Fri, Jun 24, 2011 at 10:29 AM Subject: [Bug 453] New: Swift accepts bad sites parameters without warning to user To: skenny at uchicago.edu https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=453 Summary: Swift accepts bad sites parameters without warning to user Product: Swift Version: 0.92 Platform: All OS/Version: All Status: ASSIGNED Severity: major Priority: P1 Component: SwiftScript language AssignedTo: skenny at uchicago.edu ReportedBy: wilde at mcs.anl.gov In the newuser-guide pads-quickstart, we tell the user to use the following sites file: 0 _QUEUE_ _WORK_ which is included from the test/ directory. This file is half coasters, half plain pbs, and as a result happily runs the users jobs on localhost - without any indication of problems. Swift should warn the user (or give a fatal error) if parameters specified in the sites file dont apply to the selected provider(s). -- Configure bugmail: https://bugzilla.mcs.anl.gov/swift/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. _______________________________________________ Swift-devel mailing list Swift-devel at ci.uchicago.edu https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From yadudoc1729 at gmail.com Wed Jun 29 16:43:48 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Thu, 30 Jun 2011 03:13:48 +0530 Subject: [Swift-devel] Explicit subscript types at declaration Message-ID: Hi, In an earlier mail regarding issues in foreach where a type mismatch between the default int type for indices and array keys, Mihael suggested using subscript types at array declaration. This involves saying , int assoc_array[string][int][ ] (where empty brackets implicity mean int ) I've edited the parser to accept definitions of type [ type? ] instead of the current [ ]. int array[ ][ int ] [ string ] is converted to => Perhaps I should stick to empty brackets when it isn't defined to avoid breaking existing code. I'm stuck at translating to kml. How is this supposed to look in kml ? Please let me know if I need to make any changes or if you have any suggestions/advice on this. It looks like, a lot of changes will be need to be made (almost everywhere since the type itself now different). -- Thanks and Regards, Yadu Nand B From skenny at uchicago.edu Wed Jun 29 16:59:55 2011 From: skenny at uchicago.edu (Sarah Kenny) Date: Wed, 29 Jun 2011 14:59:55 -0700 Subject: [Swift-devel] Fwd: [Bug 453] New: Swift accepts bad sites parameters without warning to user In-Reply-To: <1501953336.54980.1309381026366.JavaMail.root@zimbra.anl.gov> References: <1501953336.54980.1309381026366.JavaMail.root@zimbra.anl.gov> Message-ID: ok, that's what i figured, but just wanted to make sure i wasn't missing anything. i'll work on generating the list. ~sk On Wed, Jun 29, 2011 at 1:57 PM, Michael Wilde wrote: > Sarah, > > The only detailed list I know of is for the coaster provider, in the User > Guide. And even that is likely not 100% in sync with the code. > > Anything you can do to create such a list as part of this bug fix would be > great. > > - Mike > > > > ------------------------------ > > hey all, i've added a check for some of the incorrect parameters for local > execution provider but need to add for the other providers....this is > perhaps a naively optimistic question :) but is there a comprehensive list > of which parameters are accepted by which provider? just thought i'd check > before continuing to dig around for them. > > thanks > ~sk > > ---------- Forwarded message ---------- > From: > Date: Fri, Jun 24, 2011 at 10:29 AM > Subject: [Bug 453] New: Swift accepts bad sites parameters without warning > to user > To: skenny at uchicago.edu > > > https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=453 > > Summary: Swift accepts bad sites parameters without warning to > user > Product: Swift > Version: 0.92 > Platform: All > OS/Version: All > Status: ASSIGNED > Severity: major > Priority: P1 > Component: SwiftScript language > AssignedTo: skenny at uchicago.edu > ReportedBy: wilde at mcs.anl.gov > > > In the newuser-guide pads-quickstart, we tell the user to use the following > sites file: > > > > > > > 0 > _QUEUE_ > _WORK_ > > > > which is included from the test/ directory. > > This file is half coasters, half plain pbs, and as a result happily runs > the > users jobs on localhost - without any indication of problems. > > Swift should warn the user (or give a fatal error) if parameters specified > in > the sites file dont apply to the selected provider(s). > > -- > Configure bugmail: > https://bugzilla.mcs.anl.gov/swift/userprefs.cgi?tab=email > ------- You are receiving this mail because: ------- > You are the assignee for the bug. > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wozniak at mcs.anl.gov Wed Jun 29 17:04:25 2011 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Wed, 29 Jun 2011 17:04:25 -0500 (CDT) Subject: [Swift-devel] Fwd: [Bug 453] New: Swift accepts bad sites parameters without warning to user In-Reply-To: References: <1501953336.54980.1309381026366.JavaMail.root@zimbra.anl.gov> Message-ID: I started on this a while ago but did not get very far: https://sites.google.com/site/swiftdevel/internals/job-attributes On Wed, 29 Jun 2011, Sarah Kenny wrote: > ok, that's what i figured, but just wanted to make sure i wasn't missing > anything. i'll work on generating the list. > > ~sk > > On Wed, Jun 29, 2011 at 1:57 PM, Michael Wilde wrote: > >> Sarah, >> >> The only detailed list I know of is for the coaster provider, in the User >> Guide. And even that is likely not 100% in sync with the code. >> >> Anything you can do to create such a list as part of this bug fix would be >> great. >> >> - Mike >> >> >> >> ------------------------------ >> >> hey all, i've added a check for some of the incorrect parameters for local >> execution provider but need to add for the other providers....this is >> perhaps a naively optimistic question :) but is there a comprehensive list >> of which parameters are accepted by which provider? just thought i'd check >> before continuing to dig around for them. >> >> thanks >> ~sk >> >> ---------- Forwarded message ---------- >> From: >> Date: Fri, Jun 24, 2011 at 10:29 AM >> Subject: [Bug 453] New: Swift accepts bad sites parameters without warning >> to user >> To: skenny at uchicago.edu >> >> >> https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=453 >> >> Summary: Swift accepts bad sites parameters without warning to >> user >> Product: Swift >> Version: 0.92 >> Platform: All >> OS/Version: All >> Status: ASSIGNED >> Severity: major >> Priority: P1 >> Component: SwiftScript language >> AssignedTo: skenny at uchicago.edu >> ReportedBy: wilde at mcs.anl.gov >> >> >> In the newuser-guide pads-quickstart, we tell the user to use the following >> sites file: >> >> >> >> >> >> >> 0 >> _QUEUE_ >> _WORK_ >> >> >> >> which is included from the test/ directory. >> >> This file is half coasters, half plain pbs, and as a result happily runs >> the >> users jobs on localhost - without any indication of problems. >> >> Swift should warn the user (or give a fatal error) if parameters specified >> in >> the sites file dont apply to the selected provider(s). >> >> -- >> Configure bugmail: >> https://bugzilla.mcs.anl.gov/swift/userprefs.cgi?tab=email >> ------- You are receiving this mail because: ------- >> You are the assignee for the bug. >> >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> >> >> >> >> -- >> Michael Wilde >> Computation Institute, University of Chicago >> Mathematics and Computer Science Division >> Argonne National Laboratory >> >> > -- Justin M Wozniak From ketancmaheshwari at gmail.com Thu Jun 30 10:04:45 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Thu, 30 Jun 2011 10:04:45 -0500 Subject: [Swift-devel] finding the execution sites from swift logs Message-ID: Hello, Does anyone knows from swift log, how to find how many jobs executed on a given site when there is a mix of localhost and osg sites? I have logs of many runs each with about 3400 app tasks ran on localhost + osg sites. Trying to look into log and find that there are multiple messages corresponding to staging in-out, run and other events (change in score, thread associations, etc.). What should I be looking to identify each job uniquely? Thanks for any help on this. Regards, -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Thu Jun 30 10:24:03 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 30 Jun 2011 10:24:03 -0500 (CDT) Subject: [Swift-devel] finding the execution sites from swift logs In-Reply-To: Message-ID: <1340435119.56959.1309447443437.JavaMail.root@zimbra.anl.gov> Ben started a writeup on the pot logs back in 2008: http://lists.ci.uchicago.edu/pipermail/swift-devel/2008-October/003950.html The files he referenced are now in: /gpfs/pads/projects/swift/benc-home/benc/public_html/tmp/plot-tour Can you post that somewhere (adjusting links as needed) - I suspect this will be useful to all working on or using the plot tools or new log tools, as well as the -tui mechanism. I think that you can also glean a lot of info from Ben's many postings to swift-devel on log analysis topics. I found this by searching my mail archives for "benc log execute2". It would be useful to test and post some google searches that neatly search the swift-devel and -user archives. - Mike ----- Original Message ----- Hello, Does anyone knows from swift log, how to find how many jobs executed on a given site when there is a mix of localhost and osg sites? I have logs of many runs each with about 3400 app tasks ran on localhost + osg sites. Trying to look into log and find that there are multiple messages corresponding to staging in-out, run and other events (change in score, thread associations, etc.). What should I be looking to identify each job uniquely? Thanks for any help on this. Regards, -- Ketan _______________________________________________ Swift-devel mailing list Swift-devel at ci.uchicago.edu https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From aespinosa at cs.uchicago.edu Thu Jun 30 11:45:20 2011 From: aespinosa at cs.uchicago.edu (Allan Espinosa) Date: Thu, 30 Jun 2011 10:45:20 -0600 Subject: [Swift-devel] finding the execution sites from swift logs In-Reply-To: References: Message-ID: Hi Ketan, What I do is match the jobnames with the sites by matching execute2 log information. I have a set of R and ruby scripts in ~aespinosa/Documents/swift that analyzes the number of transfers per site. You can modify them to look for job execution. If you look at one of the makefile targets in libexec/log-processing, you can find files with the name 'color'. these targets to swift plot logs colors the plot per site. To just explore the general statistics, I obtain the execute2.event file from the log using $swift-plot-log logfile execute2.event And then just use R to analyze the *.event file. 2011/6/30 Ketan Maheshwari : > Hello, > > Does anyone knows from swift log, how to find how many jobs executed on a > given site when there is a mix of localhost and osg sites? > > I have logs of many runs each with about 3400 app tasks ran on localhost + > osg sites. > > Trying to look into log and find that there are multiple messages > corresponding to staging in-out, run and other events (change in score, > thread associations, etc.). > > What should I be looking to identify each job uniquely? > > Thanks for any help on this. > > Regards, > -- > Ketan > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- Allan M. Espinosa PhD student, Computer Science University of Chicago From jonmon at utexas.edu Thu Jun 30 13:40:20 2011 From: jonmon at utexas.edu (Jonathan Monette) Date: Thu, 30 Jun 2011 13:40:20 -0500 Subject: [Swift-devel] readData2 Message-ID: <6BE5DDCC-CD3D-482E-B258-01CCC0E0E64E@utexas.edu> Hello, I am going to rename readData2 to something more meaningful. I am thinking maybe readStructured? I will keep readData2 as an alias to the renamed function as too not break anyones current code but the userguide will reflect the renamed function. Any suggestions on the name or is readStructured good? From yadudoc1729 at gmail.com Thu Jun 30 13:44:14 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Fri, 1 Jul 2011 00:14:14 +0530 Subject: [Swift-devel] Explicit subscript types at declaration In-Reply-To: References: Message-ID: Hi, > This involves saying , > int assoc_array[string][int][ ] > (where empty brackets implicity mean int ) This is hard. It would mean a lot of effort in terms of code from what I understand. Further more, It also means less flexibility for the user in terms of the type of subscript. Is it wrong for the user to say, array[5] = 10; array["key"] = 25; If this is dealt with at the foreach level, by doing type inference of the index, I think this would be easier. This would mean we would need to change the type of the index in the template for foreach to say, _UNDEF_ -- Thanks and Regards, Yadu Nand B From hategan at mcs.anl.gov Thu Jun 30 13:57:02 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 30 Jun 2011 13:57:02 -0500 Subject: [Swift-devel] Explicit subscript types at declaration In-Reply-To: References: Message-ID: <1309460222.3783.4.camel@blabla> On Fri, 2011-07-01 at 00:14 +0530, Yadu Nand wrote: > Hi, > > > > This involves saying , > > int assoc_array[string][int][ ] > > (where empty brackets implicity mean int ) > > This is hard. It would mean a lot of effort in terms of code from what > I understand. Yes, it's not trivial. But what fun would it be if it was? > Further more, It also means less flexibility for the user > in terms of the type of subscript. Maybe. > > Is it wrong for the user to say, > array[5] = 10; > array["key"] = 25; > > If this is dealt with at the foreach level, by doing type inference of the > index, I think this would be easier. This would mean we would need > to change the type of the index in the template for foreach to say, > _UNDEF_ > Well, swift doesn't have a universal type (i.e. Object, (void *)), so a mix between int and string doesn't really have a proper type. It follows that one should either have array[5] or array["key"] but not both for the same array. From ketancmaheshwari at gmail.com Thu Jun 30 14:04:27 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Thu, 30 Jun 2011 14:04:27 -0500 Subject: [Swift-devel] swift on ranger Message-ID: Hi, I am trying to run a simple catsn swift workflow on ranger/teragrid. However, seems the SGE job does not get created on ranger host. The staging-in of input files do happen from the communicado host to ranger. The stdout on Swift shows status as job submitted, however qsub on ranger does not show any jobs being submitted. I am using the GRAM coaster provider: gt2:gt2:SGE The logs and configuration files (cf, tc, sites.xml) for this run can be found on CI network here: /home/ketan/osg-tg-effort/ranger-catsn Any help or tips to debug this further would be very useful. -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From yadudoc1729 at gmail.com Thu Jun 30 14:04:57 2011 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Fri, 1 Jul 2011 00:34:57 +0530 Subject: [Swift-devel] Explicit subscript types at declaration In-Reply-To: <1309460222.3783.4.camel@blabla> References: <1309460222.3783.4.camel@blabla> Message-ID: >> > This involves saying , >> > int assoc_array[string][int][ ] >> > (where empty brackets implicity mean int ) >> >> This is hard. It would mean a lot of effort in terms of code from what >> I understand. > > Yes, it's not trivial. But what fun would it be if it was? > >> ?Further more, It also means less flexibility for the user >> in terms of the type of subscript. > > Maybe. > >> >> Is it wrong for the user to say, >> array[5] = 10; >> array["key"] = 25; >> >> If this is dealt with at the foreach level, by doing type inference of the >> index, I think this would be easier. This would mean we would need >> to change the type of the index in the template for foreach to say, >> _UNDEF_ >> > > Well, swift doesn't have a universal type (i.e. Object, (void *)), so a > mix between int and string doesn't really have a proper type. It follows > that one should either have array[5] or array["key"] but not both for > the same array. Okay. Then I'll have it done so that, empty brackets are still implicitly int. -- Thanks and Regards, Yadu Nand B From skenny at uchicago.edu Thu Jun 30 14:07:44 2011 From: skenny at uchicago.edu (Sarah Kenny) Date: Thu, 30 Jun 2011 12:07:44 -0700 Subject: [Swift-devel] swift on ranger In-Reply-To: References: Message-ID: have you looked in the gram log? On Thu, Jun 30, 2011 at 12:04 PM, Ketan Maheshwari < ketancmaheshwari at gmail.com> wrote: > Hi, > > I am trying to run a simple catsn swift workflow on ranger/teragrid. > However, seems the SGE job does not get created on ranger host. > > The staging-in of input files do happen from the communicado host to > ranger. The stdout on Swift shows status as job submitted, however qsub on > ranger does not show any jobs being submitted. > > I am using the GRAM coaster provider: gt2:gt2:SGE > > The logs and configuration files (cf, tc, sites.xml) for this run can be > found on CI network here: > > /home/ketan/osg-tg-effort/ranger-catsn > > > Any help or tips to debug this further would be very useful. > > -- > Ketan > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Thu Jun 30 14:08:09 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 30 Jun 2011 14:08:09 -0500 Subject: [Swift-devel] swift on ranger In-Reply-To: References: Message-ID: <1309460889.3915.1.camel@blabla> You may want to start by looking at the gram job manager logs which are in ~/ on ranger and should be named in an obvious manner. You may also want to try do some sanity checks by submitting a plain gt2:sge job to ranger. On Thu, 2011-06-30 at 14:04 -0500, Ketan Maheshwari wrote: > Hi, > > I am trying to run a simple catsn swift workflow on ranger/teragrid. > However, seems the SGE job does not get created on ranger host. > > The staging-in of input files do happen from the communicado host to > ranger. The stdout on Swift shows status as job submitted, however > qsub on ranger does not show any jobs being submitted. > > I am using the GRAM coaster provider: gt2:gt2:SGE > > The logs and configuration files (cf, tc, sites.xml) for this run can > be found on CI network here: > > /home/ketan/osg-tg-effort/ranger-catsn > > > Any help or tips to debug this further would be very useful. > > -- > Ketan > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From ketancmaheshwari at gmail.com Thu Jun 30 15:08:40 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Thu, 30 Jun 2011 15:08:40 -0500 Subject: [Swift-devel] swift on ranger In-Reply-To: <1309460889.3915.1.camel@blabla> References: <1309460889.3915.1.camel@blabla> Message-ID: On Thu, Jun 30, 2011 at 2:08 PM, Mihael Hategan wrote: > You may want to start by looking at the gram job manager logs which are > in ~/ on ranger and should be named in an obvious manner. > > I tried to look into these logs. They do not show any error except these lines: /teragrid/gram5-5.0.2-r2/tmp/gram_job_state/job.gridftp1.ranger.tacc.utexas.edu.16145744013955882 016.1760 6392074284836744 > > msg="State file not owned by me" status=-121 errno=0 > > reason="Success" Which, according to documentation of Gram are not really errors. You may also want to try do some sanity checks by submitting a plain > gt2:sge job to ranger. > Tried doing this from ranger login nodes but getting this: login3$ sh run.sh Swift svn swift-r4371 cog-r3096 RunID: 20110630-1507-3kzljlsf Progress: Progress: Stage in:1 Progress: Submitting:1 Progress: Submitted:1 Could not submit job Caused by: org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Could not submit job Caused by: org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Could not start coaster service Caused by: org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Task ended before registration was received. STDOUT: Failed to download cog-abstraction-common-2.4.jar: java.net.ConnectException: Connection refused Failed to transfer wrapper log from catsn-20110630-1507-3kzljlsf/info/m on RANGER > On Thu, 2011-06-30 at 14:04 -0500, Ketan Maheshwari wrote: > > Hi, > > > > I am trying to run a simple catsn swift workflow on ranger/teragrid. > > However, seems the SGE job does not get created on ranger host. > > > > The staging-in of input files do happen from the communicado host to > > ranger. The stdout on Swift shows status as job submitted, however > > qsub on ranger does not show any jobs being submitted. > > > > I am using the GRAM coaster provider: gt2:gt2:SGE > > > > The logs and configuration files (cf, tc, sites.xml) for this run can > > be found on CI network here: > > > > /home/ketan/osg-tg-effort/ranger-catsn > > > > > > Any help or tips to debug this further would be very useful. > > > > -- > > Ketan > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Thu Jun 30 16:18:02 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Thu, 30 Jun 2011 16:18:02 -0500 Subject: [Swift-devel] finding the execution sites from swift logs In-Reply-To: References: Message-ID: Thanks Allan. In some log files, I do not see any execute2 lines. However, I do see JOB_START lines ending with host=. I confirmed that the logs indeed belong to successful complete runs. While in other logs corresponding to other successful runs, I do see the execute2 lines. All the runs were carried out using the same version of Swift. Am I missing something here? Ketan On Thu, Jun 30, 2011 at 11:45 AM, Allan Espinosa wrote: > Hi Ketan, > > What I do is match the jobnames with the sites by matching execute2 > log information. I have a set of R and ruby scripts in > ~aespinosa/Documents/swift that analyzes the number of transfers per > site. You can modify them to look for job execution. > > If you look at one of the makefile targets in libexec/log-processing, > you can find files with the name 'color'. these targets to swift plot > logs colors the plot per site. > > To just explore the general statistics, I obtain the execute2.event > file from the log using > > $swift-plot-log logfile execute2.event > > And then just use R to analyze the *.event file. > > 2011/6/30 Ketan Maheshwari : > > Hello, > > > > Does anyone knows from swift log, how to find how many jobs executed on a > > given site when there is a mix of localhost and osg sites? > > > > I have logs of many runs each with about 3400 app tasks ran on localhost > + > > osg sites. > > > > Trying to look into log and find that there are multiple messages > > corresponding to staging in-out, run and other events (change in score, > > thread associations, etc.). > > > > What should I be looking to identify each job uniquely? > > > > Thanks for any help on this. > > > > Regards, > > -- > > Ketan > > > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > -- > Allan M. Espinosa > PhD student, Computer Science > University of Chicago > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Thu Jun 30 16:21:35 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 30 Jun 2011 16:21:35 -0500 Subject: [Swift-devel] swift on ranger In-Reply-To: References: <1309460889.3915.1.camel@blabla> Message-ID: <1309468895.4978.0.camel@blabla> On Thu, 2011-06-30 at 15:08 -0500, Ketan Maheshwari wrote: > > > On Thu, Jun 30, 2011 at 2:08 PM, Mihael Hategan > wrote: > You may want to start by looking at the gram job manager logs > which are > in ~/ on ranger and should be named in an obvious manner. > > I tried to look into these logs. They do not show any error except > these lines: > /teragrid/gram5-5.0.2-r2/tmp/gram_job_state/job.gridftp1.ranger.tacc.utexas.edu.16145744013955882016.17606392074284836744 > > > msg="State file not owned by me" status=-121 errno=0 > > > reason="Success" > > > Which, according to documentation of Gram are not really errors. > > > > You may also want to try do some sanity checks by submitting a > plain > gt2:sge job to ranger. > > Tried doing this from ranger login nodes but getting this: I mean plain globus. Like globusrun. Or swift without coasters. From ketancmaheshwari at gmail.com Thu Jun 30 16:22:34 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Thu, 30 Jun 2011 16:22:34 -0500 Subject: [Swift-devel] swift on ranger In-Reply-To: <1309468895.4978.0.camel@blabla> References: <1309460889.3915.1.camel@blabla> <1309468895.4978.0.camel@blabla> Message-ID: Yes, I can confirm that plain globus-job-run works on ranger. On Thu, Jun 30, 2011 at 4:21 PM, Mihael Hategan wrote: > On Thu, 2011-06-30 at 15:08 -0500, Ketan Maheshwari wrote: > > > > > > On Thu, Jun 30, 2011 at 2:08 PM, Mihael Hategan > > wrote: > > You may want to start by looking at the gram job manager logs > > which are > > in ~/ on ranger and should be named in an obvious manner. > > > > I tried to look into these logs. They do not show any error except > > these lines: > > > /teragrid/gram5-5.0.2-r2/tmp/gram_job_state/job.gridftp1.ranger.tacc.utexas.edu.16145744013955882016.17606392074284836744 > > > > msg="State file not owned by me" status=-121 errno=0 > > > > reason="Success" > > > > > > Which, according to documentation of Gram are not really errors. > > > > > > > > You may also want to try do some sanity checks by submitting a > > plain > > gt2:sge job to ranger. > > > > Tried doing this from ranger login nodes but getting this: > > I mean plain globus. Like globusrun. Or swift without coasters. > > > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From aespinosa at cs.uchicago.edu Thu Jun 30 16:31:02 2011 From: aespinosa at cs.uchicago.edu (Allan Espinosa) Date: Thu, 30 Jun 2011 15:31:02 -0600 Subject: [Swift-devel] finding the execution sites from swift logs In-Reply-To: References: Message-ID: Make sure you have log4j.logger.swift=DEBUG when you run jobs. -Allan 2011/6/30 Ketan Maheshwari : > Thanks Allan. > > ?In some log files, I do not see any execute2 lines. However, I do see > JOB_START lines ending with host=. I confirmed that the logs > indeed belong to successful complete runs. While in other logs corresponding > to other successful runs, I do see the execute2 lines. > > All the runs were carried out using the same version of Swift. Am I missing > something here? > > Ketan > > > On Thu, Jun 30, 2011 at 11:45 AM, Allan Espinosa > wrote: >> >> Hi Ketan, >> >> What I do is match the jobnames with the sites by matching execute2 >> log information. ?I have a set of R and ruby scripts in >> ~aespinosa/Documents/swift that analyzes the number of transfers per >> site. ?You can modify them to look for job execution. >> >> If you look at one of the makefile targets in libexec/log-processing, >> you can find files with the name 'color'. ?these targets to swift plot >> logs colors the plot per site. >> >> To just explore the general statistics, I obtain the execute2.event >> file from the log using >> >> $swift-plot-log logfile execute2.event >> >> And then just use R to analyze the *.event file. >> >> 2011/6/30 Ketan Maheshwari : >> > Hello, >> > >> > Does anyone knows from swift log, how to find how many jobs executed on >> > a >> > given site when there is a mix of localhost and osg sites? >> > >> > I have logs of many runs each with about 3400 app tasks ran on localhost >> > + >> > osg sites. >> > >> > Trying to look into log and find that there are multiple messages >> > corresponding to staging in-out, run and other events (change in score, >> > thread associations, etc.). >> > >> > What should I be looking to identify each job uniquely? >> > >> > Thanks for any help on this. >> > >> > Regards, >> > -- >> > Ketan >> > >> > >> > >> > _______________________________________________ >> > Swift-devel mailing list >> > Swift-devel at ci.uchicago.edu >> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> > >> > >> >> >> >> -- >> Allan M. Espinosa >> PhD student, Computer Science >> University of Chicago > > > > -- > Ketan > > > -- Allan M. Espinosa PhD student, Computer Science University of Chicago From ketancmaheshwari at gmail.com Thu Jun 30 16:33:15 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Thu, 30 Jun 2011 16:33:15 -0500 Subject: [Swift-devel] finding the execution sites from swift logs In-Reply-To: References: Message-ID: Yes, that is enabled. On Thu, Jun 30, 2011 at 4:31 PM, Allan Espinosa wrote: > Make sure you have log4j.logger.swift=DEBUG when you run jobs. > > -Allan > > 2011/6/30 Ketan Maheshwari : > > Thanks Allan. > > > > In some log files, I do not see any execute2 lines. However, I do see > > JOB_START lines ending with host=. I confirmed that the logs > > indeed belong to successful complete runs. While in other logs > corresponding > > to other successful runs, I do see the execute2 lines. > > > > All the runs were carried out using the same version of Swift. Am I > missing > > something here? > > > > Ketan > > > > > > On Thu, Jun 30, 2011 at 11:45 AM, Allan Espinosa < > aespinosa at cs.uchicago.edu> > > wrote: > >> > >> Hi Ketan, > >> > >> What I do is match the jobnames with the sites by matching execute2 > >> log information. I have a set of R and ruby scripts in > >> ~aespinosa/Documents/swift that analyzes the number of transfers per > >> site. You can modify them to look for job execution. > >> > >> If you look at one of the makefile targets in libexec/log-processing, > >> you can find files with the name 'color'. these targets to swift plot > >> logs colors the plot per site. > >> > >> To just explore the general statistics, I obtain the execute2.event > >> file from the log using > >> > >> $swift-plot-log logfile execute2.event > >> > >> And then just use R to analyze the *.event file. > >> > >> 2011/6/30 Ketan Maheshwari : > >> > Hello, > >> > > >> > Does anyone knows from swift log, how to find how many jobs executed > on > >> > a > >> > given site when there is a mix of localhost and osg sites? > >> > > >> > I have logs of many runs each with about 3400 app tasks ran on > localhost > >> > + > >> > osg sites. > >> > > >> > Trying to look into log and find that there are multiple messages > >> > corresponding to staging in-out, run and other events (change in > score, > >> > thread associations, etc.). > >> > > >> > What should I be looking to identify each job uniquely? > >> > > >> > Thanks for any help on this. > >> > > >> > Regards, > >> > -- > >> > Ketan > >> > > >> > > >> > > >> > _______________________________________________ > >> > Swift-devel mailing list > >> > Swift-devel at ci.uchicago.edu > >> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >> > > >> > > >> > >> > >> > >> -- > >> Allan M. Espinosa > >> PhD student, Computer Science > >> University of Chicago > > > > > > > > -- > > Ketan > > > > > > > > > > -- > Allan M. Espinosa > PhD student, Computer Science > University of Chicago > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Thu Jun 30 16:48:02 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 30 Jun 2011 16:48:02 -0500 Subject: [Swift-devel] swift on ranger In-Reply-To: References: Message-ID: <1309470482.6216.0.camel@blabla> You should probably enable debugging for the sge provider. The package is: org.globus.cog.abstraction.impl.scheduler Then repost logs from runs with that enabled. Might show relevant stuff. On Thu, 2011-06-30 at 14:04 -0500, Ketan Maheshwari wrote: > Hi, > > I am trying to run a simple catsn swift workflow on ranger/teragrid. > However, seems the SGE job does not get created on ranger host. > > The staging-in of input files do happen from the communicado host to > ranger. The stdout on Swift shows status as job submitted, however > qsub on ranger does not show any jobs being submitted. > > I am using the GRAM coaster provider: gt2:gt2:SGE > > The logs and configuration files (cf, tc, sites.xml) for this run can > be found on CI network here: > > /home/ketan/osg-tg-effort/ranger-catsn > > > Any help or tips to debug this further would be very useful. > > -- > Ketan > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From ketancmaheshwari at gmail.com Thu Jun 30 16:52:39 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Thu, 30 Jun 2011 16:52:39 -0500 Subject: [Swift-devel] swift on ranger In-Reply-To: <1309470482.6216.0.camel@blabla> References: <1309470482.6216.0.camel@blabla> Message-ID: On Thu, Jun 30, 2011 at 4:48 PM, Mihael Hategan wrote: > You should probably enable debugging for the sge provider. The package > is: > > org.globus.cog.abstraction.impl.scheduler > I do not know how to do this. Should I put a line in log4j.properties: org.globus.cog.abstraction.impl.scheduler=DEBUG or make changes in the code? > > Then repost logs from runs with that enabled. Might show relevant stuff. > > On Thu, 2011-06-30 at 14:04 -0500, Ketan Maheshwari wrote: > > Hi, > > > > I am trying to run a simple catsn swift workflow on ranger/teragrid. > > However, seems the SGE job does not get created on ranger host. > > > > The staging-in of input files do happen from the communicado host to > > ranger. The stdout on Swift shows status as job submitted, however > > qsub on ranger does not show any jobs being submitted. > > > > I am using the GRAM coaster provider: gt2:gt2:SGE > > > > The logs and configuration files (cf, tc, sites.xml) for this run can > > be found on CI network here: > > > > /home/ketan/osg-tg-effort/ranger-catsn > > > > > > Any help or tips to debug this further would be very useful. > > > > -- > > Ketan > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Thu Jun 30 17:01:55 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 30 Jun 2011 17:01:55 -0500 Subject: [Swift-devel] swift on ranger In-Reply-To: References: <1309470482.6216.0.camel@blabla> Message-ID: <1309471315.8163.1.camel@blabla> On Thu, 2011-06-30 at 16:52 -0500, Ketan Maheshwari wrote: > > > On Thu, Jun 30, 2011 at 4:48 PM, Mihael Hategan > wrote: > You should probably enable debugging for the sge provider. The > package > is: > > org.globus.cog.abstraction.impl.scheduler > > I do not know how to do this. Should I put a line in log4j.properties: > > org.globus.cog.abstraction.impl.scheduler=DEBUG Something like that, except all other lines start with log4j.logger and that seems to work, so you should have: log4j.logger.org.globus.cog.abstraction.impl.scheduler=DEBUG From ketancmaheshwari at gmail.com Thu Jun 30 17:14:43 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Thu, 30 Jun 2011 17:14:43 -0500 Subject: [Swift-devel] swift on ranger In-Reply-To: <1309470482.6216.0.camel@blabla> References: <1309470482.6216.0.camel@blabla> Message-ID: > You should probably enable debugging for the sge provider. The package > is: > > org.globus.cog.abstraction.impl.scheduler > > Then repost logs from runs with that enabled. Might show relevant stuff. > /home/ketan/osg-tg-effort/ranger-catsn/catsn-20110630-1703-0jfq9ix4.log -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Thu Jun 30 17:19:51 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 30 Jun 2011 17:19:51 -0500 Subject: [Swift-devel] swift on ranger In-Reply-To: References: <1309470482.6216.0.camel@blabla> Message-ID: <1309472391.8399.0.camel@blabla> On Thu, 2011-06-30 at 17:14 -0500, Ketan Maheshwari wrote: > > You should probably enable debugging for the sge provider. The > package > is: > > org.globus.cog.abstraction.impl.scheduler > > Then repost logs from runs with that enabled. Might show > relevant stuff. > > /home/ketan/osg-tg-effort/ranger-catsn/catsn-20110630-1703-0jfq9ix4.log > That doesn't seem to have worked. You should change the log4j in swift/dist/swift*/etc. From ketancmaheshwari at gmail.com Thu Jun 30 17:30:49 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Thu, 30 Jun 2011 17:30:49 -0500 Subject: [Swift-devel] swift on ranger In-Reply-To: <1309472391.8399.0.camel@blabla> References: <1309470482.6216.0.camel@blabla> <1309472391.8399.0.camel@blabla> Message-ID: That doesn't seem to have worked. You should change the log4j in > swift/dist/swift*/etc. > > > sorry, now done. /home/ketan/osg-tg-effort/ranger-catsn/catsn-20110630-1722-15y2s6n6.log -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From aespinosa at cs.uchicago.edu Thu Jun 30 17:39:05 2011 From: aespinosa at cs.uchicago.edu (Allan Espinosa) Date: Thu, 30 Jun 2011 16:39:05 -0600 Subject: [Swift-devel] finding the execution sites from swift logs In-Reply-To: References: Message-ID: I think JOB_START are the vdl:execute2 ones if i remember correctly. 2011/6/30 Ketan Maheshwari : > Thanks Allan. > > ?In some log files, I do not see any execute2 lines. However, I do see > JOB_START lines ending with host=. I confirmed that the logs > indeed belong to successful complete runs. While in other logs corresponding > to other successful runs, I do see the execute2 lines. > > All the runs were carried out using the same version of Swift. Am I missing > something here? > > Ketan > > > On Thu, Jun 30, 2011 at 11:45 AM, Allan Espinosa > wrote: >> >> Hi Ketan, >> >> What I do is match the jobnames with the sites by matching execute2 >> log information. ?I have a set of R and ruby scripts in >> ~aespinosa/Documents/swift that analyzes the number of transfers per >> site. ?You can modify them to look for job execution. >> >> If you look at one of the makefile targets in libexec/log-processing, >> you can find files with the name 'color'. ?these targets to swift plot >> logs colors the plot per site. >> >> To just explore the general statistics, I obtain the execute2.event >> file from the log using >> >> $swift-plot-log logfile execute2.event >> >> And then just use R to analyze the *.event file. >> >> 2011/6/30 Ketan Maheshwari : >> > Hello, >> > >> > Does anyone knows from swift log, how to find how many jobs executed on >> > a >> > given site when there is a mix of localhost and osg sites? >> > >> > I have logs of many runs each with about 3400 app tasks ran on localhost >> > + >> > osg sites. >> > >> > Trying to look into log and find that there are multiple messages >> > corresponding to staging in-out, run and other events (change in score, >> > thread associations, etc.). >> > >> > What should I be looking to identify each job uniquely? >> > >> > Thanks for any help on this. >> > >> > Regards, >> > -- >> > Ketan >> > >> > >> > >> > _______________________________________________ >> > Swift-devel mailing list >> > Swift-devel at ci.uchicago.edu >> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> > >> > >> >> >> >> -- >> Allan M. Espinosa >> PhD student, Computer Science >> University of Chicago > > > > -- > Ketan > > > -- Allan M. Espinosa PhD student, Computer Science University of Chicago From ketancmaheshwari at gmail.com Thu Jun 30 17:43:11 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Thu, 30 Jun 2011 17:43:11 -0500 Subject: [Swift-devel] finding the execution sites from swift logs In-Reply-To: References: Message-ID: In addition to JOB_START, vdl:execute2 is associated with many events: THREAD_ASSOCIATION APPLICATION_EXCEPTION STAGING_OUT .. Ketan On Thu, Jun 30, 2011 at 5:39 PM, Allan Espinosa wrote: > I think JOB_START are the vdl:execute2 ones if i remember correctly. > > 2011/6/30 Ketan Maheshwari : > > Thanks Allan. > > > > In some log files, I do not see any execute2 lines. However, I do see > > JOB_START lines ending with host=. I confirmed that the logs > > indeed belong to successful complete runs. While in other logs > corresponding > > to other successful runs, I do see the execute2 lines. > > > > All the runs were carried out using the same version of Swift. Am I > missing > > something here? > > > > Ketan > > > > > > On Thu, Jun 30, 2011 at 11:45 AM, Allan Espinosa < > aespinosa at cs.uchicago.edu> > > wrote: > >> > >> Hi Ketan, > >> > >> What I do is match the jobnames with the sites by matching execute2 > >> log information. I have a set of R and ruby scripts in > >> ~aespinosa/Documents/swift that analyzes the number of transfers per > >> site. You can modify them to look for job execution. > >> > >> If you look at one of the makefile targets in libexec/log-processing, > >> you can find files with the name 'color'. these targets to swift plot > >> logs colors the plot per site. > >> > >> To just explore the general statistics, I obtain the execute2.event > >> file from the log using > >> > >> $swift-plot-log logfile execute2.event > >> > >> And then just use R to analyze the *.event file. > >> > >> 2011/6/30 Ketan Maheshwari : > >> > Hello, > >> > > >> > Does anyone knows from swift log, how to find how many jobs executed > on > >> > a > >> > given site when there is a mix of localhost and osg sites? > >> > > >> > I have logs of many runs each with about 3400 app tasks ran on > localhost > >> > + > >> > osg sites. > >> > > >> > Trying to look into log and find that there are multiple messages > >> > corresponding to staging in-out, run and other events (change in > score, > >> > thread associations, etc.). > >> > > >> > What should I be looking to identify each job uniquely? > >> > > >> > Thanks for any help on this. > >> > > >> > Regards, > >> > -- > >> > Ketan > >> > > >> > > >> > > >> > _______________________________________________ > >> > Swift-devel mailing list > >> > Swift-devel at ci.uchicago.edu > >> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >> > > >> > > >> > >> > >> > >> -- > >> Allan M. Espinosa > >> PhD student, Computer Science > >> University of Chicago > > > > > > > > -- > > Ketan > > > > > > > > > > -- > Allan M. Espinosa > PhD student, Computer Science > University of Chicago > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Thu Jun 30 17:59:22 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 30 Jun 2011 17:59:22 -0500 Subject: [Swift-devel] swift on ranger In-Reply-To: References: <1309470482.6216.0.camel@blabla> <1309472391.8399.0.camel@blabla> Message-ID: <1309474762.8546.0.camel@blabla> Still nothing. Are you running with gt2:gt2:sge or gt2:sge? On Thu, 2011-06-30 at 17:30 -0500, Ketan Maheshwari wrote: > That doesn't seem to have worked. You should change the log4j in > swift/dist/swift*/etc. > > > sorry, now done. > > > /home/ketan/osg-tg-effort/ranger-catsn/catsn-20110630-1722-15y2s6n6.log > > > -- > Ketan > > From wilde at mcs.anl.gov Thu Jun 30 18:35:29 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 30 Jun 2011 18:35:29 -0500 (CDT) Subject: [Swift-devel] swift on ranger In-Reply-To: <1309474762.8546.0.camel@blabla> Message-ID: <1519288819.59157.1309476929219.JavaMail.root@zimbra.anl.gov> If you are running under gt2 anything, I dont think you're going through the SGE provider, are you? So I would not expect logs from it. I think we have a problem here with some GT2 setting in the pool, Swift client, or environment (eg port range; GLOBUS_HOSTNAME, etc). Or an attributing that is causing bad RSL? - Mike ----- Original Message ----- > Still nothing. Are you running with gt2:gt2:sge or gt2:sge? > > On Thu, 2011-06-30 at 17:30 -0500, Ketan Maheshwari wrote: > > That doesn't seem to have worked. You should change the log4j in > > swift/dist/swift*/etc. > > > > > > sorry, now done. > > > > > > /home/ketan/osg-tg-effort/ranger-catsn/catsn-20110630-1722-15y2s6n6.log > > > > > > -- > > Ketan > > > > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From wilde at mcs.anl.gov Thu Jun 30 18:37:57 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 30 Jun 2011 18:37:57 -0500 (CDT) Subject: [Swift-devel] swift on ranger In-Reply-To: <1519288819.59157.1309476929219.JavaMail.root@zimbra.anl.gov> Message-ID: <964949295.59168.1309477077485.JavaMail.root@zimbra.anl.gov> Ketan, Can you try a single catsn job with just the GT2 provider, first to the Ranger fork job manager and then to the SGE jobmanager? - Mike ----- Original Message ----- > If you are running under gt2 anything, I dont think you're going > through the SGE provider, are you? So I would not expect logs from it. > I think we have a problem here with some GT2 setting in the pool, > Swift client, or environment (eg port range; GLOBUS_HOSTNAME, etc). Or > an attributing that is causing bad RSL? > > - Mike > > > ----- Original Message ----- > > Still nothing. Are you running with gt2:gt2:sge or gt2:sge? > > > > On Thu, 2011-06-30 at 17:30 -0500, Ketan Maheshwari wrote: > > > That doesn't seem to have worked. You should change the log4j in > > > swift/dist/swift*/etc. > > > > > > > > > sorry, now done. > > > > > > > > > /home/ketan/osg-tg-effort/ranger-catsn/catsn-20110630-1722-15y2s6n6.log > > > > > > > > > -- > > > Ketan > > > > > > > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From ketancmaheshwari at gmail.com Thu Jun 30 19:44:25 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Thu, 30 Jun 2011 19:44:25 -0500 Subject: [Swift-devel] swift on ranger In-Reply-To: <964949295.59168.1309477077485.JavaMail.root@zimbra.anl.gov> References: <1519288819.59157.1309476929219.JavaMail.root@zimbra.anl.gov> <964949295.59168.1309477077485.JavaMail.root@zimbra.anl.gov> Message-ID: Mike, All, gt2:fork works. gt2:SGE doesn't, I get similar stage-in, submitted, submitted ... as with gt2:gt2:SGE On Thu, Jun 30, 2011 at 6:37 PM, Michael Wilde wrote: > Ketan, > > Can you try a single catsn job with just the GT2 provider, first to the > Ranger fork job manager and then to the SGE jobmanager? > > - Mike > > ----- Original Message ----- > > If you are running under gt2 anything, I dont think you're going > > through the SGE provider, are you? So I would not expect logs from it. > > I think we have a problem here with some GT2 setting in the pool, > > Swift client, or environment (eg port range; GLOBUS_HOSTNAME, etc). Or > > an attributing that is causing bad RSL? > > > > - Mike > > > > > > ----- Original Message ----- > > > Still nothing. Are you running with gt2:gt2:sge or gt2:sge? > > > > > > On Thu, 2011-06-30 at 17:30 -0500, Ketan Maheshwari wrote: > > > > That doesn't seem to have worked. You should change the log4j in > > > > swift/dist/swift*/etc. > > > > > > > > > > > > sorry, now done. > > > > > > > > > > > > > /home/ketan/osg-tg-effort/ranger-catsn/catsn-20110630-1722-15y2s6n6.log > > > > > > > > > > > > -- > > > > Ketan > > > > > > > > > > > > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From aespinosa at cs.uchicago.edu Thu Jun 30 19:52:22 2011 From: aespinosa at cs.uchicago.edu (Allan Espinosa) Date: Thu, 30 Jun 2011 18:52:22 -0600 Subject: [Swift-devel] finding the execution sites from swift logs In-Reply-To: References: Message-ID: Right. the execute2.event target in swift-plot-log gets the time duration between JOB_START and JOB_END (correct me if i'm wrong Ben). With that you could get how many jobs were sent to each site. -Allan 2011/6/30 Ketan Maheshwari : > In addition to JOB_START, vdl:execute2 is associated with many events: > > THREAD_ASSOCIATION > APPLICATION_EXCEPTION > STAGING_OUT > .. > > Ketan > > On Thu, Jun 30, 2011 at 5:39 PM, Allan Espinosa > wrote: >> >> I think JOB_START are the vdl:execute2 ones if i remember correctly. >> >> 2011/6/30 Ketan Maheshwari : >> > Thanks Allan. >> > >> > ?In some log files, I do not see any execute2 lines. However, I do see >> > JOB_START lines ending with host=. I confirmed that the logs >> > indeed belong to successful complete runs. While in other logs >> > corresponding >> > to other successful runs, I do see the execute2 lines. >> > >> > All the runs were carried out using the same version of Swift. Am I >> > missing >> > something here? >> > >> > Ketan >> > >> > >> > On Thu, Jun 30, 2011 at 11:45 AM, Allan Espinosa >> > >> > wrote: >> >> >> >> Hi Ketan, >> >> >> >> What I do is match the jobnames with the sites by matching execute2 >> >> log information. ?I have a set of R and ruby scripts in >> >> ~aespinosa/Documents/swift that analyzes the number of transfers per >> >> site. ?You can modify them to look for job execution. >> >> >> >> If you look at one of the makefile targets in libexec/log-processing, >> >> you can find files with the name 'color'. ?these targets to swift plot >> >> logs colors the plot per site. >> >> >> >> To just explore the general statistics, I obtain the execute2.event >> >> file from the log using >> >> >> >> $swift-plot-log logfile execute2.event >> >> >> >> And then just use R to analyze the *.event file. >> >> >> >> 2011/6/30 Ketan Maheshwari : >> >> > Hello, >> >> > >> >> > Does anyone knows from swift log, how to find how many jobs executed >> >> > on >> >> > a >> >> > given site when there is a mix of localhost and osg sites? >> >> > >> >> > I have logs of many runs each with about 3400 app tasks ran on >> >> > localhost >> >> > + >> >> > osg sites. >> >> > >> >> > Trying to look into log and find that there are multiple messages >> >> > corresponding to staging in-out, run and other events (change in >> >> > score, >> >> > thread associations, etc.). >> >> > >> >> > What should I be looking to identify each job uniquely? >> >> > >> >> > Thanks for any help on this. >> >> > >> >> > Regards, >> >> > -- >> >> > Ketan >> >> > >> >> > >> >> > >> >> > _______________________________________________ >> >> > Swift-devel mailing list >> >> > Swift-devel at ci.uchicago.edu >> >> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> >> > >> >> > >> >> >> >> >> >> >> >> -- >> >> Allan M. Espinosa >> >> PhD student, Computer Science >> >> University of Chicago >> > >> > >> > >> > -- >> > Ketan >> > >> > >> > >> >> >> >> -- >> Allan M. Espinosa >> PhD student, Computer Science >> University of Chicago > > > > -- > Ketan > > > -- Allan M. Espinosa PhD student, Computer Science University of Chicago From wilde at mcs.anl.gov Thu Jun 30 20:05:03 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 30 Jun 2011 20:05:03 -0500 (CDT) Subject: [Swift-devel] swift on ranger In-Reply-To: Message-ID: <1180827525.59292.1309482303335.JavaMail.root@zimbra.anl.gov> Can you send the Ranger-side GRAM log from the run, as well as a pointer to the log and site file? I meant, by the way, a non-coaster run with the GT2 provider and jobmanager SGE. A Condor-G test to both the fork and SGE jobmanagers would be helpful too. - Mike ----- Original Message ----- Mike, All, gt2:fork works. gt2:SGE doesn't, I get similar stage-in, submitted, submitted ... as with gt2:gt2:SGE On Thu, Jun 30, 2011 at 6:37 PM, Michael Wilde < wilde at mcs.anl.gov > wrote: Ketan, Can you try a single catsn job with just the GT2 provider, first to the Ranger fork job manager and then to the SGE jobmanager? - Mike ----- Original Message ----- > If you are running under gt2 anything, I dont think you're going > through the SGE provider, are you? So I would not expect logs from it. > I think we have a problem here with some GT2 setting in the pool, > Swift client, or environment (eg port range; GLOBUS_HOSTNAME, etc). Or > an attributing that is causing bad RSL? > > - Mike > > > ----- Original Message ----- > > Still nothing. Are you running with gt2:gt2:sge or gt2:sge? > > > > On Thu, 2011-06-30 at 17:30 -0500, Ketan Maheshwari wrote: > > > That doesn't seem to have worked. You should change the log4j in > > > swift/dist/swift*/etc. > > > > > > > > > sorry, now done. > > > > > > > > > /home/ketan/osg-tg-effort/ranger-catsn/catsn-20110630-1722-15y2s6n6.log > > > > > > > > > -- > > > Ketan > > > > > > > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory _______________________________________________ Swift-devel mailing list Swift-devel at ci.uchicago.edu https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Ketan -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Thu Jun 30 20:37:19 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Thu, 30 Jun 2011 20:37:19 -0500 Subject: [Swift-devel] swift on ranger In-Reply-To: <1180827525.59292.1309482303335.JavaMail.root@zimbra.anl.gov> References: <1180827525.59292.1309482303335.JavaMail.root@zimbra.anl.gov> Message-ID: I tried provider=gt2 and gt4 but get following: case: provider=gt2, jobManager="gt2:fork" RunID: 20110630-2018-sjjtfjs8 Progress: Progress: Stage in:1 Invalid GSSCredentials Caused by: org.globus.cog.abstraction.impl.common.task.InvalidSecurityContextException: Invalid GSSCredentials Caused by: GSSException: Invalid name provided [Caused by: [JGLOBUS-112] Malformed name, "=" missing in "fork"] Failed to transfer wrapper log from catsn-20110630-2018-sjjtfjs8/info/c on RANGER Progress: Stage in:1 Invalid GSSCredentials Caused by: org.globus.cog.abstraction.impl.common.task.InvalidSecurityContextException: Invalid GSSCredentials Caused by: GSSException: Invalid name provided [Caused by: [JGLOBUS-112] Malformed name, "=" missing in "fork"] Failed to transfer wrapper log from catsn-20110630-2018-sjjtfjs8/info/e on RANGER Progress: Failed:1 Exception in cat: Arguments: [data.txt] Host: RANGER Directory: catsn-20110630-2018-sjjtfjs8/jobs/e/cat-e1g55cck stderr.txt: stdout.txt: log is at : /home/ketan/osg-tg-effort/ranger-catsn/catsn-20110630-2033-l3zo296g.log no gram log was created. ======================== case: provider=gt4, jobManager="gt2:fork" RunID: 20110630-2019-tn0n27kb Progress: Progress: Stage in:1 Progress: Submitted:1 Cannot submit job: The AXIS engine could not find a target service to invoke! targetService is ManagedJobFactoryService Caused by: org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Cannot submit job: The AXIS engine could not find a target service to invoke! targetService is ManagedJobFactoryService Caused by: The AXIS engine could not find a target service to invoke! targetService is ManagedJobFactoryService Failed to transfer wrapper log from catsn-20110630-2019-tn0n27kb/info/5 on RANGER Progress: Stage in:1 Cannot submit job: The AXIS engine could not find a target service to invoke! targetService is ManagedJobFactoryService Caused by: org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Cannot submit job: The AXIS engine could not find a target service to invoke! targetService is ManagedJobFactoryService Caused by: The AXIS engine could not find a target service to invoke! targetService is ManagedJobFactoryService the log is: /home/ketan/osg-tg-effort/ranger-catsn/catsn-20110630-2036-zcqs7td6.log On Thu, Jun 30, 2011 at 8:05 PM, Michael Wilde wrote: > Can you send the Ranger-side GRAM log from the run, as well as a pointer to > the log and site file? > > I meant, by the way, a non-coaster run with the GT2 provider and jobmanager > SGE. > > A Condor-G test to both the fork and SGE jobmanagers would be helpful too. > > - Mike > > > ------------------------------ > > Mike, All, > > gt2:fork works. > > gt2:SGE doesn't, I get similar stage-in, submitted, submitted ... as with > gt2:gt2:SGE > > > On Thu, Jun 30, 2011 at 6:37 PM, Michael Wilde wrote: > >> Ketan, >> >> Can you try a single catsn job with just the GT2 provider, first to the >> Ranger fork job manager and then to the SGE jobmanager? >> >> - Mike >> >> ----- Original Message ----- >> > If you are running under gt2 anything, I dont think you're going >> > through the SGE provider, are you? So I would not expect logs from it. >> > I think we have a problem here with some GT2 setting in the pool, >> > Swift client, or environment (eg port range; GLOBUS_HOSTNAME, etc). Or >> > an attributing that is causing bad RSL? >> > >> > - Mike >> > >> > >> > ----- Original Message ----- >> > > Still nothing. Are you running with gt2:gt2:sge or gt2:sge? >> > > >> > > On Thu, 2011-06-30 at 17:30 -0500, Ketan Maheshwari wrote: >> > > > That doesn't seem to have worked. You should change the log4j in >> > > > swift/dist/swift*/etc. >> > > > >> > > > >> > > > sorry, now done. >> > > > >> > > > >> > > > >> /home/ketan/osg-tg-effort/ranger-catsn/catsn-20110630-1722-15y2s6n6.log >> > > > >> > > > >> > > > -- >> > > > Ketan >> > > > >> > > > >> > > >> > > >> > > _______________________________________________ >> > > Swift-devel mailing list >> > > Swift-devel at ci.uchicago.edu >> > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> > >> > -- >> > Michael Wilde >> > Computation Institute, University of Chicago >> > Mathematics and Computer Science Division >> > Argonne National Laboratory >> > >> > _______________________________________________ >> > Swift-devel mailing list >> > Swift-devel at ci.uchicago.edu >> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> >> -- >> Michael Wilde >> Computation Institute, University of Chicago >> Mathematics and Computer Science Division >> Argonne National Laboratory >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> > > > > -- > Ketan > > > > > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From skenny at uchicago.edu Thu Jun 30 21:33:34 2011 From: skenny at uchicago.edu (Sarah Kenny) Date: Thu, 30 Jun 2011 19:33:34 -0700 Subject: [Swift-devel] swift on ranger In-Reply-To: References: <1180827525.59292.1309482303335.JavaMail.root@zimbra.anl.gov> Message-ID: i'm able to get jobs into the queue with this sites file: config> 1 10 TG-DBS080004N 240 normal /work/00043/tg457040/sidgrid_out/{username} using gt2 for the exec provider...but yeah, can't get in the queue with coasters On Thu, Jun 30, 2011 at 6:37 PM, Ketan Maheshwari < ketancmaheshwari at gmail.com> wrote: > I tried provider=gt2 and gt4 but get following: > > case: provider=gt2, jobManager="gt2:fork" > > RunID: 20110630-2018-sjjtfjs8 > > Progress: > Progress: Stage in:1 > Invalid GSSCredentials > Caused by: > org.globus.cog.abstraction.impl.common.task.InvalidSecurityContextException: > Invalid GSSCredentials > Caused by: GSSException: Invalid name provided [Caused by: [JGLOBUS-112] > Malformed name, "=" missing in "fork"] > Failed to transfer wrapper log from catsn-20110630-2018-sjjtfjs8/info/c on > RANGER > Progress: Stage in:1 > Invalid GSSCredentials > Caused by: > org.globus.cog.abstraction.impl.common.task.InvalidSecurityContextException: > Invalid GSSCredentials > Caused by: GSSException: Invalid name provided [Caused by: [JGLOBUS-112] > Malformed name, "=" missing in "fork"] > Failed to transfer wrapper log from catsn-20110630-2018-sjjtfjs8/info/e on > RANGER > Progress: Failed:1 > Exception in cat: > Arguments: [data.txt] > Host: RANGER > Directory: catsn-20110630-2018-sjjtfjs8/jobs/e/cat-e1g55cck > stderr.txt: > > stdout.txt: > > log is at : > /home/ketan/osg-tg-effort/ranger-catsn/catsn-20110630-2033-l3zo296g.log > > no gram log was created. > > ======================== > > case: provider=gt4, jobManager="gt2:fork" > > RunID: 20110630-2019-tn0n27kb > > Progress: > Progress: Stage in:1 > Progress: Submitted:1 > Cannot submit job: The AXIS engine could not find a target service to > invoke! targetService is ManagedJobFactoryService > Caused by: > org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Cannot > submit job: The AXIS engine could not find a target service to invoke! > targetService is ManagedJobFactoryService > Caused by: The AXIS engine could not find a target service to invoke! > targetService is ManagedJobFactoryService > Failed to transfer wrapper log from catsn-20110630-2019-tn0n27kb/info/5 on > RANGER > Progress: Stage in:1 > Cannot submit job: The AXIS engine could not find a target service to > invoke! targetService is ManagedJobFactoryService > Caused by: > org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Cannot > submit job: The AXIS engine could not find a target service to invoke! > targetService is ManagedJobFactoryService > Caused by: The AXIS engine could not find a target service to invoke! > targetService is ManagedJobFactoryService > > the log is: > /home/ketan/osg-tg-effort/ranger-catsn/catsn-20110630-2036-zcqs7td6.log > > > > > On Thu, Jun 30, 2011 at 8:05 PM, Michael Wilde wrote: > >> Can you send the Ranger-side GRAM log from the run, as well as a pointer >> to the log and site file? >> >> I meant, by the way, a non-coaster run with the GT2 provider and >> jobmanager SGE. >> >> A Condor-G test to both the fork and SGE jobmanagers would be helpful too. >> >> - Mike >> >> >> ------------------------------ >> >> Mike, All, >> >> gt2:fork works. >> >> gt2:SGE doesn't, I get similar stage-in, submitted, submitted ... as with >> gt2:gt2:SGE >> >> >> On Thu, Jun 30, 2011 at 6:37 PM, Michael Wilde wrote: >> >>> Ketan, >>> >>> Can you try a single catsn job with just the GT2 provider, first to the >>> Ranger fork job manager and then to the SGE jobmanager? >>> >>> - Mike >>> >>> ----- Original Message ----- >>> > If you are running under gt2 anything, I dont think you're going >>> > through the SGE provider, are you? So I would not expect logs from it. >>> > I think we have a problem here with some GT2 setting in the pool, >>> > Swift client, or environment (eg port range; GLOBUS_HOSTNAME, etc). Or >>> > an attributing that is causing bad RSL? >>> > >>> > - Mike >>> > >>> > >>> > ----- Original Message ----- >>> > > Still nothing. Are you running with gt2:gt2:sge or gt2:sge? >>> > > >>> > > On Thu, 2011-06-30 at 17:30 -0500, Ketan Maheshwari wrote: >>> > > > That doesn't seem to have worked. You should change the log4j in >>> > > > swift/dist/swift*/etc. >>> > > > >>> > > > >>> > > > sorry, now done. >>> > > > >>> > > > >>> > > > >>> /home/ketan/osg-tg-effort/ranger-catsn/catsn-20110630-1722-15y2s6n6.log >>> > > > >>> > > > >>> > > > -- >>> > > > Ketan >>> > > > >>> > > > >>> > > >>> > > >>> > > _______________________________________________ >>> > > Swift-devel mailing list >>> > > Swift-devel at ci.uchicago.edu >>> > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>> > >>> > -- >>> > Michael Wilde >>> > Computation Institute, University of Chicago >>> > Mathematics and Computer Science Division >>> > Argonne National Laboratory >>> > >>> > _______________________________________________ >>> > Swift-devel mailing list >>> > Swift-devel at ci.uchicago.edu >>> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>> >>> -- >>> Michael Wilde >>> Computation Institute, University of Chicago >>> Mathematics and Computer Science Division >>> Argonne National Laboratory >>> >>> _______________________________________________ >>> Swift-devel mailing list >>> Swift-devel at ci.uchicago.edu >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>> >> >> >> >> -- >> Ketan >> >> >> >> >> >> -- >> Michael Wilde >> Computation Institute, University of Chicago >> Mathematics and Computer Science Division >> Argonne National Laboratory >> >> > > > -- > Ketan > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Thu Jun 30 21:49:48 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Thu, 30 Jun 2011 21:49:48 -0500 Subject: [Swift-devel] swift on ranger In-Reply-To: References: <1180827525.59292.1309482303335.JavaMail.root@zimbra.anl.gov> Message-ID: On Thu, Jun 30, 2011 at 9:33 PM, Sarah Kenny wrote: > i'm able to get jobs into the queue with this sites file: > > config> > > > > > > > > 1 > 10 > TG-DBS080004N > 240 > normal > > > > /work/00043/tg457040/sidgrid_out/{username} > > > > using gt2 for the exec provider...but yeah, can't get in the queue with > coasters The above works for me too. > > > On Thu, Jun 30, 2011 at 6:37 PM, Ketan Maheshwari < > ketancmaheshwari at gmail.com> wrote: > >> I tried provider=gt2 and gt4 but get following: >> >> case: provider=gt2, jobManager="gt2:fork" >> >> RunID: 20110630-2018-sjjtfjs8 >> >> Progress: >> Progress: Stage in:1 >> Invalid GSSCredentials >> Caused by: >> org.globus.cog.abstraction.impl.common.task.InvalidSecurityContextException: >> Invalid GSSCredentials >> Caused by: GSSException: Invalid name provided [Caused by: [JGLOBUS-112] >> Malformed name, "=" missing in "fork"] >> Failed to transfer wrapper log from catsn-20110630-2018-sjjtfjs8/info/c on >> RANGER >> Progress: Stage in:1 >> Invalid GSSCredentials >> Caused by: >> org.globus.cog.abstraction.impl.common.task.InvalidSecurityContextException: >> Invalid GSSCredentials >> Caused by: GSSException: Invalid name provided [Caused by: [JGLOBUS-112] >> Malformed name, "=" missing in "fork"] >> Failed to transfer wrapper log from catsn-20110630-2018-sjjtfjs8/info/e on >> RANGER >> Progress: Failed:1 >> Exception in cat: >> Arguments: [data.txt] >> Host: RANGER >> Directory: catsn-20110630-2018-sjjtfjs8/jobs/e/cat-e1g55cck >> stderr.txt: >> >> stdout.txt: >> >> log is at : >> /home/ketan/osg-tg-effort/ranger-catsn/catsn-20110630-2033-l3zo296g.log >> >> no gram log was created. >> >> ======================== >> >> case: provider=gt4, jobManager="gt2:fork" >> >> RunID: 20110630-2019-tn0n27kb >> >> Progress: >> Progress: Stage in:1 >> Progress: Submitted:1 >> Cannot submit job: The AXIS engine could not find a target service to >> invoke! targetService is ManagedJobFactoryService >> Caused by: >> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Cannot >> submit job: The AXIS engine could not find a target service to invoke! >> targetService is ManagedJobFactoryService >> Caused by: The AXIS engine could not find a target service to invoke! >> targetService is ManagedJobFactoryService >> Failed to transfer wrapper log from catsn-20110630-2019-tn0n27kb/info/5 on >> RANGER >> Progress: Stage in:1 >> Cannot submit job: The AXIS engine could not find a target service to >> invoke! targetService is ManagedJobFactoryService >> Caused by: >> org.globus.cog.abstraction.impl.common.task.TaskSubmissionException: Cannot >> submit job: The AXIS engine could not find a target service to invoke! >> targetService is ManagedJobFactoryService >> Caused by: The AXIS engine could not find a target service to invoke! >> targetService is ManagedJobFactoryService >> >> the log is: >> /home/ketan/osg-tg-effort/ranger-catsn/catsn-20110630-2036-zcqs7td6.log >> >> >> >> >> On Thu, Jun 30, 2011 at 8:05 PM, Michael Wilde wrote: >> >>> Can you send the Ranger-side GRAM log from the run, as well as a pointer >>> to the log and site file? >>> >>> I meant, by the way, a non-coaster run with the GT2 provider and >>> jobmanager SGE. >>> >>> A Condor-G test to both the fork and SGE jobmanagers would be helpful >>> too. >>> >>> - Mike >>> >>> >>> ------------------------------ >>> >>> Mike, All, >>> >>> gt2:fork works. >>> >>> gt2:SGE doesn't, I get similar stage-in, submitted, submitted ... as with >>> gt2:gt2:SGE >>> >>> >>> On Thu, Jun 30, 2011 at 6:37 PM, Michael Wilde wrote: >>> >>>> Ketan, >>>> >>>> Can you try a single catsn job with just the GT2 provider, first to the >>>> Ranger fork job manager and then to the SGE jobmanager? >>>> >>>> - Mike >>>> >>>> ----- Original Message ----- >>>> > If you are running under gt2 anything, I dont think you're going >>>> > through the SGE provider, are you? So I would not expect logs from it. >>>> > I think we have a problem here with some GT2 setting in the pool, >>>> > Swift client, or environment (eg port range; GLOBUS_HOSTNAME, etc). Or >>>> > an attributing that is causing bad RSL? >>>> > >>>> > - Mike >>>> > >>>> > >>>> > ----- Original Message ----- >>>> > > Still nothing. Are you running with gt2:gt2:sge or gt2:sge? >>>> > > >>>> > > On Thu, 2011-06-30 at 17:30 -0500, Ketan Maheshwari wrote: >>>> > > > That doesn't seem to have worked. You should change the log4j in >>>> > > > swift/dist/swift*/etc. >>>> > > > >>>> > > > >>>> > > > sorry, now done. >>>> > > > >>>> > > > >>>> > > > >>>> /home/ketan/osg-tg-effort/ranger-catsn/catsn-20110630-1722-15y2s6n6.log >>>> > > > >>>> > > > >>>> > > > -- >>>> > > > Ketan >>>> > > > >>>> > > > >>>> > > >>>> > > >>>> > > _______________________________________________ >>>> > > Swift-devel mailing list >>>> > > Swift-devel at ci.uchicago.edu >>>> > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>> > >>>> > -- >>>> > Michael Wilde >>>> > Computation Institute, University of Chicago >>>> > Mathematics and Computer Science Division >>>> > Argonne National Laboratory >>>> > >>>> > _______________________________________________ >>>> > Swift-devel mailing list >>>> > Swift-devel at ci.uchicago.edu >>>> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>> >>>> -- >>>> Michael Wilde >>>> Computation Institute, University of Chicago >>>> Mathematics and Computer Science Division >>>> Argonne National Laboratory >>>> >>>> _______________________________________________ >>>> Swift-devel mailing list >>>> Swift-devel at ci.uchicago.edu >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>> >>> >>> >>> >>> -- >>> Ketan >>> >>> >>> >>> >>> >>> -- >>> Michael Wilde >>> Computation Institute, University of Chicago >>> Mathematics and Computer Science Division >>> Argonne National Laboratory >>> >>> >> >> >> -- >> Ketan >> >> >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> >> > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Thu Jun 30 22:26:06 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 30 Jun 2011 22:26:06 -0500 Subject: [Swift-devel] swift on ranger In-Reply-To: References: <1519288819.59157.1309476929219.JavaMail.root@zimbra.anl.gov> <964949295.59168.1309477077485.JavaMail.root@zimbra.anl.gov> Message-ID: <1309490766.10647.1.camel@blabla> On Thu, 2011-06-30 at 19:44 -0500, Ketan Maheshwari wrote: > Mike, All, > > gt2:fork works. > > gt2:SGE doesn't, I get similar stage-in, submitted, submitted ... as > with gt2:gt2:SGE Yes, but gt2:SGE should send the blocks through the SGE provider and give you logs. Except you'd have to hack the coaster log4j.properties in cog/modules/provider-coaster/resources and re-compile swift. If you ran this locally from the ranger login node with local:sge, it would be sufficient to edit the swift/etc log4j file. From wilde at mcs.anl.gov Thu Jun 30 22:34:31 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 30 Jun 2011 22:34:31 -0500 (CDT) Subject: [Swift-devel] swift on ranger In-Reply-To: <1309490766.10647.1.camel@blabla> Message-ID: <662780010.59438.1309491271389.JavaMail.root@zimbra.anl.gov> Does debug=true in etc/provider-sge work, in the same manner it works for provider-pbs? ----- Original Message ----- > On Thu, 2011-06-30 at 19:44 -0500, Ketan Maheshwari wrote: > > Mike, All, > > > > gt2:fork works. > > > > gt2:SGE doesn't, I get similar stage-in, submitted, submitted ... as > > with gt2:gt2:SGE > > Yes, but gt2:SGE should send the blocks through the SGE provider and > give you logs. Except you'd have to hack the coaster log4j.properties > in > cog/modules/provider-coaster/resources and re-compile swift. > > If you ran this locally from the ranger login node with local:sge, it > would be sufficient to edit the swift/etc log4j file. -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From hategan at mcs.anl.gov Thu Jun 30 22:58:22 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 30 Jun 2011 22:58:22 -0500 Subject: [Swift-devel] swift on ranger In-Reply-To: <662780010.59438.1309491271389.JavaMail.root@zimbra.anl.gov> References: <662780010.59438.1309491271389.JavaMail.root@zimbra.anl.gov> Message-ID: <1309492750.11862.0.camel@blabla> On Thu, 2011-06-30 at 22:34 -0500, Michael Wilde wrote: > Does debug=true in etc/provider-sge work, in the same manner it works for provider-pbs? What I see in the trunk code is that logger.isDebugEnabled() is used as a test that decides whether the script is deleted at the end or not. This is in the abstract classes that are subclassed by both the pbs and sge providers, so it should apply to both. From hategan at mcs.anl.gov Thu Jun 30 22:58:34 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 30 Jun 2011 22:58:34 -0500 Subject: [Swift-devel] swift on ranger In-Reply-To: <662780010.59438.1309491271389.JavaMail.root@zimbra.anl.gov> References: <662780010.59438.1309491271389.JavaMail.root@zimbra.anl.gov> Message-ID: <1309492750.11862.1.camel@blabla> On Thu, 2011-06-30 at 22:34 -0500, Michael Wilde wrote: > Does debug=true in etc/provider-sge work, in the same manner it works for provider-pbs? What I see in the trunk code is that logger.isDebugEnabled() is used as a test that decides whether the script is deleted at the end or not. This is in the abstract classes that are subclassed by both the pbs and sge providers, so it should apply to both. From wilde at mcs.anl.gov Thu Jun 30 23:02:46 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 30 Jun 2011 23:02:46 -0500 (CDT) Subject: [Swift-devel] swift on ranger In-Reply-To: <1309492750.11862.1.camel@blabla> Message-ID: <466860430.59465.1309492966084.JavaMail.root@zimbra.anl.gov> With 0.92.1 I am able to run coasters local:sge on Ranger: login3$ swift -config cf -tc.file tc -sites.file sites06.xml catsn.swift -n=1 Swift svn swift-r4371 cog-r3096 RunID: 20110630-2250-pn3dfi67 Progress: Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Submitted:1 Progress: Active:1 Final status: Finished successfully:1 Canceling job 2013431 login3$ cat sites06.xml 1 16 16 16 16way 1800 00:15:00 TG-CHE110004 development 10000 .15 /share/home/00306/tg455797/swiftwork login3$ cat cf wrapperlog.always.transfer=true sitedir.keep=true execution.retries=0 lazy.errors=false status.mode=provider use.provider.staging=false provider.staging.pin.swiftfiles=false login3$ cat tc localhost sh /bin/sh null null null localhost cat /bin/cat null null null pbs cat /bin/cat null null null mcs cat /bin/cat null null null localhost catnap /home/wilde/swift/lab/catnap.sh null null GLOBUS::maxwalltime="00:01:00" login3$ ls ~/.globus/scripts OLD SGE14622.submit.exitcode SGE32272.submit.exitcode SGE982.submit.exitcode.0 SGE982.submit.exitcode.4 SGE13929.submit.exitcode SGE14622.submit.stderr SGE32272.submit.stderr SGE982.submit.exitcode.1 SGE982.submit.exitcode.5 SGE13929.submit.stderr SGE14622.submit.stdout SGE32272.submit.stdout SGE982.submit.exitcode.10 SGE982.submit.exitcode.6 SGE13929.submit.stdout SGE15887.submit SGE32273.submit.exitcode SGE982.submit.exitcode.11 SGE982.submit.exitcode.7 SGE13930.submit.exitcode SGE233.submit.stderr SGE32273.submit.stderr SGE982.submit.exitcode.12 SGE982.submit.exitcode.8 SGE13930.submit.stderr SGE233.submit.stdout SGE32273.submit.stdout SGE982.submit.exitcode.13 SGE982.submit.exitcode.9 SGE13930.submit.stdout SGE234.submit.exitcode SGE45187.submit SGE982.submit.exitcode.14 SGE982.submit.stderr SGE14621.submit.exitcode SGE234.submit.stderr SGE60909.submit.exitcode SGE982.submit.exitcode.15 SGE982.submit.stdout SGE14621.submit.stderr SGE234.submit.stdout SGE60909.submit.stderr SGE982.submit.exitcode.2 SGE14621.submit.stdout SGE26260.submit SGE60909.submit.stdout SGE982.submit.exitcode.3 login3$ cd ~/.globus/scripts login3$ ls -lt total 172 -rw-r--r-- 1 tg455797 G-80243 2540 Jun 30 22:52 SGE982.submit.stdout -rw-r--r-- 1 tg455797 G-80243 96 Jun 30 22:52 SGE982.submit.stderr -rw-r--r-- 1 tg455797 G-80243 2 Jun 30 22:52 SGE982.submit.exitcode.0 ----- Original Message ----- > On Thu, 2011-06-30 at 22:34 -0500, Michael Wilde wrote: > > Does debug=true in etc/provider-sge work, in the same manner it > > works for provider-pbs? > > What I see in the trunk code is that logger.isDebugEnabled() is used > as > a test that decides whether the script is deleted at the end or not. > > This is in the abstract classes that are subclassed by both the pbs > and > sge providers, so it should apply to both. -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From pmrich at gmail.com Wed Jun 22 14:39:12 2011 From: pmrich at gmail.com (Paul Rich) Date: Wed, 22 Jun 2011 19:39:12 -0000 Subject: [Swift-devel] [alcf-support #60887] Can Cobalt command-line bug on Eureka be fixed? In-Reply-To: <1715669240.52741.1294795830606.JavaMail.root@zimbra.anl.gov> Message-ID: <750558427.34664.1308771542583.JavaMail.root@zimbra.anl.gov> Michael, I wanted to let you know that a recent patch to Cobalt on Eureka should allow you to pass command-line arguments into the program supplied to the Cobalt job. Let us know if you encounter any further difficulties, and I am sorry that this took so long to deploy. Thank you for your patience, -- Paul Rich ALCF Operations -- AIG richp at alcf.anl.gov ----- Original Message ----- From: "Michael Wilde" To: "Paul M. Rich" , "Andrew Cherry" Cc: "swift-devel" , "Robert Jacob" , support at alcf.anl.gov Sent: Tuesday, January 11, 2011 7:30:30 PM Subject: Re: [alcf-support #60887] Can Cobalt command-line bug on Eureka be fixed? Paul, Andrew, What I think we're going to do on this from the Swift side is temporarily try to use Eureka in a mode where we manually start Swift workers on the cluster using a batch job. We'll wait on testing the Swift Cobolt interface (which is different than the above) until we hear from you that the bug is fixed and ready for testing. So even though it may be many weeks or more away, we'd like to put in our vote for fixing this issue (realizing that you have many other priorities :) Thanks, MIke ----- Original Message ----- > Thanks, Rich and Andrew, for the very fast responses. > > We'll try the work-around, then. > > Regards, > > - Mike > > > ----- Original Message ----- > > Michael, > > > > Unfortunately a fix for this will, at this point in time, take a > > minimum > > of four weeks to deploy to a production resource like Eureka, due to > > our > > testing, upgrade and maintenance procedures. > > > > As a workaround for this on Eureka, since every job effectively runs > > in > > script mode, you should be able to set environment variables within > > the > > script that you submit to Cobalt. > > > > We apologize for the inconvenience. Let us know if you have any > > other > > questions. > > > > -- > > Paul Rich > > ALCF Operations -- AIG > > richp at alcf.anl.gov > > > > > > On 1/11/11 4:48 PM, Michael Wilde wrote: > > > User info for wilde at mcs.anl.gov > > > ================================= > > > Username: wilde > > > Full Name: Michael Wilde > > > Projects: > > > HTCScienceApps,JGI-Pilot,MTCScienceApps,OOPS,PTMAP,pilot-wilde > > > ('*' denotes INCITE projects) > > > ================================= > > > > > > > > > Hi ALCF Team, > > > > > > The following known issue in Cobalt is currently preventing us > > > from > > > running Swift on Eureka: > > > > > > http://trac.mcs.anl.gov/projects/cobalt/ticket/462 > > > > > > With some additional development effort we can work around this, > > > but > > > it would be much cleaner and better if this were fixed in Cobalt, > > > instead, as suggested in ticket 462 above. > > > > > > Is there any chance that can be done in the next few days? > > > If not, please let me know, and we will implement the work-around > > > instead. > > > > > > This is holding up work on the DOE ParVis project (Rob Jacob, PI) > > > and we've had to move some work we want to run on Eureka to other > > > platforms in the meantime. > > > > > > Thanks very much, > > > > > > Mike > > > > > > 462 is: > > > > > > Ticket #462 (new defect) > > > Opened 7 months ago > > > Cobalt on clusters ignores job script arguments > > > > > > Reported by: acherry > > > Priority: major > > > Component: clients > > > > > > Description > > > > > > It appears that cobalt-launcher.py does not support running a job > > > script or executable with command arguments, even though qsub will > > > accept the arguments, and the man page and help for qsub indicates > > > that arguments are accepted. > > > > > > I'm filing this as a bug rather than a feature request, since the > > > behavior isn't consistent with the documentation. But I'd rather > > > the > > > fix for this to be adding support for args, rather than changing > > > the > > > docs to say they aren't accepted. :-) > > > > > > > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory