From hategan at mcs.anl.gov Fri Jul 5 20:24:43 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 05 Jul 2013 18:24:43 -0700 Subject: [Swift-devel] faster branch merge Message-ID: <1373073883.6243.2.camel@echo> A lot of files have been moved around in the faster branch. I'm thinking that instead of merging the faster branch into trunk and then dealing with all the conflicts and weirdness, I would rather manually merge all the changes from trunk and 0.94 into the faster branch and then move trunk out of the way and then rename the faster branch to trunk. I might take this opportunity to remove the "current" part from the path. Any objections? Am I missing something that says I shouldn't do this? Mihael From hategan at mcs.anl.gov Sat Jul 6 01:43:53 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 05 Jul 2013 23:43:53 -0700 Subject: [Swift-devel] condor leave_in_queue Message-ID: <1373093033.10417.6.camel@echo> This is in regards to http://sourceforge.net/p/cogkit/svn/3671/ The reason why leave_in_queue was set to TRUE was in order to get the exit code from the job (and therefore figure whether it failed or not). If the job is automatically removed from the queue by condor when the job is done, that information is lost. Instead, the queue poller, after it figures out that a job is done and it reads the exit code, sets leave_in_queue to FALSE and removes the job from the queue. I'm guessing that was broken somehow, but I'd like to get more details before I can like the change (or before I merge it into the faster branch). Mihael From davidk at ci.uchicago.edu Sat Jul 6 03:08:51 2013 From: davidk at ci.uchicago.edu (David Kelly) Date: Sat, 6 Jul 2013 03:08:51 -0500 (CDT) Subject: [Swift-devel] condor leave_in_queue In-Reply-To: <1373093033.10417.6.camel@echo> Message-ID: <1441876150.10111427.1373098131118.JavaMail.root@ci.uchicago.edu> Mihael, Thanks for the info. The problem we were seeing was that condor jobs were not being removed. They would complete, but remain visible from condor_q forever until manually removed by the user with condor_rm. At the suggestion of the uc3 admins, I tried testing with leave_in_queue set to false. Jobs are being removed now, and I just ran a quick test (uc3 /home/davidk/test4/run003) to verify exit codes still being read correctly, but perhaps there is a better fix? David ----- Original Message ----- From: "Mihael Hategan" To: "Swift Devel" Sent: Saturday, July 6, 2013 1:43:53 AM Subject: [Swift-devel] condor leave_in_queue This is in regards to http://sourceforge.net/p/cogkit/svn/3671/ The reason why leave_in_queue was set to TRUE was in order to get the exit code from the job (and therefore figure whether it failed or not). If the job is automatically removed from the queue by condor when the job is done, that information is lost. Instead, the queue poller, after it figures out that a job is done and it reads the exit code, sets leave_in_queue to FALSE and removes the job from the queue. I'm guessing that was broken somehow, but I'd like to get more details before I can like the change (or before I merge it into the faster branch). Mihael _______________________________________________ Swift-devel mailing list Swift-devel at ci.uchicago.edu https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Sat Jul 6 03:35:12 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sat, 06 Jul 2013 01:35:12 -0700 Subject: [Swift-devel] condor leave_in_queue In-Reply-To: <1441876150.10111427.1373098131118.JavaMail.root@ci.uchicago.edu> References: <1441876150.10111427.1373098131118.JavaMail.root@ci.uchicago.edu> Message-ID: <1373099712.11863.5.camel@echo> On Sat, 2013-07-06 at 03:08 -0500, David Kelly wrote: > Mihael, > > > Thanks for the info. > > > The problem we were seeing was that condor jobs were not being > removed. > They would complete, but remain visible from condor_q forever until > manually removed by the user with condor_rm. We should find out why the removal isn't working. But I agree that there is no fail-safe in the current system and there should be. > At the suggestion of the uc3 admins, I tried testing with > leave_in_queue set to false. Jobs are being removed now, and I just > ran a quick test (uc3 /home/davidk/test4/run003) to verify exit codes > still being read correctly, but perhaps there is a better fix? I believe there might be. If I remember correctly, values can be expressions and they can be expressions that depend on time, such as leave_in_queue = (now() - jobEndTime < some_interval). Maybe. It would be good if that were true. Mihael From wilde at mcs.anl.gov Sat Jul 6 10:09:19 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sat, 6 Jul 2013 10:09:19 -0500 (CDT) Subject: [Swift-devel] faster branch merge In-Reply-To: <1373073883.6243.2.camel@echo> Message-ID: <1462368575.7835703.1373123359749.JavaMail.root@mcs.anl.gov> > A lot of files have been moved around in the faster branch. I'm > thinking > that instead of merging the faster branch into trunk and then dealing > with all the conflicts and weirdness, I would rather manually merge > all > the changes from trunk and 0.94 into the faster branch and then move > trunk out of the way and then rename the faster branch to trunk. I > might > take this opportunity to remove the "current" part from the path. > > Any objections? Am I missing something that says I shouldn't do this? This sounds good to me. Minor related issue: is there any way to move the parts of CoG we dont use out of our view while developing? Eg an optional but more selective checkout, perhaps just a set of excludes on the example checkout instructions? Also some day: is there a good point to change vdl2 to swift, before 1.0 ? - Mike From lpesce at uchicago.edu Sat Jul 6 11:22:46 2013 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Sat, 6 Jul 2013 11:22:46 -0500 Subject: [Swift-devel] faster branch merge In-Reply-To: <1462368575.7835703.1373123359749.JavaMail.root@mcs.anl.gov> References: <1462368575.7835703.1373123359749.JavaMail.root@mcs.anl.gov> Message-ID: <1C61ACC4-4018-472B-988A-47C1A8F68F2F@uchicago.edu> From my selfish perspective, let me know when I can start trying to use it on our pipelines on Beagle. The genomics pipeline will soon be put to work with relatively heavy loads (millions of core hours) and it is based on 0.94. It would be nice to test it on faster + trunk On Jul 6, 2013, at 10:09 AM, Michael Wilde wrote: > >> A lot of files have been moved around in the faster branch. I'm >> thinking >> that instead of merging the faster branch into trunk and then dealing >> with all the conflicts and weirdness, I would rather manually merge >> all >> the changes from trunk and 0.94 into the faster branch and then move >> trunk out of the way and then rename the faster branch to trunk. I >> might >> take this opportunity to remove the "current" part from the path. >> >> Any objections? Am I missing something that says I shouldn't do this? > > This sounds good to me. > > Minor related issue: is there any way to move the parts of CoG we dont use out of our view while developing? Eg an optional but more selective checkout, perhaps just a set of excludes on the example checkout instructions? > > Also some day: is there a good point to change vdl2 to swift, before 1.0 ? > > - Mike > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From wilde at mcs.anl.gov Mon Jul 8 15:08:56 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 8 Jul 2013 15:08:56 -0500 (CDT) Subject: [Swift-devel] sourceforge SVN changes In-Reply-To: <1370227449.30798.4.camel@echo> Message-ID: <1826803620.8185343.1373314136101.JavaMail.root@mcs.anl.gov> Mihael pointed out that: > Sourceforge has migrated the svn repos. If you have an existing > checkout > you should: > > 1. make sure you have your public key in your account: > https://sourceforge.net/account/ssh > > 2. do a 'svn relocate "svn > +ssh://@svn.code.sf.net/p/cogkit/svn/"' > > Also, we should update our checkout instructions. The old checkout line for CoG on the swift-lang download pages is: $ svn co https://cogkit.svn.sourceforge.net/svnroot/cogkit/branches/4.1.10/src/cog This still works for checkout, but gives you a link you cant commit to. Is the right checkout documented somewhere for developers? I.e.: $ svn co svn+ssh://YourSourceForgeUserName at svn.code.sf.net/p/cogkit/svn/branches/4.1.10/src/cog Can users do an svn+ssh:// checkout using some public, anonymous login name, which we could specify instead of https:// on the download page? (as it seems faster) This is a minor issue and not worth any follow-up unless there is an obvious solution. - Mike From hategan at mcs.anl.gov Mon Jul 8 16:03:22 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 08 Jul 2013 14:03:22 -0700 Subject: [Swift-devel] sourceforge SVN changes In-Reply-To: <1826803620.8185343.1373314136101.JavaMail.root@mcs.anl.gov> References: <1826803620.8185343.1373314136101.JavaMail.root@mcs.anl.gov> Message-ID: <1373317402.19339.1.camel@echo> > Can users do an svn+ssh:// checkout using some public, anonymous login > name, which we could specify instead of https:// on the download page? > (as it seems faster) This is a minor issue and not worth any follow-up > unless there is an obvious solution. > I do not think so. If they don't have an account, the following page gives two alternative methods for checking the code out: http://sourceforge.net/p/cogkit/svn/3706/tree/trunk/current/ Namely: svn checkout svn://svn.code.sf.net/p/cogkit/svn/trunk/current and svn checkout http://svn.code.sf.net/p/cogkit/svn/trunk/current Mihael From hategan at mcs.anl.gov Mon Jul 8 19:50:10 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 08 Jul 2013 17:50:10 -0700 Subject: [Swift-devel] faster branch merge Message-ID: <1373331010.6365.3.camel@echo> Hello, I merged all changes from trunk and 0.94 into the faster branch and ran some tests. Seems ok as far as I'm concerned. That means that I'm ready to rename trunk to trunk.old and faster to trunk. Which probably means that you might need clean checkouts and that from now on what used to be the faster branch will become trunk. This is the last opportunity to veto this move. I'll wait one hour for replies, after which I'm going ahead with the switch. Mihael From hategan at mcs.anl.gov Mon Jul 8 21:20:05 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 08 Jul 2013 19:20:05 -0700 Subject: [Swift-devel] faster branch merge In-Reply-To: <1373331010.6365.3.camel@echo> References: <1373331010.6365.3.camel@echo> Message-ID: <1373336405.3989.2.camel@echo> Done. I have also removed the "current" part in cog. So a checkout now would go something like this: svn checkout http://svn.code.sf.net/p/cogkit/svn/trunk/src/cog cd cog/modules ... Mihael On Mon, 2013-07-08 at 17:50 -0700, Mihael Hategan wrote: > Hello, > > I merged all changes from trunk and 0.94 into the faster branch and ran > some tests. Seems ok as far as I'm concerned. > > That means that I'm ready to rename trunk to trunk.old and faster to > trunk. Which probably means that you might need clean checkouts and that > from now on what used to be the faster branch will become trunk. > > This is the last opportunity to veto this move. I'll wait one hour for > replies, after which I'm going ahead with the switch. > > Mihael > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From wilde at mcs.anl.gov Tue Jul 9 09:59:09 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 9 Jul 2013 09:59:09 -0500 (CDT) Subject: [Swift-devel] Finding number of cores and sockets on a node Message-ID: <192586869.8315058.1373381949639.JavaMail.root@mcs.anl.gov> I found this article helpful for interpreting /proc/cpuinfo: https://www.ibm.com/developerworks/community/blogs/brian/entry/linux_show_the_number_of_cpu_cores_on_your_system17?lang=en - Mike From wilde at mcs.anl.gov Wed Jul 10 09:31:37 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 10 Jul 2013 09:31:37 -0500 (CDT) Subject: [Swift-devel] Suggested test for multiple local coaster pools In-Reply-To: <43768212.8721033.1373466515004.JavaMail.root@mcs.anl.gov> Message-ID: <1149271161.8721425.1373466697322.JavaMail.root@mcs.anl.gov> Yadu, this bug could use a test (and User Guide clarification): ensure that two pools on the same local cluster can each have unique attributes (most notably, jobsPerNode). This could use the same techniques we discussed to ensure that jobsPerNode itself is working properly. - Mike ----- Forwarded Message ----- From: "Michael Wilde" To: swift-support at ci.uchicago.edu Sent: Wednesday, July 10, 2013 9:28:35 AM Subject: Re: [Swift Support #23172] Issue with using multiple pools This symptom was discussed in Swift bug report 869: https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=869 As I understand Mihael's resolution to this bug, in order to use multiple pools per host you should add the "url=" option to each pool, like this: For pool 1: For pool 2: Could you try that and let us know if it works or not? Thanks, - Mike ----- Original Message ----- > From: "Mike Wilde" > Sent: Wednesday, July 10, 2013 9:21:48 AM > Subject: Re: [Swift Support #23172] Issue with using multiple pools > > The following addresses are receiving this ticket: > pittjj at uchicago.edu swift-support at ci.uchicago.edu > lpesce at ci.uchicago.edu > > Jason, I think we have seen cases where two pools running with the > same coaster service can only support one value of jobsPerNode. > > I recall a fix provided by Mihael to force coasters to use one > service per pool by specifying unique URLs for each pool. (By > default I think it uses one service per host/site). I'll try to hunt > that down to see if it explains (and fixes) what you are seeing. > > Mihael, or anyone familiar with this problem, can you clarify? > > Thanks, > > - Mike > > ----- Original Message ----- > > From: "Jason J. Pitt" > > Sent: Wednesday, July 10, 2013 1:09:24 AM > > Subject: [Swift Support #23172] Issue with using multiple pools > > > > > > Wed Jul 10 01:09:23 2013: Request 23172 was acted upon. > > Transaction: Ticket created by pittjj at uchicago.edu > > Queue: swift-support > > Subject: Issue with using multiple pools > > Owner: Nobody > > Requestors: pittjj at uchicago.edu > > Status: new > > Ticket > https://rt.ci.uchicago.edu/Ticket/Display.html?id=23172 > > > > > > > > > Hi Everyone, > > > > I'm having issues with using multiple pools within my Swift script. > > Without going into the details, as I believed we have discussed why > > multiple pools are necessary in this forum before, I need to run > > one > > of my apps as 1 per node (handle=one) and the remainder can run 8 > > per node (handle=pbs). Right now we are using Beagle for the > > computations. > > > > The issue is that even if I specify the one pool to run as one per > > node in the .tc and .xml files, it seems to be running as multiples > > on the same node (jobs failing at that app with OMM and bad > > allocation errors). Notably, if I specify the pbs pool (which is > > usually run at 4-8 per node) to only run as 1 job per node the code > > runs to completion with no errors. Below I have copied my .xml file > > and part of my .tc file. Is there anything that I may be missing in > > order to make the pooling occur properly? Since this is a very > > brief > > introduction to the issue Lorenzo and I welcome any questions. > > > > .xml file > > ------------------------------------------------ > > > > > > > > > > ${PROJECT} > > > > > key="providerAttributes">${provider_attributes} > > $queueLine > > > > > key="jobsPerNode">${jobsPerNode} > > ${walltime} > > > key="maxwalltime">${apptime} > > > key="lowOverallocation">100 > > > key="highOverallocation">100 > > > > ${numnodes} > > 1 > > 1 > > > > > key="jobThrottle">${jobThrottle} > > 10000 > > > > > > ${swiftworkdir} > > > > > > > > > > > > ${PROJECT} > > > > > key="providerAttributes">${provider_attributes} > > $queueLine > > > > 1 > > ${walltime} > > > key="maxwalltime">${apptime} > > > key="lowOverallocation">100 > > > key="highOverallocation">100 > > > > ${numnodes} > > 1 > > 1 > > > > > key="jobThrottle">${jobThrottle} > > 10000 > > > > > > ${swiftworkdir} > > > > > > -------------------------------------------- > > > > > > .tc file > > _______________________________ > > # sitename transformation path > > pbs echo /bin/echo > > pbs cat /bin/cat > > pbs ls /bin/ls > > pbs grep /bin/grep > > pbs sort /bin/sort > > pbs paste /bin/paste > > pbs cp /bin/cp > > pbs touch /bin/touch > > pbs wc /usr/bin/wc > > > > # custom entries > > pbs flagstatWrapper ${flagstatWrapper} INSTALLED AMD64::LINUX > > ENV::TMP="$LUSTRE_TMP";GLOBUS::maxwalltime="5:00:00" > > pbs coverageBedWrapper ${coverageBedWrapper} INSTALLED > > AMD64::LINUX ENV::TMP="$LUSTRE_TMP";GLOBUS::maxwalltime="5:00:00" > > one mosaikAlnBam2FastqWrapper ${mosaikAlnBam2FastqWrapper} > > INSTALLED AMD64::LINUX > > ENV::TMP="$LUSTRE_TMP";GLOBUS::maxwalltime="17:40:00" > > ----------------------------------------------- > > > > Also, I'm not sure if it is worth noting, but the test runs were > > competed using a 20 node reservation on Beagle. numnodes=20 (slots) > > was given for both pools. I'm not sure if this would have anything > > to do with the behavior we're seeing. > > > > Thanks for your help! > > > > Best, > > > > Jason > > > From wilde at mcs.anl.gov Wed Jul 10 12:53:44 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 10 Jul 2013 12:53:44 -0500 (CDT) Subject: [Swift-devel] Initial tests of new (faster) trunk In-Reply-To: <1361306774.891.1.camel@echo> Message-ID: <1668136297.8844433.1373478824678.JavaMail.root@mcs.anl.gov> Initial tests of the new trunk are working for me, but I'm seeing three odd things: 1. [Error] sites.beagle.coasters.xml:1:9: cvc-elt.1: Cannot find the declaration of element 'config'. (as reported in the email below) 2. {env.HOME} is not interpreted in sites.xml as in prior versions. Swift created a workdir named "$PWD/{env.HOME}/swiftwork" 3. Progress ticker lines seem to be defaulting to one per second, even with no status changing. But so far so good - a first test script using the new code is running nicely on SGE. - Mike ----- Original Message ----- > From: "Mihael Hategan" > To: "David Kelly" > Cc: swift-support at ci.uchicago.edu, "Michael Wilde" > Sent: Tuesday, February 19, 2013 2:46:14 PM > Subject: Re: [Swift Support #22699] Fwd: [Swift-devel] First tests with swift faster > > Yeah. The validation fails. You can ignore it for now. I'll fix in > the > future. Basically there is code to validate the sites file against > the > XML schema, but it fails. It's not a fatal issue though, and parsing > still happens. > > Mihael > > On Tue, 2013-02-19 at 14:22 -0600, David Kelly wrote: > > I tried updating from svn and running with the added url tags: > > > > > > > > > > > > > > > > > url="localhost"/> > > 1 > > 100 > > 100 > > > key="providerAttributes">pbs.aprun;pbs.mpp;depth=24 > > 4000 > > 00:05:00 > > > key="disableIdleBlockCleanup">true > > 1 > > 1 > > 1 > > batch > > 8.00 > > 10000 > > > > /lustre/beagle/davidk > > > > > > > > > > > > > > I am seeing this error: > > > > > > [Error] sites.beagle.coasters.xml:1:9: cvc-elt.1: Cannot find the > > declaration of element 'config'. > > > > > > ----- Original Message ----- > > > > > > From: "Mike Wilde" > > Sent: Tuesday, February 19, 2013 1:50:15 PM > > Subject: [Swift Support #22699] Fwd: [Swift-devel] First tests with > > swift faster > > > > > > Tue Feb 19 13:50:14 2013: Request 22699 was acted upon. > > Transaction: Ticket created by wilde at mcs.anl.gov > > Queue: swift-support > > Subject: Fwd: [Swift-devel] First tests with swift faster > > Owner: Nobody > > Requestors: wilde at ci.uchicago.edu > > Status: new > > Ticket > https://rt.ci.uchicago.edu/Ticket/Display.html?id=22699 > > > > > > > > > David, Mihael, Yadu: could one of you try this on Beagle on the > > faster branch? > > > > Does the faster branch include the PBS support for Beagle? > > > > It shouldnt be too hard to see what part of the PBS pool def it > > doesnt like. > > > > - Mike > > > > ----- Forwarded Message ----- > > From: "Lorenzo Pesce" > > To: "Swift Devel" > > Sent: Tuesday, February 19, 2013 1:26:20 PM > > Subject: [Swift-devel] First tests with swift faster > > > > > > This is the content of the file where we have the first complaint > > from swift (see attached): > > > > > > > > > > > > > > CI-DEB000002 > > > > > > > key="providerAttributes">pbs.aprun;pbs.mpp;depth=24 > > > > > > > > > > 24 > > 172800 > > 0:10:00 > > 100 > > 100 > > > > > > 200 > > 1 > > 1 > > > > > > 47.99 > > 10000 > > > > > > > > > > /lustre/beagle/samseaver/GS/swift.workdir > > > > > > > > > > Any ideas? > > > > > > Begin forwarded message: > > > > > > > > From: Sam Seaver < samseaver at gmail.com > > > > > Date: February 19, 2013 1:16:28 PM CST > > > > To: Lorenzo Pesce < lpesce at uchicago.edu > > > > > Subject: Re: How are things going? > > > > > > I got this error. I suspect using the new SWIFT_HOME directory > > means that there's possibly a missing parameter someplace: > > > > > > > > should we resume a previous calculation? [y/N] y > > rlog files displayed in reverse time order > > should I use GS-20130203-0717-jgeppt98.0.rlog ?[y/n] > > y > > Using GS-20130203-0717-jgeppt98.0.rlog > > [Error] GS_sites.xml:1:9: cvc-elt.1: Cannot find the declaration of > > element 'config'. > > > > > > Execution failed: > > Failed to parse site catalog > > swift:siteCatalog @ scheduler.k, line: 31 > > Caused by: Invalid pool entry 'pbs': > > swift:siteCatalog @ scheduler.k, line: 31 > > Caused by: java.lang.IllegalArgumentException: Missing URL > > at > > org.griphyn.vdl.karajan.lib.SiteCatalog.execution(SiteCatalog.java:173) > > at > > org.griphyn.vdl.karajan.lib.SiteCatalog.pool(SiteCatalog.java:100) > > at > > org.griphyn.vdl.karajan.lib.SiteCatalog.buildResources(SiteCatalog.java:60) > > at > > org.griphyn.vdl.karajan.lib.SiteCatalog.function(SiteCatalog.java:48) > > at > > org.globus.cog.karajan.compiled.nodes.functions.AbstractFunction.runBody(AbstractFunction.java:38) > > at > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:154) > > at > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > at > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:147) > > at > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > at > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:147) > > at > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > at > > org.globus.cog.karajan.compiled.nodes.FramedInternalFunction.run(FramedInternalFunction.java:63) > > at > > org.globus.cog.karajan.compiled.nodes.Import.runBody(Import.java:269) > > at > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:154) > > at > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > at > > org.globus.cog.karajan.compiled.nodes.FramedInternalFunction.run(FramedInternalFunction.java:63) > > at org.globus.cog.karajan.compiled.nodes.Main.run(Main.java:79) > > at k.thr.LWThread.run(LWThread.java:243) > > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > > at > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > > at java.lang.Thread.run(Thread.java:722) > > > > > > > > On Tue, Feb 19, 2013 at 1:13 PM, Sam Seaver < samseaver at gmail.com > > > wrote: > > > > > > > > OK, it got to the point where it really did hang. I'm retrying, but > > with your suggestions. The other three finished fine! > > > > > > > > Progress: time: Tue, 19 Feb 2013 19:08:53 +0000 Selecting > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > successfully:132323 Failed but can retry:183 > > Progress: time: Tue, 19 Feb 2013 19:09:23 +0000 Selecting > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > successfully:132323 Failed but can retry:183 > > Progress: time: Tue, 19 Feb 2013 19:09:53 +0000 Selecting > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > successfully:132323 Failed but can retry:183 > > Progress: time: Tue, 19 Feb 2013 19:10:23 +0000 Selecting > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > successfully:132323 Failed but can retry:183 > > Progress: time: Tue, 19 Feb 2013 19:10:53 +0000 Selecting > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > successfully:132323 Failed but can retry:183 > > Progress: time: Tue, 19 Feb 2013 19:11:23 +0000 Selecting > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > successfully:132323 Failed but can retry:183 > > > > > > > > > > > > On Tue, Feb 19, 2013 at 8:51 AM, Lorenzo Pesce < > > lpesce at uchicago.edu > wrote: > > > > > > > > Hmm... > > > > > > foreach.max.threads=100 > > > > > > maybe you should increase this number a bit and see what happens. > > > > > > Also, I would try to replace > > > > > > SWIFT_HOME=/home/wilde/swift/rev/swift-r6151-cog-r3552 > > > > > > with > > > > > > SWIFT_HOME=/soft/swift/fast > > > > > > Keep me posted. Let's get this rolling. > > > > > > if it doesn't work, I can redo the packing. > > > > > > > > > > > > > > > > > > > > > > > > On Feb 19, 2013, at 1:07 AM, Sam Seaver wrote: > > > > > > > > Actually, the ten agents job does seem to be stuck in a strange > > loop. It is incrementing the number of jobs that has finished > > successfully, and at a fast pace, but the number of jobs its > > starting is decrementing much more slowly, its almost as its > > repeatedly attempting the same set of parameters multiple times... > > > > > > I'll see what it's doing in the morning > > S > > > > > > > > On Tue, Feb 19, 2013 at 1:00 AM, Sam Seaver < samseaver at gmail.com > > > wrote: > > > > > > > > Seems to have worked overall this time! > > > > > > I resume four jobs, each were for a different number of agents > > (10,100,1000,10000) it made it easier for me to decide on the app > > time. Two of them have already finished i.e.: > > > > > > > > Progress: time: Mon, 18 Feb 2013 23:50:12 +0000 Active:4 Checking > > status:1 Finished in previous run:148098 Finished > > successfully:37897 > > Progress: time: Mon, 18 Feb 2013 23:50:15 +0000 Active:2 Checking > > status:1 Finished in previous run:148098 Finished > > successfully:37899 > > Final status: Mon, 18 Feb 2013 23:50:15 +0000 Finished in previous > > run:148098 Finished successfully:37902 > > > > > > and the only one that is showing any failure (50/110000), is the > > ten agents version which is so short I can understand why, but its > > still actively trying to run jobs and is actively finishing jobs, > > so that's good. > > > > > > Yay! > > > > > > > > > > > > > > > > On Mon, Feb 18, 2013 at 1:09 PM, Lorenzo Pesce < > > lpesce at uchicago.edu > wrote: > > > > > > > > Good. Keep me posted, I would really like to solve your problems in > > running on Beagle this week, I wish that Swift would have been > > friendlier. > > > > > > > > > > > > On Feb 18, 2013, at 1:01 PM, Sam Seaver wrote: > > > > > > > > I just resumed the jobs that I'd killed before the system went > > down, lets see how it does. I always did a mini-review of the data > > I've got and it seems to be working as expected. > > > > > > > > On Mon, Feb 18, 2013 at 12:28 PM, Lorenzo Pesce < > > lpesce at uchicago.edu > wrote: > > > > > > > > I have lost track a bit of what's up. I am happy to try and go over > > it with you when you are ready. > > > > > > Some of the problems of swift might have improved with a new > > version and the new system. > > > > > > > > > > > > > > > > On Feb 18, 2013, at 12:22 PM, Sam Seaver wrote: > > > > > > > > They're not, I've not looked since Beagle came back up. Will do so > > later today. > > S > > > > > > > > On Mon, Feb 18, 2013 at 12:20 PM, Lorenzo Pesce < > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > > > -- > > Postdoctoral Fellow > > Mathematics and Computer Science Division > > Argonne National Laboratory > > 9700 S. Cass Avenue > > Argonne, IL 60439 > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > samseaver at gmail.com > > (773) 796-7144 > > > > "We shall not cease from exploration > > And the end of all our exploring > > Will be to arrive where we started > > And know the place for the first time." > > --T. S. Eliot > > > > > > > > > > -- > > Postdoctoral Fellow > > Mathematics and Computer Science Division > > Argonne National Laboratory > > 9700 S. Cass Avenue > > Argonne, IL 60439 > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > samseaver at gmail.com > > (773) 796-7144 > > > > "We shall not cease from exploration > > And the end of all our exploring > > Will be to arrive where we started > > And know the place for the first time." > > --T. S. Eliot > > > > > > > > > > -- > > Postdoctoral Fellow > > Mathematics and Computer Science Division > > Argonne National Laboratory > > 9700 S. Cass Avenue > > Argonne, IL 60439 > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > samseaver at gmail.com > > (773) 796-7144 > > > > "We shall not cease from exploration > > And the end of all our exploring > > Will be to arrive where we started > > And know the place for the first time." > > --T. S. Eliot > > > > > > > > -- > > Postdoctoral Fellow > > Mathematics and Computer Science Division > > Argonne National Laboratory > > 9700 S. Cass Avenue > > Argonne, IL 60439 > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > samseaver at gmail.com > > (773) 796-7144 > > > > "We shall not cease from exploration > > And the end of all our exploring > > Will be to arrive where we started > > And know the place for the first time." > > --T. S. Eliot > > > > > > > > > > -- > > Postdoctoral Fellow > > Mathematics and Computer Science Division > > Argonne National Laboratory > > 9700 S. Cass Avenue > > Argonne, IL 60439 > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > samseaver at gmail.com > > (773) 796-7144 > > > > "We shall not cease from exploration > > And the end of all our exploring > > Will be to arrive where we started > > And know the place for the first time." > > --T. S. Eliot > > > > > > > > I tried updating from svn and running with the added url tags: > > > > > > > > > > > > > > > url="localhost"/> > > 1 > > > key="lowOverAllocation">100 > > > key="highOverAllocation">100 > > > key="providerAttributes">pbs.aprun;pbs.mpp;depth=24 > > 4000 > > > key="maxWallTime">00:05:00 > > > key="disableIdleBlockCleanup">true > > 1 > > 1 > > 1 > > batch > > 8.00 > > 10000 > > > > /lustre/beagle/davidk > > > > > > > > > > > > > > I am seeing this error: > > > > > > [Error] sites.beagle.coasters.xml:1:9: cvc-elt.1: Cannot find the > > declaration of element 'config'. > > > > > > > > > > ______________________________________________________________________ > > From: "Mike Wilde" > > Sent: Tuesday, February 19, 2013 1:50:15 PM > > Subject: [Swift Support #22699] Fwd: [Swift-devel] First > > tests > > with swift faster > > > > > > Tue Feb 19 13:50:14 2013: Request 22699 was acted upon. > > Transaction: Ticket created by wilde at mcs.anl.gov > > Queue: swift-support > > Subject: Fwd: [Swift-devel] First tests with swift > > faster > > Owner: Nobody > > Requestors: wilde at ci.uchicago.edu > > Status: new > > Ticket > https://rt.ci.uchicago.edu/Ticket/Display.html?id=22699 > > > > > > > > > David, Mihael, Yadu: could one of you try this on Beagle on > > the faster branch? > > > > Does the faster branch include the PBS support for Beagle? > > > > It shouldnt be too hard to see what part of the PBS pool > > def > > it doesnt like. > > > > - Mike > > > > ----- Forwarded Message ----- > > From: "Lorenzo Pesce" > > To: "Swift Devel" > > Sent: Tuesday, February 19, 2013 1:26:20 PM > > Subject: [Swift-devel] First tests with swift faster > > > > > > This is the content of the file where we have the first > > complaint from swift (see attached): > > > > > > > > > > > > > > > key="project">CI-DEB000002 > > > > > > > key="providerAttributes">pbs.aprun;pbs.mpp;depth=24 > > > > > > > > > > 24 > > 172800 > > > key="maxwalltime">0:10:00 > > > key="lowOverallocation">100 > > > key="highOverallocation">100 > > > > > > 200 > > > key="nodeGranularity">1 > > 1 > > > > > > > key="jobThrottle">47.99 > > > key="initialScore">10000 > > > > > > > > > > /lustre/beagle/samseaver/GS/swift.workdir > > > > > > > > > > Any ideas? > > > > > > Begin forwarded message: > > > > > > > > From: Sam Seaver < samseaver at gmail.com > > > > > Date: February 19, 2013 1:16:28 PM CST > > > > To: Lorenzo Pesce < lpesce at uchicago.edu > > > > > Subject: Re: How are things going? > > > > > > I got this error. I suspect using the new SWIFT_HOME > > directory > > means that there's possibly a missing parameter someplace: > > > > > > > > should we resume a previous calculation? [y/N] y > > rlog files displayed in reverse time order > > should I use GS-20130203-0717-jgeppt98.0.rlog ?[y/n] > > y > > Using GS-20130203-0717-jgeppt98.0.rlog > > [Error] GS_sites.xml:1:9: cvc-elt.1: Cannot find the > > declaration of element 'config'. > > > > > > Execution failed: > > Failed to parse site catalog > > swift:siteCatalog @ scheduler.k, line: 31 > > Caused by: Invalid pool entry 'pbs': > > swift:siteCatalog @ scheduler.k, line: 31 > > Caused by: java.lang.IllegalArgumentException: Missing URL > > at > > org.griphyn.vdl.karajan.lib.SiteCatalog.execution(SiteCatalog.java:173) > > at > > org.griphyn.vdl.karajan.lib.SiteCatalog.pool(SiteCatalog.java:100) > > at > > org.griphyn.vdl.karajan.lib.SiteCatalog.buildResources(SiteCatalog.java:60) > > at > > org.griphyn.vdl.karajan.lib.SiteCatalog.function(SiteCatalog.java:48) > > at > > org.globus.cog.karajan.compiled.nodes.functions.AbstractFunction.runBody(AbstractFunction.java:38) > > at > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:154) > > at > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > at > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:147) > > at > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > at > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:147) > > at > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > at > > org.globus.cog.karajan.compiled.nodes.FramedInternalFunction.run(FramedInternalFunction.java:63) > > at > > org.globus.cog.karajan.compiled.nodes.Import.runBody(Import.java:269) > > at > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:154) > > at > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > at > > org.globus.cog.karajan.compiled.nodes.FramedInternalFunction.run(FramedInternalFunction.java:63) > > at > > org.globus.cog.karajan.compiled.nodes.Main.run(Main.java:79) > > at k.thr.LWThread.run(LWThread.java:243) > > at > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > > at java.util.concurrent.ThreadPoolExecutor > > $Worker.run(ThreadPoolExecutor.java:603) > > at java.lang.Thread.run(Thread.java:722) > > > > > > > > On Tue, Feb 19, 2013 at 1:13 PM, Sam Seaver < > > samseaver at gmail.com > wrote: > > > > > > > > OK, it got to the point where it really did hang. I'm > > retrying, but with your suggestions. The other three > > finished > > fine! > > > > > > > > Progress: time: Tue, 19 Feb 2013 19:08:53 +0000 Selecting > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > successfully:132323 Failed but can retry:183 > > Progress: time: Tue, 19 Feb 2013 19:09:23 +0000 Selecting > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > successfully:132323 Failed but can retry:183 > > Progress: time: Tue, 19 Feb 2013 19:09:53 +0000 Selecting > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > successfully:132323 Failed but can retry:183 > > Progress: time: Tue, 19 Feb 2013 19:10:23 +0000 Selecting > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > successfully:132323 Failed but can retry:183 > > Progress: time: Tue, 19 Feb 2013 19:10:53 +0000 Selecting > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > successfully:132323 Failed but can retry:183 > > Progress: time: Tue, 19 Feb 2013 19:11:23 +0000 Selecting > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > successfully:132323 Failed but can retry:183 > > > > > > > > > > > > On Tue, Feb 19, 2013 at 8:51 AM, Lorenzo Pesce < > > lpesce at uchicago.edu > wrote: > > > > > > > > Hmm... > > > > > > foreach.max.threads=100 > > > > > > maybe you should increase this number a bit and see what > > happens. > > > > > > Also, I would try to replace > > > > > > SWIFT_HOME=/home/wilde/swift/rev/swift-r6151-cog-r3552 > > > > > > with > > > > > > SWIFT_HOME=/soft/swift/fast > > > > > > Keep me posted. Let's get this rolling. > > > > > > if it doesn't work, I can redo the packing. > > > > > > > > > > > > > > > > > > > > > > > > On Feb 19, 2013, at 1:07 AM, Sam Seaver wrote: > > > > > > > > Actually, the ten agents job does seem to be stuck in a > > strange loop. It is incrementing the number of jobs that > > has > > finished successfully, and at a fast pace, but the number > > of > > jobs its starting is decrementing much more slowly, its > > almost > > as its repeatedly attempting the same set of parameters > > multiple times... > > > > > > I'll see what it's doing in the morning > > S > > > > > > > > On Tue, Feb 19, 2013 at 1:00 AM, Sam Seaver < > > samseaver at gmail.com > wrote: > > > > > > > > Seems to have worked overall this time! > > > > > > I resume four jobs, each were for a different number of > > agents > > (10,100,1000,10000) it made it easier for me to decide on > > the > > app time. Two of them have already finished i.e.: > > > > > > > > Progress: time: Mon, 18 Feb 2013 23:50:12 +0000 Active:4 > > Checking status:1 Finished in previous run:148098 Finished > > successfully:37897 > > Progress: time: Mon, 18 Feb 2013 23:50:15 +0000 Active:2 > > Checking status:1 Finished in previous run:148098 Finished > > successfully:37899 > > Final status: Mon, 18 Feb 2013 23:50:15 +0000 Finished in > > previous run:148098 Finished successfully:37902 > > > > > > and the only one that is showing any failure (50/110000), > > is > > the ten agents version which is so short I can understand > > why, > > but its still actively trying to run jobs and is actively > > finishing jobs, so that's good. > > > > > > Yay! > > > > > > > > > > > > > > > > On Mon, Feb 18, 2013 at 1:09 PM, Lorenzo Pesce < > > lpesce at uchicago.edu > wrote: > > > > > > > > Good. Keep me posted, I would really like to solve your > > problems in running on Beagle this week, I wish that Swift > > would have been friendlier. > > > > > > > > > > > > On Feb 18, 2013, at 1:01 PM, Sam Seaver wrote: > > > > > > > > I just resumed the jobs that I'd killed before the system > > went > > down, lets see how it does. I always did a mini-review of > > the > > data I've got and it seems to be working as expected. > > > > > > > > On Mon, Feb 18, 2013 at 12:28 PM, Lorenzo Pesce < > > lpesce at uchicago.edu > wrote: > > > > > > > > I have lost track a bit of what's up. I am happy to try and > > go > > over it with you when you are ready. > > > > > > Some of the problems of swift might have improved with a > > new > > version and the new system. > > > > > > > > > > > > > > > > On Feb 18, 2013, at 12:22 PM, Sam Seaver wrote: > > > > > > > > They're not, I've not looked since Beagle came back up. > > Will > > do so later today. > > S > > > > > > > > On Mon, Feb 18, 2013 at 12:20 PM, Lorenzo Pesce < > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > > > -- > > Postdoctoral Fellow > > Mathematics and Computer Science Division > > Argonne National Laboratory > > 9700 S. Cass Avenue > > Argonne, IL 60439 > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > samseaver at gmail.com > > (773) 796-7144 > > > > "We shall not cease from exploration > > And the end of all our exploring > > Will be to arrive where we started > > And know the place for the first time." > > --T. S. Eliot > > > > > > > > > > -- > > Postdoctoral Fellow > > Mathematics and Computer Science Division > > Argonne National Laboratory > > 9700 S. Cass Avenue > > Argonne, IL 60439 > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > samseaver at gmail.com > > (773) 796-7144 > > > > "We shall not cease from exploration > > And the end of all our exploring > > Will be to arrive where we started > > And know the place for the first time." > > --T. S. Eliot > > > > > > > > > > -- > > Postdoctoral Fellow > > Mathematics and Computer Science Division > > Argonne National Laboratory > > 9700 S. Cass Avenue > > Argonne, IL 60439 > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > samseaver at gmail.com > > (773) 796-7144 > > > > "We shall not cease from exploration > > And the end of all our exploring > > Will be to arrive where we started > > And know the place for the first time." > > --T. S. Eliot > > > > > > > > -- > > Postdoctoral Fellow > > Mathematics and Computer Science Division > > Argonne National Laboratory > > 9700 S. Cass Avenue > > Argonne, IL 60439 > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > samseaver at gmail.com > > (773) 796-7144 > > > > "We shall not cease from exploration > > And the end of all our exploring > > Will be to arrive where we started > > And know the place for the first time." > > --T. S. Eliot > > > > > > > > > > -- > > Postdoctoral Fellow > > Mathematics and Computer Science Division > > Argonne National Laboratory > > 9700 S. Cass Avenue > > Argonne, IL 60439 > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > samseaver at gmail.com > > (773) 796-7144 > > > > "We shall not cease from exploration > > And the end of all our exploring > > Will be to arrive where we started > > And know the place for the first time." > > --T. S. Eliot > > > > > > > > -- > > Postdoctoral Fellow > > Mathematics and Computer Science Division > > Argonne National Laboratory > > 9700 S. Cass Avenue > > Argonne, IL 60439 > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > samseaver at gmail.com > > (773) 796-7144 > > > > "We shall not cease from exploration > > And the end of all our exploring > > Will be to arrive where we started > > And know the place for the first time." > > --T. S. Eliot > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > From wilde at mcs.anl.gov Wed Jul 10 13:09:04 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 10 Jul 2013 13:09:04 -0500 (CDT) Subject: [Swift-devel] Coasters does only one round of tasks in faster trunk In-Reply-To: <1668136297.8844433.1373478824678.JavaMail.root@mcs.anl.gov> Message-ID: <1733118118.8853440.1373479744368.JavaMail.root@mcs.anl.gov> Mihael, Ive run into a more serious problem: Im running on a single SGE node with 32 cores. The swift script has submitted about 400+ app() tasks 32 completed, but then coasters doesn't seem to be sending new jobs to the node. The log is at: http://www.ci.uchicago.edu/~wilde/paintgrid-20130710-1252-v9p47hm8.log I'll try the same on a 0.94.1 rev. - Mike ----- Original Message ----- > From: "Michael Wilde" > To: "Swift Devel" > Sent: Wednesday, July 10, 2013 12:53:44 PM > Subject: Initial tests of new (faster) trunk > > Initial tests of the new trunk are working for me, but I'm seeing > three odd things: > > 1. [Error] sites.beagle.coasters.xml:1:9: cvc-elt.1: Cannot find the > declaration of element 'config'. > (as reported in the email below) > > 2. {env.HOME} is not interpreted in sites.xml as in prior versions. > Swift created a workdir named "$PWD/{env.HOME}/swiftwork" > > 3. Progress ticker lines seem to be defaulting to one per second, > even with no status changing. > > But so far so good - a first test script using the new code is > running nicely on SGE. > > - Mike > > ----- Original Message ----- > > From: "Mihael Hategan" > > To: "David Kelly" > > Cc: swift-support at ci.uchicago.edu, "Michael Wilde" > > > > Sent: Tuesday, February 19, 2013 2:46:14 PM > > Subject: Re: [Swift Support #22699] Fwd: [Swift-devel] First tests > > with swift faster > > > > Yeah. The validation fails. You can ignore it for now. I'll fix in > > the > > future. Basically there is code to validate the sites file against > > the > > XML schema, but it fails. It's not a fatal issue though, and > > parsing > > still happens. > > > > Mihael > > > > On Tue, 2013-02-19 at 14:22 -0600, David Kelly wrote: > > > I tried updating from svn and running with the added url tags: > > > > > > > > > > > > > > > > > > > > > > > > > > url="localhost"/> > > > 1 > > > 100 > > > > > key="highOverAllocation">100 > > > > > key="providerAttributes">pbs.aprun;pbs.mpp;depth=24 > > > 4000 > > > 00:05:00 > > > > > key="disableIdleBlockCleanup">true > > > 1 > > > 1 > > > 1 > > > batch > > > 8.00 > > > 10000 > > > > > > /lustre/beagle/davidk > > > > > > > > > > > > > > > > > > > > > I am seeing this error: > > > > > > > > > [Error] sites.beagle.coasters.xml:1:9: cvc-elt.1: Cannot find the > > > declaration of element 'config'. > > > > > > > > > ----- Original Message ----- > > > > > > > > > From: "Mike Wilde" > > > Sent: Tuesday, February 19, 2013 1:50:15 PM > > > Subject: [Swift Support #22699] Fwd: [Swift-devel] First tests > > > with > > > swift faster > > > > > > > > > Tue Feb 19 13:50:14 2013: Request 22699 was acted upon. > > > Transaction: Ticket created by wilde at mcs.anl.gov > > > Queue: swift-support > > > Subject: Fwd: [Swift-devel] First tests with swift faster > > > Owner: Nobody > > > Requestors: wilde at ci.uchicago.edu > > > Status: new > > > Ticket > > https://rt.ci.uchicago.edu/Ticket/Display.html?id=22699 > > > > > > > > > > > > > David, Mihael, Yadu: could one of you try this on Beagle on the > > > faster branch? > > > > > > Does the faster branch include the PBS support for Beagle? > > > > > > It shouldnt be too hard to see what part of the PBS pool def it > > > doesnt like. > > > > > > - Mike > > > > > > ----- Forwarded Message ----- > > > From: "Lorenzo Pesce" > > > To: "Swift Devel" > > > Sent: Tuesday, February 19, 2013 1:26:20 PM > > > Subject: [Swift-devel] First tests with swift faster > > > > > > > > > This is the content of the file where we have the first complaint > > > from swift (see attached): > > > > > > > > > > > > > > > > > > > > > CI-DEB000002 > > > > > > > > > > > key="providerAttributes">pbs.aprun;pbs.mpp;depth=24 > > > > > > > > > > > > > > > 24 > > > 172800 > > > 0:10:00 > > > 100 > > > > > key="highOverallocation">100 > > > > > > > > > 200 > > > 1 > > > 1 > > > > > > > > > 47.99 > > > 10000 > > > > > > > > > > > > > > > /lustre/beagle/samseaver/GS/swift.workdir > > > > > > > > > > > > > > > Any ideas? > > > > > > > > > Begin forwarded message: > > > > > > > > > > > > From: Sam Seaver < samseaver at gmail.com > > > > > > > Date: February 19, 2013 1:16:28 PM CST > > > > > > To: Lorenzo Pesce < lpesce at uchicago.edu > > > > > > > Subject: Re: How are things going? > > > > > > > > > I got this error. I suspect using the new SWIFT_HOME directory > > > means that there's possibly a missing parameter someplace: > > > > > > > > > > > > should we resume a previous calculation? [y/N] y > > > rlog files displayed in reverse time order > > > should I use GS-20130203-0717-jgeppt98.0.rlog ?[y/n] > > > y > > > Using GS-20130203-0717-jgeppt98.0.rlog > > > [Error] GS_sites.xml:1:9: cvc-elt.1: Cannot find the declaration > > > of > > > element 'config'. > > > > > > > > > Execution failed: > > > Failed to parse site catalog > > > swift:siteCatalog @ scheduler.k, line: 31 > > > Caused by: Invalid pool entry 'pbs': > > > swift:siteCatalog @ scheduler.k, line: 31 > > > Caused by: java.lang.IllegalArgumentException: Missing URL > > > at > > > org.griphyn.vdl.karajan.lib.SiteCatalog.execution(SiteCatalog.java:173) > > > at > > > org.griphyn.vdl.karajan.lib.SiteCatalog.pool(SiteCatalog.java:100) > > > at > > > org.griphyn.vdl.karajan.lib.SiteCatalog.buildResources(SiteCatalog.java:60) > > > at > > > org.griphyn.vdl.karajan.lib.SiteCatalog.function(SiteCatalog.java:48) > > > at > > > org.globus.cog.karajan.compiled.nodes.functions.AbstractFunction.runBody(AbstractFunction.java:38) > > > at > > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:154) > > > at > > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > > at > > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:147) > > > at > > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > > at > > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:147) > > > at > > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > > at > > > org.globus.cog.karajan.compiled.nodes.FramedInternalFunction.run(FramedInternalFunction.java:63) > > > at > > > org.globus.cog.karajan.compiled.nodes.Import.runBody(Import.java:269) > > > at > > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:154) > > > at > > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > > at > > > org.globus.cog.karajan.compiled.nodes.FramedInternalFunction.run(FramedInternalFunction.java:63) > > > at org.globus.cog.karajan.compiled.nodes.Main.run(Main.java:79) > > > at k.thr.LWThread.run(LWThread.java:243) > > > at > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > > > at > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > > > at java.lang.Thread.run(Thread.java:722) > > > > > > > > > > > > On Tue, Feb 19, 2013 at 1:13 PM, Sam Seaver < samseaver at gmail.com > > > > > > > wrote: > > > > > > > > > > > > OK, it got to the point where it really did hang. I'm retrying, > > > but > > > with your suggestions. The other three finished fine! > > > > > > > > > > > > Progress: time: Tue, 19 Feb 2013 19:08:53 +0000 Selecting > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > successfully:132323 Failed but can retry:183 > > > Progress: time: Tue, 19 Feb 2013 19:09:23 +0000 Selecting > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > successfully:132323 Failed but can retry:183 > > > Progress: time: Tue, 19 Feb 2013 19:09:53 +0000 Selecting > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > successfully:132323 Failed but can retry:183 > > > Progress: time: Tue, 19 Feb 2013 19:10:23 +0000 Selecting > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > successfully:132323 Failed but can retry:183 > > > Progress: time: Tue, 19 Feb 2013 19:10:53 +0000 Selecting > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > successfully:132323 Failed but can retry:183 > > > Progress: time: Tue, 19 Feb 2013 19:11:23 +0000 Selecting > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > successfully:132323 Failed but can retry:183 > > > > > > > > > > > > > > > > > > On Tue, Feb 19, 2013 at 8:51 AM, Lorenzo Pesce < > > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > Hmm... > > > > > > > > > foreach.max.threads=100 > > > > > > > > > maybe you should increase this number a bit and see what happens. > > > > > > > > > Also, I would try to replace > > > > > > > > > SWIFT_HOME=/home/wilde/swift/rev/swift-r6151-cog-r3552 > > > > > > > > > with > > > > > > > > > SWIFT_HOME=/soft/swift/fast > > > > > > > > > Keep me posted. Let's get this rolling. > > > > > > > > > if it doesn't work, I can redo the packing. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Feb 19, 2013, at 1:07 AM, Sam Seaver wrote: > > > > > > > > > > > > Actually, the ten agents job does seem to be stuck in a strange > > > loop. It is incrementing the number of jobs that has finished > > > successfully, and at a fast pace, but the number of jobs its > > > starting is decrementing much more slowly, its almost as its > > > repeatedly attempting the same set of parameters multiple > > > times... > > > > > > > > > I'll see what it's doing in the morning > > > S > > > > > > > > > > > > On Tue, Feb 19, 2013 at 1:00 AM, Sam Seaver < samseaver at gmail.com > > > > > > > wrote: > > > > > > > > > > > > Seems to have worked overall this time! > > > > > > > > > I resume four jobs, each were for a different number of agents > > > (10,100,1000,10000) it made it easier for me to decide on the app > > > time. Two of them have already finished i.e.: > > > > > > > > > > > > Progress: time: Mon, 18 Feb 2013 23:50:12 +0000 Active:4 Checking > > > status:1 Finished in previous run:148098 Finished > > > successfully:37897 > > > Progress: time: Mon, 18 Feb 2013 23:50:15 +0000 Active:2 Checking > > > status:1 Finished in previous run:148098 Finished > > > successfully:37899 > > > Final status: Mon, 18 Feb 2013 23:50:15 +0000 Finished in > > > previous > > > run:148098 Finished successfully:37902 > > > > > > > > > and the only one that is showing any failure (50/110000), is the > > > ten agents version which is so short I can understand why, but > > > its > > > still actively trying to run jobs and is actively finishing jobs, > > > so that's good. > > > > > > > > > Yay! > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Feb 18, 2013 at 1:09 PM, Lorenzo Pesce < > > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > Good. Keep me posted, I would really like to solve your problems > > > in > > > running on Beagle this week, I wish that Swift would have been > > > friendlier. > > > > > > > > > > > > > > > > > > On Feb 18, 2013, at 1:01 PM, Sam Seaver wrote: > > > > > > > > > > > > I just resumed the jobs that I'd killed before the system went > > > down, lets see how it does. I always did a mini-review of the > > > data > > > I've got and it seems to be working as expected. > > > > > > > > > > > > On Mon, Feb 18, 2013 at 12:28 PM, Lorenzo Pesce < > > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > I have lost track a bit of what's up. I am happy to try and go > > > over > > > it with you when you are ready. > > > > > > > > > Some of the problems of swift might have improved with a new > > > version and the new system. > > > > > > > > > > > > > > > > > > > > > > > > On Feb 18, 2013, at 12:22 PM, Sam Seaver wrote: > > > > > > > > > > > > They're not, I've not looked since Beagle came back up. Will do > > > so > > > later today. > > > S > > > > > > > > > > > > On Mon, Feb 18, 2013 at 12:20 PM, Lorenzo Pesce < > > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > > > > > > > > > > -- > > > Postdoctoral Fellow > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > 9700 S. Cass Avenue > > > Argonne, IL 60439 > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > samseaver at gmail.com > > > (773) 796-7144 > > > > > > "We shall not cease from exploration > > > And the end of all our exploring > > > Will be to arrive where we started > > > And know the place for the first time." > > > --T. S. Eliot > > > > > > > > > > > > > > > -- > > > Postdoctoral Fellow > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > 9700 S. Cass Avenue > > > Argonne, IL 60439 > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > samseaver at gmail.com > > > (773) 796-7144 > > > > > > "We shall not cease from exploration > > > And the end of all our exploring > > > Will be to arrive where we started > > > And know the place for the first time." > > > --T. S. Eliot > > > > > > > > > > > > > > > -- > > > Postdoctoral Fellow > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > 9700 S. Cass Avenue > > > Argonne, IL 60439 > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > samseaver at gmail.com > > > (773) 796-7144 > > > > > > "We shall not cease from exploration > > > And the end of all our exploring > > > Will be to arrive where we started > > > And know the place for the first time." > > > --T. S. Eliot > > > > > > > > > > > > -- > > > Postdoctoral Fellow > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > 9700 S. Cass Avenue > > > Argonne, IL 60439 > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > samseaver at gmail.com > > > (773) 796-7144 > > > > > > "We shall not cease from exploration > > > And the end of all our exploring > > > Will be to arrive where we started > > > And know the place for the first time." > > > --T. S. Eliot > > > > > > > > > > > > > > > -- > > > Postdoctoral Fellow > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > 9700 S. Cass Avenue > > > Argonne, IL 60439 > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > samseaver at gmail.com > > > (773) 796-7144 > > > > > > "We shall not cease from exploration > > > And the end of all our exploring > > > Will be to arrive where we started > > > And know the place for the first time." > > > --T. S. Eliot > > > > > > > > > > > > I tried updating from svn and running with the added url tags: > > > > > > > > > > > > > > > > > > > > > > > url="localhost"/> > > > 1 > > > > > key="lowOverAllocation">100 > > > > > key="highOverAllocation">100 > > > > > key="providerAttributes">pbs.aprun;pbs.mpp;depth=24 > > > 4000 > > > > > key="maxWallTime">00:05:00 > > > > > key="disableIdleBlockCleanup">true > > > 1 > > > 1 > > > 1 > > > batch > > > 8.00 > > > > > key="initialScore">10000 > > > > > > /lustre/beagle/davidk > > > > > > > > > > > > > > > > > > > > > I am seeing this error: > > > > > > > > > [Error] sites.beagle.coasters.xml:1:9: cvc-elt.1: Cannot find the > > > declaration of element 'config'. > > > > > > > > > > > > > > > ______________________________________________________________________ > > > From: "Mike Wilde" > > > Sent: Tuesday, February 19, 2013 1:50:15 PM > > > Subject: [Swift Support #22699] Fwd: [Swift-devel] First > > > tests > > > with swift faster > > > > > > > > > Tue Feb 19 13:50:14 2013: Request 22699 was acted upon. > > > Transaction: Ticket created by wilde at mcs.anl.gov > > > Queue: swift-support > > > Subject: Fwd: [Swift-devel] First tests with swift > > > faster > > > Owner: Nobody > > > Requestors: wilde at ci.uchicago.edu > > > Status: new > > > Ticket > > https://rt.ci.uchicago.edu/Ticket/Display.html?id=22699 > > > > > > > > > > > > > David, Mihael, Yadu: could one of you try this on Beagle > > > on > > > the faster branch? > > > > > > Does the faster branch include the PBS support for > > > Beagle? > > > > > > It shouldnt be too hard to see what part of the PBS pool > > > def > > > it doesnt like. > > > > > > - Mike > > > > > > ----- Forwarded Message ----- > > > From: "Lorenzo Pesce" > > > To: "Swift Devel" > > > Sent: Tuesday, February 19, 2013 1:26:20 PM > > > Subject: [Swift-devel] First tests with swift faster > > > > > > > > > This is the content of the file where we have the first > > > complaint from swift (see attached): > > > > > > > > > > > > > > > > > > > > > > > key="project">CI-DEB000002 > > > > > > > > > > > key="providerAttributes">pbs.aprun;pbs.mpp;depth=24 > > > > > > > > > > > > > > > > > key="jobsPerNode">24 > > > > > key="maxTime">172800 > > > > > key="maxwalltime">0:10:00 > > > > > key="lowOverallocation">100 > > > > > key="highOverallocation">100 > > > > > > > > > 200 > > > > > key="nodeGranularity">1 > > > 1 > > > > > > > > > > > key="jobThrottle">47.99 > > > > > key="initialScore">10000 > > > > > > > > > > > > > > > /lustre/beagle/samseaver/GS/swift.workdir > > > > > > > > > > > > > > > Any ideas? > > > > > > > > > Begin forwarded message: > > > > > > > > > > > > From: Sam Seaver < samseaver at gmail.com > > > > > > > Date: February 19, 2013 1:16:28 PM CST > > > > > > To: Lorenzo Pesce < lpesce at uchicago.edu > > > > > > > Subject: Re: How are things going? > > > > > > > > > I got this error. I suspect using the new SWIFT_HOME > > > directory > > > means that there's possibly a missing parameter > > > someplace: > > > > > > > > > > > > should we resume a previous calculation? [y/N] y > > > rlog files displayed in reverse time order > > > should I use GS-20130203-0717-jgeppt98.0.rlog ?[y/n] > > > y > > > Using GS-20130203-0717-jgeppt98.0.rlog > > > [Error] GS_sites.xml:1:9: cvc-elt.1: Cannot find the > > > declaration of element 'config'. > > > > > > > > > Execution failed: > > > Failed to parse site catalog > > > swift:siteCatalog @ scheduler.k, line: 31 > > > Caused by: Invalid pool entry 'pbs': > > > swift:siteCatalog @ scheduler.k, line: 31 > > > Caused by: java.lang.IllegalArgumentException: Missing > > > URL > > > at > > > org.griphyn.vdl.karajan.lib.SiteCatalog.execution(SiteCatalog.java:173) > > > at > > > org.griphyn.vdl.karajan.lib.SiteCatalog.pool(SiteCatalog.java:100) > > > at > > > org.griphyn.vdl.karajan.lib.SiteCatalog.buildResources(SiteCatalog.java:60) > > > at > > > org.griphyn.vdl.karajan.lib.SiteCatalog.function(SiteCatalog.java:48) > > > at > > > org.globus.cog.karajan.compiled.nodes.functions.AbstractFunction.runBody(AbstractFunction.java:38) > > > at > > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:154) > > > at > > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > > at > > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:147) > > > at > > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > > at > > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:147) > > > at > > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > > at > > > org.globus.cog.karajan.compiled.nodes.FramedInternalFunction.run(FramedInternalFunction.java:63) > > > at > > > org.globus.cog.karajan.compiled.nodes.Import.runBody(Import.java:269) > > > at > > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:154) > > > at > > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > > at > > > org.globus.cog.karajan.compiled.nodes.FramedInternalFunction.run(FramedInternalFunction.java:63) > > > at > > > org.globus.cog.karajan.compiled.nodes.Main.run(Main.java:79) > > > at k.thr.LWThread.run(LWThread.java:243) > > > at > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > > > at java.util.concurrent.ThreadPoolExecutor > > > $Worker.run(ThreadPoolExecutor.java:603) > > > at java.lang.Thread.run(Thread.java:722) > > > > > > > > > > > > On Tue, Feb 19, 2013 at 1:13 PM, Sam Seaver < > > > samseaver at gmail.com > wrote: > > > > > > > > > > > > OK, it got to the point where it really did hang. I'm > > > retrying, but with your suggestions. The other three > > > finished > > > fine! > > > > > > > > > > > > Progress: time: Tue, 19 Feb 2013 19:08:53 +0000 Selecting > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > successfully:132323 Failed but can retry:183 > > > Progress: time: Tue, 19 Feb 2013 19:09:23 +0000 Selecting > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > successfully:132323 Failed but can retry:183 > > > Progress: time: Tue, 19 Feb 2013 19:09:53 +0000 Selecting > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > successfully:132323 Failed but can retry:183 > > > Progress: time: Tue, 19 Feb 2013 19:10:23 +0000 Selecting > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > successfully:132323 Failed but can retry:183 > > > Progress: time: Tue, 19 Feb 2013 19:10:53 +0000 Selecting > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > successfully:132323 Failed but can retry:183 > > > Progress: time: Tue, 19 Feb 2013 19:11:23 +0000 Selecting > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > successfully:132323 Failed but can retry:183 > > > > > > > > > > > > > > > > > > On Tue, Feb 19, 2013 at 8:51 AM, Lorenzo Pesce < > > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > Hmm... > > > > > > > > > foreach.max.threads=100 > > > > > > > > > maybe you should increase this number a bit and see what > > > happens. > > > > > > > > > Also, I would try to replace > > > > > > > > > SWIFT_HOME=/home/wilde/swift/rev/swift-r6151-cog-r3552 > > > > > > > > > with > > > > > > > > > SWIFT_HOME=/soft/swift/fast > > > > > > > > > Keep me posted. Let's get this rolling. > > > > > > > > > if it doesn't work, I can redo the packing. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Feb 19, 2013, at 1:07 AM, Sam Seaver wrote: > > > > > > > > > > > > Actually, the ten agents job does seem to be stuck in a > > > strange loop. It is incrementing the number of jobs that > > > has > > > finished successfully, and at a fast pace, but the number > > > of > > > jobs its starting is decrementing much more slowly, its > > > almost > > > as its repeatedly attempting the same set of parameters > > > multiple times... > > > > > > > > > I'll see what it's doing in the morning > > > S > > > > > > > > > > > > On Tue, Feb 19, 2013 at 1:00 AM, Sam Seaver < > > > samseaver at gmail.com > wrote: > > > > > > > > > > > > Seems to have worked overall this time! > > > > > > > > > I resume four jobs, each were for a different number of > > > agents > > > (10,100,1000,10000) it made it easier for me to decide on > > > the > > > app time. Two of them have already finished i.e.: > > > > > > > > > > > > Progress: time: Mon, 18 Feb 2013 23:50:12 +0000 Active:4 > > > Checking status:1 Finished in previous run:148098 > > > Finished > > > successfully:37897 > > > Progress: time: Mon, 18 Feb 2013 23:50:15 +0000 Active:2 > > > Checking status:1 Finished in previous run:148098 > > > Finished > > > successfully:37899 > > > Final status: Mon, 18 Feb 2013 23:50:15 +0000 Finished in > > > previous run:148098 Finished successfully:37902 > > > > > > > > > and the only one that is showing any failure (50/110000), > > > is > > > the ten agents version which is so short I can understand > > > why, > > > but its still actively trying to run jobs and is actively > > > finishing jobs, so that's good. > > > > > > > > > Yay! > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Feb 18, 2013 at 1:09 PM, Lorenzo Pesce < > > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > Good. Keep me posted, I would really like to solve your > > > problems in running on Beagle this week, I wish that > > > Swift > > > would have been friendlier. > > > > > > > > > > > > > > > > > > On Feb 18, 2013, at 1:01 PM, Sam Seaver wrote: > > > > > > > > > > > > I just resumed the jobs that I'd killed before the system > > > went > > > down, lets see how it does. I always did a mini-review of > > > the > > > data I've got and it seems to be working as expected. > > > > > > > > > > > > On Mon, Feb 18, 2013 at 12:28 PM, Lorenzo Pesce < > > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > I have lost track a bit of what's up. I am happy to try > > > and > > > go > > > over it with you when you are ready. > > > > > > > > > Some of the problems of swift might have improved with a > > > new > > > version and the new system. > > > > > > > > > > > > > > > > > > > > > > > > On Feb 18, 2013, at 12:22 PM, Sam Seaver wrote: > > > > > > > > > > > > They're not, I've not looked since Beagle came back up. > > > Will > > > do so later today. > > > S > > > > > > > > > > > > On Mon, Feb 18, 2013 at 12:20 PM, Lorenzo Pesce < > > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > > > > > > > > > > -- > > > Postdoctoral Fellow > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > 9700 S. Cass Avenue > > > Argonne, IL 60439 > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > samseaver at gmail.com > > > (773) 796-7144 > > > > > > "We shall not cease from exploration > > > And the end of all our exploring > > > Will be to arrive where we started > > > And know the place for the first time." > > > --T. S. Eliot > > > > > > > > > > > > > > > -- > > > Postdoctoral Fellow > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > 9700 S. Cass Avenue > > > Argonne, IL 60439 > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > samseaver at gmail.com > > > (773) 796-7144 > > > > > > "We shall not cease from exploration > > > And the end of all our exploring > > > Will be to arrive where we started > > > And know the place for the first time." > > > --T. S. Eliot > > > > > > > > > > > > > > > -- > > > Postdoctoral Fellow > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > 9700 S. Cass Avenue > > > Argonne, IL 60439 > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > samseaver at gmail.com > > > (773) 796-7144 > > > > > > "We shall not cease from exploration > > > And the end of all our exploring > > > Will be to arrive where we started > > > And know the place for the first time." > > > --T. S. Eliot > > > > > > > > > > > > -- > > > Postdoctoral Fellow > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > 9700 S. Cass Avenue > > > Argonne, IL 60439 > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > samseaver at gmail.com > > > (773) 796-7144 > > > > > > "We shall not cease from exploration > > > And the end of all our exploring > > > Will be to arrive where we started > > > And know the place for the first time." > > > --T. S. Eliot > > > > > > > > > > > > > > > -- > > > Postdoctoral Fellow > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > 9700 S. Cass Avenue > > > Argonne, IL 60439 > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > samseaver at gmail.com > > > (773) 796-7144 > > > > > > "We shall not cease from exploration > > > And the end of all our exploring > > > Will be to arrive where we started > > > And know the place for the first time." > > > --T. S. Eliot > > > > > > > > > > > > -- > > > Postdoctoral Fellow > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > 9700 S. Cass Avenue > > > Argonne, IL 60439 > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > samseaver at gmail.com > > > (773) 796-7144 > > > > > > "We shall not cease from exploration > > > And the end of all our exploring > > > Will be to arrive where we started > > > And know the place for the first time." > > > --T. S. Eliot > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > From wilde at mcs.anl.gov Wed Jul 10 13:47:05 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 10 Jul 2013 13:47:05 -0500 (CDT) Subject: [Swift-devel] Coasters does only one round of tasks in faster trunk In-Reply-To: <1733118118.8853440.1373479744368.JavaMail.root@mcs.anl.gov> Message-ID: <1893765252.8877833.1373482025672.JavaMail.root@mcs.anl.gov> Mihael, the 0.94 branch (latest rev) seems to behave the same way on this workflow. What I see in the log is below, with the last set of message repeating, with no new app tasks starting or completing, until I kill the script (ie Allocating blocks for a total walltime of: 35s; BlockQueueProcessor Jobs in holding queue: 32) I'll test this again on a different scheduler. The app tasks should run for about 5 secs, even though Ive used a default maxwalltime of 15 min. - Mike 2013-07-10 13:32:21,243-0500 DEBUG swift CDM: file://localhost/data.0001.tiny : DEFAULT 2013-07-10 13:32:21,243-0500 DEBUG swift CDM: file://localhost/processpoints.py : DEFAULT 2013-07-10 13:32:21,243-0500 INFO LateBindingScheduler jobs queued: 437 2013-07-10 13:32:21,243-0500 DEBUG swift CDM: file://localhost/out/seq/seq00313 : DEFAULT 2013-07-10 13:32:21,244-0500 DEBUG swift FILE_STAGE_IN_START file=seq00313 srchost=localhost srcdir=out/seq srcname=seq00313 desthost\ =cluster destdir=paintgrid-20130710-1331-g435v1l3/shared/out/seq provider=file policy=DEFAULT 2013-07-10 13:32:21,248-0500 INFO LateBindingScheduler jobs queued: 437 2013-07-10 13:32:21,248-0500 DEBUG swift FILE_STAGE_IN_END file=seq00313 srchost=localhost srcdir=out/seq srcname=seq00313 desthost=c\ luster destdir=paintgrid-20130710-1331-g435v1l3/shared/out/seq provider=file 2013-07-10 13:32:21,248-0500 INFO swift END jobid=python-69som5cl - Staging in finished 2013-07-10 13:32:21,249-0500 DEBUG swift JOB_START jobid=python-69som5cl tr=python arguments=[processpoints.py, data.0001.tiny, out/s\ eq/seq00313, 0.0] tmpdir=paintgrid-20130710-1331-g435v1l3/jobs/6/python-69som5cl host=cluster 2013-07-10 13:32:21,251-0500 INFO GridExec TASK_DEFINITION: Task(type=JOB_SUBMISSION, identity=urn:0-5-296-1-1-1373481110671) is /bi\ n/bash shared/_swiftwrap python-69som5cl -jobdir 6 -scratch -e python -out out/out.00313 -err stderr.txt -i -d out/seq|out -if proce\ sspoints.py|data.0001.tiny|out/seq/seq00313 -of out/out.00313 -k -cdmfile -status provider -a processpoints.py data.0001.tiny out/s\ eq/seq00313 0.0 2013-07-10 13:32:21,252-0500 INFO RequestHandler Handler(tag: 65, SUBMITJOB) unregistering (send) 2013-07-10 13:32:21,312-0500 INFO BlockQueueProcessor Jobs in holding queue: 32 2013-07-10 13:32:21,312-0500 INFO BlockQueueProcessor Time estimate for holding queue (seconds): 36 2013-07-10 13:32:21,312-0500 INFO BlockQueueProcessor Allocating blocks for a total walltime of: 35s 2013-07-10 13:32:22,354-0500 INFO BlockQueueProcessor Jobs in holding queue: 32 2013-07-10 13:32:22,354-0500 INFO BlockQueueProcessor Time estimate for holding queue (seconds): 36 2013-07-10 13:32:22,354-0500 INFO BlockQueueProcessor Allocating blocks for a total walltime of: 35s 2013-07-10 13:32:23,377-0500 INFO BlockQueueProcessor Jobs in holding queue: 32 ----- Original Message ----- > From: "Michael Wilde" > To: "Swift Devel" > Sent: Wednesday, July 10, 2013 1:09:04 PM > Subject: [Swift-devel] Coasters does only one round of tasks in faster trunk > > Mihael, > > Ive run into a more serious problem: Im running on a single SGE node > with 32 cores. > > The swift script has submitted about 400+ app() tasks > > 32 completed, but then coasters doesn't seem to be sending new jobs > to the node. > > The log is at: > http://www.ci.uchicago.edu/~wilde/paintgrid-20130710-1252-v9p47hm8.log > > I'll try the same on a 0.94.1 rev. > > - Mike > > > > ----- Original Message ----- > > From: "Michael Wilde" > > To: "Swift Devel" > > Sent: Wednesday, July 10, 2013 12:53:44 PM > > Subject: Initial tests of new (faster) trunk > > > > Initial tests of the new trunk are working for me, but I'm seeing > > three odd things: > > > > 1. [Error] sites.beagle.coasters.xml:1:9: cvc-elt.1: Cannot find > > the > > declaration of element 'config'. > > (as reported in the email below) > > > > 2. {env.HOME} is not interpreted in sites.xml as in prior versions. > > Swift created a workdir named "$PWD/{env.HOME}/swiftwork" > > > > 3. Progress ticker lines seem to be defaulting to one per second, > > even with no status changing. > > > > But so far so good - a first test script using the new code is > > running nicely on SGE. > > > > - Mike > > > > ----- Original Message ----- > > > From: "Mihael Hategan" > > > To: "David Kelly" > > > Cc: swift-support at ci.uchicago.edu, "Michael Wilde" > > > > > > Sent: Tuesday, February 19, 2013 2:46:14 PM > > > Subject: Re: [Swift Support #22699] Fwd: [Swift-devel] First > > > tests > > > with swift faster > > > > > > Yeah. The validation fails. You can ignore it for now. I'll fix > > > in > > > the > > > future. Basically there is code to validate the sites file > > > against > > > the > > > XML schema, but it fails. It's not a fatal issue though, and > > > parsing > > > still happens. > > > > > > Mihael > > > > > > On Tue, 2013-02-19 at 14:22 -0600, David Kelly wrote: > > > > I tried updating from svn and running with the added url tags: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > url="localhost"/> > > > > 1 > > > > > > > key="lowOverAllocation">100 > > > > > > > key="highOverAllocation">100 > > > > > > > key="providerAttributes">pbs.aprun;pbs.mpp;depth=24 > > > > 4000 > > > > > > > key="maxWallTime">00:05:00 > > > > > > > key="disableIdleBlockCleanup">true > > > > 1 > > > > 1 > > > > 1 > > > > batch > > > > 8.00 > > > > 10000 > > > > > > > > /lustre/beagle/davidk > > > > > > > > > > > > > > > > > > > > > > > > > > > > I am seeing this error: > > > > > > > > > > > > [Error] sites.beagle.coasters.xml:1:9: cvc-elt.1: Cannot find > > > > the > > > > declaration of element 'config'. > > > > > > > > > > > > ----- Original Message ----- > > > > > > > > > > > > From: "Mike Wilde" > > > > Sent: Tuesday, February 19, 2013 1:50:15 PM > > > > Subject: [Swift Support #22699] Fwd: [Swift-devel] First tests > > > > with > > > > swift faster > > > > > > > > > > > > Tue Feb 19 13:50:14 2013: Request 22699 was acted upon. > > > > Transaction: Ticket created by wilde at mcs.anl.gov > > > > Queue: swift-support > > > > Subject: Fwd: [Swift-devel] First tests with swift faster > > > > Owner: Nobody > > > > Requestors: wilde at ci.uchicago.edu > > > > Status: new > > > > Ticket > > > https://rt.ci.uchicago.edu/Ticket/Display.html?id=22699 > > > > > > > > > > > > > > > > > David, Mihael, Yadu: could one of you try this on Beagle on the > > > > faster branch? > > > > > > > > Does the faster branch include the PBS support for Beagle? > > > > > > > > It shouldnt be too hard to see what part of the PBS pool def it > > > > doesnt like. > > > > > > > > - Mike > > > > > > > > ----- Forwarded Message ----- > > > > From: "Lorenzo Pesce" > > > > To: "Swift Devel" > > > > Sent: Tuesday, February 19, 2013 1:26:20 PM > > > > Subject: [Swift-devel] First tests with swift faster > > > > > > > > > > > > This is the content of the file where we have the first > > > > complaint > > > > from swift (see attached): > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > key="project">CI-DEB000002 > > > > > > > > > > > > > > > key="providerAttributes">pbs.aprun;pbs.mpp;depth=24 > > > > > > > > > > > > > > > > > > > > 24 > > > > 172800 > > > > 0:10:00 > > > > > > > key="lowOverallocation">100 > > > > > > > key="highOverallocation">100 > > > > > > > > > > > > 200 > > > > 1 > > > > 1 > > > > > > > > > > > > 47.99 > > > > 10000 > > > > > > > > > > > > > > > > > > > > /lustre/beagle/samseaver/GS/swift.workdir > > > > > > > > > > > > > > > > > > > > Any ideas? > > > > > > > > > > > > Begin forwarded message: > > > > > > > > > > > > > > > > From: Sam Seaver < samseaver at gmail.com > > > > > > > > > Date: February 19, 2013 1:16:28 PM CST > > > > > > > > To: Lorenzo Pesce < lpesce at uchicago.edu > > > > > > > > > Subject: Re: How are things going? > > > > > > > > > > > > I got this error. I suspect using the new SWIFT_HOME directory > > > > means that there's possibly a missing parameter someplace: > > > > > > > > > > > > > > > > should we resume a previous calculation? [y/N] y > > > > rlog files displayed in reverse time order > > > > should I use GS-20130203-0717-jgeppt98.0.rlog ?[y/n] > > > > y > > > > Using GS-20130203-0717-jgeppt98.0.rlog > > > > [Error] GS_sites.xml:1:9: cvc-elt.1: Cannot find the > > > > declaration > > > > of > > > > element 'config'. > > > > > > > > > > > > Execution failed: > > > > Failed to parse site catalog > > > > swift:siteCatalog @ scheduler.k, line: 31 > > > > Caused by: Invalid pool entry 'pbs': > > > > swift:siteCatalog @ scheduler.k, line: 31 > > > > Caused by: java.lang.IllegalArgumentException: Missing URL > > > > at > > > > org.griphyn.vdl.karajan.lib.SiteCatalog.execution(SiteCatalog.java:173) > > > > at > > > > org.griphyn.vdl.karajan.lib.SiteCatalog.pool(SiteCatalog.java:100) > > > > at > > > > org.griphyn.vdl.karajan.lib.SiteCatalog.buildResources(SiteCatalog.java:60) > > > > at > > > > org.griphyn.vdl.karajan.lib.SiteCatalog.function(SiteCatalog.java:48) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.functions.AbstractFunction.runBody(AbstractFunction.java:38) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:154) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:147) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:147) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.FramedInternalFunction.run(FramedInternalFunction.java:63) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.Import.runBody(Import.java:269) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:154) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.FramedInternalFunction.run(FramedInternalFunction.java:63) > > > > at org.globus.cog.karajan.compiled.nodes.Main.run(Main.java:79) > > > > at k.thr.LWThread.run(LWThread.java:243) > > > > at > > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > > > > at > > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > > > > at java.lang.Thread.run(Thread.java:722) > > > > > > > > > > > > > > > > On Tue, Feb 19, 2013 at 1:13 PM, Sam Seaver < > > > > samseaver at gmail.com > > > > > > > > > wrote: > > > > > > > > > > > > > > > > OK, it got to the point where it really did hang. I'm retrying, > > > > but > > > > with your suggestions. The other three finished fine! > > > > > > > > > > > > > > > > Progress: time: Tue, 19 Feb 2013 19:08:53 +0000 Selecting > > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > > successfully:132323 Failed but can retry:183 > > > > Progress: time: Tue, 19 Feb 2013 19:09:23 +0000 Selecting > > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > > successfully:132323 Failed but can retry:183 > > > > Progress: time: Tue, 19 Feb 2013 19:09:53 +0000 Selecting > > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > > successfully:132323 Failed but can retry:183 > > > > Progress: time: Tue, 19 Feb 2013 19:10:23 +0000 Selecting > > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > > successfully:132323 Failed but can retry:183 > > > > Progress: time: Tue, 19 Feb 2013 19:10:53 +0000 Selecting > > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > > successfully:132323 Failed but can retry:183 > > > > Progress: time: Tue, 19 Feb 2013 19:11:23 +0000 Selecting > > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > > successfully:132323 Failed but can retry:183 > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Feb 19, 2013 at 8:51 AM, Lorenzo Pesce < > > > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > > > > > Hmm... > > > > > > > > > > > > foreach.max.threads=100 > > > > > > > > > > > > maybe you should increase this number a bit and see what > > > > happens. > > > > > > > > > > > > Also, I would try to replace > > > > > > > > > > > > SWIFT_HOME=/home/wilde/swift/rev/swift-r6151-cog-r3552 > > > > > > > > > > > > with > > > > > > > > > > > > SWIFT_HOME=/soft/swift/fast > > > > > > > > > > > > Keep me posted. Let's get this rolling. > > > > > > > > > > > > if it doesn't work, I can redo the packing. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Feb 19, 2013, at 1:07 AM, Sam Seaver wrote: > > > > > > > > > > > > > > > > Actually, the ten agents job does seem to be stuck in a strange > > > > loop. It is incrementing the number of jobs that has finished > > > > successfully, and at a fast pace, but the number of jobs its > > > > starting is decrementing much more slowly, its almost as its > > > > repeatedly attempting the same set of parameters multiple > > > > times... > > > > > > > > > > > > I'll see what it's doing in the morning > > > > S > > > > > > > > > > > > > > > > On Tue, Feb 19, 2013 at 1:00 AM, Sam Seaver < > > > > samseaver at gmail.com > > > > > > > > > wrote: > > > > > > > > > > > > > > > > Seems to have worked overall this time! > > > > > > > > > > > > I resume four jobs, each were for a different number of agents > > > > (10,100,1000,10000) it made it easier for me to decide on the > > > > app > > > > time. Two of them have already finished i.e.: > > > > > > > > > > > > > > > > Progress: time: Mon, 18 Feb 2013 23:50:12 +0000 Active:4 > > > > Checking > > > > status:1 Finished in previous run:148098 Finished > > > > successfully:37897 > > > > Progress: time: Mon, 18 Feb 2013 23:50:15 +0000 Active:2 > > > > Checking > > > > status:1 Finished in previous run:148098 Finished > > > > successfully:37899 > > > > Final status: Mon, 18 Feb 2013 23:50:15 +0000 Finished in > > > > previous > > > > run:148098 Finished successfully:37902 > > > > > > > > > > > > and the only one that is showing any failure (50/110000), is > > > > the > > > > ten agents version which is so short I can understand why, but > > > > its > > > > still actively trying to run jobs and is actively finishing > > > > jobs, > > > > so that's good. > > > > > > > > > > > > Yay! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Feb 18, 2013 at 1:09 PM, Lorenzo Pesce < > > > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > > > > > Good. Keep me posted, I would really like to solve your > > > > problems > > > > in > > > > running on Beagle this week, I wish that Swift would have been > > > > friendlier. > > > > > > > > > > > > > > > > > > > > > > > > On Feb 18, 2013, at 1:01 PM, Sam Seaver wrote: > > > > > > > > > > > > > > > > I just resumed the jobs that I'd killed before the system went > > > > down, lets see how it does. I always did a mini-review of the > > > > data > > > > I've got and it seems to be working as expected. > > > > > > > > > > > > > > > > On Mon, Feb 18, 2013 at 12:28 PM, Lorenzo Pesce < > > > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > > > > > I have lost track a bit of what's up. I am happy to try and go > > > > over > > > > it with you when you are ready. > > > > > > > > > > > > Some of the problems of swift might have improved with a new > > > > version and the new system. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Feb 18, 2013, at 12:22 PM, Sam Seaver wrote: > > > > > > > > > > > > > > > > They're not, I've not looked since Beagle came back up. Will do > > > > so > > > > later today. > > > > S > > > > > > > > > > > > > > > > On Mon, Feb 18, 2013 at 12:20 PM, Lorenzo Pesce < > > > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Postdoctoral Fellow > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > 9700 S. Cass Avenue > > > > Argonne, IL 60439 > > > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > > samseaver at gmail.com > > > > (773) 796-7144 > > > > > > > > "We shall not cease from exploration > > > > And the end of all our exploring > > > > Will be to arrive where we started > > > > And know the place for the first time." > > > > --T. S. Eliot > > > > > > > > > > > > > > > > > > > > -- > > > > Postdoctoral Fellow > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > 9700 S. Cass Avenue > > > > Argonne, IL 60439 > > > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > > samseaver at gmail.com > > > > (773) 796-7144 > > > > > > > > "We shall not cease from exploration > > > > And the end of all our exploring > > > > Will be to arrive where we started > > > > And know the place for the first time." > > > > --T. S. Eliot > > > > > > > > > > > > > > > > > > > > -- > > > > Postdoctoral Fellow > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > 9700 S. Cass Avenue > > > > Argonne, IL 60439 > > > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > > samseaver at gmail.com > > > > (773) 796-7144 > > > > > > > > "We shall not cease from exploration > > > > And the end of all our exploring > > > > Will be to arrive where we started > > > > And know the place for the first time." > > > > --T. S. Eliot > > > > > > > > > > > > > > > > -- > > > > Postdoctoral Fellow > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > 9700 S. Cass Avenue > > > > Argonne, IL 60439 > > > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > > samseaver at gmail.com > > > > (773) 796-7144 > > > > > > > > "We shall not cease from exploration > > > > And the end of all our exploring > > > > Will be to arrive where we started > > > > And know the place for the first time." > > > > --T. S. Eliot > > > > > > > > > > > > > > > > > > > > -- > > > > Postdoctoral Fellow > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > 9700 S. Cass Avenue > > > > Argonne, IL 60439 > > > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > > samseaver at gmail.com > > > > (773) 796-7144 > > > > > > > > "We shall not cease from exploration > > > > And the end of all our exploring > > > > Will be to arrive where we started > > > > And know the place for the first time." > > > > --T. S. Eliot > > > > > > > > > > > > > > > > I tried updating from svn and running with the added url tags: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > url="localhost"/> > > > > 1 > > > > > > > key="lowOverAllocation">100 > > > > > > > key="highOverAllocation">100 > > > > > > > key="providerAttributes">pbs.aprun;pbs.mpp;depth=24 > > > > 4000 > > > > > > > key="maxWallTime">00:05:00 > > > > > > > key="disableIdleBlockCleanup">true > > > > 1 > > > > > > > key="nodeGranularity">1 > > > > 1 > > > > batch > > > > > > > key="jobThrottle">8.00 > > > > > > > key="initialScore">10000 > > > > > > > > /lustre/beagle/davidk > > > > > > > > > > > > > > > > > > > > > > > > > > > > I am seeing this error: > > > > > > > > > > > > [Error] sites.beagle.coasters.xml:1:9: cvc-elt.1: Cannot find > > > > the > > > > declaration of element 'config'. > > > > > > > > > > > > > > > > > > > > ______________________________________________________________________ > > > > From: "Mike Wilde" > > > > Sent: Tuesday, February 19, 2013 1:50:15 PM > > > > Subject: [Swift Support #22699] Fwd: [Swift-devel] > > > > First > > > > tests > > > > with swift faster > > > > > > > > > > > > Tue Feb 19 13:50:14 2013: Request 22699 was acted upon. > > > > Transaction: Ticket created by wilde at mcs.anl.gov > > > > Queue: swift-support > > > > Subject: Fwd: [Swift-devel] First tests with swift > > > > faster > > > > Owner: Nobody > > > > Requestors: wilde at ci.uchicago.edu > > > > Status: new > > > > Ticket > > > https://rt.ci.uchicago.edu/Ticket/Display.html?id=22699 > > > > > > > > > > > > > > > > > > > > > David, Mihael, Yadu: could one of you try this on > > > > Beagle > > > > on > > > > the faster branch? > > > > > > > > Does the faster branch include the PBS support for > > > > Beagle? > > > > > > > > It shouldnt be too hard to see what part of the PBS > > > > pool > > > > def > > > > it doesnt like. > > > > > > > > - Mike > > > > > > > > ----- Forwarded Message ----- > > > > From: "Lorenzo Pesce" > > > > To: "Swift Devel" > > > > Sent: Tuesday, February 19, 2013 1:26:20 PM > > > > Subject: [Swift-devel] First tests with swift faster > > > > > > > > > > > > This is the content of the file where we have the first > > > > complaint from swift (see attached): > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > key="project">CI-DEB000002 > > > > > > > > > > > > > > > key="providerAttributes">pbs.aprun;pbs.mpp;depth=24 > > > > > > > > > > > > > > > > > > > > > > > key="jobsPerNode">24 > > > > > > > key="maxTime">172800 > > > > > > > key="maxwalltime">0:10:00 > > > > > > > key="lowOverallocation">100 > > > > > > > key="highOverallocation">100 > > > > > > > > > > > > 200 > > > > > > > key="nodeGranularity">1 > > > > 1 > > > > > > > > > > > > > > > key="jobThrottle">47.99 > > > > > > > key="initialScore">10000 > > > > > > > > > > > > > > > > > > > > /lustre/beagle/samseaver/GS/swift.workdir > > > > > > > > > > > > > > > > > > > > Any ideas? > > > > > > > > > > > > Begin forwarded message: > > > > > > > > > > > > > > > > From: Sam Seaver < samseaver at gmail.com > > > > > > > > > Date: February 19, 2013 1:16:28 PM CST > > > > > > > > To: Lorenzo Pesce < lpesce at uchicago.edu > > > > > > > > > Subject: Re: How are things going? > > > > > > > > > > > > I got this error. I suspect using the new SWIFT_HOME > > > > directory > > > > means that there's possibly a missing parameter > > > > someplace: > > > > > > > > > > > > > > > > should we resume a previous calculation? [y/N] y > > > > rlog files displayed in reverse time order > > > > should I use GS-20130203-0717-jgeppt98.0.rlog ?[y/n] > > > > y > > > > Using GS-20130203-0717-jgeppt98.0.rlog > > > > [Error] GS_sites.xml:1:9: cvc-elt.1: Cannot find the > > > > declaration of element 'config'. > > > > > > > > > > > > Execution failed: > > > > Failed to parse site catalog > > > > swift:siteCatalog @ scheduler.k, line: 31 > > > > Caused by: Invalid pool entry 'pbs': > > > > swift:siteCatalog @ scheduler.k, line: 31 > > > > Caused by: java.lang.IllegalArgumentException: Missing > > > > URL > > > > at > > > > org.griphyn.vdl.karajan.lib.SiteCatalog.execution(SiteCatalog.java:173) > > > > at > > > > org.griphyn.vdl.karajan.lib.SiteCatalog.pool(SiteCatalog.java:100) > > > > at > > > > org.griphyn.vdl.karajan.lib.SiteCatalog.buildResources(SiteCatalog.java:60) > > > > at > > > > org.griphyn.vdl.karajan.lib.SiteCatalog.function(SiteCatalog.java:48) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.functions.AbstractFunction.runBody(AbstractFunction.java:38) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:154) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:147) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:147) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.FramedInternalFunction.run(FramedInternalFunction.java:63) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.Import.runBody(Import.java:269) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:154) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.FramedInternalFunction.run(FramedInternalFunction.java:63) > > > > at > > > > org.globus.cog.karajan.compiled.nodes.Main.run(Main.java:79) > > > > at k.thr.LWThread.run(LWThread.java:243) > > > > at > > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > > > > at java.util.concurrent.ThreadPoolExecutor > > > > $Worker.run(ThreadPoolExecutor.java:603) > > > > at java.lang.Thread.run(Thread.java:722) > > > > > > > > > > > > > > > > On Tue, Feb 19, 2013 at 1:13 PM, Sam Seaver < > > > > samseaver at gmail.com > wrote: > > > > > > > > > > > > > > > > OK, it got to the point where it really did hang. I'm > > > > retrying, but with your suggestions. The other three > > > > finished > > > > fine! > > > > > > > > > > > > > > > > Progress: time: Tue, 19 Feb 2013 19:08:53 +0000 > > > > Selecting > > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > > successfully:132323 Failed but can retry:183 > > > > Progress: time: Tue, 19 Feb 2013 19:09:23 +0000 > > > > Selecting > > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > > successfully:132323 Failed but can retry:183 > > > > Progress: time: Tue, 19 Feb 2013 19:09:53 +0000 > > > > Selecting > > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > > successfully:132323 Failed but can retry:183 > > > > Progress: time: Tue, 19 Feb 2013 19:10:23 +0000 > > > > Selecting > > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > > successfully:132323 Failed but can retry:183 > > > > Progress: time: Tue, 19 Feb 2013 19:10:53 +0000 > > > > Selecting > > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > > successfully:132323 Failed but can retry:183 > > > > Progress: time: Tue, 19 Feb 2013 19:11:23 +0000 > > > > Selecting > > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > > successfully:132323 Failed but can retry:183 > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Feb 19, 2013 at 8:51 AM, Lorenzo Pesce < > > > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > > > > > Hmm... > > > > > > > > > > > > foreach.max.threads=100 > > > > > > > > > > > > maybe you should increase this number a bit and see > > > > what > > > > happens. > > > > > > > > > > > > Also, I would try to replace > > > > > > > > > > > > SWIFT_HOME=/home/wilde/swift/rev/swift-r6151-cog-r3552 > > > > > > > > > > > > with > > > > > > > > > > > > SWIFT_HOME=/soft/swift/fast > > > > > > > > > > > > Keep me posted. Let's get this rolling. > > > > > > > > > > > > if it doesn't work, I can redo the packing. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Feb 19, 2013, at 1:07 AM, Sam Seaver wrote: > > > > > > > > > > > > > > > > Actually, the ten agents job does seem to be stuck in a > > > > strange loop. It is incrementing the number of jobs > > > > that > > > > has > > > > finished successfully, and at a fast pace, but the > > > > number > > > > of > > > > jobs its starting is decrementing much more slowly, its > > > > almost > > > > as its repeatedly attempting the same set of parameters > > > > multiple times... > > > > > > > > > > > > I'll see what it's doing in the morning > > > > S > > > > > > > > > > > > > > > > On Tue, Feb 19, 2013 at 1:00 AM, Sam Seaver < > > > > samseaver at gmail.com > wrote: > > > > > > > > > > > > > > > > Seems to have worked overall this time! > > > > > > > > > > > > I resume four jobs, each were for a different number of > > > > agents > > > > (10,100,1000,10000) it made it easier for me to decide > > > > on > > > > the > > > > app time. Two of them have already finished i.e.: > > > > > > > > > > > > > > > > Progress: time: Mon, 18 Feb 2013 23:50:12 +0000 > > > > Active:4 > > > > Checking status:1 Finished in previous run:148098 > > > > Finished > > > > successfully:37897 > > > > Progress: time: Mon, 18 Feb 2013 23:50:15 +0000 > > > > Active:2 > > > > Checking status:1 Finished in previous run:148098 > > > > Finished > > > > successfully:37899 > > > > Final status: Mon, 18 Feb 2013 23:50:15 +0000 Finished > > > > in > > > > previous run:148098 Finished successfully:37902 > > > > > > > > > > > > and the only one that is showing any failure > > > > (50/110000), > > > > is > > > > the ten agents version which is so short I can > > > > understand > > > > why, > > > > but its still actively trying to run jobs and is > > > > actively > > > > finishing jobs, so that's good. > > > > > > > > > > > > Yay! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Feb 18, 2013 at 1:09 PM, Lorenzo Pesce < > > > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > > > > > Good. Keep me posted, I would really like to solve your > > > > problems in running on Beagle this week, I wish that > > > > Swift > > > > would have been friendlier. > > > > > > > > > > > > > > > > > > > > > > > > On Feb 18, 2013, at 1:01 PM, Sam Seaver wrote: > > > > > > > > > > > > > > > > I just resumed the jobs that I'd killed before the > > > > system > > > > went > > > > down, lets see how it does. I always did a mini-review > > > > of > > > > the > > > > data I've got and it seems to be working as expected. > > > > > > > > > > > > > > > > On Mon, Feb 18, 2013 at 12:28 PM, Lorenzo Pesce < > > > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > > > > > I have lost track a bit of what's up. I am happy to try > > > > and > > > > go > > > > over it with you when you are ready. > > > > > > > > > > > > Some of the problems of swift might have improved with > > > > a > > > > new > > > > version and the new system. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Feb 18, 2013, at 12:22 PM, Sam Seaver wrote: > > > > > > > > > > > > > > > > They're not, I've not looked since Beagle came back up. > > > > Will > > > > do so later today. > > > > S > > > > > > > > > > > > > > > > On Mon, Feb 18, 2013 at 12:20 PM, Lorenzo Pesce < > > > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Postdoctoral Fellow > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > 9700 S. Cass Avenue > > > > Argonne, IL 60439 > > > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > > samseaver at gmail.com > > > > (773) 796-7144 > > > > > > > > "We shall not cease from exploration > > > > And the end of all our exploring > > > > Will be to arrive where we started > > > > And know the place for the first time." > > > > --T. S. Eliot > > > > > > > > > > > > > > > > > > > > -- > > > > Postdoctoral Fellow > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > 9700 S. Cass Avenue > > > > Argonne, IL 60439 > > > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > > samseaver at gmail.com > > > > (773) 796-7144 > > > > > > > > "We shall not cease from exploration > > > > And the end of all our exploring > > > > Will be to arrive where we started > > > > And know the place for the first time." > > > > --T. S. Eliot > > > > > > > > > > > > > > > > > > > > -- > > > > Postdoctoral Fellow > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > 9700 S. Cass Avenue > > > > Argonne, IL 60439 > > > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > > samseaver at gmail.com > > > > (773) 796-7144 > > > > > > > > "We shall not cease from exploration > > > > And the end of all our exploring > > > > Will be to arrive where we started > > > > And know the place for the first time." > > > > --T. S. Eliot > > > > > > > > > > > > > > > > -- > > > > Postdoctoral Fellow > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > 9700 S. Cass Avenue > > > > Argonne, IL 60439 > > > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > > samseaver at gmail.com > > > > (773) 796-7144 > > > > > > > > "We shall not cease from exploration > > > > And the end of all our exploring > > > > Will be to arrive where we started > > > > And know the place for the first time." > > > > --T. S. Eliot > > > > > > > > > > > > > > > > > > > > -- > > > > Postdoctoral Fellow > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > 9700 S. Cass Avenue > > > > Argonne, IL 60439 > > > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > > samseaver at gmail.com > > > > (773) 796-7144 > > > > > > > > "We shall not cease from exploration > > > > And the end of all our exploring > > > > Will be to arrive where we started > > > > And know the place for the first time." > > > > --T. S. Eliot > > > > > > > > > > > > > > > > -- > > > > Postdoctoral Fellow > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > 9700 S. Cass Avenue > > > > Argonne, IL 60439 > > > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > > samseaver at gmail.com > > > > (773) 796-7144 > > > > > > > > "We shall not cease from exploration > > > > And the end of all our exploring > > > > Will be to arrive where we started > > > > And know the place for the first time." > > > > --T. S. Eliot > > > > > > > > _______________________________________________ > > > > Swift-devel mailing list > > > > Swift-devel at ci.uchicago.edu > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > From wilde at mcs.anl.gov Wed Jul 10 14:22:20 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 10 Jul 2013 14:22:20 -0500 (CDT) Subject: [Swift-devel] Coasters does only one round of tasks in faster trunk In-Reply-To: <1893765252.8877833.1373482025672.JavaMail.root@mcs.anl.gov> Message-ID: <1458807953.8905645.1373484140557.JavaMail.root@mcs.anl.gov> Further tests show that a simple catsn workflow fails on this SGE system in the same manner (when using the same sites file) and that the failing script form the prior log succeeds on Midway (under 0.94). A log from the catsn failure is at: http://www.ci.uchicago.edu/~wilde/catsn-20130710-1415-t7cprvsd.log - Mike ----- Original Message ----- > From: "Michael Wilde" > To: "Swift Devel" , "Mihael Hategan" > Sent: Wednesday, July 10, 2013 1:47:05 PM > Subject: Re: [Swift-devel] Coasters does only one round of tasks in faster trunk > > > Mihael, the 0.94 branch (latest rev) seems to behave the same way on > this workflow. > > What I see in the log is below, with the last set of message > repeating, with no new app tasks starting or completing, until I > kill the script (ie Allocating blocks for a total walltime of: 35s; > BlockQueueProcessor Jobs in holding queue: 32) > > I'll test this again on a different scheduler. The app tasks should > run for about 5 secs, even though Ive used a default maxwalltime of > 15 min. > > - Mike > > > 2013-07-10 13:32:21,243-0500 DEBUG swift CDM: > file://localhost/data.0001.tiny : DEFAULT > 2013-07-10 13:32:21,243-0500 DEBUG swift CDM: > file://localhost/processpoints.py : DEFAULT > 2013-07-10 13:32:21,243-0500 INFO LateBindingScheduler jobs queued: > 437 > 2013-07-10 13:32:21,243-0500 DEBUG swift CDM: > file://localhost/out/seq/seq00313 : DEFAULT > 2013-07-10 13:32:21,244-0500 DEBUG swift FILE_STAGE_IN_START > file=seq00313 srchost=localhost srcdir=out/seq srcname=seq00313 > desthost\ > =cluster destdir=paintgrid-20130710-1331-g435v1l3/shared/out/seq > provider=file policy=DEFAULT > 2013-07-10 13:32:21,248-0500 INFO LateBindingScheduler jobs queued: > 437 > 2013-07-10 13:32:21,248-0500 DEBUG swift FILE_STAGE_IN_END > file=seq00313 srchost=localhost srcdir=out/seq srcname=seq00313 > desthost=c\ > luster destdir=paintgrid-20130710-1331-g435v1l3/shared/out/seq > provider=file > 2013-07-10 13:32:21,248-0500 INFO swift END jobid=python-69som5cl - > Staging in finished > 2013-07-10 13:32:21,249-0500 DEBUG swift JOB_START > jobid=python-69som5cl tr=python arguments=[processpoints.py, > data.0001.tiny, out/s\ > eq/seq00313, 0.0] > tmpdir=paintgrid-20130710-1331-g435v1l3/jobs/6/python-69som5cl > host=cluster > 2013-07-10 13:32:21,251-0500 INFO GridExec TASK_DEFINITION: > Task(type=JOB_SUBMISSION, identity=urn:0-5-296-1-1-1373481110671) is > /bi\ > n/bash shared/_swiftwrap python-69som5cl -jobdir 6 -scratch -e > python -out out/out.00313 -err stderr.txt -i -d out/seq|out -if > proce\ > sspoints.py|data.0001.tiny|out/seq/seq00313 -of out/out.00313 -k > -cdmfile -status provider -a processpoints.py data.0001.tiny > out/s\ > eq/seq00313 0.0 > 2013-07-10 13:32:21,252-0500 INFO RequestHandler Handler(tag: 65, > SUBMITJOB) unregistering (send) > 2013-07-10 13:32:21,312-0500 INFO BlockQueueProcessor Jobs in > holding queue: 32 > 2013-07-10 13:32:21,312-0500 INFO BlockQueueProcessor Time estimate > for holding queue (seconds): 36 > 2013-07-10 13:32:21,312-0500 INFO BlockQueueProcessor Allocating > blocks for a total walltime of: 35s > 2013-07-10 13:32:22,354-0500 INFO BlockQueueProcessor Jobs in > holding queue: 32 > 2013-07-10 13:32:22,354-0500 INFO BlockQueueProcessor Time estimate > for holding queue (seconds): 36 > 2013-07-10 13:32:22,354-0500 INFO BlockQueueProcessor Allocating > blocks for a total walltime of: 35s > 2013-07-10 13:32:23,377-0500 INFO BlockQueueProcessor Jobs in > holding queue: 32 > > > ----- Original Message ----- > > From: "Michael Wilde" > > To: "Swift Devel" > > Sent: Wednesday, July 10, 2013 1:09:04 PM > > Subject: [Swift-devel] Coasters does only one round of tasks in > > faster trunk > > > > Mihael, > > > > Ive run into a more serious problem: Im running on a single SGE > > node > > with 32 cores. > > > > The swift script has submitted about 400+ app() tasks > > > > 32 completed, but then coasters doesn't seem to be sending new jobs > > to the node. > > > > The log is at: > > http://www.ci.uchicago.edu/~wilde/paintgrid-20130710-1252-v9p47hm8.log > > > > I'll try the same on a 0.94.1 rev. > > > > - Mike > > > > > > > > ----- Original Message ----- > > > From: "Michael Wilde" > > > To: "Swift Devel" > > > Sent: Wednesday, July 10, 2013 12:53:44 PM > > > Subject: Initial tests of new (faster) trunk > > > > > > Initial tests of the new trunk are working for me, but I'm seeing > > > three odd things: > > > > > > 1. [Error] sites.beagle.coasters.xml:1:9: cvc-elt.1: Cannot find > > > the > > > declaration of element 'config'. > > > (as reported in the email below) > > > > > > 2. {env.HOME} is not interpreted in sites.xml as in prior > > > versions. > > > Swift created a workdir named "$PWD/{env.HOME}/swiftwork" > > > > > > 3. Progress ticker lines seem to be defaulting to one per second, > > > even with no status changing. > > > > > > But so far so good - a first test script using the new code is > > > running nicely on SGE. > > > > > > - Mike > > > > > > ----- Original Message ----- > > > > From: "Mihael Hategan" > > > > To: "David Kelly" > > > > Cc: swift-support at ci.uchicago.edu, "Michael Wilde" > > > > > > > > Sent: Tuesday, February 19, 2013 2:46:14 PM > > > > Subject: Re: [Swift Support #22699] Fwd: [Swift-devel] First > > > > tests > > > > with swift faster > > > > > > > > Yeah. The validation fails. You can ignore it for now. I'll fix > > > > in > > > > the > > > > future. Basically there is code to validate the sites file > > > > against > > > > the > > > > XML schema, but it fails. It's not a fatal issue though, and > > > > parsing > > > > still happens. > > > > > > > > Mihael > > > > > > > > On Tue, 2013-02-19 at 14:22 -0600, David Kelly wrote: > > > > > I tried updating from svn and running with the added url > > > > > tags: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > url="localhost"/> > > > > > 1 > > > > > > > > > key="lowOverAllocation">100 > > > > > > > > > key="highOverAllocation">100 > > > > > > > > > key="providerAttributes">pbs.aprun;pbs.mpp;depth=24 > > > > > 4000 > > > > > > > > > key="maxWallTime">00:05:00 > > > > > > > > > key="disableIdleBlockCleanup">true > > > > > 1 > > > > > 1 > > > > > 1 > > > > > batch > > > > > 8.00 > > > > > > > > > key="initialScore">10000 > > > > > > > > > > /lustre/beagle/davidk > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I am seeing this error: > > > > > > > > > > > > > > > [Error] sites.beagle.coasters.xml:1:9: cvc-elt.1: Cannot find > > > > > the > > > > > declaration of element 'config'. > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > > > > > > > > > > From: "Mike Wilde" > > > > > Sent: Tuesday, February 19, 2013 1:50:15 PM > > > > > Subject: [Swift Support #22699] Fwd: [Swift-devel] First > > > > > tests > > > > > with > > > > > swift faster > > > > > > > > > > > > > > > Tue Feb 19 13:50:14 2013: Request 22699 was acted upon. > > > > > Transaction: Ticket created by wilde at mcs.anl.gov > > > > > Queue: swift-support > > > > > Subject: Fwd: [Swift-devel] First tests with swift faster > > > > > Owner: Nobody > > > > > Requestors: wilde at ci.uchicago.edu > > > > > Status: new > > > > > Ticket > > > > https://rt.ci.uchicago.edu/Ticket/Display.html?id=22699 > > > > > > > > > > > > > > > > > > > > > David, Mihael, Yadu: could one of you try this on Beagle on > > > > > the > > > > > faster branch? > > > > > > > > > > Does the faster branch include the PBS support for Beagle? > > > > > > > > > > It shouldnt be too hard to see what part of the PBS pool def > > > > > it > > > > > doesnt like. > > > > > > > > > > - Mike > > > > > > > > > > ----- Forwarded Message ----- > > > > > From: "Lorenzo Pesce" > > > > > To: "Swift Devel" > > > > > Sent: Tuesday, February 19, 2013 1:26:20 PM > > > > > Subject: [Swift-devel] First tests with swift faster > > > > > > > > > > > > > > > This is the content of the file where we have the first > > > > > complaint > > > > > from swift (see attached): > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > key="project">CI-DEB000002 > > > > > > > > > > > > > > > > > > > key="providerAttributes">pbs.aprun;pbs.mpp;depth=24 > > > > > > > > > > > > > > > > > > > > > > > > > 24 > > > > > 172800 > > > > > > > > > key="maxwalltime">0:10:00 > > > > > > > > > key="lowOverallocation">100 > > > > > > > > > key="highOverallocation">100 > > > > > > > > > > > > > > > 200 > > > > > 1 > > > > > 1 > > > > > > > > > > > > > > > > > > > key="jobThrottle">47.99 > > > > > > > > > key="initialScore">10000 > > > > > > > > > > > > > > > > > > > > > > > > > /lustre/beagle/samseaver/GS/swift.workdir > > > > > > > > > > > > > > > > > > > > > > > > > Any ideas? > > > > > > > > > > > > > > > Begin forwarded message: > > > > > > > > > > > > > > > > > > > > From: Sam Seaver < samseaver at gmail.com > > > > > > > > > > > Date: February 19, 2013 1:16:28 PM CST > > > > > > > > > > To: Lorenzo Pesce < lpesce at uchicago.edu > > > > > > > > > > > Subject: Re: How are things going? > > > > > > > > > > > > > > > I got this error. I suspect using the new SWIFT_HOME > > > > > directory > > > > > means that there's possibly a missing parameter someplace: > > > > > > > > > > > > > > > > > > > > should we resume a previous calculation? [y/N] y > > > > > rlog files displayed in reverse time order > > > > > should I use GS-20130203-0717-jgeppt98.0.rlog ?[y/n] > > > > > y > > > > > Using GS-20130203-0717-jgeppt98.0.rlog > > > > > [Error] GS_sites.xml:1:9: cvc-elt.1: Cannot find the > > > > > declaration > > > > > of > > > > > element 'config'. > > > > > > > > > > > > > > > Execution failed: > > > > > Failed to parse site catalog > > > > > swift:siteCatalog @ scheduler.k, line: 31 > > > > > Caused by: Invalid pool entry 'pbs': > > > > > swift:siteCatalog @ scheduler.k, line: 31 > > > > > Caused by: java.lang.IllegalArgumentException: Missing URL > > > > > at > > > > > org.griphyn.vdl.karajan.lib.SiteCatalog.execution(SiteCatalog.java:173) > > > > > at > > > > > org.griphyn.vdl.karajan.lib.SiteCatalog.pool(SiteCatalog.java:100) > > > > > at > > > > > org.griphyn.vdl.karajan.lib.SiteCatalog.buildResources(SiteCatalog.java:60) > > > > > at > > > > > org.griphyn.vdl.karajan.lib.SiteCatalog.function(SiteCatalog.java:48) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.functions.AbstractFunction.runBody(AbstractFunction.java:38) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:154) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:147) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:147) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.FramedInternalFunction.run(FramedInternalFunction.java:63) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.Import.runBody(Import.java:269) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:154) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.FramedInternalFunction.run(FramedInternalFunction.java:63) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.Main.run(Main.java:79) > > > > > at k.thr.LWThread.run(LWThread.java:243) > > > > > at > > > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > > > > > at > > > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > > > > > at java.lang.Thread.run(Thread.java:722) > > > > > > > > > > > > > > > > > > > > On Tue, Feb 19, 2013 at 1:13 PM, Sam Seaver < > > > > > samseaver at gmail.com > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > OK, it got to the point where it really did hang. I'm > > > > > retrying, > > > > > but > > > > > with your suggestions. The other three finished fine! > > > > > > > > > > > > > > > > > > > > Progress: time: Tue, 19 Feb 2013 19:08:53 +0000 Selecting > > > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > > > successfully:132323 Failed but can retry:183 > > > > > Progress: time: Tue, 19 Feb 2013 19:09:23 +0000 Selecting > > > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > > > successfully:132323 Failed but can retry:183 > > > > > Progress: time: Tue, 19 Feb 2013 19:09:53 +0000 Selecting > > > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > > > successfully:132323 Failed but can retry:183 > > > > > Progress: time: Tue, 19 Feb 2013 19:10:23 +0000 Selecting > > > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > > > successfully:132323 Failed but can retry:183 > > > > > Progress: time: Tue, 19 Feb 2013 19:10:53 +0000 Selecting > > > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > > > successfully:132323 Failed but can retry:183 > > > > > Progress: time: Tue, 19 Feb 2013 19:11:23 +0000 Selecting > > > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > > > successfully:132323 Failed but can retry:183 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Feb 19, 2013 at 8:51 AM, Lorenzo Pesce < > > > > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > > > > > > > > > Hmm... > > > > > > > > > > > > > > > foreach.max.threads=100 > > > > > > > > > > > > > > > maybe you should increase this number a bit and see what > > > > > happens. > > > > > > > > > > > > > > > Also, I would try to replace > > > > > > > > > > > > > > > SWIFT_HOME=/home/wilde/swift/rev/swift-r6151-cog-r3552 > > > > > > > > > > > > > > > with > > > > > > > > > > > > > > > SWIFT_HOME=/soft/swift/fast > > > > > > > > > > > > > > > Keep me posted. Let's get this rolling. > > > > > > > > > > > > > > > if it doesn't work, I can redo the packing. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Feb 19, 2013, at 1:07 AM, Sam Seaver wrote: > > > > > > > > > > > > > > > > > > > > Actually, the ten agents job does seem to be stuck in a > > > > > strange > > > > > loop. It is incrementing the number of jobs that has finished > > > > > successfully, and at a fast pace, but the number of jobs its > > > > > starting is decrementing much more slowly, its almost as its > > > > > repeatedly attempting the same set of parameters multiple > > > > > times... > > > > > > > > > > > > > > > I'll see what it's doing in the morning > > > > > S > > > > > > > > > > > > > > > > > > > > On Tue, Feb 19, 2013 at 1:00 AM, Sam Seaver < > > > > > samseaver at gmail.com > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > Seems to have worked overall this time! > > > > > > > > > > > > > > > I resume four jobs, each were for a different number of > > > > > agents > > > > > (10,100,1000,10000) it made it easier for me to decide on the > > > > > app > > > > > time. Two of them have already finished i.e.: > > > > > > > > > > > > > > > > > > > > Progress: time: Mon, 18 Feb 2013 23:50:12 +0000 Active:4 > > > > > Checking > > > > > status:1 Finished in previous run:148098 Finished > > > > > successfully:37897 > > > > > Progress: time: Mon, 18 Feb 2013 23:50:15 +0000 Active:2 > > > > > Checking > > > > > status:1 Finished in previous run:148098 Finished > > > > > successfully:37899 > > > > > Final status: Mon, 18 Feb 2013 23:50:15 +0000 Finished in > > > > > previous > > > > > run:148098 Finished successfully:37902 > > > > > > > > > > > > > > > and the only one that is showing any failure (50/110000), is > > > > > the > > > > > ten agents version which is so short I can understand why, > > > > > but > > > > > its > > > > > still actively trying to run jobs and is actively finishing > > > > > jobs, > > > > > so that's good. > > > > > > > > > > > > > > > Yay! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Feb 18, 2013 at 1:09 PM, Lorenzo Pesce < > > > > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > > > > > > > > > Good. Keep me posted, I would really like to solve your > > > > > problems > > > > > in > > > > > running on Beagle this week, I wish that Swift would have > > > > > been > > > > > friendlier. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Feb 18, 2013, at 1:01 PM, Sam Seaver wrote: > > > > > > > > > > > > > > > > > > > > I just resumed the jobs that I'd killed before the system > > > > > went > > > > > down, lets see how it does. I always did a mini-review of the > > > > > data > > > > > I've got and it seems to be working as expected. > > > > > > > > > > > > > > > > > > > > On Mon, Feb 18, 2013 at 12:28 PM, Lorenzo Pesce < > > > > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > > > > > > > > > I have lost track a bit of what's up. I am happy to try and > > > > > go > > > > > over > > > > > it with you when you are ready. > > > > > > > > > > > > > > > Some of the problems of swift might have improved with a new > > > > > version and the new system. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Feb 18, 2013, at 12:22 PM, Sam Seaver wrote: > > > > > > > > > > > > > > > > > > > > They're not, I've not looked since Beagle came back up. Will > > > > > do > > > > > so > > > > > later today. > > > > > S > > > > > > > > > > > > > > > > > > > > On Mon, Feb 18, 2013 at 12:20 PM, Lorenzo Pesce < > > > > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Postdoctoral Fellow > > > > > Mathematics and Computer Science Division > > > > > Argonne National Laboratory > > > > > 9700 S. Cass Avenue > > > > > Argonne, IL 60439 > > > > > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > > > samseaver at gmail.com > > > > > (773) 796-7144 > > > > > > > > > > "We shall not cease from exploration > > > > > And the end of all our exploring > > > > > Will be to arrive where we started > > > > > And know the place for the first time." > > > > > --T. S. Eliot > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Postdoctoral Fellow > > > > > Mathematics and Computer Science Division > > > > > Argonne National Laboratory > > > > > 9700 S. Cass Avenue > > > > > Argonne, IL 60439 > > > > > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > > > samseaver at gmail.com > > > > > (773) 796-7144 > > > > > > > > > > "We shall not cease from exploration > > > > > And the end of all our exploring > > > > > Will be to arrive where we started > > > > > And know the place for the first time." > > > > > --T. S. Eliot > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Postdoctoral Fellow > > > > > Mathematics and Computer Science Division > > > > > Argonne National Laboratory > > > > > 9700 S. Cass Avenue > > > > > Argonne, IL 60439 > > > > > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > > > samseaver at gmail.com > > > > > (773) 796-7144 > > > > > > > > > > "We shall not cease from exploration > > > > > And the end of all our exploring > > > > > Will be to arrive where we started > > > > > And know the place for the first time." > > > > > --T. S. Eliot > > > > > > > > > > > > > > > > > > > > -- > > > > > Postdoctoral Fellow > > > > > Mathematics and Computer Science Division > > > > > Argonne National Laboratory > > > > > 9700 S. Cass Avenue > > > > > Argonne, IL 60439 > > > > > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > > > samseaver at gmail.com > > > > > (773) 796-7144 > > > > > > > > > > "We shall not cease from exploration > > > > > And the end of all our exploring > > > > > Will be to arrive where we started > > > > > And know the place for the first time." > > > > > --T. S. Eliot > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Postdoctoral Fellow > > > > > Mathematics and Computer Science Division > > > > > Argonne National Laboratory > > > > > 9700 S. Cass Avenue > > > > > Argonne, IL 60439 > > > > > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > > > samseaver at gmail.com > > > > > (773) 796-7144 > > > > > > > > > > "We shall not cease from exploration > > > > > And the end of all our exploring > > > > > Will be to arrive where we started > > > > > And know the place for the first time." > > > > > --T. S. Eliot > > > > > > > > > > > > > > > > > > > > I tried updating from svn and running with the added url > > > > > tags: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > url="localhost"/> > > > > > 1 > > > > > > > > > key="lowOverAllocation">100 > > > > > > > > > key="highOverAllocation">100 > > > > > > > > > key="providerAttributes">pbs.aprun;pbs.mpp;depth=24 > > > > > 4000 > > > > > > > > > key="maxWallTime">00:05:00 > > > > > > > > > key="disableIdleBlockCleanup">true > > > > > 1 > > > > > > > > > key="nodeGranularity">1 > > > > > 1 > > > > > batch > > > > > > > > > key="jobThrottle">8.00 > > > > > > > > > key="initialScore">10000 > > > > > > > > > > /lustre/beagle/davidk > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I am seeing this error: > > > > > > > > > > > > > > > [Error] sites.beagle.coasters.xml:1:9: cvc-elt.1: Cannot find > > > > > the > > > > > declaration of element 'config'. > > > > > > > > > > > > > > > > > > > > > > > > > ______________________________________________________________________ > > > > > From: "Mike Wilde" > > > > > Sent: Tuesday, February 19, 2013 1:50:15 PM > > > > > Subject: [Swift Support #22699] Fwd: [Swift-devel] > > > > > First > > > > > tests > > > > > with swift faster > > > > > > > > > > > > > > > Tue Feb 19 13:50:14 2013: Request 22699 was acted > > > > > upon. > > > > > Transaction: Ticket created by wilde at mcs.anl.gov > > > > > Queue: swift-support > > > > > Subject: Fwd: [Swift-devel] First tests with > > > > > swift > > > > > faster > > > > > Owner: Nobody > > > > > Requestors: wilde at ci.uchicago.edu > > > > > Status: new > > > > > Ticket > > > > https://rt.ci.uchicago.edu/Ticket/Display.html?id=22699 > > > > > > > > > > > > > > > > > > > > > > > > > > David, Mihael, Yadu: could one of you try this on > > > > > Beagle > > > > > on > > > > > the faster branch? > > > > > > > > > > Does the faster branch include the PBS support for > > > > > Beagle? > > > > > > > > > > It shouldnt be too hard to see what part of the PBS > > > > > pool > > > > > def > > > > > it doesnt like. > > > > > > > > > > - Mike > > > > > > > > > > ----- Forwarded Message ----- > > > > > From: "Lorenzo Pesce" > > > > > To: "Swift Devel" > > > > > Sent: Tuesday, February 19, 2013 1:26:20 PM > > > > > Subject: [Swift-devel] First tests with swift faster > > > > > > > > > > > > > > > This is the content of the file where we have the > > > > > first > > > > > complaint from swift (see attached): > > > > > > > > > > > > > > > > > > > > > > > > > > > > > jobmanager="local:pbs"/> > > > > > > > > > > > > > > key="project">CI-DEB000002 > > > > > > > > > > > > > > > > > > > key="providerAttributes">pbs.aprun;pbs.mpp;depth=24 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > key="jobsPerNode">24 > > > > > > > > > key="maxTime">172800 > > > > > > > > > key="maxwalltime">0:10:00 > > > > > > > > > key="lowOverallocation">100 > > > > > > > > > key="highOverallocation">100 > > > > > > > > > > > > > > > 200 > > > > > > > > > key="nodeGranularity">1 > > > > > > > > > key="maxNodes">1 > > > > > > > > > > > > > > > > > > > key="jobThrottle">47.99 > > > > > > > > > key="initialScore">10000 > > > > > > > > > > > > > > > > > > > > > > > > > /lustre/beagle/samseaver/GS/swift.workdir > > > > > > > > > > > > > > > > > > > > > > > > > Any ideas? > > > > > > > > > > > > > > > Begin forwarded message: > > > > > > > > > > > > > > > > > > > > From: Sam Seaver < samseaver at gmail.com > > > > > > > > > > > Date: February 19, 2013 1:16:28 PM CST > > > > > > > > > > To: Lorenzo Pesce < lpesce at uchicago.edu > > > > > > > > > > > Subject: Re: How are things going? > > > > > > > > > > > > > > > I got this error. I suspect using the new SWIFT_HOME > > > > > directory > > > > > means that there's possibly a missing parameter > > > > > someplace: > > > > > > > > > > > > > > > > > > > > should we resume a previous calculation? [y/N] y > > > > > rlog files displayed in reverse time order > > > > > should I use GS-20130203-0717-jgeppt98.0.rlog ?[y/n] > > > > > y > > > > > Using GS-20130203-0717-jgeppt98.0.rlog > > > > > [Error] GS_sites.xml:1:9: cvc-elt.1: Cannot find the > > > > > declaration of element 'config'. > > > > > > > > > > > > > > > Execution failed: > > > > > Failed to parse site catalog > > > > > swift:siteCatalog @ scheduler.k, line: 31 > > > > > Caused by: Invalid pool entry 'pbs': > > > > > swift:siteCatalog @ scheduler.k, line: 31 > > > > > Caused by: java.lang.IllegalArgumentException: > > > > > Missing > > > > > URL > > > > > at > > > > > org.griphyn.vdl.karajan.lib.SiteCatalog.execution(SiteCatalog.java:173) > > > > > at > > > > > org.griphyn.vdl.karajan.lib.SiteCatalog.pool(SiteCatalog.java:100) > > > > > at > > > > > org.griphyn.vdl.karajan.lib.SiteCatalog.buildResources(SiteCatalog.java:60) > > > > > at > > > > > org.griphyn.vdl.karajan.lib.SiteCatalog.function(SiteCatalog.java:48) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.functions.AbstractFunction.runBody(AbstractFunction.java:38) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:154) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:147) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:147) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.FramedInternalFunction.run(FramedInternalFunction.java:63) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.Import.runBody(Import.java:269) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.InternalFunction.run(InternalFunction.java:154) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.CompoundNode.runChild(CompoundNode.java:87) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.FramedInternalFunction.run(FramedInternalFunction.java:63) > > > > > at > > > > > org.globus.cog.karajan.compiled.nodes.Main.run(Main.java:79) > > > > > at k.thr.LWThread.run(LWThread.java:243) > > > > > at > > > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > > > > > at java.util.concurrent.ThreadPoolExecutor > > > > > $Worker.run(ThreadPoolExecutor.java:603) > > > > > at java.lang.Thread.run(Thread.java:722) > > > > > > > > > > > > > > > > > > > > On Tue, Feb 19, 2013 at 1:13 PM, Sam Seaver < > > > > > samseaver at gmail.com > wrote: > > > > > > > > > > > > > > > > > > > > OK, it got to the point where it really did hang. I'm > > > > > retrying, but with your suggestions. The other three > > > > > finished > > > > > fine! > > > > > > > > > > > > > > > > > > > > Progress: time: Tue, 19 Feb 2013 19:08:53 +0000 > > > > > Selecting > > > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > > > successfully:132323 Failed but can retry:183 > > > > > Progress: time: Tue, 19 Feb 2013 19:09:23 +0000 > > > > > Selecting > > > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > > > successfully:132323 Failed but can retry:183 > > > > > Progress: time: Tue, 19 Feb 2013 19:09:53 +0000 > > > > > Selecting > > > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > > > successfully:132323 Failed but can retry:183 > > > > > Progress: time: Tue, 19 Feb 2013 19:10:23 +0000 > > > > > Selecting > > > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > > > successfully:132323 Failed but can retry:183 > > > > > Progress: time: Tue, 19 Feb 2013 19:10:53 +0000 > > > > > Selecting > > > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > > > successfully:132323 Failed but can retry:183 > > > > > Progress: time: Tue, 19 Feb 2013 19:11:23 +0000 > > > > > Selecting > > > > > site:18147 Submitted:174 Active:96 Failed:2 Finished > > > > > successfully:132323 Failed but can retry:183 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Feb 19, 2013 at 8:51 AM, Lorenzo Pesce < > > > > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > > > > > > > > > Hmm... > > > > > > > > > > > > > > > foreach.max.threads=100 > > > > > > > > > > > > > > > maybe you should increase this number a bit and see > > > > > what > > > > > happens. > > > > > > > > > > > > > > > Also, I would try to replace > > > > > > > > > > > > > > > SWIFT_HOME=/home/wilde/swift/rev/swift-r6151-cog-r3552 > > > > > > > > > > > > > > > with > > > > > > > > > > > > > > > SWIFT_HOME=/soft/swift/fast > > > > > > > > > > > > > > > Keep me posted. Let's get this rolling. > > > > > > > > > > > > > > > if it doesn't work, I can redo the packing. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Feb 19, 2013, at 1:07 AM, Sam Seaver wrote: > > > > > > > > > > > > > > > > > > > > Actually, the ten agents job does seem to be stuck in > > > > > a > > > > > strange loop. It is incrementing the number of jobs > > > > > that > > > > > has > > > > > finished successfully, and at a fast pace, but the > > > > > number > > > > > of > > > > > jobs its starting is decrementing much more slowly, > > > > > its > > > > > almost > > > > > as its repeatedly attempting the same set of > > > > > parameters > > > > > multiple times... > > > > > > > > > > > > > > > I'll see what it's doing in the morning > > > > > S > > > > > > > > > > > > > > > > > > > > On Tue, Feb 19, 2013 at 1:00 AM, Sam Seaver < > > > > > samseaver at gmail.com > wrote: > > > > > > > > > > > > > > > > > > > > Seems to have worked overall this time! > > > > > > > > > > > > > > > I resume four jobs, each were for a different number > > > > > of > > > > > agents > > > > > (10,100,1000,10000) it made it easier for me to > > > > > decide > > > > > on > > > > > the > > > > > app time. Two of them have already finished i.e.: > > > > > > > > > > > > > > > > > > > > Progress: time: Mon, 18 Feb 2013 23:50:12 +0000 > > > > > Active:4 > > > > > Checking status:1 Finished in previous run:148098 > > > > > Finished > > > > > successfully:37897 > > > > > Progress: time: Mon, 18 Feb 2013 23:50:15 +0000 > > > > > Active:2 > > > > > Checking status:1 Finished in previous run:148098 > > > > > Finished > > > > > successfully:37899 > > > > > Final status: Mon, 18 Feb 2013 23:50:15 +0000 > > > > > Finished > > > > > in > > > > > previous run:148098 Finished successfully:37902 > > > > > > > > > > > > > > > and the only one that is showing any failure > > > > > (50/110000), > > > > > is > > > > > the ten agents version which is so short I can > > > > > understand > > > > > why, > > > > > but its still actively trying to run jobs and is > > > > > actively > > > > > finishing jobs, so that's good. > > > > > > > > > > > > > > > Yay! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Feb 18, 2013 at 1:09 PM, Lorenzo Pesce < > > > > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > > > > > > > > > Good. Keep me posted, I would really like to solve > > > > > your > > > > > problems in running on Beagle this week, I wish that > > > > > Swift > > > > > would have been friendlier. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Feb 18, 2013, at 1:01 PM, Sam Seaver wrote: > > > > > > > > > > > > > > > > > > > > I just resumed the jobs that I'd killed before the > > > > > system > > > > > went > > > > > down, lets see how it does. I always did a > > > > > mini-review > > > > > of > > > > > the > > > > > data I've got and it seems to be working as expected. > > > > > > > > > > > > > > > > > > > > On Mon, Feb 18, 2013 at 12:28 PM, Lorenzo Pesce < > > > > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > > > > > > > > > I have lost track a bit of what's up. I am happy to > > > > > try > > > > > and > > > > > go > > > > > over it with you when you are ready. > > > > > > > > > > > > > > > Some of the problems of swift might have improved > > > > > with > > > > > a > > > > > new > > > > > version and the new system. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Feb 18, 2013, at 12:22 PM, Sam Seaver wrote: > > > > > > > > > > > > > > > > > > > > They're not, I've not looked since Beagle came back > > > > > up. > > > > > Will > > > > > do so later today. > > > > > S > > > > > > > > > > > > > > > > > > > > On Mon, Feb 18, 2013 at 12:20 PM, Lorenzo Pesce < > > > > > lpesce at uchicago.edu > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Postdoctoral Fellow > > > > > Mathematics and Computer Science Division > > > > > Argonne National Laboratory > > > > > 9700 S. Cass Avenue > > > > > Argonne, IL 60439 > > > > > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > > > samseaver at gmail.com > > > > > (773) 796-7144 > > > > > > > > > > "We shall not cease from exploration > > > > > And the end of all our exploring > > > > > Will be to arrive where we started > > > > > And know the place for the first time." > > > > > --T. S. Eliot > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Postdoctoral Fellow > > > > > Mathematics and Computer Science Division > > > > > Argonne National Laboratory > > > > > 9700 S. Cass Avenue > > > > > Argonne, IL 60439 > > > > > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > > > samseaver at gmail.com > > > > > (773) 796-7144 > > > > > > > > > > "We shall not cease from exploration > > > > > And the end of all our exploring > > > > > Will be to arrive where we started > > > > > And know the place for the first time." > > > > > --T. S. Eliot > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Postdoctoral Fellow > > > > > Mathematics and Computer Science Division > > > > > Argonne National Laboratory > > > > > 9700 S. Cass Avenue > > > > > Argonne, IL 60439 > > > > > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > > > samseaver at gmail.com > > > > > (773) 796-7144 > > > > > > > > > > "We shall not cease from exploration > > > > > And the end of all our exploring > > > > > Will be to arrive where we started > > > > > And know the place for the first time." > > > > > --T. S. Eliot > > > > > > > > > > > > > > > > > > > > -- > > > > > Postdoctoral Fellow > > > > > Mathematics and Computer Science Division > > > > > Argonne National Laboratory > > > > > 9700 S. Cass Avenue > > > > > Argonne, IL 60439 > > > > > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > > > samseaver at gmail.com > > > > > (773) 796-7144 > > > > > > > > > > "We shall not cease from exploration > > > > > And the end of all our exploring > > > > > Will be to arrive where we started > > > > > And know the place for the first time." > > > > > --T. S. Eliot > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Postdoctoral Fellow > > > > > Mathematics and Computer Science Division > > > > > Argonne National Laboratory > > > > > 9700 S. Cass Avenue > > > > > Argonne, IL 60439 > > > > > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > > > samseaver at gmail.com > > > > > (773) 796-7144 > > > > > > > > > > "We shall not cease from exploration > > > > > And the end of all our exploring > > > > > Will be to arrive where we started > > > > > And know the place for the first time." > > > > > --T. S. Eliot > > > > > > > > > > > > > > > > > > > > -- > > > > > Postdoctoral Fellow > > > > > Mathematics and Computer Science Division > > > > > Argonne National Laboratory > > > > > 9700 S. Cass Avenue > > > > > Argonne, IL 60439 > > > > > > > > > > http://www.linkedin.com/pub/sam-seaver/0/412/168 > > > > > samseaver at gmail.com > > > > > (773) 796-7144 > > > > > > > > > > "We shall not cease from exploration > > > > > And the end of all our exploring > > > > > Will be to arrive where we started > > > > > And know the place for the first time." > > > > > --T. S. Eliot > > > > > > > > > > _______________________________________________ > > > > > Swift-devel mailing list > > > > > Swift-devel at ci.uchicago.edu > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > From hategan at mcs.anl.gov Wed Jul 10 14:22:44 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 10 Jul 2013 12:22:44 -0700 Subject: [Swift-devel] Initial tests of new (faster) trunk In-Reply-To: <1668136297.8844433.1373478824678.JavaMail.root@mcs.anl.gov> References: <1668136297.8844433.1373478824678.JavaMail.root@mcs.anl.gov> Message-ID: <1373484164.20771.3.camel@echo> On Wed, 2013-07-10 at 12:53 -0500, Michael Wilde wrote: > Initial tests of the new trunk are working for me, but I'm seeing three odd things: > > 1. [Error] sites.beagle.coasters.xml:1:9: cvc-elt.1: Cannot find the declaration of element 'config'. > (as reported in the email below) Right. sites.xml is now validated. If you want that to go away, you need to add these magic words to the beginning of your sites file: > > 2. {env.HOME} is not interpreted in sites.xml as in prior versions. > Swift created a workdir named "$PWD/{env.HOME}/swiftwork" Yeah. sites.xml is not secretly a karajan script any more. But this can be fixed. > > 3. Progress ticker lines seem to be defaulting to one per second, even with no status changing. Ah, sorry. That was a temporary change I made and forgot to revert. I use that for development since I want regular feedback in large runs. Mihael From hategan at mcs.anl.gov Wed Jul 10 14:25:00 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 10 Jul 2013 12:25:00 -0700 Subject: [Swift-devel] Coasters does only one round of tasks in faster trunk In-Reply-To: <1893765252.8877833.1373482025672.JavaMail.root@mcs.anl.gov> References: <1893765252.8877833.1373482025672.JavaMail.root@mcs.anl.gov> Message-ID: <1373484300.20771.4.camel@echo> On Wed, 2013-07-10 at 13:47 -0500, Michael Wilde wrote: > Mihael, the 0.94 branch (latest rev) seems to behave the same way on this workflow. I'll take a look. Smells like a change in 0.94 that got merged to faster that causes this. Mihael From wilde at mcs.anl.gov Wed Jul 10 14:28:01 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 10 Jul 2013 14:28:01 -0500 (CDT) Subject: [Swift-devel] Initial tests of new (faster) trunk In-Reply-To: <1373484164.20771.3.camel@echo> Message-ID: <1056785980.8908901.1373484481439.JavaMail.root@mcs.anl.gov> >... you need to add these magic words to the beginning of your sites file: > > > > > > 2. {env.HOME} is not interpreted in sites.xml as in prior versions. > > Swift created a workdir named "$PWD/{env.HOME}/swiftwork" > > Yeah. sites.xml is not secretly a karajan script any more. > > But this can be fixed. Perhaps not worth any effort with the new config mechanism coming ready soon. Maybe the place to do these things is in the new format config file. - Mike From wilde at mcs.anl.gov Wed Jul 10 14:36:52 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 10 Jul 2013 14:36:52 -0500 (CDT) Subject: [Swift-devel] Coasters does only one round of tasks in faster trunk In-Reply-To: <1373484300.20771.4.camel@echo> Message-ID: <773893198.8916266.1373485012177.JavaMail.root@mcs.anl.gov> > I'll take a look. Smells like a change in 0.94 that got merged to > faster > that causes this. Thats looking very likely. The script runs OK on midway on 0.94 but fails in the same manner there under 0.94-latest-rev, which Ive been calling 0.94.1. - Mike From hategan at mcs.anl.gov Wed Jul 10 21:11:03 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 10 Jul 2013 19:11:03 -0700 Subject: [Swift-devel] app semantics Message-ID: <1373508663.4315.3.camel@echo> Maybe we talked about this before, but I think we should allow things like: app (file outf) myapp(file inf) { file tmp <"tmpfile">; a filename(inf) stdout=filename(tmp); b filename(tmp) stdout=filename(outf); } In a more general and theoretical sense, an app block can allow some delimited side-effects while enforcing sequential execution. >From a practical perspective, this would solve the problem of large temporary files being staged in and out needlessly (e.g. tmpfile in our case). Mihael From wilde at mcs.anl.gov Wed Jul 10 21:49:43 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 10 Jul 2013 21:49:43 -0500 (CDT) Subject: [Swift-devel] app semantics In-Reply-To: <1373508663.4315.3.camel@echo> Message-ID: <1141892550.9031516.1373510983753.JavaMail.root@mcs.anl.gov> Yes, I agree - we have long wanted to expand what an app() can do. Most discussions centered around enabling inline code from a variety of interpreters to be embedded, with easy substitution of input and output parameters from the app() function template. Your example could be coded like: app (file outf) myapp (file inf) { sh "-c" strcat("a ", @filename(inf), " | b ") stdout(@outf); # or sh "-c" strcat("a ", @filename(inf), ">tmp; b tmp") stdout(@outf); } so one could say : bash (file outf) myapp (file inf) { a @inf | b > @outf # or a @inf >tmp; b tmp >@outf } as well as similarly interpreted app() functions for python, tcl, R, etc. Virtually anything that can be "#!"ed. What might also be desirable is to be able to call any Swift app() function from within an app function: ie if a() and b() were already coded as apps. I.e. ensure that any app() can also serve as the equivalent of a bash function. Im not sure how this would work across interpretive languages, but one could envision simple semantic rules that would enable this. One more case of interest is to enable any swift function call or expression to be interpreted on a single node, including all the functions it calls. This would be as you suggest in your example, for the purpose of file exchange locality. - Mike ----- Original Message ----- > From: "Mihael Hategan" > To: "Swift Devel" > Sent: Wednesday, July 10, 2013 9:11:03 PM > Subject: [Swift-devel] app semantics > > Maybe we talked about this before, but I think we should allow things > like: > > app (file outf) myapp(file inf) { > file tmp <"tmpfile">; > a filename(inf) stdout=filename(tmp); > b filename(tmp) stdout=filename(outf); > } > > In a more general and theoretical sense, an app block can allow some > delimited side-effects while enforcing sequential execution. > > From a practical perspective, this would solve the problem of large > temporary files being staged in and out needlessly (e.g. tmpfile in > our > case). > > Mihael > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > From wilde at mcs.anl.gov Wed Jul 10 22:33:27 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 10 Jul 2013 22:33:27 -0500 (CDT) Subject: [Swift-devel] Coasters does only one round of tasks in faster trunk In-Reply-To: <773893198.8916266.1373485012177.JavaMail.root@mcs.anl.gov> Message-ID: <895979428.9032425.1373513607424.JavaMail.root@mcs.anl.gov> Mihael, your fix for this in Swift 0.94 swift-r6637 cog-r3742 works. Thanks, - Mike ----- Original Message ----- > From: "Michael Wilde" > To: "Mihael Hategan" > Cc: "Swift Devel" > Sent: Wednesday, July 10, 2013 2:36:52 PM > Subject: Re: [Swift-devel] Coasters does only one round of tasks in faster trunk > > > I'll take a look. Smells like a change in 0.94 that got merged to > > faster > > that causes this. > > Thats looking very likely. The script runs OK on midway on 0.94 but > fails in the same manner there under 0.94-latest-rev, which Ive been > calling 0.94.1. > > - Mike > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > From hategan at mcs.anl.gov Thu Jul 11 00:45:35 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 10 Jul 2013 22:45:35 -0700 Subject: [Swift-devel] Coasters does only one round of tasks in faster trunk In-Reply-To: <895979428.9032425.1373513607424.JavaMail.root@mcs.anl.gov> References: <895979428.9032425.1373513607424.JavaMail.root@mcs.anl.gov> Message-ID: <1373521535.12601.1.camel@echo> Yeah. Sorry. I got carried away with my previous commit and forgot about the last bit. Mihael On Wed, 2013-07-10 at 22:33 -0500, Michael Wilde wrote: > Mihael, your fix for this in Swift 0.94 swift-r6637 cog-r3742 works. > > Thanks, > > - Mike > > > ----- Original Message ----- > > From: "Michael Wilde" > > To: "Mihael Hategan" > > Cc: "Swift Devel" > > Sent: Wednesday, July 10, 2013 2:36:52 PM > > Subject: Re: [Swift-devel] Coasters does only one round of tasks in faster trunk > > > > > I'll take a look. Smells like a change in 0.94 that got merged to > > > faster > > > that causes this. > > > > Thats looking very likely. The script runs OK on midway on 0.94 but > > fails in the same manner there under 0.94-latest-rev, which Ive been > > calling 0.94.1. > > > > - Mike > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From benc at hawaga.org.uk Thu Jul 11 08:14:19 2013 From: benc at hawaga.org.uk (Ben Clifford) Date: Thu, 11 Jul 2013 13:14:19 +0000 (UTC) Subject: [Swift-devel] app semantics In-Reply-To: <1373508663.4315.3.camel@echo> References: <1373508663.4315.3.camel@echo> Message-ID: I think (at least over beer) we have talked about two different things that sound like this: one is allowing app blocks to be more complex out-of-swift blocks of code; in that case swift does not need to know about any intermediate files, perhaps. another is telling swift it doesn't want a file staged around the place, but still telling swift about that file, and declaring the separate steps that use that file in swift syntax, so that it knows about the separate steps and can do things like managing retries of individual steps. -- http://www.hawaga.org.uk/ben/ From tim.g.armstrong at gmail.com Thu Jul 11 08:16:05 2013 From: tim.g.armstrong at gmail.com (Tim Armstrong) Date: Thu, 11 Jul 2013 08:16:05 -0500 Subject: [Swift-devel] app semantics In-Reply-To: <1141892550.9031516.1373510983753.JavaMail.root@mcs.anl.gov> References: <1373508663.4315.3.camel@echo> <1141892550.9031516.1373510983753.JavaMail.root@mcs.anl.gov> Message-ID: FWIW in Swift/T we allow arbitrary expressions in the app command line, mainly for expressiveness/convenience (e.g. you can call another app function from within an app function). We don't allow multiple statements: the main reason against that would be that the complexity of the language increases and you end up with a language-within-a-language. In Swift/T we can actually deal with the temporary file situation above with a compile-time optimization. It's a bit complicated and depends on some other, more general, transformations taking place. The chain of reasoning is "Command B depends on command A, so we should defer any execution until after A finishes" => "Command B will definitely have its input dependencies filled once A finishes, so we can launch B immediately after A finishes" => "We can merge B into A since they're sequentially dependent. If there's a task C that's also sequentially dependent on A, have to pick one" => "The result of A isn't needed outside this scope, so don't need to copy out". The nice thing about this is that it works anywhere you have a pipeline of tasks in your code. I've thought in the past about the idea of supporting shell statements inline in any Swift function. To fit it in the grammar you would have to change the syntax slightly (e.g. prefix it with keyword exec). You would also need to have some way of identifying which files are output files. exec sh "-c" strcat("a ", @filename(inf), " | b ") stdout(@outf); - Tim On Wed, Jul 10, 2013 at 9:49 PM, Michael Wilde wrote: > > Yes, I agree - we have long wanted to expand what an app() can do. > > Most discussions centered around enabling inline code from a variety of > interpreters to be embedded, with easy substitution of input and output > parameters from the app() function template. > > Your example could be coded like: > > app (file outf) myapp (file inf) > { > sh "-c" strcat("a ", @filename(inf), " | b ") stdout(@outf); > # or > sh "-c" strcat("a ", @filename(inf), ">tmp; b tmp") stdout(@outf); > } > > so one could say : > > bash (file outf) myapp (file inf) > { > a @inf | b > @outf > # or > a @inf >tmp; b tmp >@outf > } > > as well as similarly interpreted app() functions for python, tcl, R, etc. > Virtually anything that can be "#!"ed. > > What might also be desirable is to be able to call any Swift app() > function from within an app function: ie if a() and b() were already coded > as apps. I.e. ensure that any app() can also serve as the equivalent of a > bash function. Im not sure how this would work across interpretive > languages, but one could envision simple semantic rules that would enable > this. > > One more case of interest is to enable any swift function call or > expression to be interpreted on a single node, including all the functions > it calls. This would be as you suggest in your example, for the purpose of > file exchange locality. > > - Mike > > > ----- Original Message ----- > > From: "Mihael Hategan" > > To: "Swift Devel" > > Sent: Wednesday, July 10, 2013 9:11:03 PM > > Subject: [Swift-devel] app semantics > > > > Maybe we talked about this before, but I think we should allow things > > like: > > > > app (file outf) myapp(file inf) { > > file tmp <"tmpfile">; > > a filename(inf) stdout=filename(tmp); > > b filename(tmp) stdout=filename(outf); > > } > > > > In a more general and theoretical sense, an app block can allow some > > delimited side-effects while enforcing sequential execution. > > > > From a practical perspective, this would solve the problem of large > > temporary files being staged in and out needlessly (e.g. tmpfile in > > our > > case). > > > > Mihael > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidk at ci.uchicago.edu Fri Jul 12 08:09:48 2013 From: davidk at ci.uchicago.edu (David Kelly) Date: Fri, 12 Jul 2013 08:09:48 -0500 (CDT) Subject: [Swift-devel] *.d directory location / naming? In-Reply-To: <1797486372.11602004.1373630693285.JavaMail.root@ci.uchicago.edu> Message-ID: <454861809.11608815.1373634588714.JavaMail.root@ci.uchicago.edu> Is there a way to change the name and location of the *.d directory? I'm currently moving things around with a wrapper script after it completes - I couldn't find an obvious way to change it with a Swift command line option or property. Thanks, David -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Fri Jul 12 09:40:11 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 12 Jul 2013 09:40:11 -0500 Subject: [Swift-devel] *.d directory location / naming? In-Reply-To: <454861809.11608815.1373634588714.JavaMail.root@ci.uchicago.edu> References: <1797486372.11602004.1373630693285.JavaMail.root@ci.uchicago.edu> <454861809.11608815.1373634588714.JavaMail.root@ci.uchicago.edu> Message-ID: You can change the runid with a command line arg I think, but not the .d suffix. On 7/12/13, David Kelly wrote: > > > > Is there a way to change the name and location of the *.d directory? I'm > currently moving things around with a wrapper script after it completes - I > couldn't find an obvious way to change it with a Swift command line option > or property. > > > Thanks, > David -- Sent from my mobile device From hategan at mcs.anl.gov Fri Jul 12 13:10:35 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 12 Jul 2013 11:10:35 -0700 Subject: [Swift-devel] *.d directory location / naming? In-Reply-To: <454861809.11608815.1373634588714.JavaMail.root@ci.uchicago.edu> References: <454861809.11608815.1373634588714.JavaMail.root@ci.uchicago.edu> Message-ID: <1373652635.15636.1.camel@echo> On Fri, 2013-07-12 at 08:09 -0500, David Kelly wrote: > > > Is there a way to change the name and location of the *.d directory? > I'm currently moving things around with a wrapper script after it > completes - I couldn't find an obvious way to change it with a Swift > command line option or property. There is no official way, but if you need to hack your way through it, initDDir() in swift-int.k. Mihael From wilde at mcs.anl.gov Fri Jul 12 20:12:12 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 12 Jul 2013 20:12:12 -0500 (CDT) Subject: [Swift-devel] Strange socket errors on swift.rcc Message-ID: <1628020038.9511090.1373677932665.JavaMail.root@mcs.anl.gov> Just a heads-up: Im trying to test David's nswift on swift.rcc, and I keep getting: java.lang.RuntimeException: java.net.BindException: Address already in use I unset GLOBUS_TCP_{SOURCE,PORT}_RANGE and then I have no problems. I see that Yadu is running about 16 swift commands (presumably for testing), and wondering if they are eating up all the ports between 50000 and 51000? Need to look at netstat and debug, but just mentioning this as a heads-up in case anyone else is seeing similar problems, or if its an issue we should document. - Mike -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From davidk at ci.uchicago.edu Fri Jul 12 20:41:18 2013 From: davidk at ci.uchicago.edu (David Kelly) Date: Fri, 12 Jul 2013 20:41:18 -0500 (CDT) Subject: [Swift-devel] Strange socket errors on swift.rcc In-Reply-To: <1628020038.9511090.1373677932665.JavaMail.root@mcs.anl.gov> Message-ID: <117903781.11757136.1373679678768.JavaMail.root@ci.uchicago.edu> I saw this today too with trunk on rcc. I opened a ticket for it earlier, https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=1036 This might be related to an older bug, https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=881 ----- Original Message ----- From: "Michael Wilde" To: "Swift Devel" Sent: Friday, July 12, 2013 8:12:12 PM Subject: [Swift-devel] Strange socket errors on swift.rcc Just a heads-up: Im trying to test David's nswift on swift.rcc, and I keep getting: java.lang.RuntimeException: java.net.BindException: Address already in use I unset GLOBUS_TCP_{SOURCE,PORT}_RANGE and then I have no problems. I see that Yadu is running about 16 swift commands (presumably for testing), and wondering if they are eating up all the ports between 50000 and 51000? Need to look at netstat and debug, but just mentioning this as a heads-up in case anyone else is seeing similar problems, or if its an issue we should document. - Mike -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory _______________________________________________ Swift-devel mailing list Swift-devel at ci.uchicago.edu https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From iraicu at cs.iit.edu Tue Jul 16 10:04:28 2013 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Tue, 16 Jul 2013 11:04:28 -0400 Subject: [Swift-devel] Fwd: [DistComp] Call for Papers: IEEE Transactions on Cloud Computing - Special Issue on Scientific Cloud Computing (deadline Jul 31, 2014) In-Reply-To: References: Message-ID: <51E560FC.1050401@cs.iit.edu> Hi all, Please keep this in mind, although its still 1 year away. Ioan -------- Original Message -------- Subject: [DistComp] Call for Papers: IEEE Transactions on Cloud Computing - Special Issue on Scientific Cloud Computing (deadline Jul 31, 2014) Date: Tue, 16 Jul 2013 12:05:06 +0100 From: Bogdan Nicolae To: asr-forum at cines.fr, KerData , distributed-computing-announce at datasys.cs.iit.edu, hpc-announce at mcs.anl.gov Dear colleagues, Please consider the following CFP for your contributions. ------------------------------------------------------------------------------- Call for Papers IEEE Transactions on Cloud Computing Special Issue on Scientific Cloud Computing http://datasys.cs.iit.edu/events/ScienceCloud2014-TCC/ ------------------------------------------------------------------------------- IMPORTANT DATES Paper Submissions Due: July 31, 2014 First Round Decision: September 30,2014 Major Revisions Due (if necessary): October 31, 2014 Final Decision: December 1, 2014 Journal Publication: TBD ------------------------------------------------------------------------------- OVERVIEW Computational and Data-Driven Sciences have become the third and fourth pillar of scientific discovery in addition to experimental and theoretical sciences. Scientific Computing has already begun to change how science is done, enabling scientific breakthroughs through new kinds of experiments that would have been impossible only a decade ago. It is the key to solving ?grand challenges? in many domains and providing breakthroughs in new knowledge, and it comes in many shapes and forms: high-performance computing (HPC) which is heavily focused on compute-intensive applications; high-throughput computing (HTC) which focuses on using many computing resources over long periods of time to accomplish its computational tasks; many-task computing (MTC) which aims to bridge the gap between HPC and HTC by focusing on using many resources over short periods of time; and data-intensive computing which is heavily focused on data distribution, data-parallel execution, and harnessing data locality by scheduling of computations close to the data. Today?s ?Big Data? trend is generating datasets that are increasing exponentially in both complexity and volume, making their analysis, archival, and sharing one of the grand challenges of the 21st century. Not surprisingly, it becomes increasingly difficult to design and operate large scale systems capable of addressing these grand challenges. This journal Special Issue on Scientific Cloud Computing in the IEEE Transaction on Cloud Computing will provide the scientific community a dedicated forum for discussing new research, development, and deployment efforts in running these kinds of scientific computing workloads on Cloud Computing infrastructures. This special issue will focus on the use of cloud-based technologies to meet new compute-intensive and data-intensive scientific challenges that are not well served by the current supercomputers, grids and HPC clusters. The special issue will aim to address questions such as: What architectural changes to the current cloud frameworks (hardware, operating systems, networking and/or programming models) are needed to support science? Dynamic information derived from remote instruments and coupled simulation, and sensor ensembles that stream data for real-time analysis are important emerging techniques in scientific and cyber-physical engineering systems. How can cloud technologies enable and adapt to these new scientific approaches dealing with dynamism? How are scientists using clouds? Are there scientific HPC/HTC/MTC workloads that are suitable candidates to take advantage of emerging cloud computing resources with high efficiency? Commercial public clouds provide easy access to cloud infrastructure for scientists. What are the gaps in commercial cloud offerings and how can they be adapted for running existing and novel eScience applications? What benefits exist by adopting the cloud model, over clusters, grids, or supercomputers? What factors are limiting clouds use or would make them more usable/efficient? ------------------------------------------------------------------------------- TOPICS The topics of interest are, but not limited to, the application of Cloud in scientific applications: ? Scientific application cases studies on Clouds ? Performance evaluation of Cloud technologies ? Fault tolerance and reliability in cloud systems ? Data-intensive workloads and tools on Clouds ? Programming models such as Map-Reduce ? Storage cloud architectures ? I/O and Data management in the Cloud ? Workflow and resource management in the Cloud ? NoSQL databases for scientific applications ? Data streaming and dynamic applications on Clouds ? Dynamic resource provisioning ? Many-Task Computing in the Cloud ? Application of cloud concepts in HPC environments ? Virtualized High performance parallel file systems ? Virtualized high performance I/O networks ? Virtualization and its Impact on Applications ? Distributed Operating Systems ? Many-core computing and accelerators in the Cloud ? Cloud security ------------------------------------------------------------------------------- SUBMISSION INSTRUCTIONS Authors are invited to submit papers with unpublished, original work to the IEEE Transactions on Cloud Computing, Special Issue on Scientific Cloud Computing. If the paper is extended from a workshop or conference paper, it must contain at least 50% new material with "brand" new ideas and results. The papers should not be longer than 14 double column pages in the IEEE TCC format. Papers should be submitted directly to TCC at https://mc.manuscriptcentral.com/tcc-cs, and "SI-ScienceCloud" should be selected. ------------------------------------------------------------------------------- ORGANIZERS ? Kate Keahey, University of Chicago & Argonne National Laboratory, USA ? Ioan Raicu, Illinois Institute of Technology & Argonne National Lab., USA ? Kyle Chard, University of Chicago & Argonne National Laboratory, USA ? Bogdan Nicolae, IBM Research, Ireland ------------------------------------------------------------------------------- CONTACT Email:sciencecloud2014-tcc-editors at datasys.cs.iit.edu Website:http://datasys.cs.iit.edu/events/ScienceCloud2014-TCC/ ---------------------- Bogdan Nicolae Tel: +353 (0)1 - 826 9253 Exascale Systems E-mail:bogdan.nicolae at ie.ibm.com IBM Research, Ireland Server 3, Damastown Industrial Park, Mulhuddart, Dublin 15, Ireland Web:http://researcher.ibm.com/person/ie-bogdan.nicolae -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- _______________________________________________ distributed-computing-announce mailing list distributed-computing-announce at datasys.cs.iit.edu http://datasys.cs.iit.edu/mailman/listinfo/distributed-computing-announce From wilde at mcs.anl.gov Tue Jul 16 17:18:37 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 16 Jul 2013 17:18:37 -0500 (CDT) Subject: [Swift-devel] Provider staging with local provider In-Reply-To: <1050619105.10161889.1374012945850.JavaMail.root@mcs.anl.gov> Message-ID: <224277050.10161990.1374013117450.JavaMail.root@mcs.anl.gov> Mihael, I discovered by accident that provider staging works with the local execution provider if you specify swift:stagingMethod=local. This was "accidental" in that I thought I was using execution provider=coasters for the single local setup app in my script, when in fact I was using execution provider=local. I kept getting errors like "No 'sfs' provider or alias found" until I specified local as the staging method. But this leads me to the question of, in general, what other flexibility exists in mixing different execution providers when you have provider staging enabled. For example, can I use *any* execution provider as long as I specify a staging method that is a valid "filesystem" provider (such as gsiftp or ssh)? - Mike From wilde at mcs.anl.gov Tue Jul 16 17:52:17 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 16 Jul 2013 17:52:17 -0500 (CDT) Subject: [Swift-devel] How to run automatic coasters from a firewalled host? In-Reply-To: <1582720570.10163744.1374014861130.JavaMail.root@mcs.anl.gov> Message-ID: <215497827.10164038.1374015137705.JavaMail.root@mcs.anl.gov> Mihael, I'm trying to run automatic coasters from a host "orthros" at Argonne which is only accessible from a cryptocard-only gateway host. I was trying to use Beagle to augment the cores available on orthros. I can do a password-less ssh from orthros to beagle using my ssh agent. But coasters on beagle is not starting from orthros, presumably because the remote coaster service can not connect back to the swift client. Can we make this work by having the ssh command run by the ssh-cl proxy create a tunnel back to orthros for the remote service to connect to? - Mike From hategan at mcs.anl.gov Tue Jul 16 22:24:57 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 16 Jul 2013 20:24:57 -0700 Subject: [Swift-devel] Provider staging with local provider In-Reply-To: <224277050.10161990.1374013117450.JavaMail.root@mcs.anl.gov> References: <224277050.10161990.1374013117450.JavaMail.root@mcs.anl.gov> Message-ID: <1374031497.8030.2.camel@echo> On Tue, 2013-07-16 at 17:18 -0500, Michael Wilde wrote: > Mihael, I discovered by accident that provider staging works with the local execution provider if you specify swift:stagingMethod=local. > > This was "accidental" in that I thought I was using execution provider=coasters for the single local setup app in my script, when in fact I was using execution provider=local. I kept getting errors like "No 'sfs' provider or alias found" until I specified local as the staging method. > > But this leads me to the question of, in general, what other flexibility exists in mixing different execution providers when you have provider staging enabled. > > For example, can I use *any* execution provider as long as I specify a staging method that is a valid "filesystem" provider (such as gsiftp or ssh)? The provider needs to support staging. The local one does, because it was easy to implement and useful for testing. Gt2 should also support some limited subset of provider staging (i.e., it doesn't support staging conditions (like only on error, etc.). I don't know if that is sufficient to run swift with provider staging correctly. Mihael From hategan at mcs.anl.gov Wed Jul 17 00:46:00 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 16 Jul 2013 22:46:00 -0700 Subject: [Swift-devel] How to run automatic coasters from a firewalled host? In-Reply-To: <215497827.10164038.1374015137705.JavaMail.root@mcs.anl.gov> References: <215497827.10164038.1374015137705.JavaMail.root@mcs.anl.gov> Message-ID: <1374039960.8030.7.camel@echo> on orthros: export GLOBUS_HOSTNAME=localhost export GLOBUS_TCP_PORT_RANGE=x,y then tunnel all ports between x, y on beagle to the same ports on orthros. What that does is tell the coaster service to connect to localhost:, which is actually on beagle which is forwarded to orthros. I think. Let me know if it works. Mihael On Tue, 2013-07-16 at 17:52 -0500, Michael Wilde wrote: > Mihael, > > I'm trying to run automatic coasters from a host "orthros" at Argonne which is only accessible from a cryptocard-only gateway host. > > I was trying to use Beagle to augment the cores available on orthros. > > I can do a password-less ssh from orthros to beagle using my ssh agent. > > But coasters on beagle is not starting from orthros, presumably because the remote coaster service can not connect back to the swift client. > > Can we make this work by having the ssh command run by the ssh-cl proxy create a tunnel back to orthros for the remote service to connect to? > > - Mike From hategan at mcs.anl.gov Thu Jul 18 21:37:47 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 18 Jul 2013 19:37:47 -0700 Subject: [Swift-devel] app defs in tc.data optional Message-ID: <1374201467.19468.2.camel@echo> ... is now in trunk. Documentation will probably need some updating. See comments in https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=1018 Mihael From wilde at mcs.anl.gov Fri Jul 19 14:08:09 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 19 Jul 2013 14:08:09 -0500 (CDT) Subject: [Swift-devel] Fwd: latest draft of data management notes In-Reply-To: <756038696.13805860.1374250746019.JavaMail.root@mcs.anl.gov> Message-ID: <549791899.13972445.1374260889415.JavaMail.root@mcs.anl.gov> Ive moved the doc to Google Drive, so feel free to edit and/or insert comments: https://docs.google.com/document/d/1ylRaL2Q0CrSJKE7MrKENJkw48Koyxd2rboEHkIbI0pE/edit?usp=sharing - Mike From tim.g.armstrong at gmail.com Fri Jul 19 15:40:59 2013 From: tim.g.armstrong at gmail.com (Tim Armstrong) Date: Fri, 19 Jul 2013 15:40:59 -0500 Subject: [Swift-devel] Swift/T mappers Message-ID: There is a little about this topic in the Swift/T user guide, but I thought that I could dmore concisely describe them by relating them to Swift/K mappers. The tl;dr version is: 1. Swift/T uses plain old vanilla functions to do everything that input mappers can do. 2. We only support output mapping individual file variables, but you can provide arbitrary string expressions, so it's very flexible. 3. Mapping arrays is feasible and on the radar, but doesn't seem to be blocking anything and will be some work. *Input mappers * Input mappers don't exist as a distinct feature in Swift/T. Instead you just declare an unmapped file variable, and assign it. Swift/T provides some functions to initialise arrays of files that achieve the same purpose as input mappers in Swift/K. E.g. file f = input_file("a_single_file.txt"); file fs[] = glob("file*.txt"); I preferred doing things this way as you can do the same things without a special language concept of input mapping. It also makes semantics more consistent (e.g. a variable is only assigned a value via. the assignment operator, rather than implicitly as in the input mappers). *Output mappers * The general concept of output mappers is the same as Swift/K afaict. The way I interpret it is that, if a programmer maps a variable, then assigns to that variable, then a side-effect of the assignment is that the file contents are created at or copied to that path in the file system. We only support mapping individual files. file f<"output.txt">; f = some_function(); You can provide arbitrary string expressions as the mapping, which gives you some flexibility. E.g. if you want to create files result.1.txt, result.2.txt, ..., result.n.txt. file results[]; foreach i in [1:N] { file f<"result." + fromint(i) + ".txt"> = some_function(); results[i] = f; } We don't have support for mapping arrays of files. I don't think there's any difficulty for language semantics, but it will need some implementation work and decisions about what mappers to support. Lack of other mappers hasn't blocked any applications so far, so it hasn't been prioritised. The current obstacles are: - We need runtime/compiler support for attaching mappers to arrays. Fitting that logic into the STC IR will be a little convoluted but should be possible. - It seems like there is opportunity to streamline interfaces/naming of the Swift/K mappers, but I'm not sure exactly how. - Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: From wozniak at mcs.anl.gov Thu Jul 25 10:30:26 2013 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Thu, 25 Jul 2013 10:30:26 -0500 Subject: [Swift-devel] dryrun for user Message-ID: <51F14492.3040308@mcs.anl.gov> I was trying to get dryrun working for a user yesterday but can't figure out where the output went- is anyone else currently using this? Thanks -- Justin M Wozniak From wilde at mcs.anl.gov Thu Jul 25 10:43:29 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 25 Jul 2013 10:43:29 -0500 (CDT) Subject: [Swift-devel] dryrun for user In-Reply-To: <51F14492.3040308@mcs.anl.gov> Message-ID: <340996760.16185991.1374767009759.JavaMail.root@mcs.anl.gov> Looking at the user guide, I suspect -dryrun was meant to be used with these properties to generate a GraphViz "dot" file: pgraph pgraph.graph.options pgraph.node.options - Mike ----- Original Message ----- > From: "Justin M Wozniak" > To: "Swift Devel" > Sent: Thursday, July 25, 2013 10:30:26 AM > Subject: [Swift-devel] dryrun for user > > > I was trying to get dryrun working for a user yesterday but can't > figure > out where the output went- is anyone else currently using this? > Thanks > > -- > Justin M Wozniak > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > From hategan at mcs.anl.gov Thu Jul 25 10:48:41 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 25 Jul 2013 08:48:41 -0700 Subject: [Swift-devel] dryrun for user In-Reply-To: <51F14492.3040308@mcs.anl.gov> References: <51F14492.3040308@mcs.anl.gov> Message-ID: <1374767321.22310.5.camel@echo> Dryrun should produce no output. It fakes app execution, but runs through the dependencies. Mihael On Thu, 2013-07-25 at 10:30 -0500, Justin M Wozniak wrote: > I was trying to get dryrun working for a user yesterday but can't figure > out where the output went- is anyone else currently using this? > Thanks > From wozniak at mcs.anl.gov Thu Jul 25 11:05:17 2013 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Thu, 25 Jul 2013 11:05:17 -0500 Subject: [Swift-devel] sourceforge SVN changes In-Reply-To: <1069254459.3187631.1371167769750.JavaMail.root@mcs.anl.gov> References: <1069254459.3187631.1371167769750.JavaMail.root@mcs.anl.gov> Message-ID: <51F14CBD.4010008@mcs.anl.gov> Are these repos up to date on the web yet? On 06/13/2013 06:56 PM, Michael Wilde wrote: > > This worked for me, per Mihael: > svn co https:://svn.code.sf.net/p/cogkit/svn/branches/4.1.10/src/cog/ > > David, can you update the 0.94 source download instructions? > > Thanks, > > - Mike > > > ----- Original Message ----- >> From: "Mihael Hategan" >> To: "Swift Devel" >> Sent: Sunday, June 2, 2013 9:44:09 PM >> Subject: [Swift-devel] sourceforge SVN changes >> >> Hi, >> >> Sourceforge has migrated the svn repos. If you have an existing >> checkout >> you should: >> >> 1. make sure you have your public key in your account: >> https://sourceforge.net/account/ssh >> >> 2. do a 'svn relocate "svn >> +ssh://@svn.code.sf.net/p/cogkit/svn/"' >> >> Also, we should update our checkout instructions. I believe they >> still >> support https in theory, but when I tried "https" instead of >> "svn+ssh" I >> got an "Internal server error". >> >> Mihael >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Justin M Wozniak From wozniak at mcs.anl.gov Thu Jul 25 11:06:20 2013 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Thu, 25 Jul 2013 11:06:20 -0500 Subject: [Swift-devel] dryrun for user In-Reply-To: <1374767321.22310.5.camel@echo> References: <51F14492.3040308@mcs.anl.gov> <1374767321.22310.5.camel@echo> Message-ID: <51F14CFC.1060401@mcs.anl.gov> Ok. I currently get: swift -dryrun -pgraph o.dot ~/hello-k.swift Could not start execution: import @ hello-k.kml, line: 5: org.globus.cog.karajan.analyzer.CompilationException: Import of 'swift.k' failed: generateProvenanceGraph @ swift.k, line: 132: org.globus.cog.karajan.analyzer.CompilationException: Unknown function: generateProvenanceGraph: java.lang.RuntimeException: Unknown function: generateProvenanceGraph On 07/25/2013 10:48 AM, Mihael Hategan wrote: > Dryrun should produce no output. It fakes app execution, but runs > through the dependencies. > > Mihael > > On Thu, 2013-07-25 at 10:30 -0500, Justin M Wozniak wrote: >> I was trying to get dryrun working for a user yesterday but can't figure >> out where the output went- is anyone else currently using this? >> Thanks >> > -- Justin M Wozniak From hategan at mcs.anl.gov Thu Jul 25 12:46:15 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 25 Jul 2013 10:46:15 -0700 Subject: [Swift-devel] dryrun for user In-Reply-To: <51F14CFC.1060401@mcs.anl.gov> References: <51F14492.3040308@mcs.anl.gov> <1374767321.22310.5.camel@echo> <51F14CFC.1060401@mcs.anl.gov> Message-ID: <1374774375.3589.0.camel@echo> Looks like a bug. Trunk? On Thu, 2013-07-25 at 11:06 -0500, Justin M Wozniak wrote: > Ok. I currently get: > > swift -dryrun -pgraph o.dot ~/hello-k.swift > > > > Could not start execution: > import @ hello-k.kml, line: 5: > org.globus.cog.karajan.analyzer.CompilationException: Import of > 'swift.k' failed: > generateProvenanceGraph @ swift.k, line: 132: > org.globus.cog.karajan.analyzer.CompilationException: Unknown function: > generateProvenanceGraph: > java.lang.RuntimeException: Unknown function: generateProvenanceGraph > > On 07/25/2013 10:48 AM, Mihael Hategan wrote: > > Dryrun should produce no output. It fakes app execution, but runs > > through the dependencies. > > > > Mihael > > > > On Thu, 2013-07-25 at 10:30 -0500, Justin M Wozniak wrote: > >> I was trying to get dryrun working for a user yesterday but can't figure > >> out where the output went- is anyone else currently using this? > >> Thanks > >> > > > > From davidk at ci.uchicago.edu Thu Jul 25 13:52:40 2013 From: davidk at ci.uchicago.edu (David Kelly) Date: Thu, 25 Jul 2013 13:52:40 -0500 (CDT) Subject: [Swift-devel] sourceforge SVN changes In-Reply-To: <51F14CBD.4010008@mcs.anl.gov> Message-ID: <1030873608.14567037.1374778360156.JavaMail.root@ci.uchicago.edu> They should be up to date now. ----- Original Message ----- > From: "Justin M Wozniak" > To: swift-devel at ci.uchicago.edu > Sent: Thursday, July 25, 2013 11:05:17 AM > Subject: Re: [Swift-devel] sourceforge SVN changes > Are these repos up to date on the web yet? > On 06/13/2013 06:56 PM, Michael Wilde wrote: > > > > This worked for me, per Mihael: > > svn co > > https:://svn.code.sf.net/p/cogkit/svn/branches/4.1.10/src/cog/ > > > > David, can you update the 0.94 source download instructions? > > > > Thanks, > > > > - Mike > > > > > > ----- Original Message ----- > >> From: "Mihael Hategan" > >> To: "Swift Devel" > >> Sent: Sunday, June 2, 2013 9:44:09 PM > >> Subject: [Swift-devel] sourceforge SVN changes > >> > >> Hi, > >> > >> Sourceforge has migrated the svn repos. If you have an existing > >> checkout > >> you should: > >> > >> 1. make sure you have your public key in your account: > >> https://sourceforge.net/account/ssh > >> > >> 2. do a 'svn relocate "svn > >> +ssh://@svn.code.sf.net/p/cogkit/svn/"' > >> > >> Also, we should update our checkout instructions. I believe they > >> still > >> support https in theory, but when I tried "https" instead of > >> "svn+ssh" I > >> got an "Internal server error". > >> > >> Mihael > >> > >> _______________________________________________ > >> Swift-devel mailing list > >> Swift-devel at ci.uchicago.edu > >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >> > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -- > Justin M Wozniak > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From wozniak at mcs.anl.gov Thu Jul 25 13:59:04 2013 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Thu, 25 Jul 2013 13:59:04 -0500 Subject: [Swift-devel] dryrun for user In-Reply-To: <1374774375.3589.0.camel@echo> References: <51F14492.3040308@mcs.anl.gov> <1374767321.22310.5.camel@echo> <51F14CFC.1060401@mcs.anl.gov> <1374774375.3589.0.camel@echo> Message-ID: <51F17578.9020203@mcs.anl.gov> Yes. On 07/25/2013 12:46 PM, Mihael Hategan wrote: > Looks like a bug. Trunk? > > On Thu, 2013-07-25 at 11:06 -0500, Justin M Wozniak wrote: >> Ok. I currently get: >> >> swift -dryrun -pgraph o.dot ~/hello-k.swift >> >> >> >> Could not start execution: >> import @ hello-k.kml, line: 5: >> org.globus.cog.karajan.analyzer.CompilationException: Import of >> 'swift.k' failed: >> generateProvenanceGraph @ swift.k, line: 132: >> org.globus.cog.karajan.analyzer.CompilationException: Unknown function: >> generateProvenanceGraph: >> java.lang.RuntimeException: Unknown function: generateProvenanceGraph >> >> On 07/25/2013 10:48 AM, Mihael Hategan wrote: >>> Dryrun should produce no output. It fakes app execution, but runs >>> through the dependencies. >>> >>> Mihael >>> >>> On Thu, 2013-07-25 at 10:30 -0500, Justin M Wozniak wrote: >>>> I was trying to get dryrun working for a user yesterday but can't figure >>>> out where the output went- is anyone else currently using this? >>>> Thanks >>>> >> > -- Justin M Wozniak From hategan at mcs.anl.gov Thu Jul 25 15:32:25 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 25 Jul 2013 13:32:25 -0700 Subject: [Swift-devel] dryrun for user In-Reply-To: <51F17578.9020203@mcs.anl.gov> References: <51F14492.3040308@mcs.anl.gov> <1374767321.22310.5.camel@echo> <51F14CFC.1060401@mcs.anl.gov> <1374774375.3589.0.camel@echo> <51F17578.9020203@mcs.anl.gov> Message-ID: <1374784345.26623.1.camel@echo> I think dryrun got removed in trunk (in the execution path, but I forgot to remove it from the command line). It was there along with "typecheck", which doesn't apply any more since the swift compiler does the typechecking. Is there a real need for dryrun any more? Mihael On Thu, 2013-07-25 at 13:59 -0500, Justin M Wozniak wrote: > Yes. > > On 07/25/2013 12:46 PM, Mihael Hategan wrote: > > Looks like a bug. Trunk? > > > > On Thu, 2013-07-25 at 11:06 -0500, Justin M Wozniak wrote: > >> Ok. I currently get: > >> > >> swift -dryrun -pgraph o.dot ~/hello-k.swift > >> > >> > >> > >> Could not start execution: > >> import @ hello-k.kml, line: 5: > >> org.globus.cog.karajan.analyzer.CompilationException: Import of > >> 'swift.k' failed: > >> generateProvenanceGraph @ swift.k, line: 132: > >> org.globus.cog.karajan.analyzer.CompilationException: Unknown function: > >> generateProvenanceGraph: > >> java.lang.RuntimeException: Unknown function: generateProvenanceGraph > >> > >> On 07/25/2013 10:48 AM, Mihael Hategan wrote: > >>> Dryrun should produce no output. It fakes app execution, but runs > >>> through the dependencies. > >>> > >>> Mihael > >>> > >>> On Thu, 2013-07-25 at 10:30 -0500, Justin M Wozniak wrote: > >>>> I was trying to get dryrun working for a user yesterday but can't figure > >>>> out where the output went- is anyone else currently using this? > >>>> Thanks > >>>> > >> > > > > From wozniak at mcs.anl.gov Thu Jul 25 15:59:05 2013 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Thu, 25 Jul 2013 15:59:05 -0500 Subject: [Swift-devel] dryrun for user In-Reply-To: <1374784345.26623.1.camel@echo> References: <51F14492.3040308@mcs.anl.gov> <1374767321.22310.5.camel@echo> <51F14CFC.1060401@mcs.anl.gov> <1374774375.3589.0.camel@echo> <51F17578.9020203@mcs.anl.gov> <1374784345.26623.1.camel@echo> Message-ID: <51F19199.3070605@mcs.anl.gov> No, I don't really need it- we can drop it. On 07/25/2013 03:32 PM, Mihael Hategan wrote: > I think dryrun got removed in trunk (in the execution path, but I forgot > to remove it from the command line). > > It was there along with "typecheck", which doesn't apply any more since > the swift compiler does the typechecking. > > Is there a real need for dryrun any more? > > Mihael > > On Thu, 2013-07-25 at 13:59 -0500, Justin M Wozniak wrote: >> Yes. >> >> On 07/25/2013 12:46 PM, Mihael Hategan wrote: >>> Looks like a bug. Trunk? >>> >>> On Thu, 2013-07-25 at 11:06 -0500, Justin M Wozniak wrote: >>>> Ok. I currently get: >>>> >>>> swift -dryrun -pgraph o.dot ~/hello-k.swift >>>> >>>> >>>> >>>> Could not start execution: >>>> import @ hello-k.kml, line: 5: >>>> org.globus.cog.karajan.analyzer.CompilationException: Import of >>>> 'swift.k' failed: >>>> generateProvenanceGraph @ swift.k, line: 132: >>>> org.globus.cog.karajan.analyzer.CompilationException: Unknown function: >>>> generateProvenanceGraph: >>>> java.lang.RuntimeException: Unknown function: generateProvenanceGraph >>>> >>>> On 07/25/2013 10:48 AM, Mihael Hategan wrote: >>>>> Dryrun should produce no output. It fakes app execution, but runs >>>>> through the dependencies. >>>>> >>>>> Mihael >>>>> >>>>> On Thu, 2013-07-25 at 10:30 -0500, Justin M Wozniak wrote: >>>>>> I was trying to get dryrun working for a user yesterday but can't figure >>>>>> out where the output went- is anyone else currently using this? >>>>>> Thanks >>>>>> >> > -- Justin M Wozniak From wilde at mcs.anl.gov Fri Jul 26 17:36:05 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 26 Jul 2013 17:36:05 -0500 (CDT) Subject: [Swift-devel] swift swing gui testing In-Reply-To: <702606509.17199043.1374877514160.JavaMail.root@mcs.anl.gov> Message-ID: <263281022.17199508.1374878165781.JavaMail.root@mcs.anl.gov> OK, I downloaded jfreechart and copied most of its jars to swift/lib The GUI came up, and did *some* stuff, but not everything. Eg the summary panel showed all zeros, but I did get some graphs and other outputs. I'll keep fiddling... - Mike ----- Original Message ----- > From: "Michael Wilde" > To: "Mihael Hategan" > Sent: Friday, July 26, 2013 5:25:14 PM > Subject: swift swing gui testing > > > Mihael, do you need to commit a jFreeChart jar? I get the error > below. > > Also, do I need a "user preferences directory" ? > > - Mike > > swift$ swift -version > Swift trunk swift-r6668 cog-r3748 > > swift$ swift -ui Swing catsnsleep.swift -n=10 -s=10 > Jul 26, 2013 10:22:15 PM java.util.prefs.FileSystemPreferences$1 run > INFO: Created user preferences directory. > java.lang.NoClassDefFoundError: org/jfree/data/xy/XYDataset > Exception in thread "main" java.lang.NoClassDefFoundError: > org/jfree/data/xy/XYDataset > at > org.griphyn.vdl.karajan.monitor.monitors.swing.GraphsPanel.setDefaultLayout(GraphsPanel.java:206) > at > org.griphyn.vdl.karajan.monitor.monitors.swing.GraphsPanel.loadLayout(GraphsPanel.java:193) > at > org.griphyn.vdl.karajan.monitor.monitors.swing.GraphsPanel.(GraphsPanel.java:110) > at > org.griphyn.vdl.karajan.monitor.monitors.swing.SwingMonitor.createTabs(SwingMonitor.java:119) > at > org.griphyn.vdl.karajan.monitor.monitors.swing.SwingMonitor.setState(SwingMonitor.java:98) > at > org.griphyn.vdl.karajan.monitor.MonitorAppender.(MonitorAppender.java:68) > at org.griphyn.vdl.karajan.Loader.setupLogging(Loader.java:581) > at org.griphyn.vdl.karajan.Loader.main(Loader.java:147) > Caused by: java.lang.ClassNotFoundException: > org.jfree.data.xy.XYDataset > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:423) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:356) > ... 8 more > swift$ > > > ----- Original Message ----- > > From: "Mihael Hategan" > > To: "Michael Wilde" > > Cc: "David Kelly" , "Yadu Nand B" > > , "Ketan Maheshwari" > > , "Justin M Wozniak" , "Tim > > Armstrong" , "Scott Krieder" > > > > Sent: Friday, July 26, 2013 4:31:29 PM > > Subject: Re: Swift meeting Fri 2PM > > > > On Fri, 2013-07-26 at 16:12 -0500, Michael Wilde wrote: > > > Mihael, this sounds very cool. :) > > > > > > Could you (re)send a pointer to Swift-devel with the info below > > > and > > > a pointer on how to use it? > > > > I trunk you would say "swift -ui Swing ...". The rest should be > > self > > explanatory. > > > > Mihael > > > > > > > > ALso to note: the swift performance plotting info in the User > > > Guide > > > describes Justin's adaptation of Ben's original log plotter, but > > > uses > > > jFreeChart as well - so we might be able to refactor the log > > > processing and plotting into a common code base to feed both > > > static > > > plots and dynamic (status-monitor) plots. > > > > Are we talking about the code in /usertools/plotter? > > > > That seems to be a command-line plotting tool, like a replacement > > for > > gnuplot if I'm not missing something. I'm more concerned with > > replacing > > how the data gets to the plotting library, but again, I might be > > missing > > something. > > > > Mihael > > > > > From wilde at mcs.anl.gov Fri Jul 26 17:43:24 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 26 Jul 2013 17:43:24 -0500 (CDT) Subject: [Swift-devel] swift swing gui testing In-Reply-To: <263281022.17199508.1374878165781.JavaMail.root@mcs.anl.gov> Message-ID: <1679951342.17206799.1374878604083.JavaMail.root@mcs.anl.gov> attached is what I see with a test of 10 "catsnsleep" jobs with 10-second sleeps. I think everything is working except for the summary table. - Mike ----- Original Message ----- > From: "Michael Wilde" > To: "Mihael Hategan" > Cc: "Swift Devel" > Sent: Friday, July 26, 2013 5:36:05 PM > Subject: Re: swift swing gui testing > > OK, I downloaded jfreechart and copied most of its jars to swift/lib > > The GUI came up, and did *some* stuff, but not everything. Eg the > summary panel showed all zeros, but I did get some graphs and other > outputs. > > I'll keep fiddling... > > - Mike > > > ----- Original Message ----- > > From: "Michael Wilde" > > To: "Mihael Hategan" > > Sent: Friday, July 26, 2013 5:25:14 PM > > Subject: swift swing gui testing > > > > > > Mihael, do you need to commit a jFreeChart jar? I get the error > > below. > > > > Also, do I need a "user preferences directory" ? > > > > - Mike > > > > swift$ swift -version > > Swift trunk swift-r6668 cog-r3748 > > > > swift$ swift -ui Swing catsnsleep.swift -n=10 -s=10 > > Jul 26, 2013 10:22:15 PM java.util.prefs.FileSystemPreferences$1 > > run > > INFO: Created user preferences directory. > > java.lang.NoClassDefFoundError: org/jfree/data/xy/XYDataset > > Exception in thread "main" java.lang.NoClassDefFoundError: > > org/jfree/data/xy/XYDataset > > at > > org.griphyn.vdl.karajan.monitor.monitors.swing.GraphsPanel.setDefaultLayout(GraphsPanel.java:206) > > at > > org.griphyn.vdl.karajan.monitor.monitors.swing.GraphsPanel.loadLayout(GraphsPanel.java:193) > > at > > org.griphyn.vdl.karajan.monitor.monitors.swing.GraphsPanel.(GraphsPanel.java:110) > > at > > org.griphyn.vdl.karajan.monitor.monitors.swing.SwingMonitor.createTabs(SwingMonitor.java:119) > > at > > org.griphyn.vdl.karajan.monitor.monitors.swing.SwingMonitor.setState(SwingMonitor.java:98) > > at > > org.griphyn.vdl.karajan.monitor.MonitorAppender.(MonitorAppender.java:68) > > at org.griphyn.vdl.karajan.Loader.setupLogging(Loader.java:581) > > at org.griphyn.vdl.karajan.Loader.main(Loader.java:147) > > Caused by: java.lang.ClassNotFoundException: > > org.jfree.data.xy.XYDataset > > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > > at java.security.AccessController.doPrivileged(Native Method) > > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > > at java.lang.ClassLoader.loadClass(ClassLoader.java:423) > > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > > at java.lang.ClassLoader.loadClass(ClassLoader.java:356) > > ... 8 more > > swift$ > > > > > > ----- Original Message ----- > > > From: "Mihael Hategan" > > > To: "Michael Wilde" > > > Cc: "David Kelly" , "Yadu Nand B" > > > , "Ketan Maheshwari" > > > , "Justin M Wozniak" , > > > "Tim > > > Armstrong" , "Scott Krieder" > > > > > > Sent: Friday, July 26, 2013 4:31:29 PM > > > Subject: Re: Swift meeting Fri 2PM > > > > > > On Fri, 2013-07-26 at 16:12 -0500, Michael Wilde wrote: > > > > Mihael, this sounds very cool. :) > > > > > > > > Could you (re)send a pointer to Swift-devel with the info below > > > > and > > > > a pointer on how to use it? > > > > > > I trunk you would say "swift -ui Swing ...". The rest should be > > > self > > > explanatory. > > > > > > Mihael > > > > > > > > > > > ALso to note: the swift performance plotting info in the User > > > > Guide > > > > describes Justin's adaptation of Ben's original log plotter, > > > > but > > > > uses > > > > jFreeChart as well - so we might be able to refactor the log > > > > processing and plotting into a common code base to feed both > > > > static > > > > plots and dynamic (status-monitor) plots. > > > > > > Are we talking about the code in /usertools/plotter? > > > > > > That seems to be a command-line plotting tool, like a replacement > > > for > > > gnuplot if I'm not missing something. I'm more concerned with > > > replacing > > > how the data gets to the plotting library, but again, I might be > > > missing > > > something. > > > > > > Mihael > > > > > > > > > -------------- next part -------------- A non-text attachment was scrubbed... Name: Screen shot 2013-07-26 at 5.41.19 PM.png Type: image/png Size: 96784 bytes Desc: not available URL: From hategan at mcs.anl.gov Sat Jul 27 13:20:39 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sat, 27 Jul 2013 11:20:39 -0700 Subject: [Swift-devel] swift swing gui testing In-Reply-To: <1679951342.17206799.1374878604083.JavaMail.root@mcs.anl.gov> References: <1679951342.17206799.1374878604083.JavaMail.root@mcs.anl.gov> Message-ID: <1374949239.2275.0.camel@echo> On Fri, 2013-07-26 at 17:43 -0500, Michael Wilde wrote: > attached is what I see with a test of 10 "catsnsleep" jobs with 10-second sleeps. > > I think everything is working except for the summary table. Sorry. Fixed. Mihael From wilde at mcs.anl.gov Sat Jul 27 17:00:56 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sat, 27 Jul 2013 17:00:56 -0500 (CDT) Subject: [Swift-devel] swift swing gui testing In-Reply-To: <1374949239.2275.0.camel@echo> Message-ID: <342767136.17371933.1374962456876.JavaMail.root@mcs.anl.gov> Thanks, Mihael, that works now. I think the next steps on this are to use and refine the interface, and then translate to a web-based version? - Mike ----- Original Message ----- > From: "Mihael Hategan" > To: "Michael Wilde" > Cc: "Swift Devel" > Sent: Saturday, July 27, 2013 1:20:39 PM > Subject: Re: swift swing gui testing > > On Fri, 2013-07-26 at 17:43 -0500, Michael Wilde wrote: > > attached is what I see with a test of 10 "catsnsleep" jobs with > > 10-second sleeps. > > > > I think everything is working except for the summary table. > > Sorry. Fixed. > > Mihael > > From hategan at mcs.anl.gov Sat Jul 27 17:35:33 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sat, 27 Jul 2013 15:35:33 -0700 Subject: [Swift-devel] swift swing gui testing In-Reply-To: <342767136.17371933.1374962456876.JavaMail.root@mcs.anl.gov> References: <342767136.17371933.1374962456876.JavaMail.root@mcs.anl.gov> Message-ID: <1374964533.4887.0.camel@echo> On Sat, 2013-07-27 at 17:00 -0500, Michael Wilde wrote: > Thanks, Mihael, that works now. > > I think the next steps on this are to use and refine the interface, and then translate to a web-based version? Yes. When I said feedback, I meant mostly from a perspective of how useful you find it and what design-wise comments you have. Mihael From davidk at ci.uchicago.edu Sun Jul 28 16:17:38 2013 From: davidk at ci.uchicago.edu (David Kelly) Date: Sun, 28 Jul 2013 16:17:38 -0500 (CDT) Subject: [Swift-devel] swift swing gui testing In-Reply-To: <1374964533.4887.0.camel@echo> Message-ID: <709090040.15042491.1375046258805.JavaMail.root@ci.uchicago.edu> I can't seem to get trunk compiled to test it out: compile: [echo] [swift]: COMPILE [mkdir] Created dir: /Users/davidk/swift-trunk/cog/modules/swift/build [javac] Compiling 427 source files to /Users/davidk/swift-trunk/cog/modules/swift/build [javac] /Users/davidk/swift-trunk/cog/modules/swift/src/org/griphyn/vdl/karajan/monitor/monitors/http/HTTPServer.java:72: cannot find symbol [javac] symbol : method bind(java.net.InetSocketAddress) [javac] location: class java.nio.channels.ServerSocketChannel [javac] channel.bind(new InetSocketAddress(port)); [javac] ^ [javac] /Users/davidk/swift-trunk/cog/modules/swift/src/org/griphyn/vdl/karajan/monitor/monitors/swing/GridView.java:344: cannot find symbol [javac] symbol : method revalidate() [javac] location: class java.awt.Container [javac] getParent().revalidate(); [javac] ^ [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. [javac] 2 errors ----- Original Message ----- > From: "Mihael Hategan" > To: "Michael Wilde" > Cc: "Swift Devel" > Sent: Saturday, July 27, 2013 5:35:33 PM > Subject: Re: [Swift-devel] swift swing gui testing > On Sat, 2013-07-27 at 17:00 -0500, Michael Wilde wrote: > > Thanks, Mihael, that works now. > > > > I think the next steps on this are to use and refine the interface, > > and then translate to a web-based version? > Yes. > When I said feedback, I meant mostly from a perspective of how useful > you find it and what design-wise comments you have. > Mihael > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Sun Jul 28 17:32:48 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sun, 28 Jul 2013 15:32:48 -0700 Subject: [Swift-devel] swift swing gui testing In-Reply-To: <709090040.15042491.1375046258805.JavaMail.root@ci.uchicago.edu> References: <709090040.15042491.1375046258805.JavaMail.root@ci.uchicago.edu> Message-ID: <1375050768.20487.2.camel@echo> I'm guessing you are using a pre 1.7 jdk. I'll try to make these 1.6 compatible, but 1.6 has been EOL-ed by Oracle. Mihael On Sun, 2013-07-28 at 16:17 -0500, David Kelly wrote: > I can't seem to get trunk compiled to test it out: > > compile: > [echo] [swift]: COMPILE > [mkdir] Created dir: /Users/davidk/swift-trunk/cog/modules/swift/build > [javac] Compiling 427 source files to /Users/davidk/swift-trunk/cog/modules/swift/build > [javac] /Users/davidk/swift-trunk/cog/modules/swift/src/org/griphyn/vdl/karajan/monitor/monitors/http/HTTPServer.java:72: cannot find symbol > [javac] symbol : method bind(java.net.InetSocketAddress) > [javac] location: class java.nio.channels.ServerSocketChannel > [javac] channel.bind(new InetSocketAddress(port)); > [javac] ^ > [javac] /Users/davidk/swift-trunk/cog/modules/swift/src/org/griphyn/vdl/karajan/monitor/monitors/swing/GridView.java:344: cannot find symbol > [javac] symbol : method revalidate() > [javac] location: class java.awt.Container > [javac] getParent().revalidate(); > [javac] ^ > [javac] Note: Some input files use unchecked or unsafe operations. > [javac] Note: Recompile with -Xlint:unchecked for details. > [javac] 2 errors > ----- Original Message ----- > > > From: "Mihael Hategan" > > To: "Michael Wilde" > > Cc: "Swift Devel" > > Sent: Saturday, July 27, 2013 5:35:33 PM > > Subject: Re: [Swift-devel] swift swing gui testing > > > On Sat, 2013-07-27 at 17:00 -0500, Michael Wilde wrote: > > > Thanks, Mihael, that works now. > > > > > > I think the next steps on this are to use and refine the interface, > > > and then translate to a web-based version? > > > Yes. > > > When I said feedback, I meant mostly from a perspective of how useful > > you find it and what design-wise comments you have. > > > Mihael > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > I can't seem to get trunk compiled to test it out: > > > compile: > [echo] [swift]: COMPILE > [mkdir] Created > dir: /Users/davidk/swift-trunk/cog/modules/swift/build > [javac] Compiling 427 source files > to /Users/davidk/swift-trunk/cog/modules/swift/build > > [javac] /Users/davidk/swift-trunk/cog/modules/swift/src/org/griphyn/vdl/karajan/monitor/monitors/http/HTTPServer.java:72: cannot find symbol > [javac] symbol : method bind(java.net.InetSocketAddress) > [javac] location: class java.nio.channels.ServerSocketChannel > [javac] channel.bind(new InetSocketAddress(port)); > [javac] ^ > > [javac] /Users/davidk/swift-trunk/cog/modules/swift/src/org/griphyn/vdl/karajan/monitor/monitors/swing/GridView.java:344: cannot find symbol > [javac] symbol : method revalidate() > [javac] location: class java.awt.Container > [javac] getParent().revalidate(); > [javac] ^ > [javac] Note: Some input files use unchecked or unsafe operations. > [javac] Note: Recompile with -Xlint:unchecked for details. > [javac] 2 errors > > > ______________________________________________________________________ > From: "Mihael Hategan" > To: "Michael Wilde" > Cc: "Swift Devel" > Sent: Saturday, July 27, 2013 5:35:33 PM > Subject: Re: [Swift-devel] swift swing gui testing > > On Sat, 2013-07-27 at 17:00 -0500, Michael Wilde wrote: > > Thanks, Mihael, that works now. > > > > I think the next steps on this are to use and refine the > interface, and then translate to a web-based version? > > Yes. > > When I said feedback, I meant mostly from a perspective of how > useful > you find it and what design-wise comments you have. > > Mihael > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > From davidk at ci.uchicago.edu Sun Jul 28 20:37:52 2013 From: davidk at ci.uchicago.edu (David Kelly) Date: Sun, 28 Jul 2013 20:37:52 -0500 (CDT) Subject: [Swift-devel] swift swing gui testing In-Reply-To: <1375050768.20487.2.camel@echo> Message-ID: <1541078622.15060562.1375061872146.JavaMail.root@ci.uchicago.edu> That makes sense. I'm running with Java 1.6. The 1.7 JDKs won't install on my laptop because it apparently requires OS X 10.7.3 or newer. I'll see if I can test with X11 forwarding over ssh for now until I can upgrade. ----- Original Message ----- > From: "Mihael Hategan" > To: "David Kelly" > Cc: "Swift Devel" , "Michael Wilde" > > Sent: Sunday, July 28, 2013 5:32:48 PM > Subject: Re: [Swift-devel] swift swing gui testing > I'm guessing you are using a pre 1.7 jdk. I'll try to make these 1.6 > compatible, but 1.6 has been EOL-ed by Oracle. > Mihael > On Sun, 2013-07-28 at 16:17 -0500, David Kelly wrote: > > I can't seem to get trunk compiled to test it out: > > > > compile: > > [echo] [swift]: COMPILE > > [mkdir] Created dir: > > /Users/davidk/swift-trunk/cog/modules/swift/build > > [javac] Compiling 427 source files to > > /Users/davidk/swift-trunk/cog/modules/swift/build > > [javac] > > /Users/davidk/swift-trunk/cog/modules/swift/src/org/griphyn/vdl/karajan/monitor/monitors/http/HTTPServer.java:72: > > cannot find symbol > > [javac] symbol : method bind(java.net.InetSocketAddress) > > [javac] location: class java.nio.channels.ServerSocketChannel > > [javac] channel.bind(new InetSocketAddress(port)); > > [javac] ^ > > [javac] > > /Users/davidk/swift-trunk/cog/modules/swift/src/org/griphyn/vdl/karajan/monitor/monitors/swing/GridView.java:344: > > cannot find symbol > > [javac] symbol : method revalidate() > > [javac] location: class java.awt.Container > > [javac] getParent().revalidate(); > > [javac] ^ > > [javac] Note: Some input files use unchecked or unsafe operations. > > [javac] Note: Recompile with -Xlint:unchecked for details. > > [javac] 2 errors > > ----- Original Message ----- > > > > > From: "Mihael Hategan" > > > To: "Michael Wilde" > > > Cc: "Swift Devel" > > > Sent: Saturday, July 27, 2013 5:35:33 PM > > > Subject: Re: [Swift-devel] swift swing gui testing > > > > > On Sat, 2013-07-27 at 17:00 -0500, Michael Wilde wrote: > > > > Thanks, Mihael, that works now. > > > > > > > > I think the next steps on this are to use and refine the > > > > interface, > > > > and then translate to a web-based version? > > > > > Yes. > > > > > When I said feedback, I meant mostly from a perspective of how > > > useful > > > you find it and what design-wise comments you have. > > > > > Mihael > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > I can't seem to get trunk compiled to test it out: > > > > > > compile: > > [echo] [swift]: COMPILE > > [mkdir] Created > > dir: /Users/davidk/swift-trunk/cog/modules/swift/build > > [javac] Compiling 427 source files > > to /Users/davidk/swift-trunk/cog/modules/swift/build > > > > [javac] > > /Users/davidk/swift-trunk/cog/modules/swift/src/org/griphyn/vdl/karajan/monitor/monitors/http/HTTPServer.java:72: > > cannot find symbol > > [javac] symbol : method bind(java.net.InetSocketAddress) > > [javac] location: class java.nio.channels.ServerSocketChannel > > [javac] channel.bind(new InetSocketAddress(port)); > > [javac] ^ > > > > [javac] > > /Users/davidk/swift-trunk/cog/modules/swift/src/org/griphyn/vdl/karajan/monitor/monitors/swing/GridView.java:344: > > cannot find symbol > > [javac] symbol : method revalidate() > > [javac] location: class java.awt.Container > > [javac] getParent().revalidate(); > > [javac] ^ > > [javac] Note: Some input files use unchecked or unsafe operations. > > [javac] Note: Recompile with -Xlint:unchecked for details. > > [javac] 2 errors > > > > > > ______________________________________________________________________ > > From: "Mihael Hategan" > > To: "Michael Wilde" > > Cc: "Swift Devel" > > Sent: Saturday, July 27, 2013 5:35:33 PM > > Subject: Re: [Swift-devel] swift swing gui testing > > > > On Sat, 2013-07-27 at 17:00 -0500, Michael Wilde wrote: > > > Thanks, Mihael, that works now. > > > > > > I think the next steps on this are to use and refine the > > interface, and then translate to a web-based version? > > > > Yes. > > > > When I said feedback, I meant mostly from a perspective of how > > useful > > you find it and what design-wise comments you have. > > > > Mihael > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Sun Jul 28 22:51:56 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sun, 28 Jul 2013 20:51:56 -0700 Subject: [Swift-devel] swift swing gui testing In-Reply-To: <1541078622.15060562.1375061872146.JavaMail.root@ci.uchicago.edu> References: <1541078622.15060562.1375061872146.JavaMail.root@ci.uchicago.edu> Message-ID: <1375069916.25173.1.camel@echo> On Sun, 2013-07-28 at 20:37 -0500, David Kelly wrote: > That makes sense. I'm running with Java 1.6. The 1.7 JDKs won't > install on my laptop because it apparently requires OS X 10.7.3 or > newer. I'll see if I can test with X11 forwarding over ssh for now > until I can upgrade. I'll fix it. I think our unofficial policy is to support at least the two latest versions of Java. The target for swift is often supercomputers with weird setups that are laggy in having the latest Java version. I think BG/P was a good example of this. Mihael From hategan at mcs.anl.gov Mon Jul 29 01:22:19 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sun, 28 Jul 2013 23:22:19 -0700 Subject: [Swift-devel] local tests Message-ID: <1375078939.28255.0.camel@echo> What happened to tests/sites/local in 0.94? Mihael From yadudoc1729 at gmail.com Mon Jul 29 02:23:11 2013 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Mon, 29 Jul 2013 02:23:11 -0500 Subject: [Swift-devel] local tests In-Reply-To: <1375078939.28255.0.camel@echo> References: <1375078939.28255.0.camel@echo> Message-ID: Hi Mihael, Could you be a bit more specific about tests/sites/local ? I do not think I've made any changes to that folder yet. Have you noticed that a huge majority of tests are reporting failures since the 26th on Bridled? This is from the 25th -> http://www.ci.uchicago.edu/swift/tests/swift-0.94/run-2013-07-25/tests-2013-07-25.html And from the 26th -> http://www.ci.uchicago.edu/swift/tests/swift-0.94/run-2013-07-26/tests-2013-07-26.html Curiously the tests since the 22nd have been failing for trunk. -Yadu- On Mon, Jul 29, 2013 at 1:22 AM, Mihael Hategan wrote: > What happened to tests/sites/local in 0.94? > > Mihael > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Yadu Nand B From hategan at mcs.anl.gov Mon Jul 29 02:46:59 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 29 Jul 2013 00:46:59 -0700 Subject: [Swift-devel] local tests In-Reply-To: References: <1375078939.28255.0.camel@echo> Message-ID: <1375084019.9628.3.camel@echo> On Mon, 2013-07-29 at 02:23 -0500, Yadu Nand wrote: > Hi Mihael, > > Could you be a bit more specific about tests/sites/local ? I do not > think I've made any changes to that folder yet. It's missing in SVN: https://trac.ci.uchicago.edu/swift/browser/branches/release-0.94/tests/sites > > Have you noticed that a huge majority of tests are reporting failures > since the 26th on Bridled? > This is from the 25th -> > http://www.ci.uchicago.edu/swift/tests/swift-0.94/run-2013-07-25/tests-2013-07-25.html > And from the 26th -> > http://www.ci.uchicago.edu/swift/tests/swift-0.94/run-2013-07-26/tests-2013-07-26.html That's because that directory is missing. It contains the sites file that the tests are supposed to be using. > > Curiously the tests since the 22nd have been failing for trunk. http://www.ci.uchicago.edu/swift/tests/swift-trunk/run-2013-07-22/0041-int-assignment-011635/0041-int-assignment.stdout I don't think the tests are failing. They don't seem to be working at all. Mihael From davidk at ci.uchicago.edu Mon Jul 29 08:47:23 2013 From: davidk at ci.uchicago.edu (David Kelly) Date: Mon, 29 Jul 2013 08:47:23 -0500 (CDT) Subject: [Swift-devel] local tests In-Reply-To: <1375084019.9628.3.camel@echo> Message-ID: <1614352190.15095893.1375105643682.JavaMail.root@ci.uchicago.edu> This was related to some changes I am making, sorry about that. The tests/sites/local and tests/sites/local-coasters directories have been restored. I think this should fix the issue with failing tests. I tried to re-run the test suite to verify, but I am running into this when I try to build: compile: [echo] [provider-coaster]: COMPILE [mkdir] Created dir: /swift/swift-0.94/cog/modules/provider-coaster/build [javac] Compiling 134 source files to /swift/swift-0.94/cog/modules/provider-coaster/build [javac] warning: [options] bootstrap class path not set in conjunction with -source 1.5 [javac] /swift/swift-0.94/cog/modules/provider-coaster/src/org/globus/cog/abstraction/impl/file/coaster/handlers/DeleteHandler.java:21: error: method normalize in class CoasterFileRequestHandler cannot be applied to given types; [javac] File f = normalize(getInDataAsString(0)); [javac] ^ [javac] required: RemoteFile [javac] found: String [javac] reason: actual argument String cannot be converted to RemoteFile by method invocation conversion [javac] /swift/swift-0.94/cog/modules/provider-coaster/src/org/globus/cog/abstraction/impl/file/coaster/handlers/ExistsHandler.java:21: error: method normalize in class CoasterFileRequestHandler cannot be applied to given types; ----- Original Message ----- > From: "Mihael Hategan" > To: "Yadu Nand" > Cc: "Swift Devel" > Sent: Monday, July 29, 2013 2:46:59 AM > Subject: Re: [Swift-devel] local tests > On Mon, 2013-07-29 at 02:23 -0500, Yadu Nand wrote: > > Hi Mihael, > > > > Could you be a bit more specific about tests/sites/local ? I do not > > think I've made any changes to that folder yet. > It's missing in SVN: > https://trac.ci.uchicago.edu/swift/browser/branches/release-0.94/tests/sites > > > > Have you noticed that a huge majority of tests are reporting > > failures > > since the 26th on Bridled? > > This is from the 25th -> > > http://www.ci.uchicago.edu/swift/tests/swift-0.94/run-2013-07-25/tests-2013-07-25.html > > And from the 26th -> > > http://www.ci.uchicago.edu/swift/tests/swift-0.94/run-2013-07-26/tests-2013-07-26.html > That's because that directory is missing. It contains the sites file > that the tests are supposed to be using. > > > > Curiously the tests since the 22nd have been failing for trunk. > http://www.ci.uchicago.edu/swift/tests/swift-trunk/run-2013-07-22/0041-int-assignment-011635/0041-int-assignment.stdout > I don't think the tests are failing. They don't seem to be working at > all. > Mihael > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From yadudoc1729 at gmail.com Mon Jul 29 09:46:42 2013 From: yadudoc1729 at gmail.com (Yadu Nand) Date: Mon, 29 Jul 2013 09:46:42 -0500 Subject: [Swift-devel] local tests In-Reply-To: <1614352190.15095893.1375105643682.JavaMail.root@ci.uchicago.edu> References: <1375084019.9628.3.camel@echo> <1614352190.15095893.1375105643682.JavaMail.root@ci.uchicago.edu> Message-ID: > This was related to some changes I am making, sorry about that. The > tests/sites/local and tests/sites/local-coasters directories have been > restored. I think this should fix the issue with failing tests. I tried to > re-run the test suite to verify, but I am running into this when I try to > build: Okay, great. I'll rerun the tests and start working on better notification of failures. -Yadu- From hategan at mcs.anl.gov Mon Jul 29 11:28:53 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 29 Jul 2013 09:28:53 -0700 Subject: [Swift-devel] local tests In-Reply-To: <1614352190.15095893.1375105643682.JavaMail.root@ci.uchicago.edu> References: <1614352190.15095893.1375105643682.JavaMail.root@ci.uchicago.edu> Message-ID: <1375115333.11566.1.camel@echo> Ooops. Sorry about the compilation error. Should be fixed now. In case anybody is wondering, it turns out that spaces in file names were a problem with coasters and somehow I've been missing it for some time. Mihael On Mon, 2013-07-29 at 08:47 -0500, David Kelly wrote: > This was related to some changes I am making, sorry about that. The tests/sites/local and tests/sites/local-coasters directories have been restored. I think this should fix the issue with failing tests. I tried to re-run the test suite to verify, but I am running into this when I try to build: > > compile: > [echo] [provider-coaster]: COMPILE > [mkdir] Created dir: /swift/swift-0.94/cog/modules/provider-coaster/build [javac] Compiling 134 source files to /swift/swift-0.94/cog/modules/provider-coaster/build > [javac] warning: [options] bootstrap class path not set in conjunction with -source 1.5 > [javac] /swift/swift-0.94/cog/modules/provider-coaster/src/org/globus/cog/abstraction/impl/file/coaster/handlers/DeleteHandler.java:21: error: method normalize in class CoasterFileRequestHandler cannot > be applied to given types; > [javac] File f = normalize(getInDataAsString(0)); > [javac] ^ > [javac] required: RemoteFile [javac] found: String > [javac] reason: actual argument String cannot be converted to RemoteFile by method invocation conversion > [javac] /swift/swift-0.94/cog/modules/provider-coaster/src/org/globus/cog/abstraction/impl/file/coaster/handlers/ExistsHandler.java:21: error: method normalize in class CoasterFileRequestHandler cannot > be applied to given types; > > ----- Original Message ----- > > > From: "Mihael Hategan" > > To: "Yadu Nand" > > Cc: "Swift Devel" > > Sent: Monday, July 29, 2013 2:46:59 AM > > Subject: Re: [Swift-devel] local tests > > > On Mon, 2013-07-29 at 02:23 -0500, Yadu Nand wrote: > > > Hi Mihael, > > > > > > Could you be a bit more specific about tests/sites/local ? I do not > > > think I've made any changes to that folder yet. > > > It's missing in SVN: > > https://trac.ci.uchicago.edu/swift/browser/branches/release-0.94/tests/sites > > > > > > > Have you noticed that a huge majority of tests are reporting > > > failures > > > since the 26th on Bridled? > > > This is from the 25th -> > > > http://www.ci.uchicago.edu/swift/tests/swift-0.94/run-2013-07-25/tests-2013-07-25.html > > > And from the 26th -> > > > http://www.ci.uchicago.edu/swift/tests/swift-0.94/run-2013-07-26/tests-2013-07-26.html > > > That's because that directory is missing. It contains the sites file > > that the tests are supposed to be using. > > > > > > > Curiously the tests since the 22nd have been failing for trunk. > > > http://www.ci.uchicago.edu/swift/tests/swift-trunk/run-2013-07-22/0041-int-assignment-011635/0041-int-assignment.stdout > > > I don't think the tests are failing. They don't seem to be working at > > all. > > > Mihael > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > This was related to some changes I am making, sorry about that. The > tests/sites/local and tests/sites/local-coasters directories have been > restored. I think this should fix the issue with failing tests. I > tried to re-run the test suite to verify, but I am running into this > when I try to build: > > > compile: > [echo] [provider-coaster]: COMPILE > [mkdir] Created > dir: /swift/swift-0.94/cog/modules/provider-coaster/build [javac] > Compiling 134 source files > to /swift/swift-0.94/cog/modules/provider-coaster/build > [javac] warning: [options] bootstrap class path not set in > conjunction with -source 1.5 > > [javac] /swift/swift-0.94/cog/modules/provider-coaster/src/org/globus/cog/abstraction/impl/file/coaster/handlers/DeleteHandler.java:21: error: method normalize in class CoasterFileRequestHandler cannot > be applied to given types; > [javac] File f = normalize(getInDataAsString(0)); > [javac] ^ > [javac] required: RemoteFile [javac] found: String > [javac] reason: actual argument String cannot be converted to > RemoteFile by method invocation conversion > > [javac] /swift/swift-0.94/cog/modules/provider-coaster/src/org/globus/cog/abstraction/impl/file/coaster/handlers/ExistsHandler.java:21: error: method normalize in class CoasterFileRequestHandler cannot > be applied to given types; > > > > > ______________________________________________________________________ > From: "Mihael Hategan" > To: "Yadu Nand" > Cc: "Swift Devel" > Sent: Monday, July 29, 2013 2:46:59 AM > Subject: Re: [Swift-devel] local tests > > On Mon, 2013-07-29 at 02:23 -0500, Yadu Nand wrote: > > Hi Mihael, > > > > Could you be a bit more specific about tests/sites/local ? I > do not > > think I've made any changes to that folder yet. > > It's missing in SVN: > https://trac.ci.uchicago.edu/swift/browser/branches/release-0.94/tests/sites > > > > > Have you noticed that a huge majority of tests are reporting > failures > > since the 26th on Bridled? > > This is from the 25th -> > > > http://www.ci.uchicago.edu/swift/tests/swift-0.94/run-2013-07-25/tests-2013-07-25.html > > And from the 26th -> > > > http://www.ci.uchicago.edu/swift/tests/swift-0.94/run-2013-07-26/tests-2013-07-26.html > > That's because that directory is missing. It contains the > sites file > that the tests are supposed to be using. > > > > > Curiously the tests since the 22nd have been failing for > trunk. > > http://www.ci.uchicago.edu/swift/tests/swift-trunk/run-2013-07-22/0041-int-assignment-011635/0041-int-assignment.stdout > > I don't think the tests are failing. They don't seem to be > working at > all. > > Mihael > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > From dsk at zimbra.ci.anl.gov Fri Jul 19 16:15:20 2013 From: dsk at zimbra.ci.anl.gov (Daniel S. Katz) Date: Fri, 19 Jul 2013 21:15:20 -0000 Subject: [Swift-devel] Swift/T mappers In-Reply-To: References: Message-ID: <792FA064-D303-4945-BF17-43EC56EC1750@zimbra.ci.anl.gov> Thanks Tim, this is quite clear and helpful. Dan On Jul 19, 2013, at 16:40, Tim Armstrong wrote: > There is a little about this topic in the Swift/T user guide, but I thought that I could dmore concisely describe them by relating them to Swift/K mappers. The tl;dr version is: > Swift/T uses plain old vanilla functions to do everything that input mappers can do. > We only support output mapping individual file variables, but you can provide arbitrary string expressions, so it's very flexible. > Mapping arrays is feasible and on the radar, but doesn't seem to be blocking anything and will be some work. > Input mappers > Input mappers don't exist as a distinct feature in Swift/T. Instead you just declare an unmapped file variable, and assign it. Swift/T provides some functions to initialise arrays of files that achieve the same purpose as input mappers in Swift/K. > E.g. > > file f = input_file("a_single_file.txt"); > file fs[] = glob("file*.txt"); > > I preferred doing things this way as you can do the same things without a special language concept of input mapping. It also makes semantics more consistent (e.g. a variable is only assigned a value via. the assignment operator, rather than implicitly as in the input mappers). > > Output mappers > The general concept of output mappers is the same as Swift/K afaict. The way I interpret it is that, if a programmer maps a variable, then assigns to that variable, then a side-effect of the assignment is that the file contents are created at or copied to that path in the file system. We only support mapping individual files. > > file f<"output.txt">; > f = some_function(); > > You can provide arbitrary string expressions as the mapping, which gives you some flexibility. E.g. if you want to create files result.1.txt, result.2.txt, ..., result.n.txt. > > file results[]; > foreach i in [1:N] { > file f<"result." + fromint(i) + ".txt"> = some_function(); > results[i] = f; > } > > We don't have support for mapping arrays of files. I don't think there's any difficulty for language semantics, but it will need some implementation work and decisions about what mappers to support. Lack of other mappers hasn't blocked any applications so far, so it hasn't been prioritised. The current obstacles are: > We need runtime/compiler support for attaching mappers to arrays. Fitting that logic into the STC IR will be a little convoluted but should be possible. > It seems like there is opportunity to streamline interfaces/naming of the Swift/K mappers, but I'm not sure exactly how. > - Tim > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From iraicu at gmail.com Wed Jul 31 12:54:57 2013 From: iraicu at gmail.com (Ioan Raicu) Date: Wed, 31 Jul 2013 17:54:57 -0000 Subject: [Swift-devel] dryrun for user In-Reply-To: <51F19199.3070605@mcs.anl.gov> References: <51F14492.3040308@mcs.anl.gov> <1374767321.22310.5.camel@echo> <51F14CFC.1060401@mcs.anl.gov> <1374774375.3589.0.camel@echo> <51F17578.9020203@mcs.anl.gov> <1374784345.26623.1.camel@echo> <51F19199.3070605@mcs.anl.gov> Message-ID: <49BE93F1-0D28-4B95-B61D-6F512F6A308F@gmail.com> I was thinking of using it to loosely couple Swift with some of the other work I am doing in my lab, specifically other run-times, as well as simulators. If its not there anymore, we'll think of some other way to get the DAG representation out of Swift. Ioan -- ================================================ Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================ Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================ Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================ ================================================ On Jul 25, 2013, at 10:59 AM, Justin M Wozniak wrote: > > No, I don't really need it- we can drop it. > > On 07/25/2013 03:32 PM, Mihael Hategan wrote: >> I think dryrun got removed in trunk (in the execution path, but I forgot >> to remove it from the command line). >> >> It was there along with "typecheck", which doesn't apply any more since >> the swift compiler does the typechecking. >> >> Is there a real need for dryrun any more? >> >> Mihael >> >> On Thu, 2013-07-25 at 13:59 -0500, Justin M Wozniak wrote: >>> Yes. >>> >>> On 07/25/2013 12:46 PM, Mihael Hategan wrote: >>>> Looks like a bug. Trunk? >>>> >>>> On Thu, 2013-07-25 at 11:06 -0500, Justin M Wozniak wrote: >>>>> Ok. I currently get: >>>>> >>>>> swift -dryrun -pgraph o.dot ~/hello-k.swift >>>>> >>>>> >>>>> >>>>> Could not start execution: >>>>> import @ hello-k.kml, line: 5: >>>>> org.globus.cog.karajan.analyzer.CompilationException: Import of >>>>> 'swift.k' failed: >>>>> generateProvenanceGraph @ swift.k, line: 132: >>>>> org.globus.cog.karajan.analyzer.CompilationException: Unknown function: >>>>> generateProvenanceGraph: >>>>> java.lang.RuntimeException: Unknown function: generateProvenanceGraph >>>>> >>>>> On 07/25/2013 10:48 AM, Mihael Hategan wrote: >>>>>> Dryrun should produce no output. It fakes app execution, but runs >>>>>> through the dependencies. >>>>>> >>>>>> Mihael >>>>>> >>>>>> On Thu, 2013-07-25 at 10:30 -0500, Justin M Wozniak wrote: >>>>>>> I was trying to get dryrun working for a user yesterday but can't figure >>>>>>> out where the output went- is anyone else currently using this? >>>>>>> Thanks > > > -- > Justin M Wozniak > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel