From wilde at mcs.anl.gov Tue Nov 1 08:46:29 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 1 Nov 2011 08:46:29 -0500 (CDT) Subject: [Swift-devel] Issues in worker-side GridFTP Message-ID: <1439480770.151545.1320155189570.JavaMail.root@zimbra.anl.gov> We have been discussing the feasibility of adding a data management option to Swift that performs data transfer via globus-url-copy from the worker node, likely issued from within a CDM function. This is mainly intended for OSG sites where no other straightforward transfer option exists (e.g. on sites that have storage servers like SRM-DCache and no mounted access to the primary GridFTP server's filesystem). I'm wondering how worker-side GridFTP will work with respect to data management functions like creation of the work directory, job directory, and transfer of utility files like swiftwrap etc. Any ideas on how best to do this? Its almost like we should first try this in coasters using provider staging, where we start by replacing the data transfer with a worker-side invocation of guc. Mihael, is this something that you could do for us? If you could create a test version of this using a simple "cp" in a worker.pl callout script for doing data transfer, we could then do the testing with actual guc. - Mike From wozniak at mcs.anl.gov Tue Nov 1 08:50:51 2011 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Tue, 1 Nov 2011 08:50:51 -0500 (Central Daylight Time) Subject: [Swift-devel] Issues in worker-side GridFTP In-Reply-To: <1439480770.151545.1320155189570.JavaMail.root@zimbra.anl.gov> References: <1439480770.151545.1320155189570.JavaMail.root@zimbra.anl.gov> Message-ID: On Tue, 1 Nov 2011, Michael Wilde wrote: > I'm wondering how worker-side GridFTP will work with respect to data > management functions like creation of the work directory, job directory, > and transfer of utility files like swiftwrap etc. If you do this with CDM, only the user data files will be affected. The other file operations are unchanged. -- Justin M Wozniak From wilde at mcs.anl.gov Tue Nov 1 09:13:52 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 1 Nov 2011 09:13:52 -0500 (CDT) Subject: [Swift-devel] Issues in worker-side GridFTP In-Reply-To: Message-ID: <2033588067.151763.1320156832447.JavaMail.root@zimbra.anl.gov> Right, and that is the problem I'm asking about: if the site has no GridFTP server, the current "setup" operations will not work and must be done in some other way (I *think*). - Mike ----- Original Message ----- > From: "Justin M Wozniak" > To: "Michael Wilde" > Cc: "Mihael Hategan" , "Ketan Maheshwari" , "Swift Devel" > > Sent: Tuesday, November 1, 2011 8:50:51 AM > Subject: Re: Issues in worker-side GridFTP > On Tue, 1 Nov 2011, Michael Wilde wrote: > > > I'm wondering how worker-side GridFTP will work with respect to data > > management functions like creation of the work directory, job > > directory, > > and transfer of utility files like swiftwrap etc. > > If you do this with CDM, only the user data files will be affected. > The > other file operations are unchanged. > > -- > Justin M Wozniak -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From ketancmaheshwari at gmail.com Tue Nov 1 09:22:23 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Tue, 1 Nov 2011 09:22:23 -0500 Subject: [Swift-devel] Issues in worker-side GridFTP In-Reply-To: <2033588067.151763.1320156832447.JavaMail.root@zimbra.anl.gov> References: <2033588067.151763.1320156832447.JavaMail.root@zimbra.anl.gov> Message-ID: Given that provider staging is good enough for small size data transfer, it could be used to transfer setup data while a worker side guc could be used for application data. On Tue, Nov 1, 2011 at 9:13 AM, Michael Wilde wrote: > Right, and that is the problem I'm asking about: if the site has no > GridFTP server, the current "setup" operations will not work and must be > done in some other way (I *think*). > > - Mike > > > ----- Original Message ----- > > From: "Justin M Wozniak" > > To: "Michael Wilde" > > Cc: "Mihael Hategan" , "Ketan Maheshwari" < > ketan at mcs.anl.gov>, "Swift Devel" > > > > Sent: Tuesday, November 1, 2011 8:50:51 AM > > Subject: Re: Issues in worker-side GridFTP > > On Tue, 1 Nov 2011, Michael Wilde wrote: > > > > > I'm wondering how worker-side GridFTP will work with respect to data > > > management functions like creation of the work directory, job > > > directory, > > > and transfer of utility files like swiftwrap etc. > > > > If you do this with CDM, only the user data files will be affected. > > The > > other file operations are unchanged. > > > > -- > > Justin M Wozniak > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Tue Nov 1 09:31:31 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 1 Nov 2011 09:31:31 -0500 (CDT) Subject: [Swift-devel] Issues in worker-side GridFTP In-Reply-To: Message-ID: <913007824.151887.1320157891214.JavaMail.root@zimbra.anl.gov> Justin, does current CDM and provider staging interact correctly? Ie, if we introduce a CDM mode called "worker-staging", will CDM operate with *provider staging* in the manner you described in your prior message? The disadvantage of provider staging, at the moment, is that when enabled it applies to all sites. But for cases where that is acceptable, this might work. So if CDM "direct" works with provider staging, then we could, as discussed in earlier meetings, use that as a base for a worker-staging mode. - Mike ----- Original Message ----- > From: "Ketan Maheshwari" > To: "Michael Wilde" > Cc: "Justin M Wozniak" , "Swift Devel" , "Ketan Maheshwari" > > Sent: Tuesday, November 1, 2011 9:22:23 AM > Subject: Re: [Swift-devel] Issues in worker-side GridFTP > Given that provider staging is good enough for small size data > transfer, it could be used to transfer setup data while a worker side > guc could be used for application data. > > > On Tue, Nov 1, 2011 at 9:13 AM, Michael Wilde < wilde at mcs.anl.gov > > wrote: > > > Right, and that is the problem I'm asking about: if the site has no > GridFTP server, the current "setup" operations will not work and must > be done in some other way (I *think*). > > - Mike > > > ----- Original Message ----- > > From: "Justin M Wozniak" < wozniak at mcs.anl.gov > > > To: "Michael Wilde" < wilde at mcs.anl.gov > > > Cc: "Mihael Hategan" < hategan at mcs.anl.gov >, "Ketan Maheshwari" < > > ketan at mcs.anl.gov >, "Swift Devel" > > < swift-devel at ci.uchicago.edu > > > Sent: Tuesday, November 1, 2011 8:50:51 AM > > Subject: Re: Issues in worker-side GridFTP > > > On Tue, 1 Nov 2011, Michael Wilde wrote: > > > > > I'm wondering how worker-side GridFTP will work with respect to > > > data > > > management functions like creation of the work directory, job > > > directory, > > > and transfer of utility files like swiftwrap etc. > > > > If you do this with CDM, only the user data files will be affected. > > The > > other file operations are unchanged. > > > > -- > > Justin M Wozniak > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > -- > Ketan -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From wozniak at mcs.anl.gov Tue Nov 1 09:52:58 2011 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Tue, 1 Nov 2011 09:52:58 -0500 (Central Daylight Time) Subject: [Swift-devel] Issues in worker-side GridFTP In-Reply-To: <913007824.151887.1320157891214.JavaMail.root@zimbra.anl.gov> References: <913007824.151887.1320157891214.JavaMail.root@zimbra.anl.gov> Message-ID: Yes, the CDM features were copied into _swiftwrap.staging and vdl-int-staging.k . On Tue, 1 Nov 2011, Michael Wilde wrote: > Justin, does current CDM and provider staging interact correctly? Ie, if > we introduce a CDM mode called "worker-staging", will CDM operate with > *provider staging* in the manner you described in your prior message? > > The disadvantage of provider staging, at the moment, is that when > enabled it applies to all sites. But for cases where that is acceptable, > this might work. > > So if CDM "direct" works with provider staging, then we could, as > discussed in earlier meetings, use that as a base for a worker-staging > mode. > > - Mike > > ----- Original Message ----- >> From: "Ketan Maheshwari" >> To: "Michael Wilde" >> Cc: "Justin M Wozniak" , "Swift Devel" , "Ketan Maheshwari" >> >> Sent: Tuesday, November 1, 2011 9:22:23 AM >> Subject: Re: [Swift-devel] Issues in worker-side GridFTP >> Given that provider staging is good enough for small size data >> transfer, it could be used to transfer setup data while a worker side >> guc could be used for application data. >> >> >> On Tue, Nov 1, 2011 at 9:13 AM, Michael Wilde < wilde at mcs.anl.gov > >> wrote: >> >> >> Right, and that is the problem I'm asking about: if the site has no >> GridFTP server, the current "setup" operations will not work and must >> be done in some other way (I *think*). >> >> - Mike >> >> >> ----- Original Message ----- >>> From: "Justin M Wozniak" < wozniak at mcs.anl.gov > >>> To: "Michael Wilde" < wilde at mcs.anl.gov > >>> Cc: "Mihael Hategan" < hategan at mcs.anl.gov >, "Ketan Maheshwari" < >>> ketan at mcs.anl.gov >, "Swift Devel" >>> < swift-devel at ci.uchicago.edu > >>> Sent: Tuesday, November 1, 2011 8:50:51 AM >>> Subject: Re: Issues in worker-side GridFTP >> >>> On Tue, 1 Nov 2011, Michael Wilde wrote: >>> >>>> I'm wondering how worker-side GridFTP will work with respect to >>>> data >>>> management functions like creation of the work directory, job >>>> directory, >>>> and transfer of utility files like swiftwrap etc. >>> >>> If you do this with CDM, only the user data files will be affected. >>> The >>> other file operations are unchanged. >>> >>> -- >>> Justin M Wozniak >> >> -- >> Michael Wilde >> Computation Institute, University of Chicago >> Mathematics and Computer Science Division >> Argonne National Laboratory >> >> >> >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> >> >> >> >> -- >> Ketan > > -- Justin M Wozniak From tim.g.armstrong at gmail.com Tue Nov 1 12:23:56 2011 From: tim.g.armstrong at gmail.com (Tim Armstrong) Date: Tue, 1 Nov 2011 12:23:56 -0500 Subject: [Swift-devel] Swift 0.93 RC3 hangs after all jobs seem to be complete In-Reply-To: <1319936310.2688.0.camel@blabla> References: <1456767810.162441.1319645666935.JavaMail.root@zimbra-mb2.anl.gov> <1319936310.2688.0.camel@blabla> Message-ID: I'm think I'm seeing a similar deadlock in the latest version of Swift. I'm first going to verify that this is actually happening (update Swift, recompile, etc), but what information should I collect that would be useful for debugging? - Tim On Sat, Oct 29, 2011 at 7:58 PM, Mihael Hategan wrote: > This deadlock is now fixed (swift r5262). > > On Wed, 2011-10-26 at 11:14 -0500, David Kelly wrote: > > I think I've found a way to reproduce this. From the test suite, if you > run language-behaviour/mappers/075-array-mapper.swift a few times, you'll > run into a deadlock which looks very similar to the one Sheri is seeing. > Here is the jstack: > > > > http://www.ci.uchicago.edu/~davidk/logs/jstack20111025110620.log > > > > David > > > > ----- Original Message ----- > > > From: "Michael Wilde" > > > To: "Mihael Hategan" , "David Kelly" < > davidk at ci.uchicago.edu> > > > Cc: "Swift Devel" , "Sheri Mickelson" < > mickelso at mcs.anl.gov> > > > Sent: Tuesday, October 25, 2011 2:10:04 PM > > > Subject: Re: Swift 0.93 RC3 hangs after all jobs seem to be complete > > > Mihael, David, > > > > > > Can you both report on what you believe the status of this bug is? > > > > > > I think the subject line here is a bot misleading, in that it seems > > > that a similar thing - ie the workflow deadlocks - was happening both > > > at the start and at the end of various scripts, and possibly at > > > intermediate points. > > > > > > I *think* that Sheri was seeing hangs at the start and in the middle; > > > David was seeing hangs at the end. > > > > > > Talking to David just now he reported diagnosing his hang case down to > > > a situation where the coaster scheduler emits a "null" (ill-formed) > > > job to PBS at the tail end of a workflow. He inserted a workaround to > > > ignore (not submit) such "null" jobs. Im not sure of that was > > > committed, or just tested. David, can you post the details? > > > > > > Mihael, did you look at the jstack that Sheri attached to the posting > > > below? > > > > > > Do you have any theories or fixes for this issue or issues? Unless we > > > believe its resolved, David, please file in bugzilla and attach > > > relevant postings from SHeri, David, and others on this bug. > > > > > > Thanks, > > > > > > - Mike > > > > > > > > > ----- Original Message ----- > > > > From: "Sheri Mickelson" > > > > To: "Mihael Hategan" > > > > Cc: "Michael Wilde" , "David Kelly" > > > > > > > > Sent: Wednesday, October 12, 2011 10:34:43 AM > > > > Subject: Re: Swift 0.93 RC3 hangs after all jobs seem to be complete > > > > I just tried running again on fusion with 0.93RC3 and it hung right > > > > away. > > > > It started with "No events in 10s." and then it looks like it hung. > > > > This was ran using coasters and I manually killed it after about 5 > > > > minutes. > > > > I attached both the log file and the jstack info. > > > > > > > > Thanks, Sheri > > > > > > > > > > > > > > > > > > > > > > > > On Oct 7, 2011, at 2:47 PM, Mihael Hategan wrote: > > > > > > > > > Yeah, so the hang checker doesn't show anything. Which means it's > > > > > not a > > > > > swift flow issue. > > > > > > > > > > I would do what Mike says with jstack as soon after the hang > > > > > checker > > > > > kicks in as possible. > > > > > > > > > > Mihael > > > > > > > > > > On Fri, 2011-10-07 at 12:12 -0500, Michael Wilde wrote: > > > > >> Was: Re: Swift 0.93RC2 is bad - Re: Help on fusion > > > > >> Changed subject so you can see what this is regarding, Mihael. > > > > >> > > > > >> --- > > > > >> > > > > >> Sheri, could you run this again? (Or have you already, and if so, > > > > >> did it run to completion?) > > > > >> > > > > >> What I saw in the log yesterday was that all jobs that were > > > > >> submitted to coasters ran successfully, including all of their > > > > >> data > > > > >> transfers. > > > > >> > > > > >> But I also see that the Swift "hang checker" went off, which > > > > >> indicates that some Java activity was indeed hung. > > > > >> > > > > >> When this happens again, can you run the command "jstack -l PID" > > > > >> where PID is the process of the Swift Java command (which you can > > > > >> best locate by using "ps -u $USER -H" and locate the java process > > > > >> below the swift command). Then send us the jstack output in > > > > >> addition to the associated Swift log. > > > > >> > > > > >> Mihael, in the meantime, can you take a look at the log to see if > > > > >> you can spot any incomplete Swift activities that may be hanging > > > > >> the run? > > > > >> > > > > >> Thanks, > > > > >> > > > > >> - Mike > > > > >> > > > > >> > > > > >> ----- Original Message ----- > > > > >>> From: "Sheri Mickelson" > > > > >>> To: "David Kelly" > > > > >>> Cc: "Michael Wilde" > > > > >>> Sent: Thursday, October 6, 2011 3:23:57 PM > > > > >>> Subject: Re: Swift 0.93RC2 is bad - Re: Help on fusion > > > > >>> Here's the log file. > > > > >>> > > > > >>> > > > > >>> > > > > >>> On Oct 6, 2011, at 3:19 PM, David Kelly wrote: > > > > >>> > > > > >>>> Hi Sheri, > > > > >>>> > > > > >>>> Could you please send the log file so we can take a closer look > > > > >>>> and > > > > >>>> see what's going on there? > > > > >>>> > > > > >>>> Thanks, > > > > >>>> David > > > > >>>> > > > > >>>> ----- Original Message ----- > > > > >>>>> From: "Sheri Mickelson" > > > > >>>>> To: "David Kelly" > > > > >>>>> Cc: "Michael Wilde" > > > > >>>>> Sent: Thursday, October 6, 2011 3:07:44 PM > > > > >>>>> Subject: Re: Swift 0.93RC2 is bad - Re: Help on fusion > > > > >>>>> I just tried this version and had a little bit more luck. It > > > > >>>>> looked > > > > >>>>> like everything was running fine, but now it looks like it's > > > > >>>>> hung > > > > >>>>> near > > > > >>>>> the end. I keep getting the message "Finished > > > > >>>>> successfully:66". > > > > >>>>> The > > > > >>>>> message before that was "Checking status:1 Finished > > > > >>>>> successfully:65". > > > > >>>>> > > > > >>>>> Thanks, Sheri > > > > >>>>> > > > > >>>>> On Oct 6, 2011, at 2:14 PM, David Kelly wrote: > > > > >>>>> > > > > >>>>>> > > > > >>>>>> It's been a while since RC2 was created. There have been > > > > >>>>>> quite > > > > >>>>>> a > > > > >>>>>> lot > > > > >>>>>> of fixes since then, so I just created a new 0.93 RC3. The > > > > >>>>>> direct > > > > >>>>>> download can be found at: > > > > >>>>>> > > > > >>>>>> > http://www.ci.uchicago.edu/swift/packages/swift-0.93RC3.tar.gz > > > > >>>>>> > > > > >>>>>> Hope this helps. > > > > >>>>>> > > > > >>>>>> Thanks, > > > > >>>>>> David > > > > >>>>>> > > > > >>>>>> ----- Original Message ----- > > > > >>>>>>> From: "Michael Wilde" > > > > >>>>>>> To: "Sheri Mickelson" > > > > >>>>>>> Cc: "David Kelly" > > > > >>>>>>> Sent: Thursday, October 6, 2011 12:17:56 PM > > > > >>>>>>> Subject: Swift 0.93RC2 is bad - Re: Help on fusion > > > > >>>>>>> Sheri, > > > > >>>>>>> > > > > >>>>>>> Your AMWG script is failing because the swift-0.93RC2 > > > > >>>>>>> release > > > > >>>>>>> is > > > > >>>>>>> bad. > > > > >>>>>>> > > > > >>>>>>> The error its showing in the log is this: "2011-10-06 > > > > >>>>>>> 11:46:24,635-0500 DEBUG vdl:execute2 APPLICATION_EXCEPTION > > > > >>>>>>> jobid=ncatted-se54rxgk - Application exception: null > > > > >>>>>>> Caused by: > > > > >>>>>>> org > > > > >>>>>>> .globus > > > > >>>>>>> .cog.abstraction.impl.common.task.TaskSubmissionException: > > > > >>>>>>> lowOverallocation must be < 1.0 (currently 100.0)" > > > > >>>>>>> > > > > >>>>>>> ...which was fixed in SVN for 0.93. > > > > >>>>>>> > > > > >>>>>>> Did you load this from a tarball or from SVN? > > > > >>>>>>> > > > > >>>>>>> David, do we have a more recent 0.93 release candidate? > > > > >>>>>>> > > > > >>>>>>> If not, then can you build an 0.93 from SVN? If not, we can > > > > >>>>>>> do > > > > >>>>>>> that > > > > >>>>>>> for you. I'll start a build in the meantime just in case. > > > > >>>>>>> > > > > >>>>>>> Sorry about this error, Sheri. > > > > >>>>>>> > > > > >>>>>>> - Mike > > > > >>>>>>> > > > > >>>>>>> > > > > >>>>>>> > > > > >>>>>>> ----- Original Message ----- > > > > >>>>>>>> From: "Sheri Mickelson" > > > > >>>>>>>> To: "Michael Wilde" > > > > >>>>>>>> Sent: Thursday, October 6, 2011 11:52:58 AM > > > > >>>>>>>> Subject: Re: Help on fusion > > > > >>>>>>>> I have everything in > > > > >>>>>>>> /fusion/gpfs/home/mickelso/amwg-swift/svnRepo/swift > > > > >>>>>>>> > > > > >>>>>>>> I believe the pathnames are correct. > > > > >>>>>>>> > > > > >>>>>>>> I have not tried running on localhost. > > > > >>>>>>>> > > > > >>>>>>>> I'm using swift version swift-0.93RC2. > > > > >>>>>>>> > > > > >>>>>>>> I'm not at Argonne today, but will be in tomorrow. > > > > >>>>>>>> > > > > >>>>>>>> -Sheri > > > > >>>>>>>> > > > > >>>>>>>> On Oct 6, 2011, at 11:39 AM, Michael Wilde wrote: > > > > >>>>>>>> > > > > >>>>>>>>> Hi Sheri, > > > > >>>>>>>>> > > > > >>>>>>>>> can you point me to the log, run directory, and work dir > > > > >>>>>>>>> of > > > > >>>>>>>>> this > > > > >>>>>>>>> run? > > > > >>>>>>>>> > > > > >>>>>>>>> I trhink we'll need to look into to the log, and the .d > > > > >>>>>>>>> directories, > > > > >>>>>>>>> and possibly the work dir to locate the stdout of the > > > > >>>>>>>>> failing > > > > >>>>>>>>> apps. > > > > >>>>>>>>> > > > > >>>>>>>>> - are the pathnames correct? > > > > >>>>>>>>> > > > > >>>>>>>>> - does the run work on localhost? (ie, are the PBS jobs > > > > >>>>>>>>> running > > > > >>>>>>>>> or > > > > >>>>>>>>> failing)? > > > > >>>>>>>>> > > > > >>>>>>>>> - which Swift rev are you using? > > > > >>>>>>>>> > > > > >>>>>>>>> Are you at Argonne? I can stop by and we can debug. > > > > >>>>>>>>> > > > > >>>>>>>>> - Mike > > > > >>>>>>>>> > > > > >>>>>>>>> > > > > >>>>>>>>> ----- Original Message ----- > > > > >>>>>>>>>> From: "Sheri Mickelson" > > > > >>>>>>>>>> To: "Michael Wilde" > > > > >>>>>>>>>> Sent: Thursday, October 6, 2011 10:32:38 AM > > > > >>>>>>>>>> Subject: Help on fusion > > > > >>>>>>>>>> Hi Mike, > > > > >>>>>>>>>> > > > > >>>>>>>>>> The AMWG people at NCAR want to incorporate the swift > > > > >>>>>>>>>> version > > > > >>>>>>>>>> to > > > > >>>>>>>>>> their > > > > >>>>>>>>>> main branch. Rob's at NCAR right now and wants to have > > > > >>>>>>>>>> this > > > > >>>>>>>>>> done > > > > >>>>>>>>>> as > > > > >>>>>>>>>> soon as possible. I've been working on incorporating the > > > > >>>>>>>>>> changes > > > > >>>>>>>>>> that > > > > >>>>>>>>>> were made in the last release and believe that it's in > > > > >>>>>>>>>> descent > > > > >>>>>>>>>> shape. > > > > >>>>>>>>>> I want to test it on fusion, though, just to make sure > > > > >>>>>>>>>> I'm > > > > >>>>>>>>>> handling > > > > >>>>>>>>>> the env variables correctly. I'm running into an error > > > > >>>>>>>>>> when > > > > >>>>>>>>>> I > > > > >>>>>>>>>> run. > > > > >>>>>>>>>> I'm getting "Failed to transfer wrapper log for job > > > > >>>>>>>>>> for > > > > >>>>>>>>>> all > > > > >>>>>>>>>> of > > > > >>>>>>>>>> the app calls. What usually causes this? I'm stuck on > > > > >>>>>>>>>> where > > > > >>>>>>>>>> to > > > > >>>>>>>>>> look. > > > > >>>>>>>>>> > > > > >>>>>>>>>> Thanks, Sheri > > > > >>>>>>>>> > > > > >>>>>>>>> -- > > > > >>>>>>>>> Michael Wilde > > > > >>>>>>>>> Computation Institute, University of Chicago > > > > >>>>>>>>> Mathematics and Computer Science Division > > > > >>>>>>>>> Argonne National Laboratory > > > > >>>>>>>>> > > > > >>>>>>> > > > > >>>>>>> -- > > > > >>>>>>> Michael Wilde > > > > >>>>>>> Computation Institute, University of Chicago > > > > >>>>>>> Mathematics and Computer Science Division > > > > >>>>>>> Argonne National Laboratory > > > > >> > > > > > > > > > > > > > > > > -- > > > Michael Wilde > > > Computation Institute, University of Chicago > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Tue Nov 1 12:41:09 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 01 Nov 2011 10:41:09 -0700 Subject: [Swift-devel] Swift 0.93 RC3 hangs after all jobs seem to be complete In-Reply-To: References: <1456767810.162441.1319645666935.JavaMail.root@zimbra-mb2.anl.gov> <1319936310.2688.0.camel@blabla> Message-ID: <1320169269.2261.0.camel@blabla> jstack -l On Tue, 2011-11-01 at 12:23 -0500, Tim Armstrong wrote: > I'm think I'm seeing a similar deadlock in the latest version of > Swift. I'm first going to verify that this is actually happening > (update Swift, recompile, etc), but what information should I collect > that would be useful for debugging? > > - Tim > > On Sat, Oct 29, 2011 at 7:58 PM, Mihael Hategan > wrote: > This deadlock is now fixed (swift r5262). > > > On Wed, 2011-10-26 at 11:14 -0500, David Kelly wrote: > > I think I've found a way to reproduce this. From the test > suite, if you run > language-behaviour/mappers/075-array-mapper.swift a few times, > you'll run into a deadlock which looks very similar to the one > Sheri is seeing. Here is the jstack: > > > > > http://www.ci.uchicago.edu/~davidk/logs/jstack20111025110620.log > > > > David > > > > ----- Original Message ----- > > > From: "Michael Wilde" > > > To: "Mihael Hategan" , "David Kelly" > > > > Cc: "Swift Devel" , "Sheri > Mickelson" > > > Sent: Tuesday, October 25, 2011 2:10:04 PM > > > Subject: Re: Swift 0.93 RC3 hangs after all jobs seem to > be complete > > > Mihael, David, > > > > > > Can you both report on what you believe the status of this > bug is? > > > > > > I think the subject line here is a bot misleading, in that > it seems > > > that a similar thing - ie the workflow deadlocks - was > happening both > > > at the start and at the end of various scripts, and > possibly at > > > intermediate points. > > > > > > I *think* that Sheri was seeing hangs at the start and in > the middle; > > > David was seeing hangs at the end. > > > > > > Talking to David just now he reported diagnosing his hang > case down to > > > a situation where the coaster scheduler emits a > "null" (ill-formed) > > > job to PBS at the tail end of a workflow. He inserted a > workaround to > > > ignore (not submit) such "null" jobs. Im not sure of that > was > > > committed, or just tested. David, can you post the > details? > > > > > > Mihael, did you look at the jstack that Sheri attached to > the posting > > > below? > > > > > > Do you have any theories or fixes for this issue or > issues? Unless we > > > believe its resolved, David, please file in bugzilla and > attach > > > relevant postings from SHeri, David, and others on this > bug. > > > > > > Thanks, > > > > > > - Mike > > > > > > > > > ----- Original Message ----- > > > > From: "Sheri Mickelson" > > > > To: "Mihael Hategan" > > > > Cc: "Michael Wilde" , "David Kelly" > > > > > > > > Sent: Wednesday, October 12, 2011 10:34:43 AM > > > > Subject: Re: Swift 0.93 RC3 hangs after all jobs seem to > be complete > > > > I just tried running again on fusion with 0.93RC3 and it > hung right > > > > away. > > > > It started with "No events in 10s." and then it looks > like it hung. > > > > This was ran using coasters and I manually killed it > after about 5 > > > > minutes. > > > > I attached both the log file and the jstack info. > > > > > > > > Thanks, Sheri > > > > > > > > > > > > > > > > > > > > > > > > On Oct 7, 2011, at 2:47 PM, Mihael Hategan wrote: > > > > > > > > > Yeah, so the hang checker doesn't show anything. Which > means it's > > > > > not a > > > > > swift flow issue. > > > > > > > > > > I would do what Mike says with jstack as soon after > the hang > > > > > checker > > > > > kicks in as possible. > > > > > > > > > > Mihael > > > > > > > > > > On Fri, 2011-10-07 at 12:12 -0500, Michael Wilde > wrote: > > > > >> Was: Re: Swift 0.93RC2 is bad - Re: Help on fusion > > > > >> Changed subject so you can see what this is > regarding, Mihael. > > > > >> > > > > >> --- > > > > >> > > > > >> Sheri, could you run this again? (Or have you > already, and if so, > > > > >> did it run to completion?) > > > > >> > > > > >> What I saw in the log yesterday was that all jobs > that were > > > > >> submitted to coasters ran successfully, including all > of their > > > > >> data > > > > >> transfers. > > > > >> > > > > >> But I also see that the Swift "hang checker" went > off, which > > > > >> indicates that some Java activity was indeed hung. > > > > >> > > > > >> When this happens again, can you run the command > "jstack -l PID" > > > > >> where PID is the process of the Swift Java command > (which you can > > > > >> best locate by using "ps -u $USER -H" and locate the > java process > > > > >> below the swift command). Then send us the jstack > output in > > > > >> addition to the associated Swift log. > > > > >> > > > > >> Mihael, in the meantime, can you take a look at the > log to see if > > > > >> you can spot any incomplete Swift activities that may > be hanging > > > > >> the run? > > > > >> > > > > >> Thanks, > > > > >> > > > > >> - Mike > > > > >> > > > > >> > > > > >> ----- Original Message ----- > > > > >>> From: "Sheri Mickelson" > > > > >>> To: "David Kelly" > > > > >>> Cc: "Michael Wilde" > > > > >>> Sent: Thursday, October 6, 2011 3:23:57 PM > > > > >>> Subject: Re: Swift 0.93RC2 is bad - Re: Help on > fusion > > > > >>> Here's the log file. > > > > >>> > > > > >>> > > > > >>> > > > > >>> On Oct 6, 2011, at 3:19 PM, David Kelly wrote: > > > > >>> > > > > >>>> Hi Sheri, > > > > >>>> > > > > >>>> Could you please send the log file so we can take a > closer look > > > > >>>> and > > > > >>>> see what's going on there? > > > > >>>> > > > > >>>> Thanks, > > > > >>>> David > > > > >>>> > > > > >>>> ----- Original Message ----- > > > > >>>>> From: "Sheri Mickelson" > > > > >>>>> To: "David Kelly" > > > > >>>>> Cc: "Michael Wilde" > > > > >>>>> Sent: Thursday, October 6, 2011 3:07:44 PM > > > > >>>>> Subject: Re: Swift 0.93RC2 is bad - Re: Help on > fusion > > > > >>>>> I just tried this version and had a little bit > more luck. It > > > > >>>>> looked > > > > >>>>> like everything was running fine, but now it looks > like it's > > > > >>>>> hung > > > > >>>>> near > > > > >>>>> the end. I keep getting the message "Finished > > > > >>>>> successfully:66". > > > > >>>>> The > > > > >>>>> message before that was "Checking status:1 > Finished > > > > >>>>> successfully:65". > > > > >>>>> > > > > >>>>> Thanks, Sheri > > > > >>>>> > > > > >>>>> On Oct 6, 2011, at 2:14 PM, David Kelly wrote: > > > > >>>>> > > > > >>>>>> > > > > >>>>>> It's been a while since RC2 was created. There > have been > > > > >>>>>> quite > > > > >>>>>> a > > > > >>>>>> lot > > > > >>>>>> of fixes since then, so I just created a new 0.93 > RC3. The > > > > >>>>>> direct > > > > >>>>>> download can be found at: > > > > >>>>>> > > > > >>>>>> > http://www.ci.uchicago.edu/swift/packages/swift-0.93RC3.tar.gz > > > > >>>>>> > > > > >>>>>> Hope this helps. > > > > >>>>>> > > > > >>>>>> Thanks, > > > > >>>>>> David > > > > >>>>>> > > > > >>>>>> ----- Original Message ----- > > > > >>>>>>> From: "Michael Wilde" > > > > >>>>>>> To: "Sheri Mickelson" > > > > >>>>>>> Cc: "David Kelly" > > > > >>>>>>> Sent: Thursday, October 6, 2011 12:17:56 PM > > > > >>>>>>> Subject: Swift 0.93RC2 is bad - Re: Help on > fusion > > > > >>>>>>> Sheri, > > > > >>>>>>> > > > > >>>>>>> Your AMWG script is failing because the > swift-0.93RC2 > > > > >>>>>>> release > > > > >>>>>>> is > > > > >>>>>>> bad. > > > > >>>>>>> > > > > >>>>>>> The error its showing in the log is this: > "2011-10-06 > > > > >>>>>>> 11:46:24,635-0500 DEBUG vdl:execute2 > APPLICATION_EXCEPTION > > > > >>>>>>> jobid=ncatted-se54rxgk - Application exception: > null > > > > >>>>>>> Caused by: > > > > >>>>>>> org > > > > >>>>>>> .globus > > > > > >>>>>>> .cog.abstraction.impl.common.task.TaskSubmissionException: > > > > >>>>>>> lowOverallocation must be < 1.0 (currently > 100.0)" > > > > >>>>>>> > > > > >>>>>>> ...which was fixed in SVN for 0.93. > > > > >>>>>>> > > > > >>>>>>> Did you load this from a tarball or from SVN? > > > > >>>>>>> > > > > >>>>>>> David, do we have a more recent 0.93 release > candidate? > > > > >>>>>>> > > > > >>>>>>> If not, then can you build an 0.93 from SVN? If > not, we can > > > > >>>>>>> do > > > > >>>>>>> that > > > > >>>>>>> for you. I'll start a build in the meantime just > in case. > > > > >>>>>>> > > > > >>>>>>> Sorry about this error, Sheri. > > > > >>>>>>> > > > > >>>>>>> - Mike > > > > >>>>>>> > > > > >>>>>>> > > > > >>>>>>> > > > > >>>>>>> ----- Original Message ----- > > > > >>>>>>>> From: "Sheri Mickelson" > > > > >>>>>>>> To: "Michael Wilde" > > > > >>>>>>>> Sent: Thursday, October 6, 2011 11:52:58 AM > > > > >>>>>>>> Subject: Re: Help on fusion > > > > >>>>>>>> I have everything in > > > > > >>>>>>>> /fusion/gpfs/home/mickelso/amwg-swift/svnRepo/swift > > > > >>>>>>>> > > > > >>>>>>>> I believe the pathnames are correct. > > > > >>>>>>>> > > > > >>>>>>>> I have not tried running on localhost. > > > > >>>>>>>> > > > > >>>>>>>> I'm using swift version swift-0.93RC2. > > > > >>>>>>>> > > > > >>>>>>>> I'm not at Argonne today, but will be in > tomorrow. > > > > >>>>>>>> > > > > >>>>>>>> -Sheri > > > > >>>>>>>> > > > > >>>>>>>> On Oct 6, 2011, at 11:39 AM, Michael Wilde > wrote: > > > > >>>>>>>> > > > > >>>>>>>>> Hi Sheri, > > > > >>>>>>>>> > > > > >>>>>>>>> can you point me to the log, run directory, > and work dir > > > > >>>>>>>>> of > > > > >>>>>>>>> this > > > > >>>>>>>>> run? > > > > >>>>>>>>> > > > > >>>>>>>>> I trhink we'll need to look into to the log, > and the .d > > > > >>>>>>>>> directories, > > > > >>>>>>>>> and possibly the work dir to locate the stdout > of the > > > > >>>>>>>>> failing > > > > >>>>>>>>> apps. > > > > >>>>>>>>> > > > > >>>>>>>>> - are the pathnames correct? > > > > >>>>>>>>> > > > > >>>>>>>>> - does the run work on localhost? (ie, are the > PBS jobs > > > > >>>>>>>>> running > > > > >>>>>>>>> or > > > > >>>>>>>>> failing)? > > > > >>>>>>>>> > > > > >>>>>>>>> - which Swift rev are you using? > > > > >>>>>>>>> > > > > >>>>>>>>> Are you at Argonne? I can stop by and we can > debug. > > > > >>>>>>>>> > > > > >>>>>>>>> - Mike > > > > >>>>>>>>> > > > > >>>>>>>>> > > > > >>>>>>>>> ----- Original Message ----- > > > > >>>>>>>>>> From: "Sheri Mickelson" > > > > > >>>>>>>>>> To: "Michael Wilde" > > > > >>>>>>>>>> Sent: Thursday, October 6, 2011 10:32:38 AM > > > > >>>>>>>>>> Subject: Help on fusion > > > > >>>>>>>>>> Hi Mike, > > > > >>>>>>>>>> > > > > >>>>>>>>>> The AMWG people at NCAR want to incorporate > the swift > > > > >>>>>>>>>> version > > > > >>>>>>>>>> to > > > > >>>>>>>>>> their > > > > >>>>>>>>>> main branch. Rob's at NCAR right now and > wants to have > > > > >>>>>>>>>> this > > > > >>>>>>>>>> done > > > > >>>>>>>>>> as > > > > >>>>>>>>>> soon as possible. I've been working on > incorporating the > > > > >>>>>>>>>> changes > > > > >>>>>>>>>> that > > > > >>>>>>>>>> were made in the last release and believe > that it's in > > > > >>>>>>>>>> descent > > > > >>>>>>>>>> shape. > > > > >>>>>>>>>> I want to test it on fusion, though, just to > make sure > > > > >>>>>>>>>> I'm > > > > >>>>>>>>>> handling > > > > >>>>>>>>>> the env variables correctly. I'm running into > an error > > > > >>>>>>>>>> when > > > > >>>>>>>>>> I > > > > >>>>>>>>>> run. > > > > >>>>>>>>>> I'm getting "Failed to transfer wrapper log > for job > > > > >>>>>>>>>> for > > > > >>>>>>>>>> all > > > > >>>>>>>>>> of > > > > >>>>>>>>>> the app calls. What usually causes this? I'm > stuck on > > > > >>>>>>>>>> where > > > > >>>>>>>>>> to > > > > >>>>>>>>>> look. > > > > >>>>>>>>>> > > > > >>>>>>>>>> Thanks, Sheri > > > > >>>>>>>>> > > > > >>>>>>>>> -- > > > > >>>>>>>>> Michael Wilde > > > > >>>>>>>>> Computation Institute, University of Chicago > > > > >>>>>>>>> Mathematics and Computer Science Division > > > > >>>>>>>>> Argonne National Laboratory > > > > >>>>>>>>> > > > > >>>>>>> > > > > >>>>>>> -- > > > > >>>>>>> Michael Wilde > > > > >>>>>>> Computation Institute, University of Chicago > > > > >>>>>>> Mathematics and Computer Science Division > > > > >>>>>>> Argonne National Laboratory > > > > >> > > > > > > > > > > > > > > > > -- > > > Michael Wilde > > > Computation Institute, University of Chicago > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > From davidk at ci.uchicago.edu Tue Nov 1 12:52:21 2011 From: davidk at ci.uchicago.edu (David Kelly) Date: Tue, 1 Nov 2011 12:52:21 -0500 (CDT) Subject: [Swift-devel] Swift 0.93 RC3 hangs after all jobs seem to be complete In-Reply-To: Message-ID: <1529774640.170206.1320169941955.JavaMail.root@zimbra-mb2.anl.gov> The last fix seems to be working for the specific issue I was seeing. I ran the array mapper script 100 times with no freezing. David ----- Original Message ----- > From: "Tim Armstrong" > To: "Mihael Hategan" > Cc: "David Kelly" , "Swift Devel" , "Sheri Mickelson" > > Sent: Tuesday, November 1, 2011 12:23:56 PM > Subject: Re: [Swift-devel] Swift 0.93 RC3 hangs after all jobs seem to be complete > I'm think I'm seeing a similar deadlock in the latest version of > Swift. I'm first going to verify that this is actually happening > (update Swift, recompile, etc), but what information should I collect > that would be useful for debugging? > > - Tim > > > On Sat, Oct 29, 2011 at 7:58 PM, Mihael Hategan < hategan at mcs.anl.gov > > wrote: > > > This deadlock is now fixed (swift r5262). > > > > > On Wed, 2011-10-26 at 11:14 -0500, David Kelly wrote: > > I think I've found a way to reproduce this. From the test suite, if > > you run language-behaviour/mappers/075-array-mapper.swift a few > > times, you'll run into a deadlock which looks very similar to the > > one Sheri is seeing. Here is the jstack: > > > > http://www.ci.uchicago.edu/~davidk/logs/jstack20111025110620.log > > > > David > > > > ----- Original Message ----- > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > To: "Mihael Hategan" < hategan at mcs.anl.gov >, "David Kelly" < > > > davidk at ci.uchicago.edu > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu >, "Sheri > > > Mickelson" < mickelso at mcs.anl.gov > > > > Sent: Tuesday, October 25, 2011 2:10:04 PM > > > Subject: Re: Swift 0.93 RC3 hangs after all jobs seem to be > > > complete > > > Mihael, David, > > > > > > Can you both report on what you believe the status of this bug is? > > > > > > I think the subject line here is a bot misleading, in that it > > > seems > > > that a similar thing - ie the workflow deadlocks - was happening > > > both > > > at the start and at the end of various scripts, and possibly at > > > intermediate points. > > > > > > I *think* that Sheri was seeing hangs at the start and in the > > > middle; > > > David was seeing hangs at the end. > > > > > > Talking to David just now he reported diagnosing his hang case > > > down to > > > a situation where the coaster scheduler emits a "null" > > > (ill-formed) > > > job to PBS at the tail end of a workflow. He inserted a workaround > > > to > > > ignore (not submit) such "null" jobs. Im not sure of that was > > > committed, or just tested. David, can you post the details? > > > > > > Mihael, did you look at the jstack that Sheri attached to the > > > posting > > > below? > > > > > > Do you have any theories or fixes for this issue or issues? Unless > > > we > > > believe its resolved, David, please file in bugzilla and attach > > > relevant postings from SHeri, David, and others on this bug. > > > > > > Thanks, > > > > > > - Mike > > > > > > > > > ----- Original Message ----- > > > > From: "Sheri Mickelson" < mickelso at mcs.anl.gov > > > > > To: "Mihael Hategan" < hategan at mcs.anl.gov > > > > > Cc: "Michael Wilde" < wilde at mcs.anl.gov >, "David Kelly" > > > > < davidk at ci.uchicago.edu > > > > > Sent: Wednesday, October 12, 2011 10:34:43 AM > > > > Subject: Re: Swift 0.93 RC3 hangs after all jobs seem to be > > > > complete > > > > I just tried running again on fusion with 0.93RC3 and it hung > > > > right > > > > away. > > > > It started with "No events in 10s." and then it looks like it > > > > hung. > > > > This was ran using coasters and I manually killed it after about > > > > 5 > > > > minutes. > > > > I attached both the log file and the jstack info. > > > > > > > > Thanks, Sheri > > > > > > > > > > > > > > > > > > > > > > > > On Oct 7, 2011, at 2:47 PM, Mihael Hategan wrote: > > > > > > > > > Yeah, so the hang checker doesn't show anything. Which means > > > > > it's > > > > > not a > > > > > swift flow issue. > > > > > > > > > > I would do what Mike says with jstack as soon after the hang > > > > > checker > > > > > kicks in as possible. > > > > > > > > > > Mihael > > > > > > > > > > On Fri, 2011-10-07 at 12:12 -0500, Michael Wilde wrote: > > > > >> Was: Re: Swift 0.93RC2 is bad - Re: Help on fusion > > > > >> Changed subject so you can see what this is regarding, > > > > >> Mihael. > > > > >> > > > > >> --- > > > > >> > > > > >> Sheri, could you run this again? (Or have you already, and if > > > > >> so, > > > > >> did it run to completion?) > > > > >> > > > > >> What I saw in the log yesterday was that all jobs that were > > > > >> submitted to coasters ran successfully, including all of > > > > >> their > > > > >> data > > > > >> transfers. > > > > >> > > > > >> But I also see that the Swift "hang checker" went off, which > > > > >> indicates that some Java activity was indeed hung. > > > > >> > > > > >> When this happens again, can you run the command "jstack -l > > > > >> PID" > > > > >> where PID is the process of the Swift Java command (which you > > > > >> can > > > > >> best locate by using "ps -u $USER -H" and locate the java > > > > >> process > > > > >> below the swift command). Then send us the jstack output in > > > > >> addition to the associated Swift log. > > > > >> > > > > >> Mihael, in the meantime, can you take a look at the log to > > > > >> see if > > > > >> you can spot any incomplete Swift activities that may be > > > > >> hanging > > > > >> the run? > > > > >> > > > > >> Thanks, > > > > >> > > > > >> - Mike > > > > >> > > > > >> > > > > >> ----- Original Message ----- > > > > >>> From: "Sheri Mickelson" < mickelso at mcs.anl.gov > > > > > >>> To: "David Kelly" < davidk at ci.uchicago.edu > > > > > >>> Cc: "Michael Wilde" < wilde at mcs.anl.gov > > > > > >>> Sent: Thursday, October 6, 2011 3:23:57 PM > > > > >>> Subject: Re: Swift 0.93RC2 is bad - Re: Help on fusion > > > > >>> Here's the log file. > > > > >>> > > > > >>> > > > > >>> > > > > >>> On Oct 6, 2011, at 3:19 PM, David Kelly wrote: > > > > >>> > > > > >>>> Hi Sheri, > > > > >>>> > > > > >>>> Could you please send the log file so we can take a closer > > > > >>>> look > > > > >>>> and > > > > >>>> see what's going on there? > > > > >>>> > > > > >>>> Thanks, > > > > >>>> David > > > > >>>> > > > > >>>> ----- Original Message ----- > > > > >>>>> From: "Sheri Mickelson" < mickelso at mcs.anl.gov > > > > > >>>>> To: "David Kelly" < davidk at ci.uchicago.edu > > > > > >>>>> Cc: "Michael Wilde" < wilde at mcs.anl.gov > > > > > >>>>> Sent: Thursday, October 6, 2011 3:07:44 PM > > > > >>>>> Subject: Re: Swift 0.93RC2 is bad - Re: Help on fusion > > > > >>>>> I just tried this version and had a little bit more luck. > > > > >>>>> It > > > > >>>>> looked > > > > >>>>> like everything was running fine, but now it looks like > > > > >>>>> it's > > > > >>>>> hung > > > > >>>>> near > > > > >>>>> the end. I keep getting the message "Finished > > > > >>>>> successfully:66". > > > > >>>>> The > > > > >>>>> message before that was "Checking status:1 Finished > > > > >>>>> successfully:65". > > > > >>>>> > > > > >>>>> Thanks, Sheri > > > > >>>>> > > > > >>>>> On Oct 6, 2011, at 2:14 PM, David Kelly wrote: > > > > >>>>> > > > > >>>>>> > > > > >>>>>> It's been a while since RC2 was created. There have been > > > > >>>>>> quite > > > > >>>>>> a > > > > >>>>>> lot > > > > >>>>>> of fixes since then, so I just created a new 0.93 RC3. > > > > >>>>>> The > > > > >>>>>> direct > > > > >>>>>> download can be found at: > > > > >>>>>> > > > > >>>>>> http://www.ci.uchicago.edu/swift/packages/swift-0.93RC3.tar.gz > > > > >>>>>> > > > > >>>>>> Hope this helps. > > > > >>>>>> > > > > >>>>>> Thanks, > > > > >>>>>> David > > > > >>>>>> > > > > >>>>>> ----- Original Message ----- > > > > >>>>>>> From: "Michael Wilde" < wilde at mcs.anl.gov > > > > > >>>>>>> To: "Sheri Mickelson" < mickelso at mcs.anl.gov > > > > > >>>>>>> Cc: "David Kelly" < davidk at ci.uchicago.edu > > > > > >>>>>>> Sent: Thursday, October 6, 2011 12:17:56 PM > > > > >>>>>>> Subject: Swift 0.93RC2 is bad - Re: Help on fusion > > > > >>>>>>> Sheri, > > > > >>>>>>> > > > > >>>>>>> Your AMWG script is failing because the swift-0.93RC2 > > > > >>>>>>> release > > > > >>>>>>> is > > > > >>>>>>> bad. > > > > >>>>>>> > > > > >>>>>>> The error its showing in the log is this: "2011-10-06 > > > > >>>>>>> 11:46:24,635-0500 DEBUG vdl:execute2 > > > > >>>>>>> APPLICATION_EXCEPTION > > > > >>>>>>> jobid=ncatted-se54rxgk - Application exception: null > > > > >>>>>>> Caused by: > > > > >>>>>>> org > > > > >>>>>>> .globus > > > > >>>>>>> .cog.abstraction.impl.common.task.TaskSubmissionException: > > > > >>>>>>> lowOverallocation must be < 1.0 (currently 100.0)" > > > > >>>>>>> > > > > >>>>>>> ...which was fixed in SVN for 0.93. > > > > >>>>>>> > > > > >>>>>>> Did you load this from a tarball or from SVN? > > > > >>>>>>> > > > > >>>>>>> David, do we have a more recent 0.93 release candidate? > > > > >>>>>>> > > > > >>>>>>> If not, then can you build an 0.93 from SVN? If not, we > > > > >>>>>>> can > > > > >>>>>>> do > > > > >>>>>>> that > > > > >>>>>>> for you. I'll start a build in the meantime just in > > > > >>>>>>> case. > > > > >>>>>>> > > > > >>>>>>> Sorry about this error, Sheri. > > > > >>>>>>> > > > > >>>>>>> - Mike > > > > >>>>>>> > > > > >>>>>>> > > > > >>>>>>> > > > > >>>>>>> ----- Original Message ----- > > > > >>>>>>>> From: "Sheri Mickelson" < mickelso at mcs.anl.gov > > > > > >>>>>>>> To: "Michael Wilde" < wilde at mcs.anl.gov > > > > > >>>>>>>> Sent: Thursday, October 6, 2011 11:52:58 AM > > > > >>>>>>>> Subject: Re: Help on fusion > > > > >>>>>>>> I have everything in > > > > >>>>>>>> /fusion/gpfs/home/mickelso/amwg-swift/svnRepo/swift > > > > >>>>>>>> > > > > >>>>>>>> I believe the pathnames are correct. > > > > >>>>>>>> > > > > >>>>>>>> I have not tried running on localhost. > > > > >>>>>>>> > > > > >>>>>>>> I'm using swift version swift-0.93RC2. > > > > >>>>>>>> > > > > >>>>>>>> I'm not at Argonne today, but will be in tomorrow. > > > > >>>>>>>> > > > > >>>>>>>> -Sheri > > > > >>>>>>>> > > > > >>>>>>>> On Oct 6, 2011, at 11:39 AM, Michael Wilde wrote: > > > > >>>>>>>> > > > > >>>>>>>>> Hi Sheri, > > > > >>>>>>>>> > > > > >>>>>>>>> can you point me to the log, run directory, and work > > > > >>>>>>>>> dir > > > > >>>>>>>>> of > > > > >>>>>>>>> this > > > > >>>>>>>>> run? > > > > >>>>>>>>> > > > > >>>>>>>>> I trhink we'll need to look into to the log, and the > > > > >>>>>>>>> .d > > > > >>>>>>>>> directories, > > > > >>>>>>>>> and possibly the work dir to locate the stdout of the > > > > >>>>>>>>> failing > > > > >>>>>>>>> apps. > > > > >>>>>>>>> > > > > >>>>>>>>> - are the pathnames correct? > > > > >>>>>>>>> > > > > >>>>>>>>> - does the run work on localhost? (ie, are the PBS > > > > >>>>>>>>> jobs > > > > >>>>>>>>> running > > > > >>>>>>>>> or > > > > >>>>>>>>> failing)? > > > > >>>>>>>>> > > > > >>>>>>>>> - which Swift rev are you using? > > > > >>>>>>>>> > > > > >>>>>>>>> Are you at Argonne? I can stop by and we can debug. > > > > >>>>>>>>> > > > > >>>>>>>>> - Mike > > > > >>>>>>>>> > > > > >>>>>>>>> > > > > >>>>>>>>> ----- Original Message ----- > > > > >>>>>>>>>> From: "Sheri Mickelson" < mickelso at mcs.anl.gov > > > > > >>>>>>>>>> To: "Michael Wilde" < wilde at mcs.anl.gov > > > > > >>>>>>>>>> Sent: Thursday, October 6, 2011 10:32:38 AM > > > > >>>>>>>>>> Subject: Help on fusion > > > > >>>>>>>>>> Hi Mike, > > > > >>>>>>>>>> > > > > >>>>>>>>>> The AMWG people at NCAR want to incorporate the swift > > > > >>>>>>>>>> version > > > > >>>>>>>>>> to > > > > >>>>>>>>>> their > > > > >>>>>>>>>> main branch. Rob's at NCAR right now and wants to > > > > >>>>>>>>>> have > > > > >>>>>>>>>> this > > > > >>>>>>>>>> done > > > > >>>>>>>>>> as > > > > >>>>>>>>>> soon as possible. I've been working on incorporating > > > > >>>>>>>>>> the > > > > >>>>>>>>>> changes > > > > >>>>>>>>>> that > > > > >>>>>>>>>> were made in the last release and believe that it's > > > > >>>>>>>>>> in > > > > >>>>>>>>>> descent > > > > >>>>>>>>>> shape. > > > > >>>>>>>>>> I want to test it on fusion, though, just to make > > > > >>>>>>>>>> sure > > > > >>>>>>>>>> I'm > > > > >>>>>>>>>> handling > > > > >>>>>>>>>> the env variables correctly. I'm running into an > > > > >>>>>>>>>> error > > > > >>>>>>>>>> when > > > > >>>>>>>>>> I > > > > >>>>>>>>>> run. > > > > >>>>>>>>>> I'm getting "Failed to transfer wrapper log for job > > > > >>>>>>>>>> > > > > >>>>>>>>>> for > > > > >>>>>>>>>> all > > > > >>>>>>>>>> of > > > > >>>>>>>>>> the app calls. What usually causes this? I'm stuck on > > > > >>>>>>>>>> where > > > > >>>>>>>>>> to > > > > >>>>>>>>>> look. > > > > >>>>>>>>>> > > > > >>>>>>>>>> Thanks, Sheri > > > > >>>>>>>>> > > > > >>>>>>>>> -- > > > > >>>>>>>>> Michael Wilde > > > > >>>>>>>>> Computation Institute, University of Chicago > > > > >>>>>>>>> Mathematics and Computer Science Division > > > > >>>>>>>>> Argonne National Laboratory > > > > >>>>>>>>> > > > > >>>>>>> > > > > >>>>>>> -- > > > > >>>>>>> Michael Wilde > > > > >>>>>>> Computation Institute, University of Chicago > > > > >>>>>>> Mathematics and Computer Science Division > > > > >>>>>>> Argonne National Laboratory > > > > >> > > > > > > > > > > > > > > > > -- > > > Michael Wilde > > > Computation Institute, University of Chicago > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From davidk at ci.uchicago.edu Tue Nov 1 14:38:23 2011 From: davidk at ci.uchicago.edu (David Kelly) Date: Tue, 1 Nov 2011 14:38:23 -0500 (CDT) Subject: [Swift-devel] Could not convert value to boolean: null Message-ID: <37283659.170465.1320176303678.JavaMail.root@zimbra-mb2.anl.gov> I've noticed that with 0.93, I will occasionally see this error message "Could not convert value to boolean: null." The error is sporadic. It happens in maybe 1 out of every 50 attempts. When the error occurs, it happens immediately and exits. I've seen this with a few different scripts, but here is one from the test suite that just failed. 0723-simplemapper-nopadding.swift ---- type messagefile; (messagefile t) write() { app { echo @filename(t) stdout=@filename(t); } } messagefile outfile[] ; outfile[0] = write(); outfile[5] = write(); outfile[75943] = write(); --- I added a dumpStack() to location that was printing the error to get a little more detail. ---- Swift svn swift-r5262 cog-r3314 (cog modified locally) RunID: 20111101-1414-r4ut0gjf Progress: time: Tue, 01 Nov 2011 14:14:08 -0500 java.lang.Exception: Stack trace at java.lang.Thread.dumpStack(Thread.java:1249) at org.globus.cog.karajan.util.TypeUtil.toBoolean(TypeUtil.java:127) at org.griphyn.vdl.karajan.lib.Mark.function(Mark.java:30) at org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:62) at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) at org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28) at org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) at org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) at org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) at org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) at org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) at org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) at org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) at org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Execution failed: Could not convert value to boolean: null ----- It looks like the error is somehow related to Karajan getting null values from the stack. I have this as bug 585 (https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=585) David From hategan at mcs.anl.gov Tue Nov 1 14:42:03 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 01 Nov 2011 12:42:03 -0700 Subject: [Swift-devel] Could not convert value to boolean: null In-Reply-To: <37283659.170465.1320176303678.JavaMail.root@zimbra-mb2.anl.gov> References: <37283659.170465.1320176303678.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: <1320176523.3848.0.camel@blabla> Saw it too. I'm working on it. On Tue, 2011-11-01 at 14:38 -0500, David Kelly wrote: > I've noticed that with 0.93, I will occasionally see this error message "Could not convert value to boolean: null." The error is sporadic. It happens in maybe 1 out of every 50 attempts. When the error occurs, it happens immediately and exits. I've seen this with a few different scripts, but here is one from the test suite that just failed. > > 0723-simplemapper-nopadding.swift > ---- > type messagefile; > > (messagefile t) write() { > app { > echo @filename(t) stdout=@filename(t); > } > } > > messagefile outfile[] prefix="0723-simplemapper-nopadding.", > suffix=".out", > padding="0">; > > outfile[0] = write(); > outfile[5] = write(); > outfile[75943] = write(); > --- > > I added a dumpStack() to location that was printing the error to get a little more detail. > ---- > Swift svn swift-r5262 cog-r3314 (cog modified locally) > > RunID: 20111101-1414-r4ut0gjf > Progress: time: Tue, 01 Nov 2011 14:14:08 -0500 > java.lang.Exception: Stack trace > at java.lang.Thread.dumpStack(Thread.java:1249) > at org.globus.cog.karajan.util.TypeUtil.toBoolean(TypeUtil.java:127) > at org.griphyn.vdl.karajan.lib.Mark.function(Mark.java:30) > at org.griphyn.vdl.karajan.lib.VDLFunction.post(VDLFunction.java:62) > at org.globus.cog.karajan.workflow.nodes.AbstractSequentialWithArguments.completed(AbstractSequentialWithArguments.java:194) > at org.globus.cog.karajan.workflow.nodes.FlowNode.complete(FlowNode.java:214) > at org.globus.cog.karajan.workflow.nodes.FlowContainer.post(FlowContainer.java:58) > at org.globus.cog.karajan.workflow.nodes.functions.AbstractFunction.post(AbstractFunction.java:28) > at org.globus.cog.karajan.workflow.nodes.Sequential.startNext(Sequential.java:29) > at org.globus.cog.karajan.workflow.nodes.Sequential.executeChildren(Sequential.java:20) > at org.globus.cog.karajan.workflow.nodes.FlowContainer.execute(FlowContainer.java:63) > at org.globus.cog.karajan.workflow.nodes.FlowNode.restart(FlowNode.java:139) > at org.globus.cog.karajan.workflow.nodes.FlowNode.start(FlowNode.java:197) > at org.globus.cog.karajan.workflow.FlowElementWrapper.start(FlowElementWrapper.java:227) > at org.globus.cog.karajan.workflow.events.EventBus.start(EventBus.java:104) > at org.globus.cog.karajan.workflow.events.EventTargetPair.run(EventTargetPair.java:40) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) > at java.util.concurrent.FutureTask.run(FutureTask.java:138) > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > Execution failed: > Could not convert value to boolean: null > ----- > > It looks like the error is somehow related to Karajan getting null values from the stack. I have this as bug 585 (https://bugzilla.mcs.anl.gov/swift/show_bug.cgi?id=585) > > David > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From ketancmaheshwari at gmail.com Thu Nov 3 09:42:57 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Thu, 3 Nov 2011 09:42:57 -0500 Subject: [Swift-devel] Issues in worker-side GridFTP In-Reply-To: <1439480770.151545.1320155189570.JavaMail.root@zimbra.anl.gov> References: <1439480770.151545.1320155189570.JavaMail.root@zimbra.anl.gov> Message-ID: On Tue, Nov 1, 2011 at 8:46 AM, Michael Wilde wrote: > We have been discussing the feasibility of adding a data management option > to Swift that performs data transfer via globus-url-copy from the worker > node, likely issued from within a CDM function. > > This is mainly intended for OSG sites where no other straightforward > transfer option exists (e.g. on sites that have storage servers like > SRM-DCache and no mounted access to the primary GridFTP server's > filesystem). > > I'm wondering how worker-side GridFTP will work with respect to data > management functions like creation of the work directory, job directory, > and transfer of utility files like swiftwrap etc. > > Any ideas on how best to do this? Its almost like we should first try this > in coasters using provider staging, where we start by replacing the data > transfer with a worker-side invocation of guc. > > Mihael, is this something that you could do for us? If you could create a > test version of this using a simple "cp" in a worker.pl callout script > for doing data transfer, we could then do the testing with actual guc. > I can start working on this. Will keep posted as and when I have questions or need help debug. > > - Mike > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From iraicu at cs.iit.edu Sun Nov 6 01:44:21 2011 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Sun, 06 Nov 2011 01:44:21 -0500 Subject: [Swift-devel] win an Apple iPad 2 tablet -- Call for Participation at MTAGS11, co-located with Supercomputing 2011 on Monday November 14th Message-ID: <4EB62CC5.3060007@cs.iit.edu> Call for Participation --------------------------------------------------------------------------------------- The 4th ACM Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS) 2011 http://datasys.cs.iit.edu/events/MTAGS11/ --------------------------------------------------------------------------------------- November 14th, 2011 Seattle, Washington, USA Co-located with with IEEE/ACM International Conference for High Performance Computing, Networking, Storage and Analysis (SC11) ======================================================================================= The 4th workshop on Many-Task Computing on Grids and Supercomputers (MTAGS) will provide the scientific community a dedicated forum for presenting new research, development, and deployment efforts of large-scale many-task computing (MTC) applications on large scale clusters, Grids, Supercomputers, and Cloud Computing infrastructure. We have assembled an excellent workshop program, with a keynote, a panel, and 6 peer-reviewed papers, and 1 invited paper. The full program can be found at http://datasys.cs.iit.edu/events/MTAGS11/program.html. Program highlights: Keynote: "Mixing Cloud and Grid Resources for Many Task Computing" Professor David Abramson, Monash University Panel: "Many-Task Computing meets Exascales" Dr. Dan Reed, Corporate Vice President of Technology Policy and Strategy, Microsoft Research Professor David Abramson, Monash University Professor Jack Dongarra, University of Tennessee Dr. Daniel S. Katz, Senior Fellow in the Computation Institute, University of Chicago& Argonne National Lab. Papers: "Parallel High-resolution Climate Data Analysis using Swift" Matthew Woitaszek, John Dennis, Taleena Sines "Riding the Elephant: Managing Ensembles with Hadoop" Elif Dede, Madhusudan Govindaraju, Dan Gunter, Lavanya Ramakrishnan "A Dependency-Driven Formulation of Parareal: Parallel-in-Time Solution of PDEs as a Many-Task Application" Wael Elwasif, Samantha Foley, David Bernholdt, Lee Berry, D. Samaddar, David Newman, Raul Sanchez "Design and Implementation of ?Many Parallel Task? Hybrid Subsurface Model" Khushbu Agarwal, Jared Chase, Karen Schuchardt, Timothy Scheibe, Bruce Palmer, Todd Elsethagen "Toward Scalable I/O Architecture for Exascale Systems" Yong Chen "High Performance Matrix Inversion Based on LU Factorization for Multicore Architectures" Jack Dongarra, Mathieu Faverge, Hatem Ltaief, Piotr Luszczek "MATE-EC2: A Middleware for Processing Data with AWS", Invited Paper Tekin Bicer, David Chiu, Gagan Agrawal Everyone who attends the MTAGS 2011 workshop is welcome to participate in the attendee prize giveaway. This includes all attendees, speakers, panelists, program committee members, and steering committee members. To be eligible to win, you must register online at https://docs.google.com/spreadsheet/viewform?hl=en_US&formkey=dDdKdU84VHdPODgyUThLakQ0QUdmS2c6MA#gid=0, with your name, affiliation, and email address, and must be present at 5:30PM on the day of the workshop (11-14-11) to win. Committee Members --------------------------------------------------------------------------------------- Workshop Chairs * Ioan Raicu, Illinois Institute of Technology& Argonne National Laboratory * Ian Foster, University of Chicago& Argonne National Laboratory * Yong Zhao, University of Electronic Science and Technology of China Steering Committee * David Abramson, Monash University, Australia * Jack Dongara, University of Tennessee, USA * Geoffrey Fox, Indiana University, USA * Manish Parashar, Rutgers University, USA * Marc Snir, University of Illinois at Urbana Champaign, USA * Xian-He Sun, Illinois Institute of Technology, USA * Weimin Zheng, Tsinghua University, China Technical Committee * Roger Barga, Microsoft Research, USA * Mihai Budiu, Microsoft Research, USA * Rajkumar Buyya, University of Melbourne, Australia * Catalin Dumitrescu, Fermi National Labs, USA * Alexandru Iosup, Delft University of Technology, Netherlands * Florin Isaila, Universidad Carlos III de Madrid, Spain * Michael Isard, Microsoft Research, USA * Kamil Iskra, Argonne National Laboratory, USA * Hui Jin, Illinois Institute of Technology, USA * Daniel S. Katz, University of Chicago, USA * Tevfik Kosar, Louisiana State University, USA * Zhiling Lan, Illinois Institute of Technology, USA * Reagan Moore, University of North Carolina, Chappel Hill, USA * Jose Moreira, IBM Research, USA * Marlon Pierce, Indiana University, USA * Judy Qiu, Indiana University, USA * Lavanya Ramakrishnan, Lawrence Berkeley National Laboratory, USA * Matei Ripeanu, University of British Columbia, Canada * Alain Roy, University of Wisconsin, Madison, USA * Edward Walker, Whitworth University, USA * Mike Wilde, University of Chicago& Argonne National Laboratory, USA * Matthew Woitaszek, The University Corporation for Atmospheric Research, USA * Ken Yocum, University of California at San Diego, USA * Zhifeng Yun, Louisiana State University, USA -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================================= ================================================================= From iraicu at cs.iit.edu Sun Nov 6 01:51:38 2011 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Sun, 06 Nov 2011 01:51:38 -0500 Subject: [Swift-devel] win an Amazon Kindle Fire tablet -- Call for Participation at DataCloud-SC11, co-located with Supercomputing 2011 on Monday November 14th Message-ID: <4EB62E7A.4060607@cs.iit.edu> Call for Participation --------------------------------------------------------------------------------------- The Second International Workshop on Data Intensive Computing in the Clouds (DataCloud-SC11) 2011 http://datasys.cs.iit.edu/events/DataCloud-SC11/index.html --------------------------------------------------------------------------------------- November 14th, 2011 Seattle, Washington, USA Co-located with with IEEE/ACM International Conference for High Performance Computing, Networking, Storage and Analysis (SC11) ======================================================================================= The second international workshop on Data-intensive Computing in the Clouds (DataCloud-SC11) will provide the scientific community a dedicated forum for discussing new research, development, and deployment efforts in running data-intensive computing workloads on Cloud Computing infrastructures. We have assembled an excellent workshop program, with a keynote and 9 peer-reviewed papers. The full program can be found at http://datasys.cs.iit.edu/events/DataCloud-SC11/program.html. Program highlights: Keynote: "Data Intensive Applications on Clouds" Geoffrey Fox, Professor, Indiana University Papers: "I/O Performance of Virtualized Cloud Environments" Devarshi Ghoshal, Richard Canon, Lavanya Ramakrishnan "OAuth and ABE based Authorization in Semi-Trusted Cloud Computing" Anuchart Tassanaviboon, Guang Gong "Dynamic Split Model of Resource Utilization in MapReduce" Xiaowei Wang, Jie Zhang, Huaming Liao, Li Zha "Evaluating the suitability of MapReduce for surface temperature analysis codes" Vinay Sudhakaran, Neil Chue Hong "Designing a Secure Storage Repository for Sharing Scientific Datasets using Public Clouds" Alok Gautam Kumbhare, Yogesh Simmhan, Viktor Prasanna "Performance Evaluation of Scheduling Algorithms for Database Services with Soft and Hard SLAs" Hyun Moon, Yun Chi, Hakan Hacigumus "Efficient Processing of RDF Graph Pattern Matching on Map Reduce Platforms" Padmashree Ravindra, Seokyong Hong, HyeongSik Kim, Kemafor Anyanwu "Describing Cloud Usage with Excess Entropy" Charles Loboz "Design Patterns for Scientific Applications in DryadLINQ CTP" Hui Li, Yang Ruan, Yuduo Zhou, Judy Qiu, Geoffrey Fox Everyone who attends the DataCloud-SC11 2011 workshop is welcome to participate in the attendee prize giveaway. This includes all attendees, speakers, panelists, program committee members, and steering committee members. To be eligible to win, you must register online at https://docs.google.com/spreadsheet/viewform?formkey=dDZrQTZrd0pkWWllbDdCN0twendjNUE6MQ, with your name, affiliation, and email address, and must be present at 5:00PM on the day of the workshop (11-14-11) to win. Committee Members --------------------------------------------------------------------------------------- Workshop Chairs * Ioan Raicu, Illinois Institute of Technology& Argonne National Laboratory, USA * Tevfik Kosar, University at Buffalo, USA * Roger Barga, Microsoft Research, USA Steering Committee * Ian Foster, University of Chicago& Argonne National Laboratory, USA * Geoffrey Fox, Indiana University, USA * James Hamilton, Amazon, USA * Manish Parashar, Rutgers University, USA * Dan Reed, Microsoft Research, USA * Rich Wolski, University of California at Santa Barbara, USA * Rong Chang, IBM, USA Technical Committee * David Abramson, Monash University, Australia * Abhishek Chandra, University of Minnesota, USA * Yong Chen, Texas Tech University, USA * Terence Critchlow, Pacific Northwest National Laboratory, USA * Murat Demirbas, SUNY Buffalo, USA * Jaliya Ekanayake, Microsoft Research, USA * Rob Gillen, Oak Ridge National Laboratory, USA * Maria Indrawan, Monash University, Australia * Alexandru Iosup, Delft University of Technology, Netherlands * Hui Jin, Illinois Institute of Technology, USA * Peter Kacsuk, Hungarian Academy of Sciences, Hungary * Dan S. Katz, University of Chicago, USA * Steven Ko, SUNY Buffalo, USA * Gregor von Laszewski, Indiana University, USA * Erwin Laure, CERN, Switzerland * Reagan Moore, University of North Carolina at Chapel Hill, USA * Jim Myers, Rensselaer Polytechnic Institute, USA * Judy Qiu, Indiana University, USA * Lavanya Ramakrishnan, Lawrence Berkeley National Laboratory, USA * Florian Schintke, Zuse Institute Berlin, Germany * Borja Sotomayor, University of Chicago, USA * Ian Taylor, Cardiff University, UK * Bernard Traversat, Oracle Corporation, USA -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================================= ================================================================= From iraicu at cs.iit.edu Fri Nov 4 21:40:10 2011 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Fri, 04 Nov 2011 21:40:10 -0500 Subject: [Swift-devel] CFP: 12th IEEE/ACM International Symposium on Cluster, Grid and Cloud Computing (CCGrid 2012) Ottawa, Canada Message-ID: <4EB4A20A.1010901@cs.iit.edu> CALL FOR PAPERS 12th IEEE/ACM International Symposium on Cluster, Grid and Cloud Computing (CCGrid 2012) Ottawa, Canada May 13-16, 2012 http://www.cloudbus.org/ccgrid2012 Rapid advances in processing, communication and systems/middleware technologies are leading to new paradigms and platforms for computing, ranging from computing Clusters to widely distributed Grid and emerging Clouds. CCGrid is a series of very successful conferences, sponsored by the IEEE Computer Society Technical Committee on Scalable Computing (TCSC) and ACM, with the overarching goal of bringing together international researchers, developers, and users and to provide an international forum to present leading research activities and results on a broad range of topics related to these platforms and paradigms and their applications. The conference features keynotes, technical presentations, posters and research demos, workshops, tutorials, as well as the SCALE challenges featuring live demonstrations. In 2012, CCGrid will come to Canada for the first time and will be held in Ottawa, the capital city. CCGrid 2012 will have a focus on important and immediate issues that are significantly influencing all aspects of cluster, cloud and grid computing. Topics of interest include, but are not limited to: * Applications and Experiences: Applications to real and complex problems in science, engineering, business and society; User studies; Experiences with large-scale deployments systems or applications. * Architecture: System architectures, Design and deployment. * Autonomic Computing and Cyberinfrastructure: Self managed behavior, models and technologies; Autonomic paradigms and approaches (control-based, bio-inspired, emergent, etc.); Bio-inspired approaches to management; SLA definition and enforcement. * Performance Modeling and Evaluation: Performance models; Monitoring and evaluation tools, Analysis of system/application performance; Benchmarks and testbeds. * Programming Models, Systems, and Fault-Tolerant Computing: Programming models for cluster, clouds and grid computing; fault tolerant infrastructure and algorithms; systems software to enable efficient computing. * Multicore and Accelerator-based Computing: Software and application techniques to utilize multicore architectures and accelerators/heterogeneous computing systems. * Scheduling and Resource Management: Techniques to schedule jobs and resources on clusters, clouds and grid computing platforms. * Cloud Computing: Cloud architectures; Software tools and techniques for clouds. PAPER SUBMISSION Authors are invited to submit papers electronically. Submitted manuscripts should be structured as technical papers and may not exceed 8 letter size (8.5 x 11) pages including figures, tables and references using the IEEE format for conference proceedings (print area of 6-1/2 inches (16.51 cm) wide by 8-7/8 inches (22.51 cm) high, two-column format with columns 3-1/16 inches (7.85 cm) wide with a 3/8 inch (0.81 cm) space between them, single-spaced 10-point Times fully justified text). Submissions not conforming to these guidelines may be returned without review. Authors should submit the manuscript in PDF format and make sure that the file will print on a printer that uses letter size (8.5 x 11) paper. The official language of the meeting is English. All manuscripts will be reviewed and will be judged on correctness, originality, technical strength, significance, quality of presentation, and interest and relevance to the conference attendees. Submitted papers must represent original unpublished research that is not currently under review for any other conference or journal. Papers not following these guidelines will be rejected without review and further action may be taken, including (but not limited to) notifications sent to the heads of the institutions of the authors and sponsors of the conference. Submissions received after the due date, exceeding the page limit, or not appropriately structured may not be considered. Authors may contact the conference chairs for more information. The proceedings will be published through the IEEE Computer Society Press, USA and will be made available online through the IEEE Digital Library. Submission Link: https://www.easychair.org/account/signin.cgi?conf=ccgrid2012 JOURNAL SPECIAL ISSUE Highly rated Top 6 papers from the CCGrid 2012 conference will be invited to extend for publication in a special issue of the "Future Generation Computer Systems (FGCS)" Journal published by Elsevier Press. CHAIRS General Chair * Shikharesh Majumdar, Carleton University, Canada Honorary Chair * Geoffrey Fox, Indiana University, USA Program Committee Co-Chairs * Rajkumar Buyya, University of Melbourne, Australia * Pavan Balaji, Argonne National Laboratory, USA Program Committee Vice-chairs * Daniel S. Katz (Applications and Experiences) * Dhabaleswar K. Panda (Architecture) * Manish Parashar (Middleware, Autonomic Computing, and Cyberinfrastructure) * Ahmad Afsahi (Performance Modeling and Analysis) * Xian-He Sun (Performance Measurement and Evaluation) * William Gropp (Programming Models, Systems, and Fault-Tolerant computing) * David Bader (Multicore and Accelerator-based Computing) * Thomas Fahringer (Scheduling and Resource Management) * Ignacio Martin Llorente and Madhusudhan Govindaraju (Cloud Computing) Cyber Co-Chairs * Anton Beloglazov, The University of Melbourne, Australia * Suraj Pandey, CSIRO, Australia * Trevor Gelowsky, Carleton University, Canada Workshops Co-Chairs * Marin Litiou, York University, Canada * Mukaddim Pathan, Telstra Corporation Limited, Australia Publicity Chairs * Helen Karatza, Aristotle University of Thessaloniki, Greece * Ioan Raicu, Illinois Institute of Technology& Argonne National Labs, USA * Bruno Schulze, National Laboratory for Scientific Computing, Brazil * G Subrahmanya VRK Rao: Cognizant technology Solutions, India Tutorials Co-Chairs * Sushil K. Prasad, Georgia State University, USA * Rob Simmonds, Westgrid, Canada Doctoral Symposium Co-Chairs * Carlos Varela, Rensselaer Polytechnic Institute, USA * Yogesh Simmhan, University of Southern California Poster and Research Demo Co-Chairs * Suraj Pandey, CSIRO, Australia SCALE Challenge Coordinator * Shantenu Jha, Rutgers and Loisiana State University Steering Committee * Henri Bal, Vrije University, The Netherlands * Pavan Balaji, Argonne National Laboratory, USA * Rajkumar Buyya, University of Melbourne, Australia (Chair) * Franck Capello, University of Paris-Sud, France * Jack Dongarra, University of Tennessee& ORNL, USA * Dick Epema, Technical University of Delft, The Netherlands * Thomas Fahringer, University of Innsbruck, Austria * Ian Foster, University of Chicago, USA * Wolfgang Gentzsch, DEISA, Germany * Hai Jin, Huazhong University of Science& Technology, China * Craig Lee, The Aerospace Corporation, USA (Co-Chair) * Laurent Lefevre, INRIA, France * Geng Lin, Dell Inc., USA * Manish Parashar, Rutgers: The State University of New Jersey, USA * Shikharesh Majumdar, Carleton University, Canada * Satoshi Matsuoaka, Tokyo Institute of Technology, Japan * Omer Rana, Cardiff University, UK * Paul Roe, Queensland University of Technology, Australia * Bruno Schulze, LNCC, Brazil * Nalini Venkatasubramanian, University of California, USA * Carlos Varela, Rensselaer Polytechnic Institute, USA IMPORTANT DATES Papers Due: 25 November 2011 Notification of Acceptance: 30 January 2012 Camera Ready Papers Due: 27 February 2012 Sponsors: IEEE Computer Society (TCSE)& ACM SIGARCH (approval pending) -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================================= ================================================================= From wilde at mcs.anl.gov Mon Nov 7 12:09:10 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 7 Nov 2011 12:09:10 -0600 (CST) Subject: [Swift-devel] Next provider staging enhancements In-Reply-To: <48325388.4253.1320689196583.JavaMail.root@zimbra.anl.gov> Message-ID: <2080076761.4272.1320689350368.JavaMail.root@zimbra.anl.gov> Mihael, Justin, and I spoke by phone on Friday on this topic. The outcome was that Mihael will work on the following provider staging enhancements: 1. Make non-encrypted data transfer the default for provider staging. This yields 20X performance improvement (transfer speed) in laptop tests. One suspected problem: Ketan tried testing this by manually changing the Java code flag mentioned in Mihael's email on the topic, and did not observe any speedup. So that needs to be investigated: was the setting incorrect, or did it not yield the expected improvement? We should set up a test that demonstrates the expected speed range, and make that speed test a part of the test suite. 2. Mihael will test and if necessary fix the ability of provider staging to work with gsiftp:// URIs. Ketan will create a test for this as well, and then adapt the ExTENCI applications to use this method. 3. Mihael will adapt the provider staging "SFS" (shared filesystem) staging method to create a new staging method (like worker-gridftp) which uses globus-url-copy on the worker node as the staging mechanism. That could perhaps be further generalized, but that generalization could come later. Justin pointed out that in order to not have the worker block other job execution activities during these staging operations, the globus-url-copy processes need to be made asynchronous to normal worker job processing. Mihael proposed, I think, to do item 1 right away, for 0.93, and to do items 2 & 3 in a new 0.93.1 branch. That way they will not slow down release of 0.93, but will be made available much sooner than they would be if they were made on trunk and then slotted for 0.94. Mihael, all, can you comment on whether I have described this all right? Thanks, - Mike From wilde at mcs.anl.gov Tue Nov 8 04:41:57 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 8 Nov 2011 04:41:57 -0600 (CST) Subject: [Swift-devel] Can we re-activate coaster worker timeout capability? Message-ID: <571270834.7056.1320748917383.JavaMail.root@zimbra.anl.gov> When using Swift with the OSG Glidein Workload Management System, which we need to do for the ExTENCI project, as well as with our own pilot job tools in bin/grid, we often will have more workers starting than we need. It would be handy to re-instate some variation of the older worker timeout feature, under which when coasters is running in persistent passive mode, a worker option can specify a timeout period after which the worker will cleanly exit after some time period T of no work. That way the pilot factory can aggressively launch workers, and if it overshoots, the excess workers will exit to avoid wasting site CPU resources. Can that be done in a reasonable manner, avoiding the pitfalls that led to the removal of the timeout feature? If so, I'll file an enhancement bug for this. - Mike From skenny at uchicago.edu Tue Nov 8 16:36:42 2011 From: skenny at uchicago.edu (Sarah Kenny) Date: Tue, 8 Nov 2011 14:36:42 -0800 Subject: [Swift-devel] [Swift-user] gram on ranger In-Reply-To: References: <501704482.165901.1319823662516.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: thought i'd revisit this since anjali re-ran this workflow with fewer jobs (~85K) and perhaps the info would be useful. it showed a similar pattern in that it finished all jobs but one (that is, we were missing a single output file) and hung indefinitely on the last 'finished successfully...' so this discussion seems to have turned mostly to how coasters requests cores. however, i have to say that *generally* in the past when swift/coasters has requested too many cores for the given queue gram complains and you see it in the gram log, which is not the case here. that said, if you want em: the swift log is in /home/skenny/swift_logs on ci and the coaster log was too big for my home on ci (and has since been appended to so make sure to match the dates with the swift log), but if someone has access to ranger it's in /var/tmp/skenny_swift on login3 we're continuing to use the same swift version and sites file since it's at least helping us push thru much of the work (doing manual resumes/restarts). ~sk On Fri, Oct 28, 2011 at 11:02 AM, Justin M Wozniak wrote: > > I think count is the number of processes. PBSExecutor uses it, that may > be a good place to look. In the Coasters context, I think it is the > number of invocations of worker.pl . > > On Fri, 28 Oct 2011, David Kelly wrote: > > > Just to clarify - when coasters is being used, count represents the > > number of coaster blocks? Then to get the number of cores to request, I > > should use count*workersPerNode? > > > > What about in the case where coasters is not used? > > > > ----- Original Message ----- > >> From: "Mihael Hategan" > >> To: "David Kelly" > >> Cc: "Anjali Raja" , "Swift Devel" < > swift-devel at ci.uchicago.edu>, "Swift User" > >> , "Ketan Maheshwari" < > ketancmaheshwari at gmail.com> > >> Sent: Thursday, October 20, 2011 9:08:46 PM > >> Subject: Re: [Swift-devel] [Swift-user] gram on ranger > >> On Thu, 2011-10-20 at 21:03 -0500, David Kelly wrote: > >>> Yep, this is using coasters > >>> > >> > >> Then no. Count is whatever the block allocation algorithm decides it > >> should be. > >> > >>>>> > >>>>> Should count=32 in the second case? Am I misunderstanding what > >>>>> 'count' is? Is there any way to get the exact number of > >>>>> applications? > >>>> > >>>> Coasters? > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > Justin M Wozniak > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -- Sarah Kenny Programmer ~ Brain Circuits Laboratory ~ Rm 2224 Bio Sci III University of California Irvine, Dept. of Neurology ~ 773-818-8300 -------------- next part -------------- An HTML attachment was scrubbed... URL: From iraicu at cs.iit.edu Tue Nov 8 16:39:25 2011 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Tue, 08 Nov 2011 16:39:25 -0600 Subject: [Swift-devel] Call for Participation: Tutorial -- Using and Building Infrastructure Clouds for Science at Supercomputing/SC 2011 Message-ID: <4EB9AF9D.7050605@cs.iit.edu> CALL FOR PARTICIPATION The International Conference for High Performance Computing, Networking, Storage and Analysis (Supercomputing/SC) 2011 Tutorial: Using and Building Infrastructure Clouds for Science http://sc11.supercomputing.org/schedule/event_detail.php?evid=tut159 DATE: Sunday, November 13th, 2011 TIME: 8:30AM - 5:00PM ROOM:TCC 303 ABSTRACT: Infrastructure-as-a-service (IaaS) cloud computing (sometimes also called ?infrastructure cloud computing?) has recently emerged as a promising outsourcing paradigm: it has been widely embraced commercially and is also beginning to make inroads in scientific communities. Although popular, the understanding of its benefits, challenges, modes of use, and general applicability as an outsourcing paradigm for science are still in its infancy, which gives raise to many myths and misconceptions. Without specific and accurate information it is hard for the scientific communities to understand whether this new paradigm is worthwhile ? and if so, how to best develop, leverage, and invest in it. Our objective in this tutorial is to facilitate the introduction to infrastructure cloud computing to scientific communities and provide accurate and up-to-date information about features that could affect its use in science: to conquer myths, highlight opportunities, and equip the attendees with a better understanding of the relevance of cloud computing to their scientific domain. To this end, we have developed a tutorial that mixes the discussion of various aspects of cloud computing for science, such as performance, privacy and standards, with practical exercises using infrastructure clouds and state-of-the-art tools. Chair/Presenter Details: Katarzyna Keahey - Argonne National Laboratory John Bresnahan - Argonne National Laboratory David LaBissoniere - University of Chicago Paul Marshall - University of Colorado Patrick Armstrong - University of Victoria -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================================= ================================================================= From wilde at mcs.anl.gov Tue Nov 8 21:05:08 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 8 Nov 2011 21:05:08 -0600 (CST) Subject: [Swift-devel] [Swift-user] gram on ranger In-Reply-To: Message-ID: <1601551403.11124.1320807908046.JavaMail.root@zimbra.anl.gov> Sarah, David, All, Ive just tried to review the length email conversation on this problem (or set of problems) and I found it very hard to discern how many different problem symptoms are involved. I filed bug 593 to cover the issue that you reported on Oct 11. I know that David and Ketan both worked on this and David made some changes to the provider but there has been no updates to that ticket to help understand where things stand. Also, there have been several fixes to 0.93 in the past week, but at the same time Ketan has encountered new SGE provide issues (with MPI jobs but perhaps applicable to all jobs) with the latest code, and I just filed those as bug 624. David and Ketan, please work together to discuss the status, review the specifics of Sarah's issue, and see if she is running with the latest 0.93 revisions. David, since you made the last set of changes to the SGE provider, can you take ownership of this, and update SGE with comments and/or new tickets? Also, can you try to create a test case that can replicate Sarah's problem? It seems to me that this problem/thread started off as a "some jobs at the end of a long workflow dont complete", then changed to "SGE is rejecting or my workflow at some point with no explanation", and now we're back to the "last job doesnt complete" symptom. Lets get the symptoms clearly identified, matched with the logs and with the Swift+CoG revisions being used, and then matched against what other things we suspect are still broken in the SGE provider. Further, we are not doing any GRAM-SGE testing to my knowledge, yet thats what Sarah is using, so we should add that to the test cases for SGE. Perhaps we should discuss with Sarah whether a Coaster-SSH-SGE config would help us get in sync and make use f Ranger more reliable, at least until we have time to test the GRAM case. Thanks, - Mike ----- Original Message ----- > From: "Sarah Kenny" > To: "Justin M Wozniak" > Cc: "Anjali Raja" , "Swift Devel" > Sent: Tuesday, November 8, 2011 4:36:42 PM > Subject: Re: [Swift-devel] [Swift-user] gram on ranger > thought i'd revisit this since anjali re-ran this workflow with fewer > jobs (~85K) and perhaps the info would be useful. it showed a similar > pattern in that it finished all jobs but one (that is, we were missing > a single output file) and hung indefinitely on the last 'finished > successfully...' > > so this discussion seems to have turned mostly to how coasters > requests cores. however, i have to say that *generally* in the past > when swift/coasters has requested too many cores for the given queue > gram complains and you see it in the gram log, which is not the case > here. > > that said, if you want em: the swift log is in /home/skenny/swift_logs > on ci and the coaster log was too big for my home on ci (and has since > been appended to so make sure to match the dates with the swift log), > but if someone has access to ranger it's in /var/tmp/skenny_swift on > login3 > > we're continuing to use the same swift version and sites file since > it's at least helping us push thru much of the work (doing manual > resumes/restarts). > > ~sk > > > On Fri, Oct 28, 2011 at 11:02 AM, Justin M Wozniak < > wozniak at mcs.anl.gov > wrote: > > > > I think count is the number of processes. PBSExecutor uses it, that > may > be a good place to look. In the Coasters context, I think it is the > number of invocations of worker.pl . > > > > > On Fri, 28 Oct 2011, David Kelly wrote: > > > Just to clarify - when coasters is being used, count represents the > > number of coaster blocks? Then to get the number of cores to > > request, I > > should use count*workersPerNode? > > > > What about in the case where coasters is not used? > > > > ----- Original Message ----- > >> From: "Mihael Hategan" < hategan at mcs.anl.gov > > >> To: "David Kelly" < davidk at ci.uchicago.edu > > >> Cc: "Anjali Raja" < anjraja at gmail.com >, "Swift Devel" < > >> swift-devel at ci.uchicago.edu >, "Swift User" > >> < swift-user at ci.uchicago.edu >, "Ketan Maheshwari" < > >> ketancmaheshwari at gmail.com > > >> Sent: Thursday, October 20, 2011 9:08:46 PM > >> Subject: Re: [Swift-devel] [Swift-user] gram on ranger > >> On Thu, 2011-10-20 at 21:03 -0500, David Kelly wrote: > >>> Yep, this is using coasters > >>> > >> > >> Then no. Count is whatever the block allocation algorithm decides > >> it > >> should be. > >> > >>>>> > >>>>> Should count=32 in the second case? Am I misunderstanding what > >>>>> 'count' is? Is there any way to get the exact number of > >>>>> applications? > >>>> > >>>> Coasters? > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > Justin M Wozniak > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > Sarah Kenny > Programmer ~ Brain Circuits Laboratory ~ Rm 2224 Bio Sci III > University of California Irvine, Dept. of Neurology ~ 773-818-8300 > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From ketancmaheshwari at gmail.com Thu Nov 10 16:04:16 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Thu, 10 Nov 2011 16:04:16 -0600 Subject: [Swift-devel] job fails with exit code 522 and connection to worker lost Message-ID: Mihael, On my latest run of scec workflow after the worker fixes, I see the code 522 error message and connection to worker lost issues. The run is using OSG sites obtained from a gen_greensites test done this morning. Looking for error 522 on worker.pl I find this line: queueCmd((nullCB(), "JOBSTATUS", $jobid, FAILED, "522", "Could not write to file: $!")); The log for this run is: http://www.ci.uchicago.edu/~ketan/postproc-20111110-0935-kjqnp1d6.log Could it be that the recent mods on worker.pl has caused this? The OSG sites used are as follows: LIGO_UWM_NEMO__osg-nemo-ce.phys.uwm.edu OU_OSCER_ATLAS__grid1.oscer.ou.edu RENCI-Blueridge__brgw1.renci.org FNAL_FERMIGRID__fermigridosg1.fnal.gov Regards, -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Thu Nov 10 16:11:29 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Thu, 10 Nov 2011 16:11:29 -0600 Subject: [Swift-devel] job fails with exit code 522 and connection to worker lost In-Reply-To: References: Message-ID: Looking further into logs, I see java.net.SocketException: Broken pipe about 12 times in there. On Thu, Nov 10, 2011 at 4:04 PM, Ketan Maheshwari < ketancmaheshwari at gmail.com> wrote: > Mihael, > > On my latest run of scec workflow after the worker fixes, I see the code > 522 error message and connection to worker lost issues. > > The run is using OSG sites obtained from a gen_greensites test done this > morning. > > Looking for error 522 on worker.pl I find this line: > > queueCmd((nullCB(), "JOBSTATUS", $jobid, FAILED, "522", "Could not write > to file: $!")); > > The log for this run is: > http://www.ci.uchicago.edu/~ketan/postproc-20111110-0935-kjqnp1d6.log > > Could it be that the recent mods on worker.pl has caused this? > > The OSG sites used are as follows: > LIGO_UWM_NEMO__osg-nemo-ce.phys.uwm.edu > OU_OSCER_ATLAS__grid1.oscer.ou.edu > RENCI-Blueridge__brgw1.renci.org > FNAL_FERMIGRID__fermigridosg1.fnal.gov > > > Regards, > -- > Ketan > > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From aespinosa at cs.uchicago.edu Fri Nov 11 10:55:05 2011 From: aespinosa at cs.uchicago.edu (Allan Espinosa) Date: Fri, 11 Nov 2011 10:55:05 -0600 Subject: [Swift-devel] hippie-ish language design book Message-ID: Anyone checked out its contents? How did you guys find it? http://createyourproglang.com/ -Allan -- Allan M. Espinosa PhD student, Computer Science University of Chicago From davidk at ci.uchicago.edu Sat Nov 12 01:53:09 2011 From: davidk at ci.uchicago.edu (David Kelly) Date: Sat, 12 Nov 2011 01:53:09 -0600 (CST) Subject: [Swift-devel] [Swift-user] gram on ranger In-Reply-To: Message-ID: <1349889700.12818.1321084389586.JavaMail.root@zimbra-mb2.anl.gov> Sarah, I just submitted a fix that might help. There was an issue with the provider not always correctly detecting when the job was completed. The fix is in the 0.93 source. Can you give it a try and let me know if you still see any issues? Thanks. David ----- Original Message ----- > From: "Sarah Kenny" > To: "Justin M Wozniak" > Cc: "David Kelly" , "Swift Devel" , "Anjali Raja" > > Sent: Tuesday, November 8, 2011 4:36:42 PM > Subject: Re: [Swift-devel] [Swift-user] gram on ranger > thought i'd revisit this since anjali re-ran this workflow with fewer > jobs (~85K) and perhaps the info would be useful. it showed a similar > pattern in that it finished all jobs but one (that is, we were missing > a single output file) and hung indefinitely on the last 'finished > successfully...' > > so this discussion seems to have turned mostly to how coasters > requests cores. however, i have to say that *generally* in the past > when swift/coasters has requested too many cores for the given queue > gram complains and you see it in the gram log, which is not the case > here. > > that said, if you want em: the swift log is in /home/skenny/swift_logs > on ci and the coaster log was too big for my home on ci (and has since > been appended to so make sure to match the dates with the swift log), > but if someone has access to ranger it's in /var/tmp/skenny_swift on > login3 > > we're continuing to use the same swift version and sites file since > it's at least helping us push thru much of the work (doing manual > resumes/restarts). > > ~sk > > > On Fri, Oct 28, 2011 at 11:02 AM, Justin M Wozniak < > wozniak at mcs.anl.gov > wrote: > > > > I think count is the number of processes. PBSExecutor uses it, that > may > be a good place to look. In the Coasters context, I think it is the > number of invocations of worker.pl . > > > > > On Fri, 28 Oct 2011, David Kelly wrote: > > > Just to clarify - when coasters is being used, count represents the > > number of coaster blocks? Then to get the number of cores to > > request, I > > should use count*workersPerNode? > > > > What about in the case where coasters is not used? > > > > ----- Original Message ----- > >> From: "Mihael Hategan" < hategan at mcs.anl.gov > > >> To: "David Kelly" < davidk at ci.uchicago.edu > > >> Cc: "Anjali Raja" < anjraja at gmail.com >, "Swift Devel" < > >> swift-devel at ci.uchicago.edu >, "Swift User" > >> < swift-user at ci.uchicago.edu >, "Ketan Maheshwari" < > >> ketancmaheshwari at gmail.com > > >> Sent: Thursday, October 20, 2011 9:08:46 PM > >> Subject: Re: [Swift-devel] [Swift-user] gram on ranger > >> On Thu, 2011-10-20 at 21:03 -0500, David Kelly wrote: > >>> Yep, this is using coasters > >>> > >> > >> Then no. Count is whatever the block allocation algorithm decides > >> it > >> should be. > >> > >>>>> > >>>>> Should count=32 in the second case? Am I misunderstanding what > >>>>> 'count' is? Is there any way to get the exact number of > >>>>> applications? > >>>> > >>>> Coasters? > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > Justin M Wozniak > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > Sarah Kenny > Programmer ~ Brain Circuits Laboratory ~ Rm 2224 Bio Sci III > University of California Irvine, Dept. of Neurology ~ 773-818-8300 From wilde at mcs.anl.gov Sat Nov 12 13:31:02 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sat, 12 Nov 2011 13:31:02 -0600 (CST) Subject: [Swift-devel] User has problem with PBS provider on Beagle - Fwd: Swift question In-Reply-To: <818D3B69-4716-40B9-B4CD-422A62D56787@gmail.com> Message-ID: <1285627078.23656.1321126262196.JavaMail.root@zimbra.anl.gov> Can the first person who has time try to address the problem below? Im about to head to SC. Thanks, - Mike ----- Forwarded Message ----- From: "Fangfang Xia" To: "Michael Wilde" Cc: "Ketan Maheshwari" , "Scott Devoid" Sent: Saturday, November 12, 2011 1:27:29 PM Subject: Swift question Hi Mike and Ketan, Thanks for the guide. I tried to follow the "cat" example, and got the following error: 2011-11-12 19:21:06,510+0000 DEBUG AbstractExecutor Writing PBS script to /home/fangfang/.globus/scripts/PBS6954924010553344333.submit 2011-11-12 19:21:06,521+0000 DEBUG PBSExecutor PBS name: for: Block-1112-210706-000000 is: Block-1112-2107 2011-11-12 19:21:06,521+0000 INFO BlockTaskSubmitter Error submitting block task: Cannot submit job: Illegal value for ppn. Must be an integer. 2011-11-12 19:21:16,429+0000 INFO TaskNotifier Congestion queue size: 0 I looked at the PBS script and somehow it's blank. I have attached the full log file. Could you please take a look and let me know how to proceed? Thanks, Fangfang On Nov 8, 2011, at 12:42 PM, Michael Wilde wrote: > Hi Fangfang, Scott, > > Sorry for the late reply! I think the best roadmap to follow is this: > > - try running the sample tutorial Swift script on Beagle using the instructions posted at: > > http://www.ci.uchicago.edu/swift/wwwdev/guides/release-0.93/siteguide/siteguide.html#_beagle > > This tiny tutorial contains a simple Swift script that does N "cat" commends in parallel to "process" an input file and create an output file. It contains all the related config files you need to run on Beagle, and is thus a good "Hello World" application. You can then copy catsn.swift to create the first Swift script to run your actual applications. > > - set up a face to face meeting with Ketan Maheshwari, the Beagle Catalyst for Swift applications. Ketan is based here at Argonne, on the 5th floor near my office, 5141. Ketan can help answer any questions you have, and will be your personal contact to help you make good use of Beagle. > > - then do your first Model-SEED script based on catsn.swift, first with N = 1 to just ensure that you have described your app's command line(s) correctly to Swift and that the app is getting invoked and returning output correctly. > > - then, with help form Ketan as needed, start scaling up to increasingly larger runs. > > I'll try to stay close in the loop and help out as needed. > > Do you have any questions I can answer to get started? If you are at Argonne and available today, perhaps I can join you and Ketan in an introductory meeting. Im free from 3 to 4:40 today or after 5:30. Otherwise, pelase do this at your joint conveniences. > > Regards, > > - Mike > > > > > ----- Original Message ----- >> From: "Fangfang Xia" >> To: "Michael Wilde" >> Sent: Monday, October 31, 2011 12:44:23 PM >> Subject: Re: How is install/test of Model SEED on Beagle going? >> Hi Mike, >> >> We got two types of flux balance analysis to run on beagle. I was >> wondering if we should test them with Swift to see if things scale. >> Both operations take about 40 seconds to run on sandbox. Ideally we >> should also test two more expensive computation "fba single knockouts" >> and "gapfilling", but I won't be able to resolve the problems with >> those until I meet with Chris this week. >> >> source /lustre/beagle/fangfang/Model-SEED-core/bin/source-me.sh >> >> fbacheckgrowth -model iJR904.16242 >> fbafva -model iJR904.16242 >> >> You can find the descriptions of these tools at: >> http://bionet.mcs.anl.gov/index.php/Using_the_Model_SEED >> >> I've been switching between PrgEnv-pgi/gcc to get perl modules and >> mfatoolkit to compile. And I still seem to be getting the cc1plus >> error with gcc which you don't have. So if this version doesn't work >> well on multiple processors, I'll need your help with recompiling my >> updated mfatoolkit in >> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit. >> >> I have 777'ed my /lustre/beagle/fangfang/ModelSEED/ directory in case >> you need to test something there. >> >> Thanks, >> Fangfang >> >> On Oct 24, 2011, at 11:14 PM, Michael Wilde wrote: >> >>> Hi Fangfang, >>> >>> I was able to build that directory using the gcc module; I past the >>> make output below. It gave many warnings, but I did not get the >>> cc1plus libmpc.so error that you encountered. >>> >>> My build is in $HOME/wilde/mfatoolkit >>> >>> I ran this on sandbox.beagle.ci.uchicago.edu. >>> >>> - Mike >>> >>> ---- make output: >>> >>> sandbox$ make >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> -c /home/wilde/mfatoolkit/Source/driver.cpp; mv *.o >>> /home/wilde/mfatoolkit/Source >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> -c /home/wilde/mfatoolkit/Source/MFAProblem.cpp; mv *.o >>> /home/wilde/mfatoolkit/Source >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>> 'int MFAProblem::ModifyInputConstraints(ConstraintsToModify*, >>> Data*)': >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1828:16: warning: NULL >>> used in arithmetic >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1832:16: warning: NULL >>> used in arithmetic >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>> 'int MFAProblem::FluxCouplingAnalysis(Data*, OptimizationParameter*, >>> bool, std::string&, bool)': >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:4900:46: warning: >>> converting 'false' to pointer type for argument 1 of >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>> std::char_traits, _Alloc = std::allocator]' >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5023:47: warning: >>> converting 'false' to pointer type for argument 1 of >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>> std::char_traits, _Alloc = std::allocator]' >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5212:47: warning: >>> converting 'false' to pointer type for argument 1 of >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>> std::char_traits, _Alloc = std::allocator]' >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>> 'int MFAProblem::IdentifyReactionLoops(Data*, >>> OptimizationParameter*)': >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5989:43: warning: >>> converting 'false' to pointer type for argument 1 of >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>> std::char_traits, _Alloc = std::allocator]' >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>> 'int MFAProblem::ParseRegExp(OptimizationParameter*, Data*, >>> std::string)': >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:7984:10: warning: >>> converting to non-pointer type 'int' from NULL >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> -c /home/wilde/mfatoolkit/Source/CPLEXapi.cpp; mv *.o >>> /home/wilde/mfatoolkit/Source >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> -c /home/wilde/mfatoolkit/Source/SCIPapi.cpp; mv *.o >>> /home/wilde/mfatoolkit/Source >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> -c /home/wilde/mfatoolkit/Source/GLPKapi.cpp; mv *.o >>> /home/wilde/mfatoolkit/Source >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> -c /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.cpp; mv *.o >>> /home/wilde/mfatoolkit/Source >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> -c /home/wilde/mfatoolkit/Source/SolverInterface.cpp; mv *.o >>> /home/wilde/mfatoolkit/Source >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> -c /home/wilde/mfatoolkit/Source/Species.cpp; mv *.o >>> /home/wilde/mfatoolkit/Source >>> /home/wilde/mfatoolkit/Source/Species.cpp: In member function 'void >>> Species::AddpKab(std::string, bool)': >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>> std::allocator, value_type = int]' >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>> std::allocator, value_type = int]' >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>> std::allocator, value_type = int]' >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>> std::allocator, value_type = int]' >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> -c /home/wilde/mfatoolkit/Source/Data.cpp; mv *.o >>> /home/wilde/mfatoolkit/Source >>> /home/wilde/mfatoolkit/Source/Data.cpp:2220:43: warning: >>> multi-character character constant >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> -c /home/wilde/mfatoolkit/Source/InterfaceFunctions.cpp; mv *.o >>> /home/wilde/mfatoolkit/Source >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> -c /home/wilde/mfatoolkit/Source/Identity.cpp; mv *.o >>> /home/wilde/mfatoolkit/Source >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> -c /home/wilde/mfatoolkit/Source/Reaction.cpp; mv *.o >>> /home/wilde/mfatoolkit/Source >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> -c /home/wilde/mfatoolkit/Source/GlobalFunctions.cpp; mv *.o >>> /home/wilde/mfatoolkit/Source >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> -c /home/wilde/mfatoolkit/Source/AtomCPP.cpp; mv *.o >>> /home/wilde/mfatoolkit/Source >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> -c /home/wilde/mfatoolkit/Source/UtilityFunctions.cpp; mv *.o >>> /home/wilde/mfatoolkit/Source >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> -c /home/wilde/mfatoolkit/Source/AtomType.cpp; mv *.o >>> /home/wilde/mfatoolkit/Source >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> -c /home/wilde/mfatoolkit/Source/Gene.cpp; mv *.o >>> /home/wilde/mfatoolkit/Source >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> -c /home/wilde/mfatoolkit/Source/GeneInterval.cpp; mv *.o >>> /home/wilde/mfatoolkit/Source >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> -c /home/wilde/mfatoolkit/Source/stringDB.cpp; mv *.o >>> /home/wilde/mfatoolkit/Source >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> -o /home/wilde/mfatoolkit/Linux/mfatoolkit >>> /home/wilde/mfatoolkit/Source/driver.o >>> /home/wilde/mfatoolkit/Source/MFAProblem.o >>> /home/wilde/mfatoolkit/Source/CPLEXapi.o >>> /home/wilde/mfatoolkit/Source/SCIPapi.o >>> /home/wilde/mfatoolkit/Source/GLPKapi.o >>> /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.o >>> /home/wilde/mfatoolkit/Source/SolverInterface.o >>> /home/wilde/mfatoolkit/Source/Species.o >>> /home/wilde/mfatoolkit/Source/Data.o >>> /home/wilde/mfatoolkit/Source/InterfaceFunctions.o >>> /home/wilde/mfatoolkit/Source/Identity.o >>> /home/wilde/mfatoolkit/Source/Reaction.o >>> /home/wilde/mfatoolkit/Source/GlobalFunctions.o >>> /home/wilde/mfatoolkit/Source/AtomCPP.o >>> /home/wilde/mfatoolkit/Source/UtilityFunctions.o >>> /home/wilde/mfatoolkit/Source/AtomType.o >>> /home/wilde/mfatoolkit/Source/Gene.o >>> /home/wilde/mfatoolkit/Source/GeneInterval.o >>> /home/wilde/mfatoolkit/Source/stringDB.o >>> -L/lustre/beagle/fangfang/ModelSEED/Solvers/lib -lglpk >>> -L/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/lib/x86-64_sles10_4.1/static_pic >>> -lcplex -lm -lpthread -lz >>> sandbox$ >>> >>> >>> ----- Original Message ----- >>>> From: "Fangfang Xia" >>>> To: "Michael Wilde" >>>> Cc: "Scott Devoid" >>>> Sent: Monday, October 24, 2011 5:20:20 PM >>>> Subject: Re: How is install/test of Model SEED on Beagle going? >>>> Hi Mike, >>>> >>>> This is very helpful. Thanks for pointing out the difference >>>> between >>>> PrgEnv-pgi and gcc. Here's an error message we got when trying to >>>> compile our core c++ code. >>>> >>>> /opt/gcc/4.5.2/snos/libexec/gcc/x86_64-suse-linux/default/cc1plus: >>>> error while loading shared libraries: libmpc.so.2: cannot open >>>> shared >>>> object file: No such file or directory >>>> >>>> It looks like something is wrong with cc1plus. I suppose it's part >>>> of >>>> the g++? I don't know what it does. >>>> >>>> So we resolved the perl dependency issues, and we were able to >>>> compile >>>> the code with the default PrgEnv-pgi just for testing purposes. It >>>> seems we still have some issues with our new pipeline code. But I >>>> don't think we are very far from giving you a running example. >>>> >>>> Just in case you could help us with the gcc compilation issue, I >>>> have >>>> 777'ed my directory and here's the steps to compile the core C++ >>>> code: >>>> >>>> source >>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/bin/source-me.sh >>>> cd >>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit/Linux >>>> make >>>> >>>> Thanks, >>>> Fangfang >>>> >>>> On Oct 22, 2011, at 1:10 PM, Michael Wilde wrote: >>>> >>>>> Sounds great, thanks for the update, Fangfang. >>>>> >>>>> One question: what compiler are you using? >>>>> >>>>> I'd like to suggest, for the first pass, that you use the "gcc" >>>>> module (rather than the PrgEnv-pgi or PrgEnv-gnu modules). Thats >>>>> because the GCC module will create code that we can run in >>>>> parallel, >>>>> multiple programs in parallel per compute node. The PrgEnv modules >>>>> all create code that expects to run only one program per node, >>>>> because its meant for MPI, OpenMP, etc). >>>>> >>>>> Also, I think that the gcc module (which I think includes gcc, g++ >>>>> and gfortran) may be more like the traditional Linux gcc than >>>>> PrgEnv-gnu. >>>>> >>>>> The default PrgEnv (at least for me) is pgi. So before i build >>>>> software I do: >>>>> >>>>> module unload PrgEnv-pgi >>>>> module load gcc >>>>> >>>>> Let me know if I can help; if you want i can try to build you a >>>>> libxml2 using gcc. >>>>> Same for Perl if it needs to be executed multiple copies per node >>>>> in >>>>> parallel. >>>>> >>>>> We can discuss more next week, and I'll be working off and on this >>>>> weekend. >>>>> >>>>> Regards, >>>>> >>>>> - Mike >>>>> >>>>> >>>>> ----- Original Message ----- >>>>>> From: "Fangfang Xia" >>>>>> To: "Michael Wilde" >>>>>> Cc: "Fangfang Xia" , "Scott Devoid" >>>>>> >>>>>> Sent: Saturday, October 22, 2011 12:39:05 PM >>>>>> Subject: Re: How is install/test of Model SEED on Beagle going? >>>>>> Hi Mike, >>>>>> >>>>>> We encountered some dependency issues while attempting to install >>>>>> some >>>>>> additional Perl libraries for ModelSEED. We have asked Beagle >>>>>> systems >>>>>> folks to help install libxml2. I'm also looking into ways to >>>>>> install >>>>>> it in a user directory. I get the feeling that things should be >>>>>> resolved after our group meeting on Monday. So we'll keep you >>>>>> posted. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Fangfang >>>>>> >>>>>> On Oct 21, 2011, at 2:08 PM, Michael Wilde wrote: >>>>>> >>>>>>> Hi Fangfang, Scott, >>>>>>> >>>>>>> Any progress - can I try it soon? >>>>>>> >>>>>>> Or, any problems that I can help with? Im at Argonne today >>>>>>> (5141) >>>>>>> if >>>>>>> I can help or you'd like to talk. Free except for 3:30 - 4:30. >>>>>>> >>>>>>> Regards, >>>>>>> >>>>>>> - Mike >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Michael Wilde >>>>>>> Computation Institute, University of Chicago >>>>>>> Mathematics and Computer Science Division >>>>>>> Argonne National Laboratory >>>>>>> >>>>> >>>>> -- >>>>> Michael Wilde >>>>> Computation Institute, University of Chicago >>>>> Mathematics and Computer Science Division >>>>> Argonne National Laboratory >>>>> >>> >>> -- >>> Michael Wilde >>> Computation Institute, University of Chicago >>> Mathematics and Computer Science Division >>> Argonne National Laboratory >>> > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From ketancmaheshwari at gmail.com Sat Nov 12 14:02:47 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Sat, 12 Nov 2011 14:02:47 -0600 Subject: [Swift-devel] User has problem with PBS provider on Beagle - Fwd: Swift question In-Reply-To: <1285627078.23656.1321126262196.JavaMail.root@zimbra.anl.gov> References: <818D3B69-4716-40B9-B4CD-422A62D56787@gmail.com> <1285627078.23656.1321126262196.JavaMail.root@zimbra.anl.gov> Message-ID: Hello Fangfang, The log file does not seem to be found. Could you attach it please. >From this line: Illegal value for ppn. Must be an integer. Looks like the sites file is not configured well for the pbs provider. Could you post your sites.xml. Were there any error messages on commandline? Regards, Ketan On Sat, Nov 12, 2011 at 1:31 PM, Michael Wilde wrote: > Can the first person who has time try to address the problem below? > Im about to head to SC. > > Thanks, > > - Mike > > > ----- Forwarded Message ----- > From: "Fangfang Xia" > To: "Michael Wilde" > Cc: "Ketan Maheshwari" , "Scott Devoid" < > devoid at ci.uchicago.edu> > Sent: Saturday, November 12, 2011 1:27:29 PM > Subject: Swift question > > Hi Mike and Ketan, > > Thanks for the guide. I tried to follow the "cat" example, and got the > following error: > > 2011-11-12 19:21:06,510+0000 DEBUG AbstractExecutor Writing PBS script to > /home/fangfang/.globus/scripts/PBS6954924010553344333.submit > 2011-11-12 19:21:06,521+0000 DEBUG PBSExecutor PBS name: for: > Block-1112-210706-000000 is: Block-1112-2107 > 2011-11-12 19:21:06,521+0000 INFO BlockTaskSubmitter Error submitting > block task: Cannot submit job: Illegal value for ppn. Must be an integer. > 2011-11-12 19:21:16,429+0000 INFO TaskNotifier Congestion queue size: 0 > > I looked at the PBS script and somehow it's blank. I have attached the > full log file. Could you please take a look and let me know how to proceed? > > Thanks, > > Fangfang > > On Nov 8, 2011, at 12:42 PM, Michael Wilde wrote: > > > Hi Fangfang, Scott, > > > > Sorry for the late reply! I think the best roadmap to follow is this: > > > > - try running the sample tutorial Swift script on Beagle using the > instructions posted at: > > > > > http://www.ci.uchicago.edu/swift/wwwdev/guides/release-0.93/siteguide/siteguide.html#_beagle > > > > This tiny tutorial contains a simple Swift script that does N "cat" > commends in parallel to "process" an input file and create an output file. > It contains all the related config files you need to run on Beagle, and is > thus a good "Hello World" application. You can then copy catsn.swift to > create the first Swift script to run your actual applications. > > > > - set up a face to face meeting with Ketan Maheshwari, the Beagle > Catalyst for Swift applications. Ketan is based here at Argonne, on the 5th > floor near my office, 5141. Ketan can help answer any questions you have, > and will be your personal contact to help you make good use of Beagle. > > > > - then do your first Model-SEED script based on catsn.swift, first with > N = 1 to just ensure that you have described your app's command line(s) > correctly to Swift and that the app is getting invoked and returning output > correctly. > > > > - then, with help form Ketan as needed, start scaling up to increasingly > larger runs. > > > > I'll try to stay close in the loop and help out as needed. > > > > Do you have any questions I can answer to get started? If you are at > Argonne and available today, perhaps I can join you and Ketan in an > introductory meeting. Im free from 3 to 4:40 today or after 5:30. > Otherwise, pelase do this at your joint conveniences. > > > > Regards, > > > > - Mike > > > > > > > > > > ----- Original Message ----- > >> From: "Fangfang Xia" > >> To: "Michael Wilde" > >> Sent: Monday, October 31, 2011 12:44:23 PM > >> Subject: Re: How is install/test of Model SEED on Beagle going? > >> Hi Mike, > >> > >> We got two types of flux balance analysis to run on beagle. I was > >> wondering if we should test them with Swift to see if things scale. > >> Both operations take about 40 seconds to run on sandbox. Ideally we > >> should also test two more expensive computation "fba single knockouts" > >> and "gapfilling", but I won't be able to resolve the problems with > >> those until I meet with Chris this week. > >> > >> source /lustre/beagle/fangfang/Model-SEED-core/bin/source-me.sh > >> > >> fbacheckgrowth -model iJR904.16242 > >> fbafva -model iJR904.16242 > >> > >> You can find the descriptions of these tools at: > >> http://bionet.mcs.anl.gov/index.php/Using_the_Model_SEED > >> > >> I've been switching between PrgEnv-pgi/gcc to get perl modules and > >> mfatoolkit to compile. And I still seem to be getting the cc1plus > >> error with gcc which you don't have. So if this version doesn't work > >> well on multiple processors, I'll need your help with recompiling my > >> updated mfatoolkit in > >> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit. > >> > >> I have 777'ed my /lustre/beagle/fangfang/ModelSEED/ directory in case > >> you need to test something there. > >> > >> Thanks, > >> Fangfang > >> > >> On Oct 24, 2011, at 11:14 PM, Michael Wilde wrote: > >> > >>> Hi Fangfang, > >>> > >>> I was able to build that directory using the gcc module; I past the > >>> make output below. It gave many warnings, but I did not get the > >>> cc1plus libmpc.so error that you encountered. > >>> > >>> My build is in $HOME/wilde/mfatoolkit > >>> > >>> I ran this on sandbox.beagle.ci.uchicago.edu. > >>> > >>> - Mike > >>> > >>> ---- make output: > >>> > >>> sandbox$ make > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> > -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/driver.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> > -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/MFAProblem.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function > >>> 'int MFAProblem::ModifyInputConstraints(ConstraintsToModify*, > >>> Data*)': > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1828:16: warning: NULL > >>> used in arithmetic > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1832:16: warning: NULL > >>> used in arithmetic > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function > >>> 'int MFAProblem::FluxCouplingAnalysis(Data*, OptimizationParameter*, > >>> bool, std::string&, bool)': > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:4900:46: warning: > >>> converting 'false' to pointer type for argument 1 of > >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const > >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = > >>> std::char_traits, _Alloc = std::allocator]' > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5023:47: warning: > >>> converting 'false' to pointer type for argument 1 of > >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const > >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = > >>> std::char_traits, _Alloc = std::allocator]' > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5212:47: warning: > >>> converting 'false' to pointer type for argument 1 of > >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const > >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = > >>> std::char_traits, _Alloc = std::allocator]' > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function > >>> 'int MFAProblem::IdentifyReactionLoops(Data*, > >>> OptimizationParameter*)': > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5989:43: warning: > >>> converting 'false' to pointer type for argument 1 of > >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const > >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = > >>> std::char_traits, _Alloc = std::allocator]' > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function > >>> 'int MFAProblem::ParseRegExp(OptimizationParameter*, Data*, > >>> std::string)': > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:7984:10: warning: > >>> converting to non-pointer type 'int' from NULL > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> > -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/CPLEXapi.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> > -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/SCIPapi.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> > -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/GLPKapi.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> > -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> > -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/SolverInterface.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> > -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/Species.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> /home/wilde/mfatoolkit/Source/Species.cpp: In member function 'void > >>> Species::AddpKab(std::string, bool)': > >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing > >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, > >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = > >>> std::allocator, value_type = int]' > >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing > >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, > >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = > >>> std::allocator, value_type = int]' > >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing > >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, > >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = > >>> std::allocator, value_type = int]' > >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing > >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, > >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = > >>> std::allocator, value_type = int]' > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> > -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/Data.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> /home/wilde/mfatoolkit/Source/Data.cpp:2220:43: warning: > >>> multi-character character constant > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> > -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/InterfaceFunctions.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> > -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/Identity.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> > -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/Reaction.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> > -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/GlobalFunctions.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> > -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/AtomCPP.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> > -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/UtilityFunctions.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> > -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/AtomType.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> > -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/Gene.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> > -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/GeneInterval.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> > -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/stringDB.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> > -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -o /home/wilde/mfatoolkit/Linux/mfatoolkit > >>> /home/wilde/mfatoolkit/Source/driver.o > >>> /home/wilde/mfatoolkit/Source/MFAProblem.o > >>> /home/wilde/mfatoolkit/Source/CPLEXapi.o > >>> /home/wilde/mfatoolkit/Source/SCIPapi.o > >>> /home/wilde/mfatoolkit/Source/GLPKapi.o > >>> /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.o > >>> /home/wilde/mfatoolkit/Source/SolverInterface.o > >>> /home/wilde/mfatoolkit/Source/Species.o > >>> /home/wilde/mfatoolkit/Source/Data.o > >>> /home/wilde/mfatoolkit/Source/InterfaceFunctions.o > >>> /home/wilde/mfatoolkit/Source/Identity.o > >>> /home/wilde/mfatoolkit/Source/Reaction.o > >>> /home/wilde/mfatoolkit/Source/GlobalFunctions.o > >>> /home/wilde/mfatoolkit/Source/AtomCPP.o > >>> /home/wilde/mfatoolkit/Source/UtilityFunctions.o > >>> /home/wilde/mfatoolkit/Source/AtomType.o > >>> /home/wilde/mfatoolkit/Source/Gene.o > >>> /home/wilde/mfatoolkit/Source/GeneInterval.o > >>> /home/wilde/mfatoolkit/Source/stringDB.o > >>> -L/lustre/beagle/fangfang/ModelSEED/Solvers/lib -lglpk > >>> > -L/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/lib/x86-64_sles10_4.1/static_pic > >>> -lcplex -lm -lpthread -lz > >>> sandbox$ > >>> > >>> > >>> ----- Original Message ----- > >>>> From: "Fangfang Xia" > >>>> To: "Michael Wilde" > >>>> Cc: "Scott Devoid" > >>>> Sent: Monday, October 24, 2011 5:20:20 PM > >>>> Subject: Re: How is install/test of Model SEED on Beagle going? > >>>> Hi Mike, > >>>> > >>>> This is very helpful. Thanks for pointing out the difference > >>>> between > >>>> PrgEnv-pgi and gcc. Here's an error message we got when trying to > >>>> compile our core c++ code. > >>>> > >>>> /opt/gcc/4.5.2/snos/libexec/gcc/x86_64-suse-linux/default/cc1plus: > >>>> error while loading shared libraries: libmpc.so.2: cannot open > >>>> shared > >>>> object file: No such file or directory > >>>> > >>>> It looks like something is wrong with cc1plus. I suppose it's part > >>>> of > >>>> the g++? I don't know what it does. > >>>> > >>>> So we resolved the perl dependency issues, and we were able to > >>>> compile > >>>> the code with the default PrgEnv-pgi just for testing purposes. It > >>>> seems we still have some issues with our new pipeline code. But I > >>>> don't think we are very far from giving you a running example. > >>>> > >>>> Just in case you could help us with the gcc compilation issue, I > >>>> have > >>>> 777'ed my directory and here's the steps to compile the core C++ > >>>> code: > >>>> > >>>> source > >>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/bin/source-me.sh > >>>> cd > >>>> > /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit/Linux > >>>> make > >>>> > >>>> Thanks, > >>>> Fangfang > >>>> > >>>> On Oct 22, 2011, at 1:10 PM, Michael Wilde wrote: > >>>> > >>>>> Sounds great, thanks for the update, Fangfang. > >>>>> > >>>>> One question: what compiler are you using? > >>>>> > >>>>> I'd like to suggest, for the first pass, that you use the "gcc" > >>>>> module (rather than the PrgEnv-pgi or PrgEnv-gnu modules). Thats > >>>>> because the GCC module will create code that we can run in > >>>>> parallel, > >>>>> multiple programs in parallel per compute node. The PrgEnv modules > >>>>> all create code that expects to run only one program per node, > >>>>> because its meant for MPI, OpenMP, etc). > >>>>> > >>>>> Also, I think that the gcc module (which I think includes gcc, g++ > >>>>> and gfortran) may be more like the traditional Linux gcc than > >>>>> PrgEnv-gnu. > >>>>> > >>>>> The default PrgEnv (at least for me) is pgi. So before i build > >>>>> software I do: > >>>>> > >>>>> module unload PrgEnv-pgi > >>>>> module load gcc > >>>>> > >>>>> Let me know if I can help; if you want i can try to build you a > >>>>> libxml2 using gcc. > >>>>> Same for Perl if it needs to be executed multiple copies per node > >>>>> in > >>>>> parallel. > >>>>> > >>>>> We can discuss more next week, and I'll be working off and on this > >>>>> weekend. > >>>>> > >>>>> Regards, > >>>>> > >>>>> - Mike > >>>>> > >>>>> > >>>>> ----- Original Message ----- > >>>>>> From: "Fangfang Xia" > >>>>>> To: "Michael Wilde" > >>>>>> Cc: "Fangfang Xia" , "Scott Devoid" > >>>>>> > >>>>>> Sent: Saturday, October 22, 2011 12:39:05 PM > >>>>>> Subject: Re: How is install/test of Model SEED on Beagle going? > >>>>>> Hi Mike, > >>>>>> > >>>>>> We encountered some dependency issues while attempting to install > >>>>>> some > >>>>>> additional Perl libraries for ModelSEED. We have asked Beagle > >>>>>> systems > >>>>>> folks to help install libxml2. I'm also looking into ways to > >>>>>> install > >>>>>> it in a user directory. I get the feeling that things should be > >>>>>> resolved after our group meeting on Monday. So we'll keep you > >>>>>> posted. > >>>>>> > >>>>>> Thanks, > >>>>>> > >>>>>> Fangfang > >>>>>> > >>>>>> On Oct 21, 2011, at 2:08 PM, Michael Wilde wrote: > >>>>>> > >>>>>>> Hi Fangfang, Scott, > >>>>>>> > >>>>>>> Any progress - can I try it soon? > >>>>>>> > >>>>>>> Or, any problems that I can help with? Im at Argonne today > >>>>>>> (5141) > >>>>>>> if > >>>>>>> I can help or you'd like to talk. Free except for 3:30 - 4:30. > >>>>>>> > >>>>>>> Regards, > >>>>>>> > >>>>>>> - Mike > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> Michael Wilde > >>>>>>> Computation Institute, University of Chicago > >>>>>>> Mathematics and Computer Science Division > >>>>>>> Argonne National Laboratory > >>>>>>> > >>>>> > >>>>> -- > >>>>> Michael Wilde > >>>>> Computation Institute, University of Chicago > >>>>> Mathematics and Computer Science Division > >>>>> Argonne National Laboratory > >>>>> > >>> > >>> -- > >>> Michael Wilde > >>> Computation Institute, University of Chicago > >>> Mathematics and Computer Science Division > >>> Argonne National Laboratory > >>> > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Sat Nov 12 14:21:18 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Sat, 12 Nov 2011 14:21:18 -0600 Subject: [Swift-devel] User has problem with PBS provider on Beagle - Fwd: Swift question In-Reply-To: <7C4E0EA1-9DC4-404A-8C4A-16B9B011EADF@gmail.com> References: <818D3B69-4716-40B9-B4CD-422A62D56787@gmail.com> <1285627078.23656.1321126262196.JavaMail.root@zimbra.anl.gov> <7C4E0EA1-9DC4-404A-8C4A-16B9B011EADF@gmail.com> Message-ID: Hello Fangfang, Could you replace the following line: 24:cray:pack with this one: pbs.aprun;pbs.mpp;depth=24 in your sites.xml. The line you have is obsoleted form from the 0.92 version of Swift. It should work now. Regards, Ketan On Sat, Nov 12, 2011 at 2:07 PM, Fangfang Xia wrote: > Hi Ketan, > > Thanks for getting back to me so promptly. I have attached the log file, > and here's the content of sites.xml: > > > > > CI-DEB000002 > > 24:cray:pack > > 24 > 1000 > 1 > 1 > 1 > > .63 > 10000 > > > >/lustre/beagle/fangfang/swift-lab/swift.workdir > > > > There's no error message on the command line. > > > > On Nov 12, 2011, at 2:02 PM, Ketan Maheshwari wrote: > > Hello Fangfang, > > The log file does not seem to be found. Could you attach it please. > > From this line: > Illegal value for ppn. Must be an integer. > > Looks like the sites file is not configured well for the pbs provider. > Could you post your sites.xml. > > Were there any error messages on commandline? > > Regards, > Ketan > > On Sat, Nov 12, 2011 at 1:31 PM, Michael Wilde wrote: > >> Can the first person who has time try to address the problem below? >> Im about to head to SC. >> >> Thanks, >> >> - Mike >> >> >> ----- Forwarded Message ----- >> From: "Fangfang Xia" >> To: "Michael Wilde" >> Cc: "Ketan Maheshwari" , "Scott Devoid" < >> devoid at ci.uchicago.edu> >> Sent: Saturday, November 12, 2011 1:27:29 PM >> Subject: Swift question >> >> Hi Mike and Ketan, >> >> Thanks for the guide. I tried to follow the "cat" example, and got the >> following error: >> >> 2011-11-12 19:21:06,510+0000 DEBUG AbstractExecutor Writing PBS script to >> /home/fangfang/.globus/scripts/PBS6954924010553344333.submit >> 2011-11-12 19:21:06,521+0000 DEBUG PBSExecutor PBS name: for: >> Block-1112-210706-000000 is: Block-1112-2107 >> 2011-11-12 19:21:06,521+0000 INFO BlockTaskSubmitter Error submitting >> block task: Cannot submit job: Illegal value for ppn. Must be an integer. >> 2011-11-12 19:21:16,429+0000 INFO TaskNotifier Congestion queue size: 0 >> >> I looked at the PBS script and somehow it's blank. I have attached the >> full log file. Could you please take a look and let me know how to proceed? >> >> Thanks, >> >> Fangfang >> >> On Nov 8, 2011, at 12:42 PM, Michael Wilde wrote: >> >> > Hi Fangfang, Scott, >> > >> > Sorry for the late reply! I think the best roadmap to follow is this: >> > >> > - try running the sample tutorial Swift script on Beagle using the >> instructions posted at: >> > >> > >> http://www.ci.uchicago.edu/swift/wwwdev/guides/release-0.93/siteguide/siteguide.html#_beagle >> > >> > This tiny tutorial contains a simple Swift script that does N "cat" >> commends in parallel to "process" an input file and create an output file. >> It contains all the related config files you need to run on Beagle, and is >> thus a good "Hello World" application. You can then copy catsn.swift to >> create the first Swift script to run your actual applications. >> > >> > - set up a face to face meeting with Ketan Maheshwari, the Beagle >> Catalyst for Swift applications. Ketan is based here at Argonne, on the 5th >> floor near my office, 5141. Ketan can help answer any questions you have, >> and will be your personal contact to help you make good use of Beagle. >> > >> > - then do your first Model-SEED script based on catsn.swift, first with >> N = 1 to just ensure that you have described your app's command line(s) >> correctly to Swift and that the app is getting invoked and returning output >> correctly. >> > >> > - then, with help form Ketan as needed, start scaling up to >> increasingly larger runs. >> > >> > I'll try to stay close in the loop and help out as needed. >> > >> > Do you have any questions I can answer to get started? If you are at >> Argonne and available today, perhaps I can join you and Ketan in an >> introductory meeting. Im free from 3 to 4:40 today or after 5:30. >> Otherwise, pelase do this at your joint conveniences. >> > >> > Regards, >> > >> > - Mike >> > >> > >> > >> > >> > ----- Original Message ----- >> >> From: "Fangfang Xia" >> >> To: "Michael Wilde" >> >> Sent: Monday, October 31, 2011 12:44:23 PM >> >> Subject: Re: How is install/test of Model SEED on Beagle going? >> >> Hi Mike, >> >> >> >> We got two types of flux balance analysis to run on beagle. I was >> >> wondering if we should test them with Swift to see if things scale. >> >> Both operations take about 40 seconds to run on sandbox. Ideally we >> >> should also test two more expensive computation "fba single knockouts" >> >> and "gapfilling", but I won't be able to resolve the problems with >> >> those until I meet with Chris this week. >> >> >> >> source /lustre/beagle/fangfang/Model-SEED-core/bin/source-me.sh >> >> >> >> fbacheckgrowth -model iJR904.16242 >> >> fbafva -model iJR904.16242 >> >> >> >> You can find the descriptions of these tools at: >> >> http://bionet.mcs.anl.gov/index.php/Using_the_Model_SEED >> >> >> >> I've been switching between PrgEnv-pgi/gcc to get perl modules and >> >> mfatoolkit to compile. And I still seem to be getting the cc1plus >> >> error with gcc which you don't have. So if this version doesn't work >> >> well on multiple processors, I'll need your help with recompiling my >> >> updated mfatoolkit in >> >> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit. >> >> >> >> I have 777'ed my /lustre/beagle/fangfang/ModelSEED/ directory in case >> >> you need to test something there. >> >> >> >> Thanks, >> >> Fangfang >> >> >> >> On Oct 24, 2011, at 11:14 PM, Michael Wilde wrote: >> >> >> >>> Hi Fangfang, >> >>> >> >>> I was able to build that directory using the gcc module; I past the >> >>> make output below. It gave many warnings, but I did not get the >> >>> cc1plus libmpc.so error that you encountered. >> >>> >> >>> My build is in $HOME/wilde/mfatoolkit >> >>> >> >>> I ran this on sandbox.beagle.ci.uchicago.edu. >> >>> >> >>> - Mike >> >>> >> >>> ---- make output: >> >>> >> >>> sandbox$ make >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> >> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/driver.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> >> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/MFAProblem.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >> >>> 'int MFAProblem::ModifyInputConstraints(ConstraintsToModify*, >> >>> Data*)': >> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1828:16: warning: NULL >> >>> used in arithmetic >> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1832:16: warning: NULL >> >>> used in arithmetic >> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >> >>> 'int MFAProblem::FluxCouplingAnalysis(Data*, OptimizationParameter*, >> >>> bool, std::string&, bool)': >> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:4900:46: warning: >> >>> converting 'false' to pointer type for argument 1 of >> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >> >>> std::char_traits, _Alloc = std::allocator]' >> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5023:47: warning: >> >>> converting 'false' to pointer type for argument 1 of >> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >> >>> std::char_traits, _Alloc = std::allocator]' >> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5212:47: warning: >> >>> converting 'false' to pointer type for argument 1 of >> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >> >>> std::char_traits, _Alloc = std::allocator]' >> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >> >>> 'int MFAProblem::IdentifyReactionLoops(Data*, >> >>> OptimizationParameter*)': >> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5989:43: warning: >> >>> converting 'false' to pointer type for argument 1 of >> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >> >>> std::char_traits, _Alloc = std::allocator]' >> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >> >>> 'int MFAProblem::ParseRegExp(OptimizationParameter*, Data*, >> >>> std::string)': >> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:7984:10: warning: >> >>> converting to non-pointer type 'int' from NULL >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> >> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/CPLEXapi.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> >> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/SCIPapi.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> >> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/GLPKapi.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> >> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> >> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/SolverInterface.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> >> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/Species.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> /home/wilde/mfatoolkit/Source/Species.cpp: In member function 'void >> >>> Species::AddpKab(std::string, bool)': >> >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing >> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >> >>> std::allocator, value_type = int]' >> >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing >> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >> >>> std::allocator, value_type = int]' >> >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing >> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >> >>> std::allocator, value_type = int]' >> >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing >> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >> >>> std::allocator, value_type = int]' >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> >> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/Data.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> /home/wilde/mfatoolkit/Source/Data.cpp:2220:43: warning: >> >>> multi-character character constant >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> >> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/InterfaceFunctions.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> >> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/Identity.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> >> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/Reaction.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> >> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/GlobalFunctions.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> >> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/AtomCPP.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> >> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/UtilityFunctions.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> >> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/AtomType.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> >> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/Gene.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> >> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/GeneInterval.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> >> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/stringDB.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> >> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -o /home/wilde/mfatoolkit/Linux/mfatoolkit >> >>> /home/wilde/mfatoolkit/Source/driver.o >> >>> /home/wilde/mfatoolkit/Source/MFAProblem.o >> >>> /home/wilde/mfatoolkit/Source/CPLEXapi.o >> >>> /home/wilde/mfatoolkit/Source/SCIPapi.o >> >>> /home/wilde/mfatoolkit/Source/GLPKapi.o >> >>> /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.o >> >>> /home/wilde/mfatoolkit/Source/SolverInterface.o >> >>> /home/wilde/mfatoolkit/Source/Species.o >> >>> /home/wilde/mfatoolkit/Source/Data.o >> >>> /home/wilde/mfatoolkit/Source/InterfaceFunctions.o >> >>> /home/wilde/mfatoolkit/Source/Identity.o >> >>> /home/wilde/mfatoolkit/Source/Reaction.o >> >>> /home/wilde/mfatoolkit/Source/GlobalFunctions.o >> >>> /home/wilde/mfatoolkit/Source/AtomCPP.o >> >>> /home/wilde/mfatoolkit/Source/UtilityFunctions.o >> >>> /home/wilde/mfatoolkit/Source/AtomType.o >> >>> /home/wilde/mfatoolkit/Source/Gene.o >> >>> /home/wilde/mfatoolkit/Source/GeneInterval.o >> >>> /home/wilde/mfatoolkit/Source/stringDB.o >> >>> -L/lustre/beagle/fangfang/ModelSEED/Solvers/lib -lglpk >> >>> >> -L/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/lib/x86-64_sles10_4.1/static_pic >> >>> -lcplex -lm -lpthread -lz >> >>> sandbox$ >> >>> >> >>> >> >>> ----- Original Message ----- >> >>>> From: "Fangfang Xia" >> >>>> To: "Michael Wilde" >> >>>> Cc: "Scott Devoid" >> >>>> Sent: Monday, October 24, 2011 5:20:20 PM >> >>>> Subject: Re: How is install/test of Model SEED on Beagle going? >> >>>> Hi Mike, >> >>>> >> >>>> This is very helpful. Thanks for pointing out the difference >> >>>> between >> >>>> PrgEnv-pgi and gcc. Here's an error message we got when trying to >> >>>> compile our core c++ code. >> >>>> >> >>>> /opt/gcc/4.5.2/snos/libexec/gcc/x86_64-suse-linux/default/cc1plus: >> >>>> error while loading shared libraries: libmpc.so.2: cannot open >> >>>> shared >> >>>> object file: No such file or directory >> >>>> >> >>>> It looks like something is wrong with cc1plus. I suppose it's part >> >>>> of >> >>>> the g++? I don't know what it does. >> >>>> >> >>>> So we resolved the perl dependency issues, and we were able to >> >>>> compile >> >>>> the code with the default PrgEnv-pgi just for testing purposes. It >> >>>> seems we still have some issues with our new pipeline code. But I >> >>>> don't think we are very far from giving you a running example. >> >>>> >> >>>> Just in case you could help us with the gcc compilation issue, I >> >>>> have >> >>>> 777'ed my directory and here's the steps to compile the core C++ >> >>>> code: >> >>>> >> >>>> source >> >>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/bin/source-me.sh >> >>>> cd >> >>>> >> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit/Linux >> >>>> make >> >>>> >> >>>> Thanks, >> >>>> Fangfang >> >>>> >> >>>> On Oct 22, 2011, at 1:10 PM, Michael Wilde wrote: >> >>>> >> >>>>> Sounds great, thanks for the update, Fangfang. >> >>>>> >> >>>>> One question: what compiler are you using? >> >>>>> >> >>>>> I'd like to suggest, for the first pass, that you use the "gcc" >> >>>>> module (rather than the PrgEnv-pgi or PrgEnv-gnu modules). Thats >> >>>>> because the GCC module will create code that we can run in >> >>>>> parallel, >> >>>>> multiple programs in parallel per compute node. The PrgEnv modules >> >>>>> all create code that expects to run only one program per node, >> >>>>> because its meant for MPI, OpenMP, etc). >> >>>>> >> >>>>> Also, I think that the gcc module (which I think includes gcc, g++ >> >>>>> and gfortran) may be more like the traditional Linux gcc than >> >>>>> PrgEnv-gnu. >> >>>>> >> >>>>> The default PrgEnv (at least for me) is pgi. So before i build >> >>>>> software I do: >> >>>>> >> >>>>> module unload PrgEnv-pgi >> >>>>> module load gcc >> >>>>> >> >>>>> Let me know if I can help; if you want i can try to build you a >> >>>>> libxml2 using gcc. >> >>>>> Same for Perl if it needs to be executed multiple copies per node >> >>>>> in >> >>>>> parallel. >> >>>>> >> >>>>> We can discuss more next week, and I'll be working off and on this >> >>>>> weekend. >> >>>>> >> >>>>> Regards, >> >>>>> >> >>>>> - Mike >> >>>>> >> >>>>> >> >>>>> ----- Original Message ----- >> >>>>>> From: "Fangfang Xia" >> >>>>>> To: "Michael Wilde" >> >>>>>> Cc: "Fangfang Xia" , "Scott Devoid" >> >>>>>> >> >>>>>> Sent: Saturday, October 22, 2011 12:39:05 PM >> >>>>>> Subject: Re: How is install/test of Model SEED on Beagle going? >> >>>>>> Hi Mike, >> >>>>>> >> >>>>>> We encountered some dependency issues while attempting to install >> >>>>>> some >> >>>>>> additional Perl libraries for ModelSEED. We have asked Beagle >> >>>>>> systems >> >>>>>> folks to help install libxml2. I'm also looking into ways to >> >>>>>> install >> >>>>>> it in a user directory. I get the feeling that things should be >> >>>>>> resolved after our group meeting on Monday. So we'll keep you >> >>>>>> posted. >> >>>>>> >> >>>>>> Thanks, >> >>>>>> >> >>>>>> Fangfang >> >>>>>> >> >>>>>> On Oct 21, 2011, at 2:08 PM, Michael Wilde wrote: >> >>>>>> >> >>>>>>> Hi Fangfang, Scott, >> >>>>>>> >> >>>>>>> Any progress - can I try it soon? >> >>>>>>> >> >>>>>>> Or, any problems that I can help with? Im at Argonne today >> >>>>>>> (5141) >> >>>>>>> if >> >>>>>>> I can help or you'd like to talk. Free except for 3:30 - 4:30. >> >>>>>>> >> >>>>>>> Regards, >> >>>>>>> >> >>>>>>> - Mike >> >>>>>>> >> >>>>>>> >> >>>>>>> -- >> >>>>>>> Michael Wilde >> >>>>>>> Computation Institute, University of Chicago >> >>>>>>> Mathematics and Computer Science Division >> >>>>>>> Argonne National Laboratory >> >>>>>>> >> >>>>> >> >>>>> -- >> >>>>> Michael Wilde >> >>>>> Computation Institute, University of Chicago >> >>>>> Mathematics and Computer Science Division >> >>>>> Argonne National Laboratory >> >>>>> >> >>> >> >>> -- >> >>> Michael Wilde >> >>> Computation Institute, University of Chicago >> >>> Mathematics and Computer Science Division >> >>> Argonne National Laboratory >> >>> >> > >> > -- >> > Michael Wilde >> > Computation Institute, University of Chicago >> > Mathematics and Computer Science Division >> > Argonne National Laboratory >> > >> >> >> -- >> Michael Wilde >> Computation Institute, University of Chicago >> Mathematics and Computer Science Division >> Argonne National Laboratory >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> > > > > -- > Ketan > > > > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From fangfang.xia at gmail.com Sat Nov 12 14:07:15 2011 From: fangfang.xia at gmail.com (Fangfang Xia) Date: Sat, 12 Nov 2011 14:07:15 -0600 Subject: [Swift-devel] User has problem with PBS provider on Beagle - Fwd: Swift question In-Reply-To: References: <818D3B69-4716-40B9-B4CD-422A62D56787@gmail.com> <1285627078.23656.1321126262196.JavaMail.root@zimbra.anl.gov> Message-ID: <7C4E0EA1-9DC4-404A-8C4A-16B9B011EADF@gmail.com> Hi Ketan, Thanks for getting back to me so promptly. I have attached the log file, and here's the content of sites.xml: CI-DEB000002 24:cray:pack 24 1000 1 1 1 .63 10000 /lustre/beagle/fangfang/swift-lab/swift.workdir There's no error message on the command line. On Nov 12, 2011, at 2:02 PM, Ketan Maheshwari wrote: > Hello Fangfang, > > The log file does not seem to be found. Could you attach it please. > > From this line: > Illegal value for ppn. Must be an integer. > > Looks like the sites file is not configured well for the pbs provider. Could you post your sites.xml. > > Were there any error messages on commandline? > > Regards, > Ketan > > On Sat, Nov 12, 2011 at 1:31 PM, Michael Wilde wrote: > Can the first person who has time try to address the problem below? > Im about to head to SC. > > Thanks, > > - Mike > > > ----- Forwarded Message ----- > From: "Fangfang Xia" > To: "Michael Wilde" > Cc: "Ketan Maheshwari" , "Scott Devoid" > Sent: Saturday, November 12, 2011 1:27:29 PM > Subject: Swift question > > Hi Mike and Ketan, > > Thanks for the guide. I tried to follow the "cat" example, and got the following error: > > 2011-11-12 19:21:06,510+0000 DEBUG AbstractExecutor Writing PBS script to /home/fangfang/.globus/scripts/PBS6954924010553344333.submit > 2011-11-12 19:21:06,521+0000 DEBUG PBSExecutor PBS name: for: Block-1112-210706-000000 is: Block-1112-2107 > 2011-11-12 19:21:06,521+0000 INFO BlockTaskSubmitter Error submitting block task: Cannot submit job: Illegal value for ppn. Must be an integer. > 2011-11-12 19:21:16,429+0000 INFO TaskNotifier Congestion queue size: 0 > > I looked at the PBS script and somehow it's blank. I have attached the full log file. Could you please take a look and let me know how to proceed? > > Thanks, > > Fangfang > > On Nov 8, 2011, at 12:42 PM, Michael Wilde wrote: > > > Hi Fangfang, Scott, > > > > Sorry for the late reply! I think the best roadmap to follow is this: > > > > - try running the sample tutorial Swift script on Beagle using the instructions posted at: > > > > http://www.ci.uchicago.edu/swift/wwwdev/guides/release-0.93/siteguide/siteguide.html#_beagle > > > > This tiny tutorial contains a simple Swift script that does N "cat" commends in parallel to "process" an input file and create an output file. It contains all the related config files you need to run on Beagle, and is thus a good "Hello World" application. You can then copy catsn.swift to create the first Swift script to run your actual applications. > > > > - set up a face to face meeting with Ketan Maheshwari, the Beagle Catalyst for Swift applications. Ketan is based here at Argonne, on the 5th floor near my office, 5141. Ketan can help answer any questions you have, and will be your personal contact to help you make good use of Beagle. > > > > - then do your first Model-SEED script based on catsn.swift, first with N = 1 to just ensure that you have described your app's command line(s) correctly to Swift and that the app is getting invoked and returning output correctly. > > > > - then, with help form Ketan as needed, start scaling up to increasingly larger runs. > > > > I'll try to stay close in the loop and help out as needed. > > > > Do you have any questions I can answer to get started? If you are at Argonne and available today, perhaps I can join you and Ketan in an introductory meeting. Im free from 3 to 4:40 today or after 5:30. Otherwise, pelase do this at your joint conveniences. > > > > Regards, > > > > - Mike > > > > > > > > > > ----- Original Message ----- > >> From: "Fangfang Xia" > >> To: "Michael Wilde" > >> Sent: Monday, October 31, 2011 12:44:23 PM > >> Subject: Re: How is install/test of Model SEED on Beagle going? > >> Hi Mike, > >> > >> We got two types of flux balance analysis to run on beagle. I was > >> wondering if we should test them with Swift to see if things scale. > >> Both operations take about 40 seconds to run on sandbox. Ideally we > >> should also test two more expensive computation "fba single knockouts" > >> and "gapfilling", but I won't be able to resolve the problems with > >> those until I meet with Chris this week. > >> > >> source /lustre/beagle/fangfang/Model-SEED-core/bin/source-me.sh > >> > >> fbacheckgrowth -model iJR904.16242 > >> fbafva -model iJR904.16242 > >> > >> You can find the descriptions of these tools at: > >> http://bionet.mcs.anl.gov/index.php/Using_the_Model_SEED > >> > >> I've been switching between PrgEnv-pgi/gcc to get perl modules and > >> mfatoolkit to compile. And I still seem to be getting the cc1plus > >> error with gcc which you don't have. So if this version doesn't work > >> well on multiple processors, I'll need your help with recompiling my > >> updated mfatoolkit in > >> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit. > >> > >> I have 777'ed my /lustre/beagle/fangfang/ModelSEED/ directory in case > >> you need to test something there. > >> > >> Thanks, > >> Fangfang > >> > >> On Oct 24, 2011, at 11:14 PM, Michael Wilde wrote: > >> > >>> Hi Fangfang, > >>> > >>> I was able to build that directory using the gcc module; I past the > >>> make output below. It gave many warnings, but I did not get the > >>> cc1plus libmpc.so error that you encountered. > >>> > >>> My build is in $HOME/wilde/mfatoolkit > >>> > >>> I ran this on sandbox.beagle.ci.uchicago.edu. > >>> > >>> - Mike > >>> > >>> ---- make output: > >>> > >>> sandbox$ make > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/driver.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/MFAProblem.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function > >>> 'int MFAProblem::ModifyInputConstraints(ConstraintsToModify*, > >>> Data*)': > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1828:16: warning: NULL > >>> used in arithmetic > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1832:16: warning: NULL > >>> used in arithmetic > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function > >>> 'int MFAProblem::FluxCouplingAnalysis(Data*, OptimizationParameter*, > >>> bool, std::string&, bool)': > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:4900:46: warning: > >>> converting 'false' to pointer type for argument 1 of > >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const > >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = > >>> std::char_traits, _Alloc = std::allocator]' > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5023:47: warning: > >>> converting 'false' to pointer type for argument 1 of > >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const > >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = > >>> std::char_traits, _Alloc = std::allocator]' > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5212:47: warning: > >>> converting 'false' to pointer type for argument 1 of > >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const > >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = > >>> std::char_traits, _Alloc = std::allocator]' > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function > >>> 'int MFAProblem::IdentifyReactionLoops(Data*, > >>> OptimizationParameter*)': > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5989:43: warning: > >>> converting 'false' to pointer type for argument 1 of > >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const > >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = > >>> std::char_traits, _Alloc = std::allocator]' > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function > >>> 'int MFAProblem::ParseRegExp(OptimizationParameter*, Data*, > >>> std::string)': > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:7984:10: warning: > >>> converting to non-pointer type 'int' from NULL > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/CPLEXapi.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/SCIPapi.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/GLPKapi.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/SolverInterface.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/Species.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> /home/wilde/mfatoolkit/Source/Species.cpp: In member function 'void > >>> Species::AddpKab(std::string, bool)': > >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing > >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, > >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = > >>> std::allocator, value_type = int]' > >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing > >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, > >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = > >>> std::allocator, value_type = int]' > >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing > >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, > >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = > >>> std::allocator, value_type = int]' > >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing > >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, > >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = > >>> std::allocator, value_type = int]' > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/Data.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> /home/wilde/mfatoolkit/Source/Data.cpp:2220:43: warning: > >>> multi-character character constant > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/InterfaceFunctions.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/Identity.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/Reaction.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/GlobalFunctions.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/AtomCPP.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/UtilityFunctions.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/AtomType.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/Gene.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/GeneInterval.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/stringDB.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -o /home/wilde/mfatoolkit/Linux/mfatoolkit > >>> /home/wilde/mfatoolkit/Source/driver.o > >>> /home/wilde/mfatoolkit/Source/MFAProblem.o > >>> /home/wilde/mfatoolkit/Source/CPLEXapi.o > >>> /home/wilde/mfatoolkit/Source/SCIPapi.o > >>> /home/wilde/mfatoolkit/Source/GLPKapi.o > >>> /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.o > >>> /home/wilde/mfatoolkit/Source/SolverInterface.o > >>> /home/wilde/mfatoolkit/Source/Species.o > >>> /home/wilde/mfatoolkit/Source/Data.o > >>> /home/wilde/mfatoolkit/Source/InterfaceFunctions.o > >>> /home/wilde/mfatoolkit/Source/Identity.o > >>> /home/wilde/mfatoolkit/Source/Reaction.o > >>> /home/wilde/mfatoolkit/Source/GlobalFunctions.o > >>> /home/wilde/mfatoolkit/Source/AtomCPP.o > >>> /home/wilde/mfatoolkit/Source/UtilityFunctions.o > >>> /home/wilde/mfatoolkit/Source/AtomType.o > >>> /home/wilde/mfatoolkit/Source/Gene.o > >>> /home/wilde/mfatoolkit/Source/GeneInterval.o > >>> /home/wilde/mfatoolkit/Source/stringDB.o > >>> -L/lustre/beagle/fangfang/ModelSEED/Solvers/lib -lglpk > >>> -L/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/lib/x86-64_sles10_4.1/static_pic > >>> -lcplex -lm -lpthread -lz > >>> sandbox$ > >>> > >>> > >>> ----- Original Message ----- > >>>> From: "Fangfang Xia" > >>>> To: "Michael Wilde" > >>>> Cc: "Scott Devoid" > >>>> Sent: Monday, October 24, 2011 5:20:20 PM > >>>> Subject: Re: How is install/test of Model SEED on Beagle going? > >>>> Hi Mike, > >>>> > >>>> This is very helpful. Thanks for pointing out the difference > >>>> between > >>>> PrgEnv-pgi and gcc. Here's an error message we got when trying to > >>>> compile our core c++ code. > >>>> > >>>> /opt/gcc/4.5.2/snos/libexec/gcc/x86_64-suse-linux/default/cc1plus: > >>>> error while loading shared libraries: libmpc.so.2: cannot open > >>>> shared > >>>> object file: No such file or directory > >>>> > >>>> It looks like something is wrong with cc1plus. I suppose it's part > >>>> of > >>>> the g++? I don't know what it does. > >>>> > >>>> So we resolved the perl dependency issues, and we were able to > >>>> compile > >>>> the code with the default PrgEnv-pgi just for testing purposes. It > >>>> seems we still have some issues with our new pipeline code. But I > >>>> don't think we are very far from giving you a running example. > >>>> > >>>> Just in case you could help us with the gcc compilation issue, I > >>>> have > >>>> 777'ed my directory and here's the steps to compile the core C++ > >>>> code: > >>>> > >>>> source > >>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/bin/source-me.sh > >>>> cd > >>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit/Linux > >>>> make > >>>> > >>>> Thanks, > >>>> Fangfang > >>>> > >>>> On Oct 22, 2011, at 1:10 PM, Michael Wilde wrote: > >>>> > >>>>> Sounds great, thanks for the update, Fangfang. > >>>>> > >>>>> One question: what compiler are you using? > >>>>> > >>>>> I'd like to suggest, for the first pass, that you use the "gcc" > >>>>> module (rather than the PrgEnv-pgi or PrgEnv-gnu modules). Thats > >>>>> because the GCC module will create code that we can run in > >>>>> parallel, > >>>>> multiple programs in parallel per compute node. The PrgEnv modules > >>>>> all create code that expects to run only one program per node, > >>>>> because its meant for MPI, OpenMP, etc). > >>>>> > >>>>> Also, I think that the gcc module (which I think includes gcc, g++ > >>>>> and gfortran) may be more like the traditional Linux gcc than > >>>>> PrgEnv-gnu. > >>>>> > >>>>> The default PrgEnv (at least for me) is pgi. So before i build > >>>>> software I do: > >>>>> > >>>>> module unload PrgEnv-pgi > >>>>> module load gcc > >>>>> > >>>>> Let me know if I can help; if you want i can try to build you a > >>>>> libxml2 using gcc. > >>>>> Same for Perl if it needs to be executed multiple copies per node > >>>>> in > >>>>> parallel. > >>>>> > >>>>> We can discuss more next week, and I'll be working off and on this > >>>>> weekend. > >>>>> > >>>>> Regards, > >>>>> > >>>>> - Mike > >>>>> > >>>>> > >>>>> ----- Original Message ----- > >>>>>> From: "Fangfang Xia" > >>>>>> To: "Michael Wilde" > >>>>>> Cc: "Fangfang Xia" , "Scott Devoid" > >>>>>> > >>>>>> Sent: Saturday, October 22, 2011 12:39:05 PM > >>>>>> Subject: Re: How is install/test of Model SEED on Beagle going? > >>>>>> Hi Mike, > >>>>>> > >>>>>> We encountered some dependency issues while attempting to install > >>>>>> some > >>>>>> additional Perl libraries for ModelSEED. We have asked Beagle > >>>>>> systems > >>>>>> folks to help install libxml2. I'm also looking into ways to > >>>>>> install > >>>>>> it in a user directory. I get the feeling that things should be > >>>>>> resolved after our group meeting on Monday. So we'll keep you > >>>>>> posted. > >>>>>> > >>>>>> Thanks, > >>>>>> > >>>>>> Fangfang > >>>>>> > >>>>>> On Oct 21, 2011, at 2:08 PM, Michael Wilde wrote: > >>>>>> > >>>>>>> Hi Fangfang, Scott, > >>>>>>> > >>>>>>> Any progress - can I try it soon? > >>>>>>> > >>>>>>> Or, any problems that I can help with? Im at Argonne today > >>>>>>> (5141) > >>>>>>> if > >>>>>>> I can help or you'd like to talk. Free except for 3:30 - 4:30. > >>>>>>> > >>>>>>> Regards, > >>>>>>> > >>>>>>> - Mike > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> Michael Wilde > >>>>>>> Computation Institute, University of Chicago > >>>>>>> Mathematics and Computer Science Division > >>>>>>> Argonne National Laboratory > >>>>>>> > >>>>> > >>>>> -- > >>>>> Michael Wilde > >>>>> Computation Institute, University of Chicago > >>>>> Mathematics and Computer Science Division > >>>>> Argonne National Laboratory > >>>>> > >>> > >>> -- > >>> Michael Wilde > >>> Computation Institute, University of Chicago > >>> Mathematics and Computer Science Division > >>> Argonne National Laboratory > >>> > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > Ketan > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: catsn-20111112-1921-u149j6i5.log Type: application/octet-stream Size: 17084 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Sat Nov 12 14:29:11 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sat, 12 Nov 2011 14:29:11 -0600 (CST) Subject: [Swift-devel] User has problem with PBS provider on Beagle - Fwd: Swift question In-Reply-To: Message-ID: <268940120.23686.1321129751936.JavaMail.root@zimbra.anl.gov> Ketan, David, I assume that Fangfang got this from the example that the siteguide entry for beagle points to. Ketan, please make sure that this is fixed. David, can you make sure that the Beagle example from the suiteguide is added to the test suite? Further, the example should be in-lined in the site guide, not pointing to anyone home directory. Thanks, - Mike ----- Original Message ----- > From: "Ketan Maheshwari" > To: "Fangfang Xia" > Cc: "Swift Devel" > Sent: Saturday, November 12, 2011 2:21:18 PM > Subject: Re: [Swift-devel] User has problem with PBS provider on Beagle - Fwd: Swift question > Hello Fangfang, > > > Could you replace the following line: > 24:cray:pack > > > with this one: > key="ppn">pbs.aprun;pbs.mpp;depth=24 > > > in your sites.xml. > > > The line you have is obsoleted form from the 0.92 version of Swift. > > > It should work now. > > Regards, > Ketan > > > > > On Sat, Nov 12, 2011 at 2:07 PM, Fangfang Xia < fangfang.xia at gmail.com > > wrote: > > > > > Hi Ketan, > > > Thanks for getting back to me so promptly. I have attached the log > file, and here's the content of sites.xml: > > > > > > > CI-DEB000002 > > > 24:cray:pack > > > 24 > 1000 > 1 > 1 > 1 > > > .63 > 10000 > > > > >/lustre/beagle/fangfang/swift-lab/swift.workdir > > > > > There's no error message on the command line. > > > > > > > > On Nov 12, 2011, at 2:02 PM, Ketan Maheshwari wrote: > > > Hello Fangfang, > > > The log file does not seem to be found. Could you attach it please. > > > From this line: > Illegal value for ppn. Must be an integer. > > > Looks like the sites file is not configured well for the pbs provider. > Could you post your sites.xml. > > > Were there any error messages on commandline? > > > Regards, > Ketan > > > On Sat, Nov 12, 2011 at 1:31 PM, Michael Wilde < wilde at mcs.anl.gov > > wrote: > > > Can the first person who has time try to address the problem below? > Im about to head to SC. > > Thanks, > > - Mike > > > ----- Forwarded Message ----- > From: "Fangfang Xia" < fangfang.xia at gmail.com > > To: "Michael Wilde" < wilde at mcs.anl.gov > > Cc: "Ketan Maheshwari" < ketan at mcs.anl.gov >, "Scott Devoid" < > devoid at ci.uchicago.edu > > Sent: Saturday, November 12, 2011 1:27:29 PM > Subject: Swift question > > Hi Mike and Ketan, > > Thanks for the guide. I tried to follow the "cat" example, and got the > following error: > > 2011-11-12 19:21:06,510+0000 DEBUG AbstractExecutor Writing PBS script > to /home/fangfang/.globus/scripts/PBS6954924010553344333.submit > 2011-11-12 19:21:06,521+0000 DEBUG PBSExecutor PBS name: for: > Block-1112-210706-000000 is: Block-1112-2107 > 2011-11-12 19:21:06,521+0000 INFO BlockTaskSubmitter Error submitting > block task: Cannot submit job: Illegal value for ppn. Must be an > integer. > 2011-11-12 19:21:16,429+0000 INFO TaskNotifier Congestion queue size: > 0 > > I looked at the PBS script and somehow it's blank. I have attached the > full log file. Could you please take a look and let me know how to > proceed? > > Thanks, > > Fangfang > > On Nov 8, 2011, at 12:42 PM, Michael Wilde wrote: > > > Hi Fangfang, Scott, > > > > Sorry for the late reply! I think the best roadmap to follow is > > this: > > > > - try running the sample tutorial Swift script on Beagle using the > > instructions posted at: > > > > http://www.ci.uchicago.edu/swift/wwwdev/guides/release-0.93/siteguide/siteguide.html#_beagle > > > > This tiny tutorial contains a simple Swift script that does N "cat" > > commends in parallel to "process" an input file and create an output > > file. It contains all the related config files you need to run on > > Beagle, and is thus a good "Hello World" application. You can then > > copy catsn.swift to create the first Swift script to run your actual > > applications. > > > > - set up a face to face meeting with Ketan Maheshwari, the Beagle > > Catalyst for Swift applications. Ketan is based here at Argonne, on > > the 5th floor near my office, 5141. Ketan can help answer any > > questions you have, and will be your personal contact to help you > > make good use of Beagle. > > > > - then do your first Model-SEED script based on catsn.swift, first > > with N = 1 to just ensure that you have described your app's command > > line(s) correctly to Swift and that the app is getting invoked and > > returning output correctly. > > > > - then, with help form Ketan as needed, start scaling up to > > increasingly larger runs. > > > > I'll try to stay close in the loop and help out as needed. > > > > Do you have any questions I can answer to get started? If you are at > > Argonne and available today, perhaps I can join you and Ketan in an > > introductory meeting. Im free from 3 to 4:40 today or after 5:30. > > Otherwise, pelase do this at your joint conveniences. > > > > Regards, > > > > - Mike > > > > > > > > > > ----- Original Message ----- > >> From: "Fangfang Xia" < fangfang.xia at gmail.com > > >> To: "Michael Wilde" < wilde at mcs.anl.gov > > >> Sent: Monday, October 31, 2011 12:44:23 PM > >> Subject: Re: How is install/test of Model SEED on Beagle going? > >> Hi Mike, > >> > >> We got two types of flux balance analysis to run on beagle. I was > >> wondering if we should test them with Swift to see if things scale. > >> Both operations take about 40 seconds to run on sandbox. Ideally we > >> should also test two more expensive computation "fba single > >> knockouts" > >> and "gapfilling", but I won't be able to resolve the problems with > >> those until I meet with Chris this week. > >> > >> source /lustre/beagle/fangfang/Model-SEED-core/bin/source-me.sh > >> > >> fbacheckgrowth -model iJR904.16242 > >> fbafva -model iJR904.16242 > >> > >> You can find the descriptions of these tools at: > >> http://bionet.mcs.anl.gov/index.php/Using_the_Model_SEED > >> > >> I've been switching between PrgEnv-pgi/gcc to get perl modules and > >> mfatoolkit to compile. And I still seem to be getting the cc1plus > >> error with gcc which you don't have. So if this version doesn't > >> work > >> well on multiple processors, I'll need your help with recompiling > >> my > >> updated mfatoolkit in > >> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit. > >> > >> I have 777'ed my /lustre/beagle/fangfang/ModelSEED/ directory in > >> case > >> you need to test something there. > >> > >> Thanks, > >> Fangfang > >> > >> On Oct 24, 2011, at 11:14 PM, Michael Wilde wrote: > >> > >>> Hi Fangfang, > >>> > >>> I was able to build that directory using the gcc module; I past > >>> the > >>> make output below. It gave many warnings, but I did not get the > >>> cc1plus libmpc.so error that you encountered. > >>> > >>> My build is in $HOME/wilde/mfatoolkit > >>> > >>> I ran this on sandbox.beagle.ci.uchicago.edu . > >>> > >>> - Mike > >>> > >>> ---- make output: > >>> > >>> sandbox$ make > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/driver.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/MFAProblem.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function > >>> 'int MFAProblem::ModifyInputConstraints(ConstraintsToModify*, > >>> Data*)': > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1828:16: warning: > >>> NULL > >>> used in arithmetic > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1832:16: warning: > >>> NULL > >>> used in arithmetic > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function > >>> 'int MFAProblem::FluxCouplingAnalysis(Data*, > >>> OptimizationParameter*, > >>> bool, std::string&, bool)': > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:4900:46: warning: > >>> converting 'false' to pointer type for argument 1 of > >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const > >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = > >>> std::char_traits, _Alloc = std::allocator]' > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5023:47: warning: > >>> converting 'false' to pointer type for argument 1 of > >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const > >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = > >>> std::char_traits, _Alloc = std::allocator]' > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5212:47: warning: > >>> converting 'false' to pointer type for argument 1 of > >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const > >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = > >>> std::char_traits, _Alloc = std::allocator]' > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function > >>> 'int MFAProblem::IdentifyReactionLoops(Data*, > >>> OptimizationParameter*)': > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5989:43: warning: > >>> converting 'false' to pointer type for argument 1 of > >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const > >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = > >>> std::char_traits, _Alloc = std::allocator]' > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function > >>> 'int MFAProblem::ParseRegExp(OptimizationParameter*, Data*, > >>> std::string)': > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:7984:10: warning: > >>> converting to non-pointer type 'int' from NULL > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/CPLEXapi.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/SCIPapi.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/GLPKapi.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/SolverInterface.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/Species.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> /home/wilde/mfatoolkit/Source/Species.cpp: In member function > >>> 'void > >>> Species::AddpKab(std::string, bool)': > >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing > >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, > >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = > >>> std::allocator, value_type = int]' > >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing > >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, > >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = > >>> std::allocator, value_type = int]' > >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing > >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, > >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = > >>> std::allocator, value_type = int]' > >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing > >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, > >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = > >>> std::allocator, value_type = int]' > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/Data.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> /home/wilde/mfatoolkit/Source/Data.cpp:2220:43: warning: > >>> multi-character character constant > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/InterfaceFunctions.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/Identity.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/Reaction.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/GlobalFunctions.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/AtomCPP.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/UtilityFunctions.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/AtomType.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/Gene.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/GeneInterval.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/stringDB.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -o /home/wilde/mfatoolkit/Linux/mfatoolkit > >>> /home/wilde/mfatoolkit/Source/driver.o > >>> /home/wilde/mfatoolkit/Source/MFAProblem.o > >>> /home/wilde/mfatoolkit/Source/CPLEXapi.o > >>> /home/wilde/mfatoolkit/Source/SCIPapi.o > >>> /home/wilde/mfatoolkit/Source/GLPKapi.o > >>> /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.o > >>> /home/wilde/mfatoolkit/Source/SolverInterface.o > >>> /home/wilde/mfatoolkit/Source/Species.o > >>> /home/wilde/mfatoolkit/Source/Data.o > >>> /home/wilde/mfatoolkit/Source/InterfaceFunctions.o > >>> /home/wilde/mfatoolkit/Source/Identity.o > >>> /home/wilde/mfatoolkit/Source/Reaction.o > >>> /home/wilde/mfatoolkit/Source/GlobalFunctions.o > >>> /home/wilde/mfatoolkit/Source/AtomCPP.o > >>> /home/wilde/mfatoolkit/Source/UtilityFunctions.o > >>> /home/wilde/mfatoolkit/Source/AtomType.o > >>> /home/wilde/mfatoolkit/Source/Gene.o > >>> /home/wilde/mfatoolkit/Source/GeneInterval.o > >>> /home/wilde/mfatoolkit/Source/stringDB.o > >>> -L/lustre/beagle/fangfang/ModelSEED/Solvers/lib -lglpk > >>> -L/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/lib/x86-64_sles10_4.1/static_pic > >>> -lcplex -lm -lpthread -lz > >>> sandbox$ > >>> > >>> > >>> ----- Original Message ----- > >>>> From: "Fangfang Xia" < fangfang.xia at gmail.com > > >>>> To: "Michael Wilde" < wilde at mcs.anl.gov > > >>>> Cc: "Scott Devoid" < devoid at ci.uchicago.edu > > >>>> Sent: Monday, October 24, 2011 5:20:20 PM > >>>> Subject: Re: How is install/test of Model SEED on Beagle going? > >>>> Hi Mike, > >>>> > >>>> This is very helpful. Thanks for pointing out the difference > >>>> between > >>>> PrgEnv-pgi and gcc. Here's an error message we got when trying to > >>>> compile our core c++ code. > >>>> > >>>> /opt/gcc/4.5.2/snos/libexec/gcc/x86_64-suse-linux/default/cc1plus: > >>>> error while loading shared libraries: libmpc.so.2: cannot open > >>>> shared > >>>> object file: No such file or directory > >>>> > >>>> It looks like something is wrong with cc1plus. I suppose it's > >>>> part > >>>> of > >>>> the g++? I don't know what it does. > >>>> > >>>> So we resolved the perl dependency issues, and we were able to > >>>> compile > >>>> the code with the default PrgEnv-pgi just for testing purposes. > >>>> It > >>>> seems we still have some issues with our new pipeline code. But I > >>>> don't think we are very far from giving you a running example. > >>>> > >>>> Just in case you could help us with the gcc compilation issue, I > >>>> have > >>>> 777'ed my directory and here's the steps to compile the core C++ > >>>> code: > >>>> > >>>> source > >>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/bin/source-me.sh > >>>> cd > >>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit/Linux > >>>> make > >>>> > >>>> Thanks, > >>>> Fangfang > >>>> > >>>> On Oct 22, 2011, at 1:10 PM, Michael Wilde wrote: > >>>> > >>>>> Sounds great, thanks for the update, Fangfang. > >>>>> > >>>>> One question: what compiler are you using? > >>>>> > >>>>> I'd like to suggest, for the first pass, that you use the "gcc" > >>>>> module (rather than the PrgEnv-pgi or PrgEnv-gnu modules). Thats > >>>>> because the GCC module will create code that we can run in > >>>>> parallel, > >>>>> multiple programs in parallel per compute node. The PrgEnv > >>>>> modules > >>>>> all create code that expects to run only one program per node, > >>>>> because its meant for MPI, OpenMP, etc). > >>>>> > >>>>> Also, I think that the gcc module (which I think includes gcc, > >>>>> g++ > >>>>> and gfortran) may be more like the traditional Linux gcc than > >>>>> PrgEnv-gnu. > >>>>> > >>>>> The default PrgEnv (at least for me) is pgi. So before i build > >>>>> software I do: > >>>>> > >>>>> module unload PrgEnv-pgi > >>>>> module load gcc > >>>>> > >>>>> Let me know if I can help; if you want i can try to build you a > >>>>> libxml2 using gcc. > >>>>> Same for Perl if it needs to be executed multiple copies per > >>>>> node > >>>>> in > >>>>> parallel. > >>>>> > >>>>> We can discuss more next week, and I'll be working off and on > >>>>> this > >>>>> weekend. > >>>>> > >>>>> Regards, > >>>>> > >>>>> - Mike > >>>>> > >>>>> > >>>>> ----- Original Message ----- > >>>>>> From: "Fangfang Xia" < fangfang.xia at gmail.com > > >>>>>> To: "Michael Wilde" < wilde at mcs.anl.gov > > >>>>>> Cc: "Fangfang Xia" < fangfang at uchicago.edu >, "Scott Devoid" > >>>>>> < devoid at ci.uchicago.edu > > >>>>>> Sent: Saturday, October 22, 2011 12:39:05 PM > >>>>>> Subject: Re: How is install/test of Model SEED on Beagle going? > >>>>>> Hi Mike, > >>>>>> > >>>>>> We encountered some dependency issues while attempting to > >>>>>> install > >>>>>> some > >>>>>> additional Perl libraries for ModelSEED. We have asked Beagle > >>>>>> systems > >>>>>> folks to help install libxml2. I'm also looking into ways to > >>>>>> install > >>>>>> it in a user directory. I get the feeling that things should be > >>>>>> resolved after our group meeting on Monday. So we'll keep you > >>>>>> posted. > >>>>>> > >>>>>> Thanks, > >>>>>> > >>>>>> Fangfang > >>>>>> > >>>>>> On Oct 21, 2011, at 2:08 PM, Michael Wilde wrote: > >>>>>> > >>>>>>> Hi Fangfang, Scott, > >>>>>>> > >>>>>>> Any progress - can I try it soon? > >>>>>>> > >>>>>>> Or, any problems that I can help with? Im at Argonne today > >>>>>>> (5141) > >>>>>>> if > >>>>>>> I can help or you'd like to talk. Free except for 3:30 - 4:30. > >>>>>>> > >>>>>>> Regards, > >>>>>>> > >>>>>>> - Mike > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> Michael Wilde > >>>>>>> Computation Institute, University of Chicago > >>>>>>> Mathematics and Computer Science Division > >>>>>>> Argonne National Laboratory > >>>>>>> > >>>>> > >>>>> -- > >>>>> Michael Wilde > >>>>> Computation Institute, University of Chicago > >>>>> Mathematics and Computer Science Division > >>>>> Argonne National Laboratory > >>>>> > >>> > >>> -- > >>> Michael Wilde > >>> Computation Institute, University of Chicago > >>> Mathematics and Computer Science Division > >>> Argonne National Laboratory > >>> > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > -- > Ketan > > > > > > > > > -- > Ketan > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From fangfang.xia at gmail.com Sat Nov 12 14:30:28 2011 From: fangfang.xia at gmail.com (Fangfang Xia) Date: Sat, 12 Nov 2011 14:30:28 -0600 Subject: [Swift-devel] User has problem with PBS provider on Beagle - Fwd: Swift question In-Reply-To: References: <818D3B69-4716-40B9-B4CD-422A62D56787@gmail.com> <1285627078.23656.1321126262196.JavaMail.root@zimbra.anl.gov> <7C4E0EA1-9DC4-404A-8C4A-16B9B011EADF@gmail.com> Message-ID: <832E99F6-26FF-40E4-81C1-DCB19728D339@gmail.com> Thanks. The "Illegal value for ppn" line seems to persist in the log. On Nov 12, 2011, at 2:21 PM, Ketan Maheshwari wrote: > Hello Fangfang, > > Could you replace the following line: > 24:cray:pack > > with this one: > pbs.aprun;pbs.mpp;depth=24 > > in your sites.xml. > > The line you have is obsoleted form from the 0.92 version of Swift. > > It should work now. > > Regards, > Ketan > > > On Sat, Nov 12, 2011 at 2:07 PM, Fangfang Xia wrote: > Hi Ketan, > > Thanks for getting back to me so promptly. I have attached the log file, and here's the content of sites.xml: > > > > > CI-DEB000002 > > 24:cray:pack > > 24 > 1000 > 1 > 1 > 1 > > .63 > 10000 > > > /lustre/beagle/fangfang/swift-lab/swift.workdir > > > > There's no error message on the command line. > > > > On Nov 12, 2011, at 2:02 PM, Ketan Maheshwari wrote: > >> Hello Fangfang, >> >> The log file does not seem to be found. Could you attach it please. >> >> From this line: >> Illegal value for ppn. Must be an integer. >> >> Looks like the sites file is not configured well for the pbs provider. Could you post your sites.xml. >> >> Were there any error messages on commandline? >> >> Regards, >> Ketan >> >> On Sat, Nov 12, 2011 at 1:31 PM, Michael Wilde wrote: >> Can the first person who has time try to address the problem below? >> Im about to head to SC. >> >> Thanks, >> >> - Mike >> >> >> ----- Forwarded Message ----- >> From: "Fangfang Xia" >> To: "Michael Wilde" >> Cc: "Ketan Maheshwari" , "Scott Devoid" >> Sent: Saturday, November 12, 2011 1:27:29 PM >> Subject: Swift question >> >> Hi Mike and Ketan, >> >> Thanks for the guide. I tried to follow the "cat" example, and got the following error: >> >> 2011-11-12 19:21:06,510+0000 DEBUG AbstractExecutor Writing PBS script to /home/fangfang/.globus/scripts/PBS6954924010553344333.submit >> 2011-11-12 19:21:06,521+0000 DEBUG PBSExecutor PBS name: for: Block-1112-210706-000000 is: Block-1112-2107 >> 2011-11-12 19:21:06,521+0000 INFO BlockTaskSubmitter Error submitting block task: Cannot submit job: Illegal value for ppn. Must be an integer. >> 2011-11-12 19:21:16,429+0000 INFO TaskNotifier Congestion queue size: 0 >> >> I looked at the PBS script and somehow it's blank. I have attached the full log file. Could you please take a look and let me know how to proceed? >> >> Thanks, >> >> Fangfang >> >> On Nov 8, 2011, at 12:42 PM, Michael Wilde wrote: >> >> > Hi Fangfang, Scott, >> > >> > Sorry for the late reply! I think the best roadmap to follow is this: >> > >> > - try running the sample tutorial Swift script on Beagle using the instructions posted at: >> > >> > http://www.ci.uchicago.edu/swift/wwwdev/guides/release-0.93/siteguide/siteguide.html#_beagle >> > >> > This tiny tutorial contains a simple Swift script that does N "cat" commends in parallel to "process" an input file and create an output file. It contains all the related config files you need to run on Beagle, and is thus a good "Hello World" application. You can then copy catsn.swift to create the first Swift script to run your actual applications. >> > >> > - set up a face to face meeting with Ketan Maheshwari, the Beagle Catalyst for Swift applications. Ketan is based here at Argonne, on the 5th floor near my office, 5141. Ketan can help answer any questions you have, and will be your personal contact to help you make good use of Beagle. >> > >> > - then do your first Model-SEED script based on catsn.swift, first with N = 1 to just ensure that you have described your app's command line(s) correctly to Swift and that the app is getting invoked and returning output correctly. >> > >> > - then, with help form Ketan as needed, start scaling up to increasingly larger runs. >> > >> > I'll try to stay close in the loop and help out as needed. >> > >> > Do you have any questions I can answer to get started? If you are at Argonne and available today, perhaps I can join you and Ketan in an introductory meeting. Im free from 3 to 4:40 today or after 5:30. Otherwise, pelase do this at your joint conveniences. >> > >> > Regards, >> > >> > - Mike >> > >> > >> > >> > >> > ----- Original Message ----- >> >> From: "Fangfang Xia" >> >> To: "Michael Wilde" >> >> Sent: Monday, October 31, 2011 12:44:23 PM >> >> Subject: Re: How is install/test of Model SEED on Beagle going? >> >> Hi Mike, >> >> >> >> We got two types of flux balance analysis to run on beagle. I was >> >> wondering if we should test them with Swift to see if things scale. >> >> Both operations take about 40 seconds to run on sandbox. Ideally we >> >> should also test two more expensive computation "fba single knockouts" >> >> and "gapfilling", but I won't be able to resolve the problems with >> >> those until I meet with Chris this week. >> >> >> >> source /lustre/beagle/fangfang/Model-SEED-core/bin/source-me.sh >> >> >> >> fbacheckgrowth -model iJR904.16242 >> >> fbafva -model iJR904.16242 >> >> >> >> You can find the descriptions of these tools at: >> >> http://bionet.mcs.anl.gov/index.php/Using_the_Model_SEED >> >> >> >> I've been switching between PrgEnv-pgi/gcc to get perl modules and >> >> mfatoolkit to compile. And I still seem to be getting the cc1plus >> >> error with gcc which you don't have. So if this version doesn't work >> >> well on multiple processors, I'll need your help with recompiling my >> >> updated mfatoolkit in >> >> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit. >> >> >> >> I have 777'ed my /lustre/beagle/fangfang/ModelSEED/ directory in case >> >> you need to test something there. >> >> >> >> Thanks, >> >> Fangfang >> >> >> >> On Oct 24, 2011, at 11:14 PM, Michael Wilde wrote: >> >> >> >>> Hi Fangfang, >> >>> >> >>> I was able to build that directory using the gcc module; I past the >> >>> make output below. It gave many warnings, but I did not get the >> >>> cc1plus libmpc.so error that you encountered. >> >>> >> >>> My build is in $HOME/wilde/mfatoolkit >> >>> >> >>> I ran this on sandbox.beagle.ci.uchicago.edu. >> >>> >> >>> - Mike >> >>> >> >>> ---- make output: >> >>> >> >>> sandbox$ make >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/driver.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/MFAProblem.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >> >>> 'int MFAProblem::ModifyInputConstraints(ConstraintsToModify*, >> >>> Data*)': >> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1828:16: warning: NULL >> >>> used in arithmetic >> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1832:16: warning: NULL >> >>> used in arithmetic >> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >> >>> 'int MFAProblem::FluxCouplingAnalysis(Data*, OptimizationParameter*, >> >>> bool, std::string&, bool)': >> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:4900:46: warning: >> >>> converting 'false' to pointer type for argument 1 of >> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >> >>> std::char_traits, _Alloc = std::allocator]' >> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5023:47: warning: >> >>> converting 'false' to pointer type for argument 1 of >> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >> >>> std::char_traits, _Alloc = std::allocator]' >> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5212:47: warning: >> >>> converting 'false' to pointer type for argument 1 of >> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >> >>> std::char_traits, _Alloc = std::allocator]' >> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >> >>> 'int MFAProblem::IdentifyReactionLoops(Data*, >> >>> OptimizationParameter*)': >> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5989:43: warning: >> >>> converting 'false' to pointer type for argument 1 of >> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >> >>> std::char_traits, _Alloc = std::allocator]' >> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >> >>> 'int MFAProblem::ParseRegExp(OptimizationParameter*, Data*, >> >>> std::string)': >> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:7984:10: warning: >> >>> converting to non-pointer type 'int' from NULL >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/CPLEXapi.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/SCIPapi.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/GLPKapi.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/SolverInterface.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/Species.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> /home/wilde/mfatoolkit/Source/Species.cpp: In member function 'void >> >>> Species::AddpKab(std::string, bool)': >> >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing >> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >> >>> std::allocator, value_type = int]' >> >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing >> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >> >>> std::allocator, value_type = int]' >> >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing >> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >> >>> std::allocator, value_type = int]' >> >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing >> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >> >>> std::allocator, value_type = int]' >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/Data.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> /home/wilde/mfatoolkit/Source/Data.cpp:2220:43: warning: >> >>> multi-character character constant >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/InterfaceFunctions.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/Identity.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/Reaction.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/GlobalFunctions.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/AtomCPP.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/UtilityFunctions.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/AtomType.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/Gene.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/GeneInterval.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -c /home/wilde/mfatoolkit/Source/stringDB.cpp; mv *.o >> >>> /home/wilde/mfatoolkit/Source >> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >> >>> -o /home/wilde/mfatoolkit/Linux/mfatoolkit >> >>> /home/wilde/mfatoolkit/Source/driver.o >> >>> /home/wilde/mfatoolkit/Source/MFAProblem.o >> >>> /home/wilde/mfatoolkit/Source/CPLEXapi.o >> >>> /home/wilde/mfatoolkit/Source/SCIPapi.o >> >>> /home/wilde/mfatoolkit/Source/GLPKapi.o >> >>> /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.o >> >>> /home/wilde/mfatoolkit/Source/SolverInterface.o >> >>> /home/wilde/mfatoolkit/Source/Species.o >> >>> /home/wilde/mfatoolkit/Source/Data.o >> >>> /home/wilde/mfatoolkit/Source/InterfaceFunctions.o >> >>> /home/wilde/mfatoolkit/Source/Identity.o >> >>> /home/wilde/mfatoolkit/Source/Reaction.o >> >>> /home/wilde/mfatoolkit/Source/GlobalFunctions.o >> >>> /home/wilde/mfatoolkit/Source/AtomCPP.o >> >>> /home/wilde/mfatoolkit/Source/UtilityFunctions.o >> >>> /home/wilde/mfatoolkit/Source/AtomType.o >> >>> /home/wilde/mfatoolkit/Source/Gene.o >> >>> /home/wilde/mfatoolkit/Source/GeneInterval.o >> >>> /home/wilde/mfatoolkit/Source/stringDB.o >> >>> -L/lustre/beagle/fangfang/ModelSEED/Solvers/lib -lglpk >> >>> -L/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/lib/x86-64_sles10_4.1/static_pic >> >>> -lcplex -lm -lpthread -lz >> >>> sandbox$ >> >>> >> >>> >> >>> ----- Original Message ----- >> >>>> From: "Fangfang Xia" >> >>>> To: "Michael Wilde" >> >>>> Cc: "Scott Devoid" >> >>>> Sent: Monday, October 24, 2011 5:20:20 PM >> >>>> Subject: Re: How is install/test of Model SEED on Beagle going? >> >>>> Hi Mike, >> >>>> >> >>>> This is very helpful. Thanks for pointing out the difference >> >>>> between >> >>>> PrgEnv-pgi and gcc. Here's an error message we got when trying to >> >>>> compile our core c++ code. >> >>>> >> >>>> /opt/gcc/4.5.2/snos/libexec/gcc/x86_64-suse-linux/default/cc1plus: >> >>>> error while loading shared libraries: libmpc.so.2: cannot open >> >>>> shared >> >>>> object file: No such file or directory >> >>>> >> >>>> It looks like something is wrong with cc1plus. I suppose it's part >> >>>> of >> >>>> the g++? I don't know what it does. >> >>>> >> >>>> So we resolved the perl dependency issues, and we were able to >> >>>> compile >> >>>> the code with the default PrgEnv-pgi just for testing purposes. It >> >>>> seems we still have some issues with our new pipeline code. But I >> >>>> don't think we are very far from giving you a running example. >> >>>> >> >>>> Just in case you could help us with the gcc compilation issue, I >> >>>> have >> >>>> 777'ed my directory and here's the steps to compile the core C++ >> >>>> code: >> >>>> >> >>>> source >> >>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/bin/source-me.sh >> >>>> cd >> >>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit/Linux >> >>>> make >> >>>> >> >>>> Thanks, >> >>>> Fangfang >> >>>> >> >>>> On Oct 22, 2011, at 1:10 PM, Michael Wilde wrote: >> >>>> >> >>>>> Sounds great, thanks for the update, Fangfang. >> >>>>> >> >>>>> One question: what compiler are you using? >> >>>>> >> >>>>> I'd like to suggest, for the first pass, that you use the "gcc" >> >>>>> module (rather than the PrgEnv-pgi or PrgEnv-gnu modules). Thats >> >>>>> because the GCC module will create code that we can run in >> >>>>> parallel, >> >>>>> multiple programs in parallel per compute node. The PrgEnv modules >> >>>>> all create code that expects to run only one program per node, >> >>>>> because its meant for MPI, OpenMP, etc). >> >>>>> >> >>>>> Also, I think that the gcc module (which I think includes gcc, g++ >> >>>>> and gfortran) may be more like the traditional Linux gcc than >> >>>>> PrgEnv-gnu. >> >>>>> >> >>>>> The default PrgEnv (at least for me) is pgi. So before i build >> >>>>> software I do: >> >>>>> >> >>>>> module unload PrgEnv-pgi >> >>>>> module load gcc >> >>>>> >> >>>>> Let me know if I can help; if you want i can try to build you a >> >>>>> libxml2 using gcc. >> >>>>> Same for Perl if it needs to be executed multiple copies per node >> >>>>> in >> >>>>> parallel. >> >>>>> >> >>>>> We can discuss more next week, and I'll be working off and on this >> >>>>> weekend. >> >>>>> >> >>>>> Regards, >> >>>>> >> >>>>> - Mike >> >>>>> >> >>>>> >> >>>>> ----- Original Message ----- >> >>>>>> From: "Fangfang Xia" >> >>>>>> To: "Michael Wilde" >> >>>>>> Cc: "Fangfang Xia" , "Scott Devoid" >> >>>>>> >> >>>>>> Sent: Saturday, October 22, 2011 12:39:05 PM >> >>>>>> Subject: Re: How is install/test of Model SEED on Beagle going? >> >>>>>> Hi Mike, >> >>>>>> >> >>>>>> We encountered some dependency issues while attempting to install >> >>>>>> some >> >>>>>> additional Perl libraries for ModelSEED. We have asked Beagle >> >>>>>> systems >> >>>>>> folks to help install libxml2. I'm also looking into ways to >> >>>>>> install >> >>>>>> it in a user directory. I get the feeling that things should be >> >>>>>> resolved after our group meeting on Monday. So we'll keep you >> >>>>>> posted. >> >>>>>> >> >>>>>> Thanks, >> >>>>>> >> >>>>>> Fangfang >> >>>>>> >> >>>>>> On Oct 21, 2011, at 2:08 PM, Michael Wilde wrote: >> >>>>>> >> >>>>>>> Hi Fangfang, Scott, >> >>>>>>> >> >>>>>>> Any progress - can I try it soon? >> >>>>>>> >> >>>>>>> Or, any problems that I can help with? Im at Argonne today >> >>>>>>> (5141) >> >>>>>>> if >> >>>>>>> I can help or you'd like to talk. Free except for 3:30 - 4:30. >> >>>>>>> >> >>>>>>> Regards, >> >>>>>>> >> >>>>>>> - Mike >> >>>>>>> >> >>>>>>> >> >>>>>>> -- >> >>>>>>> Michael Wilde >> >>>>>>> Computation Institute, University of Chicago >> >>>>>>> Mathematics and Computer Science Division >> >>>>>>> Argonne National Laboratory >> >>>>>>> >> >>>>> >> >>>>> -- >> >>>>> Michael Wilde >> >>>>> Computation Institute, University of Chicago >> >>>>> Mathematics and Computer Science Division >> >>>>> Argonne National Laboratory >> >>>>> >> >>> >> >>> -- >> >>> Michael Wilde >> >>> Computation Institute, University of Chicago >> >>> Mathematics and Computer Science Division >> >>> Argonne National Laboratory >> >>> >> > >> > -- >> > Michael Wilde >> > Computation Institute, University of Chicago >> > Mathematics and Computer Science Division >> > Argonne National Laboratory >> > >> >> >> -- >> Michael Wilde >> Computation Institute, University of Chicago >> Mathematics and Computer Science Division >> Argonne National Laboratory >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> >> >> >> -- >> Ketan >> >> > > > > > > -- > Ketan > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: catsn-20111112-2025-g4qxj66a.log Type: application/octet-stream Size: 17027 bytes Desc: not available URL: -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Sat Nov 12 14:41:25 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sat, 12 Nov 2011 14:41:25 -0600 (CST) Subject: [Swift-devel] User has problem with PBS provider on Beagle - Fwd: Swift question In-Reply-To: <832E99F6-26FF-40E4-81C1-DCB19728D339@gmail.com> Message-ID: <482380763.23702.1321130485233.JavaMail.root@zimbra.anl.gov> Ketan, can you send Fangfang a working example, and make sure that he's running the suggested Swift version for Beagle? Fangfang, please bear with us - the interface between Swift and Cray PBS has gone through a lot of recent changes, and it seems that the example is out of sync with the Swift version you are running. Thanks, - Mike ----- Original Message ----- > From: "Fangfang Xia" > To: "Ketan Maheshwari" > Cc: "Swift Devel" > Sent: Saturday, November 12, 2011 2:30:28 PM > Subject: Re: [Swift-devel] User has problem with PBS provider on Beagle - Fwd: Swift question > Thanks. The " Illegal value for ppn" line seems to persist in the log. > > > > > > > > > > > On Nov 12, 2011, at 2:21 PM, Ketan Maheshwari wrote: > > > Hello Fangfang, > > > Could you replace the following line: > 24:cray:pack > > > with this one: > key="ppn">pbs.aprun;pbs.mpp;depth=24 > > > in your sites.xml. > > > The line you have is obsoleted form from the 0.92 version of Swift. > > > It should work now. > > Regards, > Ketan > > > > > On Sat, Nov 12, 2011 at 2:07 PM, Fangfang Xia < fangfang.xia at gmail.com > > wrote: > > > > > Hi Ketan, > > > Thanks for getting back to me so promptly. I have attached the log > file, and here's the content of sites.xml: > > > > > > > CI-DEB000002 > > > 24:cray:pack > > > 24 > 1000 > 1 > 1 > 1 > > > .63 > 10000 > > > > >/lustre/beagle/fangfang/swift-lab/swift.workdir > > > > > There's no error message on the command line. > > > > > > > > On Nov 12, 2011, at 2:02 PM, Ketan Maheshwari wrote: > > > Hello Fangfang, > > > The log file does not seem to be found. Could you attach it please. > > > From this line: > Illegal value for ppn. Must be an integer. > > > Looks like the sites file is not configured well for the pbs provider. > Could you post your sites.xml. > > > Were there any error messages on commandline? > > > Regards, > Ketan > > > On Sat, Nov 12, 2011 at 1:31 PM, Michael Wilde < wilde at mcs.anl.gov > > wrote: > > > Can the first person who has time try to address the problem below? > Im about to head to SC. > > Thanks, > > - Mike > > > ----- Forwarded Message ----- > From: "Fangfang Xia" < fangfang.xia at gmail.com > > To: "Michael Wilde" < wilde at mcs.anl.gov > > Cc: "Ketan Maheshwari" < ketan at mcs.anl.gov >, "Scott Devoid" < > devoid at ci.uchicago.edu > > Sent: Saturday, November 12, 2011 1:27:29 PM > Subject: Swift question > > Hi Mike and Ketan, > > Thanks for the guide. I tried to follow the "cat" example, and got the > following error: > > 2011-11-12 19:21:06,510+0000 DEBUG AbstractExecutor Writing PBS script > to /home/fangfang/.globus/scripts/PBS6954924010553344333.submit > 2011-11-12 19:21:06,521+0000 DEBUG PBSExecutor PBS name: for: > Block-1112-210706-000000 is: Block-1112-2107 > 2011-11-12 19:21:06,521+0000 INFO BlockTaskSubmitter Error submitting > block task: Cannot submit job: Illegal value for ppn. Must be an > integer. > 2011-11-12 19:21:16,429+0000 INFO TaskNotifier Congestion queue size: > 0 > > I looked at the PBS script and somehow it's blank. I have attached the > full log file. Could you please take a look and let me know how to > proceed? > > Thanks, > > Fangfang > > On Nov 8, 2011, at 12:42 PM, Michael Wilde wrote: > > > Hi Fangfang, Scott, > > > > Sorry for the late reply! I think the best roadmap to follow is > > this: > > > > - try running the sample tutorial Swift script on Beagle using the > > instructions posted at: > > > > http://www.ci.uchicago.edu/swift/wwwdev/guides/release-0.93/siteguide/siteguide.html#_beagle > > > > This tiny tutorial contains a simple Swift script that does N "cat" > > commends in parallel to "process" an input file and create an output > > file. It contains all the related config files you need to run on > > Beagle, and is thus a good "Hello World" application. You can then > > copy catsn.swift to create the first Swift script to run your actual > > applications. > > > > - set up a face to face meeting with Ketan Maheshwari, the Beagle > > Catalyst for Swift applications. Ketan is based here at Argonne, on > > the 5th floor near my office, 5141. Ketan can help answer any > > questions you have, and will be your personal contact to help you > > make good use of Beagle. > > > > - then do your first Model-SEED script based on catsn.swift, first > > with N = 1 to just ensure that you have described your app's command > > line(s) correctly to Swift and that the app is getting invoked and > > returning output correctly. > > > > - then, with help form Ketan as needed, start scaling up to > > increasingly larger runs. > > > > I'll try to stay close in the loop and help out as needed. > > > > Do you have any questions I can answer to get started? If you are at > > Argonne and available today, perhaps I can join you and Ketan in an > > introductory meeting. Im free from 3 to 4:40 today or after 5:30. > > Otherwise, pelase do this at your joint conveniences. > > > > Regards, > > > > - Mike > > > > > > > > > > ----- Original Message ----- > >> From: "Fangfang Xia" < fangfang.xia at gmail.com > > >> To: "Michael Wilde" < wilde at mcs.anl.gov > > >> Sent: Monday, October 31, 2011 12:44:23 PM > >> Subject: Re: How is install/test of Model SEED on Beagle going? > >> Hi Mike, > >> > >> We got two types of flux balance analysis to run on beagle. I was > >> wondering if we should test them with Swift to see if things scale. > >> Both operations take about 40 seconds to run on sandbox. Ideally we > >> should also test two more expensive computation "fba single > >> knockouts" > >> and "gapfilling", but I won't be able to resolve the problems with > >> those until I meet with Chris this week. > >> > >> source /lustre/beagle/fangfang/Model-SEED-core/bin/source-me.sh > >> > >> fbacheckgrowth -model iJR904.16242 > >> fbafva -model iJR904.16242 > >> > >> You can find the descriptions of these tools at: > >> http://bionet.mcs.anl.gov/index.php/Using_the_Model_SEED > >> > >> I've been switching between PrgEnv-pgi/gcc to get perl modules and > >> mfatoolkit to compile. And I still seem to be getting the cc1plus > >> error with gcc which you don't have. So if this version doesn't > >> work > >> well on multiple processors, I'll need your help with recompiling > >> my > >> updated mfatoolkit in > >> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit. > >> > >> I have 777'ed my /lustre/beagle/fangfang/ModelSEED/ directory in > >> case > >> you need to test something there. > >> > >> Thanks, > >> Fangfang > >> > >> On Oct 24, 2011, at 11:14 PM, Michael Wilde wrote: > >> > >>> Hi Fangfang, > >>> > >>> I was able to build that directory using the gcc module; I past > >>> the > >>> make output below. It gave many warnings, but I did not get the > >>> cc1plus libmpc.so error that you encountered. > >>> > >>> My build is in $HOME/wilde/mfatoolkit > >>> > >>> I ran this on sandbox.beagle.ci.uchicago.edu . > >>> > >>> - Mike > >>> > >>> ---- make output: > >>> > >>> sandbox$ make > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/driver.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/MFAProblem.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function > >>> 'int MFAProblem::ModifyInputConstraints(ConstraintsToModify*, > >>> Data*)': > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1828:16: warning: > >>> NULL > >>> used in arithmetic > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1832:16: warning: > >>> NULL > >>> used in arithmetic > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function > >>> 'int MFAProblem::FluxCouplingAnalysis(Data*, > >>> OptimizationParameter*, > >>> bool, std::string&, bool)': > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:4900:46: warning: > >>> converting 'false' to pointer type for argument 1 of > >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const > >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = > >>> std::char_traits, _Alloc = std::allocator]' > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5023:47: warning: > >>> converting 'false' to pointer type for argument 1 of > >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const > >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = > >>> std::char_traits, _Alloc = std::allocator]' > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5212:47: warning: > >>> converting 'false' to pointer type for argument 1 of > >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const > >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = > >>> std::char_traits, _Alloc = std::allocator]' > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function > >>> 'int MFAProblem::IdentifyReactionLoops(Data*, > >>> OptimizationParameter*)': > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5989:43: warning: > >>> converting 'false' to pointer type for argument 1 of > >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const > >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = > >>> std::char_traits, _Alloc = std::allocator]' > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function > >>> 'int MFAProblem::ParseRegExp(OptimizationParameter*, Data*, > >>> std::string)': > >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:7984:10: warning: > >>> converting to non-pointer type 'int' from NULL > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/CPLEXapi.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/SCIPapi.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/GLPKapi.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/SolverInterface.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/Species.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> /home/wilde/mfatoolkit/Source/Species.cpp: In member function > >>> 'void > >>> Species::AddpKab(std::string, bool)': > >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing > >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, > >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = > >>> std::allocator, value_type = int]' > >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing > >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, > >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = > >>> std::allocator, value_type = int]' > >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing > >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, > >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = > >>> std::allocator, value_type = int]' > >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing > >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, > >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = > >>> std::allocator, value_type = int]' > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/Data.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> /home/wilde/mfatoolkit/Source/Data.cpp:2220:43: warning: > >>> multi-character character constant > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/InterfaceFunctions.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/Identity.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/Reaction.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/GlobalFunctions.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/AtomCPP.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/UtilityFunctions.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/AtomType.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/Gene.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/GeneInterval.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -c /home/wilde/mfatoolkit/Source/stringDB.cpp; mv *.o > >>> /home/wilde/mfatoolkit/Source > >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD > >>> -DLINUX > >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include > >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ > >>> -o /home/wilde/mfatoolkit/Linux/mfatoolkit > >>> /home/wilde/mfatoolkit/Source/driver.o > >>> /home/wilde/mfatoolkit/Source/MFAProblem.o > >>> /home/wilde/mfatoolkit/Source/CPLEXapi.o > >>> /home/wilde/mfatoolkit/Source/SCIPapi.o > >>> /home/wilde/mfatoolkit/Source/GLPKapi.o > >>> /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.o > >>> /home/wilde/mfatoolkit/Source/SolverInterface.o > >>> /home/wilde/mfatoolkit/Source/Species.o > >>> /home/wilde/mfatoolkit/Source/Data.o > >>> /home/wilde/mfatoolkit/Source/InterfaceFunctions.o > >>> /home/wilde/mfatoolkit/Source/Identity.o > >>> /home/wilde/mfatoolkit/Source/Reaction.o > >>> /home/wilde/mfatoolkit/Source/GlobalFunctions.o > >>> /home/wilde/mfatoolkit/Source/AtomCPP.o > >>> /home/wilde/mfatoolkit/Source/UtilityFunctions.o > >>> /home/wilde/mfatoolkit/Source/AtomType.o > >>> /home/wilde/mfatoolkit/Source/Gene.o > >>> /home/wilde/mfatoolkit/Source/GeneInterval.o > >>> /home/wilde/mfatoolkit/Source/stringDB.o > >>> -L/lustre/beagle/fangfang/ModelSEED/Solvers/lib -lglpk > >>> -L/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/lib/x86-64_sles10_4.1/static_pic > >>> -lcplex -lm -lpthread -lz > >>> sandbox$ > >>> > >>> > >>> ----- Original Message ----- > >>>> From: "Fangfang Xia" < fangfang.xia at gmail.com > > >>>> To: "Michael Wilde" < wilde at mcs.anl.gov > > >>>> Cc: "Scott Devoid" < devoid at ci.uchicago.edu > > >>>> Sent: Monday, October 24, 2011 5:20:20 PM > >>>> Subject: Re: How is install/test of Model SEED on Beagle going? > >>>> Hi Mike, > >>>> > >>>> This is very helpful. Thanks for pointing out the difference > >>>> between > >>>> PrgEnv-pgi and gcc. Here's an error message we got when trying to > >>>> compile our core c++ code. > >>>> > >>>> /opt/gcc/4.5.2/snos/libexec/gcc/x86_64-suse-linux/default/cc1plus: > >>>> error while loading shared libraries: libmpc.so.2: cannot open > >>>> shared > >>>> object file: No such file or directory > >>>> > >>>> It looks like something is wrong with cc1plus. I suppose it's > >>>> part > >>>> of > >>>> the g++? I don't know what it does. > >>>> > >>>> So we resolved the perl dependency issues, and we were able to > >>>> compile > >>>> the code with the default PrgEnv-pgi just for testing purposes. > >>>> It > >>>> seems we still have some issues with our new pipeline code. But I > >>>> don't think we are very far from giving you a running example. > >>>> > >>>> Just in case you could help us with the gcc compilation issue, I > >>>> have > >>>> 777'ed my directory and here's the steps to compile the core C++ > >>>> code: > >>>> > >>>> source > >>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/bin/source-me.sh > >>>> cd > >>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit/Linux > >>>> make > >>>> > >>>> Thanks, > >>>> Fangfang > >>>> > >>>> On Oct 22, 2011, at 1:10 PM, Michael Wilde wrote: > >>>> > >>>>> Sounds great, thanks for the update, Fangfang. > >>>>> > >>>>> One question: what compiler are you using? > >>>>> > >>>>> I'd like to suggest, for the first pass, that you use the "gcc" > >>>>> module (rather than the PrgEnv-pgi or PrgEnv-gnu modules). Thats > >>>>> because the GCC module will create code that we can run in > >>>>> parallel, > >>>>> multiple programs in parallel per compute node. The PrgEnv > >>>>> modules > >>>>> all create code that expects to run only one program per node, > >>>>> because its meant for MPI, OpenMP, etc). > >>>>> > >>>>> Also, I think that the gcc module (which I think includes gcc, > >>>>> g++ > >>>>> and gfortran) may be more like the traditional Linux gcc than > >>>>> PrgEnv-gnu. > >>>>> > >>>>> The default PrgEnv (at least for me) is pgi. So before i build > >>>>> software I do: > >>>>> > >>>>> module unload PrgEnv-pgi > >>>>> module load gcc > >>>>> > >>>>> Let me know if I can help; if you want i can try to build you a > >>>>> libxml2 using gcc. > >>>>> Same for Perl if it needs to be executed multiple copies per > >>>>> node > >>>>> in > >>>>> parallel. > >>>>> > >>>>> We can discuss more next week, and I'll be working off and on > >>>>> this > >>>>> weekend. > >>>>> > >>>>> Regards, > >>>>> > >>>>> - Mike > >>>>> > >>>>> > >>>>> ----- Original Message ----- > >>>>>> From: "Fangfang Xia" < fangfang.xia at gmail.com > > >>>>>> To: "Michael Wilde" < wilde at mcs.anl.gov > > >>>>>> Cc: "Fangfang Xia" < fangfang at uchicago.edu >, "Scott Devoid" > >>>>>> < devoid at ci.uchicago.edu > > >>>>>> Sent: Saturday, October 22, 2011 12:39:05 PM > >>>>>> Subject: Re: How is install/test of Model SEED on Beagle going? > >>>>>> Hi Mike, > >>>>>> > >>>>>> We encountered some dependency issues while attempting to > >>>>>> install > >>>>>> some > >>>>>> additional Perl libraries for ModelSEED. We have asked Beagle > >>>>>> systems > >>>>>> folks to help install libxml2. I'm also looking into ways to > >>>>>> install > >>>>>> it in a user directory. I get the feeling that things should be > >>>>>> resolved after our group meeting on Monday. So we'll keep you > >>>>>> posted. > >>>>>> > >>>>>> Thanks, > >>>>>> > >>>>>> Fangfang > >>>>>> > >>>>>> On Oct 21, 2011, at 2:08 PM, Michael Wilde wrote: > >>>>>> > >>>>>>> Hi Fangfang, Scott, > >>>>>>> > >>>>>>> Any progress - can I try it soon? > >>>>>>> > >>>>>>> Or, any problems that I can help with? Im at Argonne today > >>>>>>> (5141) > >>>>>>> if > >>>>>>> I can help or you'd like to talk. Free except for 3:30 - 4:30. > >>>>>>> > >>>>>>> Regards, > >>>>>>> > >>>>>>> - Mike > >>>>>>> > >>>>>>> > >>>>>>> -- > >>>>>>> Michael Wilde > >>>>>>> Computation Institute, University of Chicago > >>>>>>> Mathematics and Computer Science Division > >>>>>>> Argonne National Laboratory > >>>>>>> > >>>>> > >>>>> -- > >>>>> Michael Wilde > >>>>> Computation Institute, University of Chicago > >>>>> Mathematics and Computer Science Division > >>>>> Argonne National Laboratory > >>>>> > >>> > >>> -- > >>> Michael Wilde > >>> Computation Institute, University of Chicago > >>> Mathematics and Computer Science Division > >>> Argonne National Laboratory > >>> > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > -- > Ketan > > > > > > > > > -- > Ketan > > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From hategan at mcs.anl.gov Sat Nov 12 15:00:26 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sat, 12 Nov 2011 13:00:26 -0800 Subject: [Swift-devel] extending swift with workflow optimization algorithms In-Reply-To: <20110708152527.12700w6kenz5kbav@webmail.auth.gr> References: <20110708152527.12700w6kenz5kbav@webmail.auth.gr> Message-ID: <1321131626.2898.3.camel@blabla> This paper presents some heuristics on that topic: http://www.ci.uchicago.edu/swift/papers/jogc_03.pdf Mihael On Fri, 2011-07-08 at 15:25 +0300, Efthymia Tsamoura wrote: > Hello > > I am a phd student and during this period i am dealing with workflow > optimization problems in distributed environments. I would like to > ask, if there are exist any cases where if the order of task > invocation in a scientific workflow changes its performance changes > too without, however, affecting the produced results. In the > following, a present a small use case of the problem i am interested in: > > Suppose that a company wants to obtain a list of email addresses of > potential customers selecting only those who have a good payment > history for at least one card and a credit rating above some > threshold. The company has the right to use the following web services > > WS1 : SSN id (ssn, threshold) -> credit rating (cr) > WS2 : SSN id (ssn) -> credit card numbers (ccn) > WS3 : card number (ccn, good) -> good history (gph) > WS4 : SSN id (ssn) -> email addresses (ea) > > The input data containing customer identifiers (ssn) and other > relevant information is stored in a local data resource. Two possible > web service linear workflows that can be formed to process the input > data using the above services are C1 = WS2,WS3,WS1,WS4 and C2 = > WS1,WS2,WS3,WS4. In the first workflow, first, the customers having a > good payment history are initially selected (WS2,WS3), and then, the > remaining customers whose credit history is below some threshold are > filtered out (through WS1). The C2 workflow performs the same tasks in > a reverse order. The above linear workflows may have different > performance; if WS3 filters out more data than WS1, then it will be > more beneficial to invoke WS3 before WS1 in order for the subsequent > web services in the workflow to process less data. > > It would be very useful to know if there exist similar scientific > workflow examples (where users have many options for ordering the > workflow tasks but cannot decide which task ordering to use, while the > workflow performance depends on the workflow task invocation order) > and if you are interested in extending swift with optimization > algorithms for such workflows. > > I am asking because i have recently developed an optimization > algorithm for this problem and i would like to test its performance in > a real-world workflow management system with real-world workflows. > > P.S.: references to publications or any other information dealing with > scientific workflows of the above rationale will be extremely useful. > > Thank you very much for your time > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From ketancmaheshwari at gmail.com Sat Nov 12 15:43:52 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Sat, 12 Nov 2011 15:43:52 -0600 Subject: [Swift-devel] User has problem with PBS provider on Beagle - Fwd: Swift question In-Reply-To: <832E99F6-26FF-40E4-81C1-DCB19728D339@gmail.com> References: <818D3B69-4716-40B9-B4CD-422A62D56787@gmail.com> <1285627078.23656.1321126262196.JavaMail.root@zimbra.anl.gov> <7C4E0EA1-9DC4-404A-8C4A-16B9B011EADF@gmail.com> <832E99F6-26FF-40E4-81C1-DCB19728D339@gmail.com> Message-ID: Hello Fangfang, Sorry, I made a mistake in the new line, in place of key="ppn", it should be key="providerAttributes". So the line should be as follows: pbs.aprun;pbs.mpp;depth=24 I just tested this on Beagle and it works now. Regards, Ketan On Sat, Nov 12, 2011 at 2:30 PM, Fangfang Xia wrote: > Thanks. The "Illegal value for ppn" line seems to persist in the log. > > > > On Nov 12, 2011, at 2:21 PM, Ketan Maheshwari wrote: > > Hello Fangfang, > > Could you replace the following line: > 24:cray:pack > > with this one: > pbs.aprun;pbs.mpp;depth=24 > > in your sites.xml. > > The line you have is obsoleted form from the 0.92 version of Swift. > > It should work now. > > Regards, > Ketan > > > On Sat, Nov 12, 2011 at 2:07 PM, Fangfang Xia wrote: > >> Hi Ketan, >> >> Thanks for getting back to me so promptly. I have attached the log file, >> and here's the content of sites.xml: >> >> >> >> >> CI-DEB000002 >> >> 24:cray:pack >> >> 24 >> 1000 >> 1 >> 1 >> 1 >> >> .63 >> 10000 >> >> >> > >/lustre/beagle/fangfang/swift-lab/swift.workdir >> >> >> >> There's no error message on the command line. >> >> >> >> On Nov 12, 2011, at 2:02 PM, Ketan Maheshwari wrote: >> >> Hello Fangfang, >> >> The log file does not seem to be found. Could you attach it please. >> >> From this line: >> Illegal value for ppn. Must be an integer. >> >> Looks like the sites file is not configured well for the pbs provider. >> Could you post your sites.xml. >> >> Were there any error messages on commandline? >> >> Regards, >> Ketan >> >> On Sat, Nov 12, 2011 at 1:31 PM, Michael Wilde wrote: >> >>> Can the first person who has time try to address the problem below? >>> Im about to head to SC. >>> >>> Thanks, >>> >>> - Mike >>> >>> >>> ----- Forwarded Message ----- >>> From: "Fangfang Xia" >>> To: "Michael Wilde" >>> Cc: "Ketan Maheshwari" , "Scott Devoid" < >>> devoid at ci.uchicago.edu> >>> Sent: Saturday, November 12, 2011 1:27:29 PM >>> Subject: Swift question >>> >>> Hi Mike and Ketan, >>> >>> Thanks for the guide. I tried to follow the "cat" example, and got the >>> following error: >>> >>> 2011-11-12 19:21:06,510+0000 DEBUG AbstractExecutor Writing PBS script >>> to /home/fangfang/.globus/scripts/PBS6954924010553344333.submit >>> 2011-11-12 19:21:06,521+0000 DEBUG PBSExecutor PBS name: for: >>> Block-1112-210706-000000 is: Block-1112-2107 >>> 2011-11-12 19:21:06,521+0000 INFO BlockTaskSubmitter Error submitting >>> block task: Cannot submit job: Illegal value for ppn. Must be an integer. >>> 2011-11-12 19:21:16,429+0000 INFO TaskNotifier Congestion queue size: 0 >>> >>> I looked at the PBS script and somehow it's blank. I have attached the >>> full log file. Could you please take a look and let me know how to proceed? >>> >>> Thanks, >>> >>> Fangfang >>> >>> On Nov 8, 2011, at 12:42 PM, Michael Wilde wrote: >>> >>> > Hi Fangfang, Scott, >>> > >>> > Sorry for the late reply! I think the best roadmap to follow is this: >>> > >>> > - try running the sample tutorial Swift script on Beagle using the >>> instructions posted at: >>> > >>> > >>> http://www.ci.uchicago.edu/swift/wwwdev/guides/release-0.93/siteguide/siteguide.html#_beagle >>> > >>> > This tiny tutorial contains a simple Swift script that does N "cat" >>> commends in parallel to "process" an input file and create an output file. >>> It contains all the related config files you need to run on Beagle, and is >>> thus a good "Hello World" application. You can then copy catsn.swift to >>> create the first Swift script to run your actual applications. >>> > >>> > - set up a face to face meeting with Ketan Maheshwari, the Beagle >>> Catalyst for Swift applications. Ketan is based here at Argonne, on the 5th >>> floor near my office, 5141. Ketan can help answer any questions you have, >>> and will be your personal contact to help you make good use of Beagle. >>> > >>> > - then do your first Model-SEED script based on catsn.swift, first >>> with N = 1 to just ensure that you have described your app's command >>> line(s) correctly to Swift and that the app is getting invoked and >>> returning output correctly. >>> > >>> > - then, with help form Ketan as needed, start scaling up to >>> increasingly larger runs. >>> > >>> > I'll try to stay close in the loop and help out as needed. >>> > >>> > Do you have any questions I can answer to get started? If you are at >>> Argonne and available today, perhaps I can join you and Ketan in an >>> introductory meeting. Im free from 3 to 4:40 today or after 5:30. >>> Otherwise, pelase do this at your joint conveniences. >>> > >>> > Regards, >>> > >>> > - Mike >>> > >>> > >>> > >>> > >>> > ----- Original Message ----- >>> >> From: "Fangfang Xia" >>> >> To: "Michael Wilde" >>> >> Sent: Monday, October 31, 2011 12:44:23 PM >>> >> Subject: Re: How is install/test of Model SEED on Beagle going? >>> >> Hi Mike, >>> >> >>> >> We got two types of flux balance analysis to run on beagle. I was >>> >> wondering if we should test them with Swift to see if things scale. >>> >> Both operations take about 40 seconds to run on sandbox. Ideally we >>> >> should also test two more expensive computation "fba single knockouts" >>> >> and "gapfilling", but I won't be able to resolve the problems with >>> >> those until I meet with Chris this week. >>> >> >>> >> source /lustre/beagle/fangfang/Model-SEED-core/bin/source-me.sh >>> >> >>> >> fbacheckgrowth -model iJR904.16242 >>> >> fbafva -model iJR904.16242 >>> >> >>> >> You can find the descriptions of these tools at: >>> >> http://bionet.mcs.anl.gov/index.php/Using_the_Model_SEED >>> >> >>> >> I've been switching between PrgEnv-pgi/gcc to get perl modules and >>> >> mfatoolkit to compile. And I still seem to be getting the cc1plus >>> >> error with gcc which you don't have. So if this version doesn't work >>> >> well on multiple processors, I'll need your help with recompiling my >>> >> updated mfatoolkit in >>> >> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit. >>> >> >>> >> I have 777'ed my /lustre/beagle/fangfang/ModelSEED/ directory in case >>> >> you need to test something there. >>> >> >>> >> Thanks, >>> >> Fangfang >>> >> >>> >> On Oct 24, 2011, at 11:14 PM, Michael Wilde wrote: >>> >> >>> >>> Hi Fangfang, >>> >>> >>> >>> I was able to build that directory using the gcc module; I past the >>> >>> make output below. It gave many warnings, but I did not get the >>> >>> cc1plus libmpc.so error that you encountered. >>> >>> >>> >>> My build is in $HOME/wilde/mfatoolkit >>> >>> >>> >>> I ran this on sandbox.beagle.ci.uchicago.edu. >>> >>> >>> >>> - Mike >>> >>> >>> >>> ---- make output: >>> >>> >>> >>> sandbox$ make >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/driver.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/MFAProblem.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>> >>> 'int MFAProblem::ModifyInputConstraints(ConstraintsToModify*, >>> >>> Data*)': >>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1828:16: warning: NULL >>> >>> used in arithmetic >>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1832:16: warning: NULL >>> >>> used in arithmetic >>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>> >>> 'int MFAProblem::FluxCouplingAnalysis(Data*, OptimizationParameter*, >>> >>> bool, std::string&, bool)': >>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:4900:46: warning: >>> >>> converting 'false' to pointer type for argument 1 of >>> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>> >>> std::char_traits, _Alloc = std::allocator]' >>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5023:47: warning: >>> >>> converting 'false' to pointer type for argument 1 of >>> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>> >>> std::char_traits, _Alloc = std::allocator]' >>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5212:47: warning: >>> >>> converting 'false' to pointer type for argument 1 of >>> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>> >>> std::char_traits, _Alloc = std::allocator]' >>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>> >>> 'int MFAProblem::IdentifyReactionLoops(Data*, >>> >>> OptimizationParameter*)': >>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5989:43: warning: >>> >>> converting 'false' to pointer type for argument 1 of >>> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>> >>> std::char_traits, _Alloc = std::allocator]' >>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>> >>> 'int MFAProblem::ParseRegExp(OptimizationParameter*, Data*, >>> >>> std::string)': >>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:7984:10: warning: >>> >>> converting to non-pointer type 'int' from NULL >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/CPLEXapi.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/SCIPapi.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/GLPKapi.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/SolverInterface.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/Species.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> /home/wilde/mfatoolkit/Source/Species.cpp: In member function 'void >>> >>> Species::AddpKab(std::string, bool)': >>> >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing >>> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>> >>> std::allocator, value_type = int]' >>> >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing >>> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>> >>> std::allocator, value_type = int]' >>> >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing >>> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>> >>> std::allocator, value_type = int]' >>> >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing >>> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>> >>> std::allocator, value_type = int]' >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/Data.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> /home/wilde/mfatoolkit/Source/Data.cpp:2220:43: warning: >>> >>> multi-character character constant >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/InterfaceFunctions.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/Identity.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/Reaction.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/GlobalFunctions.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/AtomCPP.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/UtilityFunctions.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/AtomType.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/Gene.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/GeneInterval.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/stringDB.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -o /home/wilde/mfatoolkit/Linux/mfatoolkit >>> >>> /home/wilde/mfatoolkit/Source/driver.o >>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.o >>> >>> /home/wilde/mfatoolkit/Source/CPLEXapi.o >>> >>> /home/wilde/mfatoolkit/Source/SCIPapi.o >>> >>> /home/wilde/mfatoolkit/Source/GLPKapi.o >>> >>> /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.o >>> >>> /home/wilde/mfatoolkit/Source/SolverInterface.o >>> >>> /home/wilde/mfatoolkit/Source/Species.o >>> >>> /home/wilde/mfatoolkit/Source/Data.o >>> >>> /home/wilde/mfatoolkit/Source/InterfaceFunctions.o >>> >>> /home/wilde/mfatoolkit/Source/Identity.o >>> >>> /home/wilde/mfatoolkit/Source/Reaction.o >>> >>> /home/wilde/mfatoolkit/Source/GlobalFunctions.o >>> >>> /home/wilde/mfatoolkit/Source/AtomCPP.o >>> >>> /home/wilde/mfatoolkit/Source/UtilityFunctions.o >>> >>> /home/wilde/mfatoolkit/Source/AtomType.o >>> >>> /home/wilde/mfatoolkit/Source/Gene.o >>> >>> /home/wilde/mfatoolkit/Source/GeneInterval.o >>> >>> /home/wilde/mfatoolkit/Source/stringDB.o >>> >>> -L/lustre/beagle/fangfang/ModelSEED/Solvers/lib -lglpk >>> >>> >>> -L/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/lib/x86-64_sles10_4.1/static_pic >>> >>> -lcplex -lm -lpthread -lz >>> >>> sandbox$ >>> >>> >>> >>> >>> >>> ----- Original Message ----- >>> >>>> From: "Fangfang Xia" >>> >>>> To: "Michael Wilde" >>> >>>> Cc: "Scott Devoid" >>> >>>> Sent: Monday, October 24, 2011 5:20:20 PM >>> >>>> Subject: Re: How is install/test of Model SEED on Beagle going? >>> >>>> Hi Mike, >>> >>>> >>> >>>> This is very helpful. Thanks for pointing out the difference >>> >>>> between >>> >>>> PrgEnv-pgi and gcc. Here's an error message we got when trying to >>> >>>> compile our core c++ code. >>> >>>> >>> >>>> /opt/gcc/4.5.2/snos/libexec/gcc/x86_64-suse-linux/default/cc1plus: >>> >>>> error while loading shared libraries: libmpc.so.2: cannot open >>> >>>> shared >>> >>>> object file: No such file or directory >>> >>>> >>> >>>> It looks like something is wrong with cc1plus. I suppose it's part >>> >>>> of >>> >>>> the g++? I don't know what it does. >>> >>>> >>> >>>> So we resolved the perl dependency issues, and we were able to >>> >>>> compile >>> >>>> the code with the default PrgEnv-pgi just for testing purposes. It >>> >>>> seems we still have some issues with our new pipeline code. But I >>> >>>> don't think we are very far from giving you a running example. >>> >>>> >>> >>>> Just in case you could help us with the gcc compilation issue, I >>> >>>> have >>> >>>> 777'ed my directory and here's the steps to compile the core C++ >>> >>>> code: >>> >>>> >>> >>>> source >>> >>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/bin/source-me.sh >>> >>>> cd >>> >>>> >>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit/Linux >>> >>>> make >>> >>>> >>> >>>> Thanks, >>> >>>> Fangfang >>> >>>> >>> >>>> On Oct 22, 2011, at 1:10 PM, Michael Wilde wrote: >>> >>>> >>> >>>>> Sounds great, thanks for the update, Fangfang. >>> >>>>> >>> >>>>> One question: what compiler are you using? >>> >>>>> >>> >>>>> I'd like to suggest, for the first pass, that you use the "gcc" >>> >>>>> module (rather than the PrgEnv-pgi or PrgEnv-gnu modules). Thats >>> >>>>> because the GCC module will create code that we can run in >>> >>>>> parallel, >>> >>>>> multiple programs in parallel per compute node. The PrgEnv modules >>> >>>>> all create code that expects to run only one program per node, >>> >>>>> because its meant for MPI, OpenMP, etc). >>> >>>>> >>> >>>>> Also, I think that the gcc module (which I think includes gcc, g++ >>> >>>>> and gfortran) may be more like the traditional Linux gcc than >>> >>>>> PrgEnv-gnu. >>> >>>>> >>> >>>>> The default PrgEnv (at least for me) is pgi. So before i build >>> >>>>> software I do: >>> >>>>> >>> >>>>> module unload PrgEnv-pgi >>> >>>>> module load gcc >>> >>>>> >>> >>>>> Let me know if I can help; if you want i can try to build you a >>> >>>>> libxml2 using gcc. >>> >>>>> Same for Perl if it needs to be executed multiple copies per node >>> >>>>> in >>> >>>>> parallel. >>> >>>>> >>> >>>>> We can discuss more next week, and I'll be working off and on this >>> >>>>> weekend. >>> >>>>> >>> >>>>> Regards, >>> >>>>> >>> >>>>> - Mike >>> >>>>> >>> >>>>> >>> >>>>> ----- Original Message ----- >>> >>>>>> From: "Fangfang Xia" >>> >>>>>> To: "Michael Wilde" >>> >>>>>> Cc: "Fangfang Xia" , "Scott Devoid" >>> >>>>>> >>> >>>>>> Sent: Saturday, October 22, 2011 12:39:05 PM >>> >>>>>> Subject: Re: How is install/test of Model SEED on Beagle going? >>> >>>>>> Hi Mike, >>> >>>>>> >>> >>>>>> We encountered some dependency issues while attempting to install >>> >>>>>> some >>> >>>>>> additional Perl libraries for ModelSEED. We have asked Beagle >>> >>>>>> systems >>> >>>>>> folks to help install libxml2. I'm also looking into ways to >>> >>>>>> install >>> >>>>>> it in a user directory. I get the feeling that things should be >>> >>>>>> resolved after our group meeting on Monday. So we'll keep you >>> >>>>>> posted. >>> >>>>>> >>> >>>>>> Thanks, >>> >>>>>> >>> >>>>>> Fangfang >>> >>>>>> >>> >>>>>> On Oct 21, 2011, at 2:08 PM, Michael Wilde wrote: >>> >>>>>> >>> >>>>>>> Hi Fangfang, Scott, >>> >>>>>>> >>> >>>>>>> Any progress - can I try it soon? >>> >>>>>>> >>> >>>>>>> Or, any problems that I can help with? Im at Argonne today >>> >>>>>>> (5141) >>> >>>>>>> if >>> >>>>>>> I can help or you'd like to talk. Free except for 3:30 - 4:30. >>> >>>>>>> >>> >>>>>>> Regards, >>> >>>>>>> >>> >>>>>>> - Mike >>> >>>>>>> >>> >>>>>>> >>> >>>>>>> -- >>> >>>>>>> Michael Wilde >>> >>>>>>> Computation Institute, University of Chicago >>> >>>>>>> Mathematics and Computer Science Division >>> >>>>>>> Argonne National Laboratory >>> >>>>>>> >>> >>>>> >>> >>>>> -- >>> >>>>> Michael Wilde >>> >>>>> Computation Institute, University of Chicago >>> >>>>> Mathematics and Computer Science Division >>> >>>>> Argonne National Laboratory >>> >>>>> >>> >>> >>> >>> -- >>> >>> Michael Wilde >>> >>> Computation Institute, University of Chicago >>> >>> Mathematics and Computer Science Division >>> >>> Argonne National Laboratory >>> >>> >>> > >>> > -- >>> > Michael Wilde >>> > Computation Institute, University of Chicago >>> > Mathematics and Computer Science Division >>> > Argonne National Laboratory >>> > >>> >>> >>> -- >>> Michael Wilde >>> Computation Institute, University of Chicago >>> Mathematics and Computer Science Division >>> Argonne National Laboratory >>> >>> _______________________________________________ >>> Swift-devel mailing list >>> Swift-devel at ci.uchicago.edu >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>> >> >> >> >> -- >> Ketan >> >> >> >> >> > > > -- > Ketan > > > > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From fangfang.xia at gmail.com Sat Nov 12 15:55:15 2011 From: fangfang.xia at gmail.com (Fangfang Xia) Date: Sat, 12 Nov 2011 15:55:15 -0600 Subject: [Swift-devel] User has problem with PBS provider on Beagle - Fwd: Swift question In-Reply-To: References: <818D3B69-4716-40B9-B4CD-422A62D56787@gmail.com> <1285627078.23656.1321126262196.JavaMail.root@zimbra.anl.gov> <7C4E0EA1-9DC4-404A-8C4A-16B9B011EADF@gmail.com> <832E99F6-26FF-40E4-81C1-DCB19728D339@gmail.com> Message-ID: <2F4CFB07-EADD-4A05-8D99-AFC3E7D99329@gmail.com> Thanks Mike and Ketan. I think it's nice that Swift is preinstalled on Beagle. When I do "module load swift", I get "Swift version swift-0.93RC3 loaded"; is that the latest Swift? This time I get a command line error: login2:catsn > swift -config cf -tc.file tc -sites.file sites.xml catsn.swift -n=2 Swift svn swift-r5205 cog-r3293 RunID: 20111112-2151-rbu7ivof Progress: time: Sat, 12 Nov 2011 21:51:47 +0000 Progress: time: Sat, 12 Nov 2011 21:51:59 +0000 Submitted:1 Active:1 Progress: time: Sat, 12 Nov 2011 21:52:10 +0000 Submitted:1 Active:1 Exception in cat: Arguments: [data.txt] Host: pbs Directory: catsn-20111112-2151-rbu7ivof/jobs/v/cat-vbgmznik - - - Caused by: Task failed: 1112-510947-000001 Block task ended prematurely Exception in cat: Arguments: [data.txt] Host: pbs Directory: catsn-20111112-2151-rbu7ivof/jobs/t/cat-tbgmznik - - - Caused by: Task failed: 1112-510947-000001 Block task ended prematurely Final status: time: Sat, 12 Nov 2011 21:52:10 +0000 Failed:2 The following errors have occurred: 1. Task failed: 1112-510947-000001 Block task ended prematurely (2 times) On Nov 12, 2011, at 3:43 PM, Ketan Maheshwari wrote: > Hello Fangfang, > > Sorry, I made a mistake in the new line, in place of key="ppn", it should be key="providerAttributes". > > So the line should be as follows: > > pbs.aprun;pbs.mpp;depth=24 > > I just tested this on Beagle and it works now. > > Regards, > Ketan > > On Sat, Nov 12, 2011 at 2:30 PM, Fangfang Xia wrote: > Thanks. The "Illegal value for ppn" line seems to persist in the log. > > > > On Nov 12, 2011, at 2:21 PM, Ketan Maheshwari wrote: > >> Hello Fangfang, >> >> Could you replace the following line: >> 24:cray:pack >> >> with this one: >> pbs.aprun;pbs.mpp;depth=24 >> >> in your sites.xml. >> >> The line you have is obsoleted form from the 0.92 version of Swift. >> >> It should work now. >> >> Regards, >> Ketan >> >> >> On Sat, Nov 12, 2011 at 2:07 PM, Fangfang Xia wrote: >> Hi Ketan, >> >> Thanks for getting back to me so promptly. I have attached the log file, and here's the content of sites.xml: >> >> >> >> >> CI-DEB000002 >> >> 24:cray:pack >> >> 24 >> 1000 >> 1 >> 1 >> 1 >> >> .63 >> 10000 >> >> >> /lustre/beagle/fangfang/swift-lab/swift.workdir >> >> >> >> There's no error message on the command line. >> >> >> >> On Nov 12, 2011, at 2:02 PM, Ketan Maheshwari wrote: >> >>> Hello Fangfang, >>> >>> The log file does not seem to be found. Could you attach it please. >>> >>> From this line: >>> Illegal value for ppn. Must be an integer. >>> >>> Looks like the sites file is not configured well for the pbs provider. Could you post your sites.xml. >>> >>> Were there any error messages on commandline? >>> >>> Regards, >>> Ketan >>> >>> On Sat, Nov 12, 2011 at 1:31 PM, Michael Wilde wrote: >>> Can the first person who has time try to address the problem below? >>> Im about to head to SC. >>> >>> Thanks, >>> >>> - Mike >>> >>> >>> ----- Forwarded Message ----- >>> From: "Fangfang Xia" >>> To: "Michael Wilde" >>> Cc: "Ketan Maheshwari" , "Scott Devoid" >>> Sent: Saturday, November 12, 2011 1:27:29 PM >>> Subject: Swift question >>> >>> Hi Mike and Ketan, >>> >>> Thanks for the guide. I tried to follow the "cat" example, and got the following error: >>> >>> 2011-11-12 19:21:06,510+0000 DEBUG AbstractExecutor Writing PBS script to /home/fangfang/.globus/scripts/PBS6954924010553344333.submit >>> 2011-11-12 19:21:06,521+0000 DEBUG PBSExecutor PBS name: for: Block-1112-210706-000000 is: Block-1112-2107 >>> 2011-11-12 19:21:06,521+0000 INFO BlockTaskSubmitter Error submitting block task: Cannot submit job: Illegal value for ppn. Must be an integer. >>> 2011-11-12 19:21:16,429+0000 INFO TaskNotifier Congestion queue size: 0 >>> >>> I looked at the PBS script and somehow it's blank. I have attached the full log file. Could you please take a look and let me know how to proceed? >>> >>> Thanks, >>> >>> Fangfang >>> >>> On Nov 8, 2011, at 12:42 PM, Michael Wilde wrote: >>> >>> > Hi Fangfang, Scott, >>> > >>> > Sorry for the late reply! I think the best roadmap to follow is this: >>> > >>> > - try running the sample tutorial Swift script on Beagle using the instructions posted at: >>> > >>> > http://www.ci.uchicago.edu/swift/wwwdev/guides/release-0.93/siteguide/siteguide.html#_beagle >>> > >>> > This tiny tutorial contains a simple Swift script that does N "cat" commends in parallel to "process" an input file and create an output file. It contains all the related config files you need to run on Beagle, and is thus a good "Hello World" application. You can then copy catsn.swift to create the first Swift script to run your actual applications. >>> > >>> > - set up a face to face meeting with Ketan Maheshwari, the Beagle Catalyst for Swift applications. Ketan is based here at Argonne, on the 5th floor near my office, 5141. Ketan can help answer any questions you have, and will be your personal contact to help you make good use of Beagle. >>> > >>> > - then do your first Model-SEED script based on catsn.swift, first with N = 1 to just ensure that you have described your app's command line(s) correctly to Swift and that the app is getting invoked and returning output correctly. >>> > >>> > - then, with help form Ketan as needed, start scaling up to increasingly larger runs. >>> > >>> > I'll try to stay close in the loop and help out as needed. >>> > >>> > Do you have any questions I can answer to get started? If you are at Argonne and available today, perhaps I can join you and Ketan in an introductory meeting. Im free from 3 to 4:40 today or after 5:30. Otherwise, pelase do this at your joint conveniences. >>> > >>> > Regards, >>> > >>> > - Mike >>> > >>> > >>> > >>> > >>> > ----- Original Message ----- >>> >> From: "Fangfang Xia" >>> >> To: "Michael Wilde" >>> >> Sent: Monday, October 31, 2011 12:44:23 PM >>> >> Subject: Re: How is install/test of Model SEED on Beagle going? >>> >> Hi Mike, >>> >> >>> >> We got two types of flux balance analysis to run on beagle. I was >>> >> wondering if we should test them with Swift to see if things scale. >>> >> Both operations take about 40 seconds to run on sandbox. Ideally we >>> >> should also test two more expensive computation "fba single knockouts" >>> >> and "gapfilling", but I won't be able to resolve the problems with >>> >> those until I meet with Chris this week. >>> >> >>> >> source /lustre/beagle/fangfang/Model-SEED-core/bin/source-me.sh >>> >> >>> >> fbacheckgrowth -model iJR904.16242 >>> >> fbafva -model iJR904.16242 >>> >> >>> >> You can find the descriptions of these tools at: >>> >> http://bionet.mcs.anl.gov/index.php/Using_the_Model_SEED >>> >> >>> >> I've been switching between PrgEnv-pgi/gcc to get perl modules and >>> >> mfatoolkit to compile. And I still seem to be getting the cc1plus >>> >> error with gcc which you don't have. So if this version doesn't work >>> >> well on multiple processors, I'll need your help with recompiling my >>> >> updated mfatoolkit in >>> >> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit. >>> >> >>> >> I have 777'ed my /lustre/beagle/fangfang/ModelSEED/ directory in case >>> >> you need to test something there. >>> >> >>> >> Thanks, >>> >> Fangfang >>> >> >>> >> On Oct 24, 2011, at 11:14 PM, Michael Wilde wrote: >>> >> >>> >>> Hi Fangfang, >>> >>> >>> >>> I was able to build that directory using the gcc module; I past the >>> >>> make output below. It gave many warnings, but I did not get the >>> >>> cc1plus libmpc.so error that you encountered. >>> >>> >>> >>> My build is in $HOME/wilde/mfatoolkit >>> >>> >>> >>> I ran this on sandbox.beagle.ci.uchicago.edu. >>> >>> >>> >>> - Mike >>> >>> >>> >>> ---- make output: >>> >>> >>> >>> sandbox$ make >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/driver.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/MFAProblem.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>> >>> 'int MFAProblem::ModifyInputConstraints(ConstraintsToModify*, >>> >>> Data*)': >>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1828:16: warning: NULL >>> >>> used in arithmetic >>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1832:16: warning: NULL >>> >>> used in arithmetic >>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>> >>> 'int MFAProblem::FluxCouplingAnalysis(Data*, OptimizationParameter*, >>> >>> bool, std::string&, bool)': >>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:4900:46: warning: >>> >>> converting 'false' to pointer type for argument 1 of >>> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>> >>> std::char_traits, _Alloc = std::allocator]' >>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5023:47: warning: >>> >>> converting 'false' to pointer type for argument 1 of >>> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>> >>> std::char_traits, _Alloc = std::allocator]' >>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5212:47: warning: >>> >>> converting 'false' to pointer type for argument 1 of >>> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>> >>> std::char_traits, _Alloc = std::allocator]' >>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>> >>> 'int MFAProblem::IdentifyReactionLoops(Data*, >>> >>> OptimizationParameter*)': >>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5989:43: warning: >>> >>> converting 'false' to pointer type for argument 1 of >>> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>> >>> std::char_traits, _Alloc = std::allocator]' >>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>> >>> 'int MFAProblem::ParseRegExp(OptimizationParameter*, Data*, >>> >>> std::string)': >>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:7984:10: warning: >>> >>> converting to non-pointer type 'int' from NULL >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/CPLEXapi.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/SCIPapi.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/GLPKapi.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/SolverInterface.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/Species.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> /home/wilde/mfatoolkit/Source/Species.cpp: In member function 'void >>> >>> Species::AddpKab(std::string, bool)': >>> >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing >>> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>> >>> std::allocator, value_type = int]' >>> >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing >>> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>> >>> std::allocator, value_type = int]' >>> >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing >>> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>> >>> std::allocator, value_type = int]' >>> >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing >>> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>> >>> std::allocator, value_type = int]' >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/Data.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> /home/wilde/mfatoolkit/Source/Data.cpp:2220:43: warning: >>> >>> multi-character character constant >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/InterfaceFunctions.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/Identity.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/Reaction.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/GlobalFunctions.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/AtomCPP.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/UtilityFunctions.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/AtomType.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/Gene.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/GeneInterval.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -c /home/wilde/mfatoolkit/Source/stringDB.cpp; mv *.o >>> >>> /home/wilde/mfatoolkit/Source >>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>> >>> -o /home/wilde/mfatoolkit/Linux/mfatoolkit >>> >>> /home/wilde/mfatoolkit/Source/driver.o >>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.o >>> >>> /home/wilde/mfatoolkit/Source/CPLEXapi.o >>> >>> /home/wilde/mfatoolkit/Source/SCIPapi.o >>> >>> /home/wilde/mfatoolkit/Source/GLPKapi.o >>> >>> /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.o >>> >>> /home/wilde/mfatoolkit/Source/SolverInterface.o >>> >>> /home/wilde/mfatoolkit/Source/Species.o >>> >>> /home/wilde/mfatoolkit/Source/Data.o >>> >>> /home/wilde/mfatoolkit/Source/InterfaceFunctions.o >>> >>> /home/wilde/mfatoolkit/Source/Identity.o >>> >>> /home/wilde/mfatoolkit/Source/Reaction.o >>> >>> /home/wilde/mfatoolkit/Source/GlobalFunctions.o >>> >>> /home/wilde/mfatoolkit/Source/AtomCPP.o >>> >>> /home/wilde/mfatoolkit/Source/UtilityFunctions.o >>> >>> /home/wilde/mfatoolkit/Source/AtomType.o >>> >>> /home/wilde/mfatoolkit/Source/Gene.o >>> >>> /home/wilde/mfatoolkit/Source/GeneInterval.o >>> >>> /home/wilde/mfatoolkit/Source/stringDB.o >>> >>> -L/lustre/beagle/fangfang/ModelSEED/Solvers/lib -lglpk >>> >>> -L/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/lib/x86-64_sles10_4.1/static_pic >>> >>> -lcplex -lm -lpthread -lz >>> >>> sandbox$ >>> >>> >>> >>> >>> >>> ----- Original Message ----- >>> >>>> From: "Fangfang Xia" >>> >>>> To: "Michael Wilde" >>> >>>> Cc: "Scott Devoid" >>> >>>> Sent: Monday, October 24, 2011 5:20:20 PM >>> >>>> Subject: Re: How is install/test of Model SEED on Beagle going? >>> >>>> Hi Mike, >>> >>>> >>> >>>> This is very helpful. Thanks for pointing out the difference >>> >>>> between >>> >>>> PrgEnv-pgi and gcc. Here's an error message we got when trying to >>> >>>> compile our core c++ code. >>> >>>> >>> >>>> /opt/gcc/4.5.2/snos/libexec/gcc/x86_64-suse-linux/default/cc1plus: >>> >>>> error while loading shared libraries: libmpc.so.2: cannot open >>> >>>> shared >>> >>>> object file: No such file or directory >>> >>>> >>> >>>> It looks like something is wrong with cc1plus. I suppose it's part >>> >>>> of >>> >>>> the g++? I don't know what it does. >>> >>>> >>> >>>> So we resolved the perl dependency issues, and we were able to >>> >>>> compile >>> >>>> the code with the default PrgEnv-pgi just for testing purposes. It >>> >>>> seems we still have some issues with our new pipeline code. But I >>> >>>> don't think we are very far from giving you a running example. >>> >>>> >>> >>>> Just in case you could help us with the gcc compilation issue, I >>> >>>> have >>> >>>> 777'ed my directory and here's the steps to compile the core C++ >>> >>>> code: >>> >>>> >>> >>>> source >>> >>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/bin/source-me.sh >>> >>>> cd >>> >>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit/Linux >>> >>>> make >>> >>>> >>> >>>> Thanks, >>> >>>> Fangfang >>> >>>> >>> >>>> On Oct 22, 2011, at 1:10 PM, Michael Wilde wrote: >>> >>>> >>> >>>>> Sounds great, thanks for the update, Fangfang. >>> >>>>> >>> >>>>> One question: what compiler are you using? >>> >>>>> >>> >>>>> I'd like to suggest, for the first pass, that you use the "gcc" >>> >>>>> module (rather than the PrgEnv-pgi or PrgEnv-gnu modules). Thats >>> >>>>> because the GCC module will create code that we can run in >>> >>>>> parallel, >>> >>>>> multiple programs in parallel per compute node. The PrgEnv modules >>> >>>>> all create code that expects to run only one program per node, >>> >>>>> because its meant for MPI, OpenMP, etc). >>> >>>>> >>> >>>>> Also, I think that the gcc module (which I think includes gcc, g++ >>> >>>>> and gfortran) may be more like the traditional Linux gcc than >>> >>>>> PrgEnv-gnu. >>> >>>>> >>> >>>>> The default PrgEnv (at least for me) is pgi. So before i build >>> >>>>> software I do: >>> >>>>> >>> >>>>> module unload PrgEnv-pgi >>> >>>>> module load gcc >>> >>>>> >>> >>>>> Let me know if I can help; if you want i can try to build you a >>> >>>>> libxml2 using gcc. >>> >>>>> Same for Perl if it needs to be executed multiple copies per node >>> >>>>> in >>> >>>>> parallel. >>> >>>>> >>> >>>>> We can discuss more next week, and I'll be working off and on this >>> >>>>> weekend. >>> >>>>> >>> >>>>> Regards, >>> >>>>> >>> >>>>> - Mike >>> >>>>> >>> >>>>> >>> >>>>> ----- Original Message ----- >>> >>>>>> From: "Fangfang Xia" >>> >>>>>> To: "Michael Wilde" >>> >>>>>> Cc: "Fangfang Xia" , "Scott Devoid" >>> >>>>>> >>> >>>>>> Sent: Saturday, October 22, 2011 12:39:05 PM >>> >>>>>> Subject: Re: How is install/test of Model SEED on Beagle going? >>> >>>>>> Hi Mike, >>> >>>>>> >>> >>>>>> We encountered some dependency issues while attempting to install >>> >>>>>> some >>> >>>>>> additional Perl libraries for ModelSEED. We have asked Beagle >>> >>>>>> systems >>> >>>>>> folks to help install libxml2. I'm also looking into ways to >>> >>>>>> install >>> >>>>>> it in a user directory. I get the feeling that things should be >>> >>>>>> resolved after our group meeting on Monday. So we'll keep you >>> >>>>>> posted. >>> >>>>>> >>> >>>>>> Thanks, >>> >>>>>> >>> >>>>>> Fangfang >>> >>>>>> >>> >>>>>> On Oct 21, 2011, at 2:08 PM, Michael Wilde wrote: >>> >>>>>> >>> >>>>>>> Hi Fangfang, Scott, >>> >>>>>>> >>> >>>>>>> Any progress - can I try it soon? >>> >>>>>>> >>> >>>>>>> Or, any problems that I can help with? Im at Argonne today >>> >>>>>>> (5141) >>> >>>>>>> if >>> >>>>>>> I can help or you'd like to talk. Free except for 3:30 - 4:30. >>> >>>>>>> >>> >>>>>>> Regards, >>> >>>>>>> >>> >>>>>>> - Mike >>> >>>>>>> >>> >>>>>>> >>> >>>>>>> -- >>> >>>>>>> Michael Wilde >>> >>>>>>> Computation Institute, University of Chicago >>> >>>>>>> Mathematics and Computer Science Division >>> >>>>>>> Argonne National Laboratory >>> >>>>>>> >>> >>>>> >>> >>>>> -- >>> >>>>> Michael Wilde >>> >>>>> Computation Institute, University of Chicago >>> >>>>> Mathematics and Computer Science Division >>> >>>>> Argonne National Laboratory >>> >>>>> >>> >>> >>> >>> -- >>> >>> Michael Wilde >>> >>> Computation Institute, University of Chicago >>> >>> Mathematics and Computer Science Division >>> >>> Argonne National Laboratory >>> >>> >>> > >>> > -- >>> > Michael Wilde >>> > Computation Institute, University of Chicago >>> > Mathematics and Computer Science Division >>> > Argonne National Laboratory >>> > >>> >>> >>> -- >>> Michael Wilde >>> Computation Institute, University of Chicago >>> Mathematics and Computer Science Division >>> Argonne National Laboratory >>> >>> _______________________________________________ >>> Swift-devel mailing list >>> Swift-devel at ci.uchicago.edu >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>> >>> >>> >>> -- >>> Ketan >>> >>> >> >> >> >> >> >> -- >> Ketan >> >> > > > > > > -- > Ketan > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Sat Nov 12 16:31:53 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Sat, 12 Nov 2011 16:31:53 -0600 Subject: [Swift-devel] User has problem with PBS provider on Beagle - Fwd: Swift question In-Reply-To: <2F4CFB07-EADD-4A05-8D99-AFC3E7D99329@gmail.com> References: <818D3B69-4716-40B9-B4CD-422A62D56787@gmail.com> <1285627078.23656.1321126262196.JavaMail.root@zimbra.anl.gov> <7C4E0EA1-9DC4-404A-8C4A-16B9B011EADF@gmail.com> <832E99F6-26FF-40E4-81C1-DCB19728D339@gmail.com> <2F4CFB07-EADD-4A05-8D99-AFC3E7D99329@gmail.com> Message-ID: Hello Fangfang, I logged out of my screen session before my test completed. Now, I could see that the run I tested was submitted but in the end I saw the same message. I am looking into this and will get back to you soon. Regards, Ketan On Sat, Nov 12, 2011 at 3:55 PM, Fangfang Xia wrote: > Thanks Mike and Ketan. > > I think it's nice that Swift is preinstalled on Beagle. When I do "module > load swift", I get "Swift version swift-0.93RC3 loaded"; is that the latest > Swift? > > This time I get a command line error: > > login2:catsn > swift -config cf -tc.file tc -sites.file sites.xml > catsn.swift -n=2 > Swift svn swift-r5205 cog-r3293 > > RunID: 20111112-2151-rbu7ivof > Progress: time: Sat, 12 Nov 2011 21:51:47 +0000 > Progress: time: Sat, 12 Nov 2011 21:51:59 +0000 Submitted:1 Active:1 > Progress: time: Sat, 12 Nov 2011 21:52:10 +0000 Submitted:1 Active:1 > Exception in cat: > Arguments: [data.txt] > Host: pbs > Directory: catsn-20111112-2151-rbu7ivof/jobs/v/cat-vbgmznik > - - - > > Caused by: Task failed: 1112-510947-000001 Block task ended prematurely > > > Exception in cat: > Arguments: [data.txt] > Host: pbs > Directory: catsn-20111112-2151-rbu7ivof/jobs/t/cat-tbgmznik > - - - > > Caused by: Task failed: 1112-510947-000001 Block task ended prematurely > > > Final status: time: Sat, 12 Nov 2011 21:52:10 +0000 Failed:2 > The following errors have occurred: > 1. Task failed: 1112-510947-000001 Block task ended prematurely (2 times) > > > On Nov 12, 2011, at 3:43 PM, Ketan Maheshwari wrote: > > Hello Fangfang, > > Sorry, I made a mistake in the new line, in place of key="ppn", it should > be key="providerAttributes". > > So the line should be as follows: > > key="providerAttributes">pbs.aprun;pbs.mpp;depth=24 > > I just tested this on Beagle and it works now. > > Regards, > Ketan > > On Sat, Nov 12, 2011 at 2:30 PM, Fangfang Xia wrote: > >> Thanks. The "Illegal value for ppn" line seems to persist in the log. >> >> >> >> On Nov 12, 2011, at 2:21 PM, Ketan Maheshwari wrote: >> >> Hello Fangfang, >> >> Could you replace the following line: >> 24:cray:pack >> >> with this one: >> pbs.aprun;pbs.mpp;depth=24 >> >> in your sites.xml. >> >> The line you have is obsoleted form from the 0.92 version of Swift. >> >> It should work now. >> >> Regards, >> Ketan >> >> >> On Sat, Nov 12, 2011 at 2:07 PM, Fangfang Xia wrote: >> >>> Hi Ketan, >>> >>> Thanks for getting back to me so promptly. I have attached the log file, >>> and here's the content of sites.xml: >>> >>> >>> >>> >>> CI-DEB000002 >>> >>> 24:cray:pack >>> >>> 24 >>> 1000 >>> 1 >>> 1 >>> 1 >>> >>> .63 >>> 10000 >>> >>> >>> >> >/lustre/beagle/fangfang/swift-lab/swift.workdir >>> >>> >>> >>> There's no error message on the command line. >>> >>> >>> >>> On Nov 12, 2011, at 2:02 PM, Ketan Maheshwari wrote: >>> >>> Hello Fangfang, >>> >>> The log file does not seem to be found. Could you attach it please. >>> >>> From this line: >>> Illegal value for ppn. Must be an integer. >>> >>> Looks like the sites file is not configured well for the pbs provider. >>> Could you post your sites.xml. >>> >>> Were there any error messages on commandline? >>> >>> Regards, >>> Ketan >>> >>> On Sat, Nov 12, 2011 at 1:31 PM, Michael Wilde wrote: >>> >>>> Can the first person who has time try to address the problem below? >>>> Im about to head to SC. >>>> >>>> Thanks, >>>> >>>> - Mike >>>> >>>> >>>> ----- Forwarded Message ----- >>>> From: "Fangfang Xia" >>>> To: "Michael Wilde" >>>> Cc: "Ketan Maheshwari" , "Scott Devoid" < >>>> devoid at ci.uchicago.edu> >>>> Sent: Saturday, November 12, 2011 1:27:29 PM >>>> Subject: Swift question >>>> >>>> Hi Mike and Ketan, >>>> >>>> Thanks for the guide. I tried to follow the "cat" example, and got the >>>> following error: >>>> >>>> 2011-11-12 19:21:06,510+0000 DEBUG AbstractExecutor Writing PBS script >>>> to /home/fangfang/.globus/scripts/PBS6954924010553344333.submit >>>> 2011-11-12 19:21:06,521+0000 DEBUG PBSExecutor PBS name: for: >>>> Block-1112-210706-000000 is: Block-1112-2107 >>>> 2011-11-12 19:21:06,521+0000 INFO BlockTaskSubmitter Error submitting >>>> block task: Cannot submit job: Illegal value for ppn. Must be an integer. >>>> 2011-11-12 19:21:16,429+0000 INFO TaskNotifier Congestion queue size: 0 >>>> >>>> I looked at the PBS script and somehow it's blank. I have attached the >>>> full log file. Could you please take a look and let me know how to proceed? >>>> >>>> Thanks, >>>> >>>> Fangfang >>>> >>>> On Nov 8, 2011, at 12:42 PM, Michael Wilde wrote: >>>> >>>> > Hi Fangfang, Scott, >>>> > >>>> > Sorry for the late reply! I think the best roadmap to follow is this: >>>> > >>>> > - try running the sample tutorial Swift script on Beagle using the >>>> instructions posted at: >>>> > >>>> > >>>> http://www.ci.uchicago.edu/swift/wwwdev/guides/release-0.93/siteguide/siteguide.html#_beagle >>>> > >>>> > This tiny tutorial contains a simple Swift script that does N "cat" >>>> commends in parallel to "process" an input file and create an output file. >>>> It contains all the related config files you need to run on Beagle, and is >>>> thus a good "Hello World" application. You can then copy catsn.swift to >>>> create the first Swift script to run your actual applications. >>>> > >>>> > - set up a face to face meeting with Ketan Maheshwari, the Beagle >>>> Catalyst for Swift applications. Ketan is based here at Argonne, on the 5th >>>> floor near my office, 5141. Ketan can help answer any questions you have, >>>> and will be your personal contact to help you make good use of Beagle. >>>> > >>>> > - then do your first Model-SEED script based on catsn.swift, first >>>> with N = 1 to just ensure that you have described your app's command >>>> line(s) correctly to Swift and that the app is getting invoked and >>>> returning output correctly. >>>> > >>>> > - then, with help form Ketan as needed, start scaling up to >>>> increasingly larger runs. >>>> > >>>> > I'll try to stay close in the loop and help out as needed. >>>> > >>>> > Do you have any questions I can answer to get started? If you are at >>>> Argonne and available today, perhaps I can join you and Ketan in an >>>> introductory meeting. Im free from 3 to 4:40 today or after 5:30. >>>> Otherwise, pelase do this at your joint conveniences. >>>> > >>>> > Regards, >>>> > >>>> > - Mike >>>> > >>>> > >>>> > >>>> > >>>> > ----- Original Message ----- >>>> >> From: "Fangfang Xia" >>>> >> To: "Michael Wilde" >>>> >> Sent: Monday, October 31, 2011 12:44:23 PM >>>> >> Subject: Re: How is install/test of Model SEED on Beagle going? >>>> >> Hi Mike, >>>> >> >>>> >> We got two types of flux balance analysis to run on beagle. I was >>>> >> wondering if we should test them with Swift to see if things scale. >>>> >> Both operations take about 40 seconds to run on sandbox. Ideally we >>>> >> should also test two more expensive computation "fba single >>>> knockouts" >>>> >> and "gapfilling", but I won't be able to resolve the problems with >>>> >> those until I meet with Chris this week. >>>> >> >>>> >> source /lustre/beagle/fangfang/Model-SEED-core/bin/source-me.sh >>>> >> >>>> >> fbacheckgrowth -model iJR904.16242 >>>> >> fbafva -model iJR904.16242 >>>> >> >>>> >> You can find the descriptions of these tools at: >>>> >> http://bionet.mcs.anl.gov/index.php/Using_the_Model_SEED >>>> >> >>>> >> I've been switching between PrgEnv-pgi/gcc to get perl modules and >>>> >> mfatoolkit to compile. And I still seem to be getting the cc1plus >>>> >> error with gcc which you don't have. So if this version doesn't work >>>> >> well on multiple processors, I'll need your help with recompiling my >>>> >> updated mfatoolkit in >>>> >> >>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit. >>>> >> >>>> >> I have 777'ed my /lustre/beagle/fangfang/ModelSEED/ directory in case >>>> >> you need to test something there. >>>> >> >>>> >> Thanks, >>>> >> Fangfang >>>> >> >>>> >> On Oct 24, 2011, at 11:14 PM, Michael Wilde wrote: >>>> >> >>>> >>> Hi Fangfang, >>>> >>> >>>> >>> I was able to build that directory using the gcc module; I past the >>>> >>> make output below. It gave many warnings, but I did not get the >>>> >>> cc1plus libmpc.so error that you encountered. >>>> >>> >>>> >>> My build is in $HOME/wilde/mfatoolkit >>>> >>> >>>> >>> I ran this on sandbox.beagle.ci.uchicago.edu. >>>> >>> >>>> >>> - Mike >>>> >>> >>>> >>> ---- make output: >>>> >>> >>>> >>> sandbox$ make >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> >>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/driver.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> >>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/MFAProblem.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>>> >>> 'int MFAProblem::ModifyInputConstraints(ConstraintsToModify*, >>>> >>> Data*)': >>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1828:16: warning: NULL >>>> >>> used in arithmetic >>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1832:16: warning: NULL >>>> >>> used in arithmetic >>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>>> >>> 'int MFAProblem::FluxCouplingAnalysis(Data*, OptimizationParameter*, >>>> >>> bool, std::string&, bool)': >>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:4900:46: warning: >>>> >>> converting 'false' to pointer type for argument 1 of >>>> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>>> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>>> >>> std::char_traits, _Alloc = std::allocator]' >>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5023:47: warning: >>>> >>> converting 'false' to pointer type for argument 1 of >>>> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>>> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>>> >>> std::char_traits, _Alloc = std::allocator]' >>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5212:47: warning: >>>> >>> converting 'false' to pointer type for argument 1 of >>>> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>>> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>>> >>> std::char_traits, _Alloc = std::allocator]' >>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>>> >>> 'int MFAProblem::IdentifyReactionLoops(Data*, >>>> >>> OptimizationParameter*)': >>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5989:43: warning: >>>> >>> converting 'false' to pointer type for argument 1 of >>>> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>>> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>>> >>> std::char_traits, _Alloc = std::allocator]' >>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>>> >>> 'int MFAProblem::ParseRegExp(OptimizationParameter*, Data*, >>>> >>> std::string)': >>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:7984:10: warning: >>>> >>> converting to non-pointer type 'int' from NULL >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> >>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/CPLEXapi.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> >>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/SCIPapi.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> >>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/GLPKapi.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> >>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> >>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/SolverInterface.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> >>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/Species.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> /home/wilde/mfatoolkit/Source/Species.cpp: In member function 'void >>>> >>> Species::AddpKab(std::string, bool)': >>>> >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing >>>> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>>> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>>> >>> std::allocator, value_type = int]' >>>> >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing >>>> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>>> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>>> >>> std::allocator, value_type = int]' >>>> >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing >>>> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>>> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>>> >>> std::allocator, value_type = int]' >>>> >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing >>>> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>>> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>>> >>> std::allocator, value_type = int]' >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> >>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/Data.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> /home/wilde/mfatoolkit/Source/Data.cpp:2220:43: warning: >>>> >>> multi-character character constant >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> >>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/InterfaceFunctions.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> >>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/Identity.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> >>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/Reaction.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> >>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/GlobalFunctions.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> >>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/AtomCPP.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> >>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/UtilityFunctions.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> >>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/AtomType.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> >>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/Gene.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> >>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/GeneInterval.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> >>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/stringDB.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> >>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -o /home/wilde/mfatoolkit/Linux/mfatoolkit >>>> >>> /home/wilde/mfatoolkit/Source/driver.o >>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.o >>>> >>> /home/wilde/mfatoolkit/Source/CPLEXapi.o >>>> >>> /home/wilde/mfatoolkit/Source/SCIPapi.o >>>> >>> /home/wilde/mfatoolkit/Source/GLPKapi.o >>>> >>> /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.o >>>> >>> /home/wilde/mfatoolkit/Source/SolverInterface.o >>>> >>> /home/wilde/mfatoolkit/Source/Species.o >>>> >>> /home/wilde/mfatoolkit/Source/Data.o >>>> >>> /home/wilde/mfatoolkit/Source/InterfaceFunctions.o >>>> >>> /home/wilde/mfatoolkit/Source/Identity.o >>>> >>> /home/wilde/mfatoolkit/Source/Reaction.o >>>> >>> /home/wilde/mfatoolkit/Source/GlobalFunctions.o >>>> >>> /home/wilde/mfatoolkit/Source/AtomCPP.o >>>> >>> /home/wilde/mfatoolkit/Source/UtilityFunctions.o >>>> >>> /home/wilde/mfatoolkit/Source/AtomType.o >>>> >>> /home/wilde/mfatoolkit/Source/Gene.o >>>> >>> /home/wilde/mfatoolkit/Source/GeneInterval.o >>>> >>> /home/wilde/mfatoolkit/Source/stringDB.o >>>> >>> -L/lustre/beagle/fangfang/ModelSEED/Solvers/lib -lglpk >>>> >>> >>>> -L/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/lib/x86-64_sles10_4.1/static_pic >>>> >>> -lcplex -lm -lpthread -lz >>>> >>> sandbox$ >>>> >>> >>>> >>> >>>> >>> ----- Original Message ----- >>>> >>>> From: "Fangfang Xia" >>>> >>>> To: "Michael Wilde" >>>> >>>> Cc: "Scott Devoid" >>>> >>>> Sent: Monday, October 24, 2011 5:20:20 PM >>>> >>>> Subject: Re: How is install/test of Model SEED on Beagle going? >>>> >>>> Hi Mike, >>>> >>>> >>>> >>>> This is very helpful. Thanks for pointing out the difference >>>> >>>> between >>>> >>>> PrgEnv-pgi and gcc. Here's an error message we got when trying to >>>> >>>> compile our core c++ code. >>>> >>>> >>>> >>>> /opt/gcc/4.5.2/snos/libexec/gcc/x86_64-suse-linux/default/cc1plus: >>>> >>>> error while loading shared libraries: libmpc.so.2: cannot open >>>> >>>> shared >>>> >>>> object file: No such file or directory >>>> >>>> >>>> >>>> It looks like something is wrong with cc1plus. I suppose it's part >>>> >>>> of >>>> >>>> the g++? I don't know what it does. >>>> >>>> >>>> >>>> So we resolved the perl dependency issues, and we were able to >>>> >>>> compile >>>> >>>> the code with the default PrgEnv-pgi just for testing purposes. It >>>> >>>> seems we still have some issues with our new pipeline code. But I >>>> >>>> don't think we are very far from giving you a running example. >>>> >>>> >>>> >>>> Just in case you could help us with the gcc compilation issue, I >>>> >>>> have >>>> >>>> 777'ed my directory and here's the steps to compile the core C++ >>>> >>>> code: >>>> >>>> >>>> >>>> source >>>> >>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/bin/source-me.sh >>>> >>>> cd >>>> >>>> >>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit/Linux >>>> >>>> make >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Fangfang >>>> >>>> >>>> >>>> On Oct 22, 2011, at 1:10 PM, Michael Wilde wrote: >>>> >>>> >>>> >>>>> Sounds great, thanks for the update, Fangfang. >>>> >>>>> >>>> >>>>> One question: what compiler are you using? >>>> >>>>> >>>> >>>>> I'd like to suggest, for the first pass, that you use the "gcc" >>>> >>>>> module (rather than the PrgEnv-pgi or PrgEnv-gnu modules). Thats >>>> >>>>> because the GCC module will create code that we can run in >>>> >>>>> parallel, >>>> >>>>> multiple programs in parallel per compute node. The PrgEnv modules >>>> >>>>> all create code that expects to run only one program per node, >>>> >>>>> because its meant for MPI, OpenMP, etc). >>>> >>>>> >>>> >>>>> Also, I think that the gcc module (which I think includes gcc, g++ >>>> >>>>> and gfortran) may be more like the traditional Linux gcc than >>>> >>>>> PrgEnv-gnu. >>>> >>>>> >>>> >>>>> The default PrgEnv (at least for me) is pgi. So before i build >>>> >>>>> software I do: >>>> >>>>> >>>> >>>>> module unload PrgEnv-pgi >>>> >>>>> module load gcc >>>> >>>>> >>>> >>>>> Let me know if I can help; if you want i can try to build you a >>>> >>>>> libxml2 using gcc. >>>> >>>>> Same for Perl if it needs to be executed multiple copies per node >>>> >>>>> in >>>> >>>>> parallel. >>>> >>>>> >>>> >>>>> We can discuss more next week, and I'll be working off and on this >>>> >>>>> weekend. >>>> >>>>> >>>> >>>>> Regards, >>>> >>>>> >>>> >>>>> - Mike >>>> >>>>> >>>> >>>>> >>>> >>>>> ----- Original Message ----- >>>> >>>>>> From: "Fangfang Xia" >>>> >>>>>> To: "Michael Wilde" >>>> >>>>>> Cc: "Fangfang Xia" , "Scott Devoid" >>>> >>>>>> >>>> >>>>>> Sent: Saturday, October 22, 2011 12:39:05 PM >>>> >>>>>> Subject: Re: How is install/test of Model SEED on Beagle going? >>>> >>>>>> Hi Mike, >>>> >>>>>> >>>> >>>>>> We encountered some dependency issues while attempting to install >>>> >>>>>> some >>>> >>>>>> additional Perl libraries for ModelSEED. We have asked Beagle >>>> >>>>>> systems >>>> >>>>>> folks to help install libxml2. I'm also looking into ways to >>>> >>>>>> install >>>> >>>>>> it in a user directory. I get the feeling that things should be >>>> >>>>>> resolved after our group meeting on Monday. So we'll keep you >>>> >>>>>> posted. >>>> >>>>>> >>>> >>>>>> Thanks, >>>> >>>>>> >>>> >>>>>> Fangfang >>>> >>>>>> >>>> >>>>>> On Oct 21, 2011, at 2:08 PM, Michael Wilde wrote: >>>> >>>>>> >>>> >>>>>>> Hi Fangfang, Scott, >>>> >>>>>>> >>>> >>>>>>> Any progress - can I try it soon? >>>> >>>>>>> >>>> >>>>>>> Or, any problems that I can help with? Im at Argonne today >>>> >>>>>>> (5141) >>>> >>>>>>> if >>>> >>>>>>> I can help or you'd like to talk. Free except for 3:30 - 4:30. >>>> >>>>>>> >>>> >>>>>>> Regards, >>>> >>>>>>> >>>> >>>>>>> - Mike >>>> >>>>>>> >>>> >>>>>>> >>>> >>>>>>> -- >>>> >>>>>>> Michael Wilde >>>> >>>>>>> Computation Institute, University of Chicago >>>> >>>>>>> Mathematics and Computer Science Division >>>> >>>>>>> Argonne National Laboratory >>>> >>>>>>> >>>> >>>>> >>>> >>>>> -- >>>> >>>>> Michael Wilde >>>> >>>>> Computation Institute, University of Chicago >>>> >>>>> Mathematics and Computer Science Division >>>> >>>>> Argonne National Laboratory >>>> >>>>> >>>> >>> >>>> >>> -- >>>> >>> Michael Wilde >>>> >>> Computation Institute, University of Chicago >>>> >>> Mathematics and Computer Science Division >>>> >>> Argonne National Laboratory >>>> >>> >>>> > >>>> > -- >>>> > Michael Wilde >>>> > Computation Institute, University of Chicago >>>> > Mathematics and Computer Science Division >>>> > Argonne National Laboratory >>>> > >>>> >>>> >>>> -- >>>> Michael Wilde >>>> Computation Institute, University of Chicago >>>> Mathematics and Computer Science Division >>>> Argonne National Laboratory >>>> >>>> _______________________________________________ >>>> Swift-devel mailing list >>>> Swift-devel at ci.uchicago.edu >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>> >>> >>> >>> >>> -- >>> Ketan >>> >>> >>> >>> >>> >> >> >> -- >> Ketan >> >> >> >> >> > > > -- > Ketan > > > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Sat Nov 12 22:39:36 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sat, 12 Nov 2011 22:39:36 -0600 Subject: [Swift-devel] swift pbs/beagle broken In-Reply-To: References: Message-ID: Ketan, can you post the submit script and site file? On 11/12/11, Ketan Maheshwari wrote: > Hi, > > It seems the pbs-coaster provider (local:pbs) is broken for swift. I tried > swift trunk, 0.93 svn branch, 0.93RC3 and 0.93RC4 but getting the same > response: > > Swift svn swift-r5205 cog-r3293 > > RunID: 20111113-0216-1d35h7eb > Progress: time: Sun, 13 Nov 2011 02:16:54 +0000 > site setting workersPerNode has been replaced with jobsPerNode! > Progress: time: Sun, 13 Nov 2011 02:17:05 +0000 Active:1 > Failed to transfer wrapper log for job cat-1hg8aoik > Exception in cat: > Arguments: [data.txt] > Host: pbs > Directory: catsn-20111113-0216-1d35h7eb/jobs/1/cat-1hg8aoik > stderr.txt: > > stdout.txt: > > ---- > > Caused by: Task failed: 1113-160254-000000 Block task ended prematurely > > > Final status: time: Sun, 13 Nov 2011 02:17:05 +0000 Failed:1 > The following errors have occurred: > 1. Task failed: 1113-160254-000000 Block task ended prematurely > > > > Trying the submit script outside of swift also does not seem to be working. > The scripts get submitted to the queue and immediately exits without > writing anything to stdout or stderr. > > Were there any recent changes that could have affected this? > > I remember to have tried this successfully in the last week of last month. > > Regards, > -- > Ketan > -- Sent from my mobile device From wilde at mcs.anl.gov Sun Nov 13 00:54:56 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sun, 13 Nov 2011 00:54:56 -0600 (CST) Subject: [Swift-devel] swift pbs/beagle broken In-Reply-To: Message-ID: <1133228392.24383.1321167296764.JavaMail.root@zimbra.anl.gov> OK, I dont need these; I can reproduce the problem as well. For some reason, the coaster worker is exiting immediately. I see a few possibilities: - Beagle networking may have changed, making it no longer possible to reach the coaster service from the compute nodes using the previous IP address ranges. - the worker.pl script is not being created in $HOME/.globus/coasters Mike ----- Original Message ----- > From: "Michael Wilde" > To: "Ketan Maheshwari" > Cc: "Swift Devel" > Sent: Saturday, November 12, 2011 8:39:36 PM > Subject: Re: [Swift-devel] swift pbs/beagle broken > Ketan, can you post the submit script and site file? > > On 11/12/11, Ketan Maheshwari wrote: > > Hi, > > > > It seems the pbs-coaster provider (local:pbs) is broken for swift. I > > tried > > swift trunk, 0.93 svn branch, 0.93RC3 and 0.93RC4 but getting the > > same > > response: > > > > Swift svn swift-r5205 cog-r3293 > > > > RunID: 20111113-0216-1d35h7eb > > Progress: time: Sun, 13 Nov 2011 02:16:54 +0000 > > site setting workersPerNode has been replaced with jobsPerNode! > > Progress: time: Sun, 13 Nov 2011 02:17:05 +0000 Active:1 > > Failed to transfer wrapper log for job cat-1hg8aoik > > Exception in cat: > > Arguments: [data.txt] > > Host: pbs > > Directory: catsn-20111113-0216-1d35h7eb/jobs/1/cat-1hg8aoik > > stderr.txt: > > > > stdout.txt: > > > > ---- > > > > Caused by: Task failed: 1113-160254-000000 Block task ended > > prematurely > > > > > > Final status: time: Sun, 13 Nov 2011 02:17:05 +0000 Failed:1 > > The following errors have occurred: > > 1. Task failed: 1113-160254-000000 Block task ended prematurely > > > > > > > > Trying the submit script outside of swift also does not seem to be > > working. > > The scripts get submitted to the queue and immediately exits without > > writing anything to stdout or stderr. > > > > Were there any recent changes that could have affected this? > > > > I remember to have tried this successfully in the last week of last > > month. > > > > Regards, > > -- > > Ketan > > > > -- > Sent from my mobile device > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From davidk at ci.uchicago.edu Sun Nov 13 01:23:27 2011 From: davidk at ci.uchicago.edu (David Kelly) Date: Sun, 13 Nov 2011 01:23:27 -0600 (CST) Subject: [Swift-devel] swift pbs/beagle broken In-Reply-To: <1133228392.24383.1321167296764.JavaMail.root@zimbra.anl.gov> Message-ID: <685658645.13542.1321169007332.JavaMail.root@zimbra-mb2.anl.gov> I'm seeing the same thing.. coasters is immediately failing. My first thought is that it might be some type of new network restriction, or possibly something with the recent authentication changes. ----- Original Message ----- > From: "Michael Wilde" > To: "Ketan Maheshwari" > Cc: "Swift Devel" > Sent: Sunday, November 13, 2011 12:54:56 AM > Subject: Re: [Swift-devel] swift pbs/beagle broken > OK, I dont need these; I can reproduce the problem as well. > > For some reason, the coaster worker is exiting immediately. > > I see a few possibilities: > > - Beagle networking may have changed, making it no longer possible to > reach the coaster service from the compute nodes using the previous IP > address ranges. > > - the worker.pl script is not being created in $HOME/.globus/coasters > > Mike > > > ----- Original Message ----- > > From: "Michael Wilde" > > To: "Ketan Maheshwari" > > Cc: "Swift Devel" > > Sent: Saturday, November 12, 2011 8:39:36 PM > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > Ketan, can you post the submit script and site file? > > > > On 11/12/11, Ketan Maheshwari wrote: > > > Hi, > > > > > > It seems the pbs-coaster provider (local:pbs) is broken for swift. > > > I > > > tried > > > swift trunk, 0.93 svn branch, 0.93RC3 and 0.93RC4 but getting the > > > same > > > response: > > > > > > Swift svn swift-r5205 cog-r3293 > > > > > > RunID: 20111113-0216-1d35h7eb > > > Progress: time: Sun, 13 Nov 2011 02:16:54 +0000 > > > site setting workersPerNode has been replaced with jobsPerNode! > > > Progress: time: Sun, 13 Nov 2011 02:17:05 +0000 Active:1 > > > Failed to transfer wrapper log for job cat-1hg8aoik > > > Exception in cat: > > > Arguments: [data.txt] > > > Host: pbs > > > Directory: catsn-20111113-0216-1d35h7eb/jobs/1/cat-1hg8aoik > > > stderr.txt: > > > > > > stdout.txt: > > > > > > ---- > > > > > > Caused by: Task failed: 1113-160254-000000 Block task ended > > > prematurely > > > > > > > > > Final status: time: Sun, 13 Nov 2011 02:17:05 +0000 Failed:1 > > > The following errors have occurred: > > > 1. Task failed: 1113-160254-000000 Block task ended prematurely > > > > > > > > > > > > Trying the submit script outside of swift also does not seem to be > > > working. > > > The scripts get submitted to the queue and immediately exits > > > without > > > writing anything to stdout or stderr. > > > > > > Were there any recent changes that could have affected this? > > > > > > I remember to have tried this successfully in the last week of > > > last > > > month. > > > > > > Regards, > > > -- > > > Ketan > > > > > > > -- > > Sent from my mobile device > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From wilde at mcs.anl.gov Sun Nov 13 09:18:58 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sun, 13 Nov 2011 09:18:58 -0600 (CST) Subject: [Swift-devel] swift pbs/beagle broken In-Reply-To: <685658645.13542.1321169007332.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: <1390226474.24668.1321197538456.JavaMail.root@zimbra.anl.gov> It seems that the problem is less likely to be related to network connectivity. I tested access from a compute node to a login host, and that seems to still work as required (ie both netcat and a worker.pl can connect to a login host at the 192.5.86.10X addresses). And manual coasters and worker seem to be able to connect and run jobs. Im not sure why we are not seeing any output on the .stdout and .stderr files from the jobs that Swift is generating. Simple submit files tests have the same odd behavior. Im assuming for now that this is due to my incorrect usage or wrong assumptions rather than a PBS issue. I'll next try to re-create the failure that the Swift-generated jobs are seeing. - Mike ----- Original Message ----- > From: "David Kelly" > To: "Michael Wilde" > Cc: "Swift Devel" , "Ketan Maheshwari" > Sent: Saturday, November 12, 2011 11:23:27 PM > Subject: Re: [Swift-devel] swift pbs/beagle broken > I'm seeing the same thing.. coasters is immediately failing. > > My first thought is that it might be some type of new network > restriction, or possibly something with the recent authentication > changes. > > ----- Original Message ----- > > From: "Michael Wilde" > > To: "Ketan Maheshwari" > > Cc: "Swift Devel" > > Sent: Sunday, November 13, 2011 12:54:56 AM > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > OK, I dont need these; I can reproduce the problem as well. > > > > For some reason, the coaster worker is exiting immediately. > > > > I see a few possibilities: > > > > - Beagle networking may have changed, making it no longer possible > > to > > reach the coaster service from the compute nodes using the previous > > IP > > address ranges. > > > > - the worker.pl script is not being created in > > $HOME/.globus/coasters > > > > Mike > > > > > > ----- Original Message ----- > > > From: "Michael Wilde" > > > To: "Ketan Maheshwari" > > > Cc: "Swift Devel" > > > Sent: Saturday, November 12, 2011 8:39:36 PM > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > Ketan, can you post the submit script and site file? > > > > > > On 11/12/11, Ketan Maheshwari wrote: > > > > Hi, > > > > > > > > It seems the pbs-coaster provider (local:pbs) is broken for > > > > swift. > > > > I > > > > tried > > > > swift trunk, 0.93 svn branch, 0.93RC3 and 0.93RC4 but getting > > > > the > > > > same > > > > response: > > > > > > > > Swift svn swift-r5205 cog-r3293 > > > > > > > > RunID: 20111113-0216-1d35h7eb > > > > Progress: time: Sun, 13 Nov 2011 02:16:54 +0000 > > > > site setting workersPerNode has been replaced with jobsPerNode! > > > > Progress: time: Sun, 13 Nov 2011 02:17:05 +0000 Active:1 > > > > Failed to transfer wrapper log for job cat-1hg8aoik > > > > Exception in cat: > > > > Arguments: [data.txt] > > > > Host: pbs > > > > Directory: catsn-20111113-0216-1d35h7eb/jobs/1/cat-1hg8aoik > > > > stderr.txt: > > > > > > > > stdout.txt: > > > > > > > > ---- > > > > > > > > Caused by: Task failed: 1113-160254-000000 Block task ended > > > > prematurely > > > > > > > > > > > > Final status: time: Sun, 13 Nov 2011 02:17:05 +0000 Failed:1 > > > > The following errors have occurred: > > > > 1. Task failed: 1113-160254-000000 Block task ended prematurely > > > > > > > > > > > > > > > > Trying the submit script outside of swift also does not seem to > > > > be > > > > working. > > > > The scripts get submitted to the queue and immediately exits > > > > without > > > > writing anything to stdout or stderr. > > > > > > > > Were there any recent changes that could have affected this? > > > > > > > > I remember to have tried this successfully in the last week of > > > > last > > > > month. > > > > > > > > Regards, > > > > -- > > > > Ketan > > > > > > > > > > -- > > > Sent from my mobile device > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From ketancmaheshwari at gmail.com Sun Nov 13 09:20:25 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Sun, 13 Nov 2011 09:20:25 -0600 Subject: [Swift-devel] swift pbs/beagle broken In-Reply-To: <1133228392.24383.1321167296764.JavaMail.root@zimbra.anl.gov> References: <1133228392.24383.1321167296764.JavaMail.root@zimbra.anl.gov> Message-ID: I tried with a simple /bin/date command at the end of the submit script removing the call to worker.pl: #CoG This script generated by CoG #CoG by class: class org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor #CoG on date: 2011/11/13 02:16:54 #PBS -S /bin/bash #PBS -N Block-1113-1602 #PBS -m n #PBS -A CI-DEB000002 #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 #PBS -l walltime=00:10:00 #PBS -o /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout #PBS -e /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr WORKER_LOGGING_LEVEL=NONE #PBS -v WORKER_LOGGING_LEVEL cd / && aprun -n 3 -N 1 -cc none -d 24 -F exclusive /bin/sh -c /bin/date ======= This fails too. The queue cancels the job as soon as it starts running, without writing anything to stdout or stderr. On Sun, Nov 13, 2011 at 12:54 AM, Michael Wilde wrote: > OK, I dont need these; I can reproduce the problem as well. > > For some reason, the coaster worker is exiting immediately. > > I see a few possibilities: > > - Beagle networking may have changed, making it no longer possible to > reach the coaster service from the compute nodes using the previous IP > address ranges. > > - the worker.pl script is not being created in $HOME/.globus/coasters > > Mike > > > ----- Original Message ----- > > From: "Michael Wilde" > > To: "Ketan Maheshwari" > > Cc: "Swift Devel" > > Sent: Saturday, November 12, 2011 8:39:36 PM > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > Ketan, can you post the submit script and site file? > > > > On 11/12/11, Ketan Maheshwari wrote: > > > Hi, > > > > > > It seems the pbs-coaster provider (local:pbs) is broken for swift. I > > > tried > > > swift trunk, 0.93 svn branch, 0.93RC3 and 0.93RC4 but getting the > > > same > > > response: > > > > > > Swift svn swift-r5205 cog-r3293 > > > > > > RunID: 20111113-0216-1d35h7eb > > > Progress: time: Sun, 13 Nov 2011 02:16:54 +0000 > > > site setting workersPerNode has been replaced with jobsPerNode! > > > Progress: time: Sun, 13 Nov 2011 02:17:05 +0000 Active:1 > > > Failed to transfer wrapper log for job cat-1hg8aoik > > > Exception in cat: > > > Arguments: [data.txt] > > > Host: pbs > > > Directory: catsn-20111113-0216-1d35h7eb/jobs/1/cat-1hg8aoik > > > stderr.txt: > > > > > > stdout.txt: > > > > > > ---- > > > > > > Caused by: Task failed: 1113-160254-000000 Block task ended > > > prematurely > > > > > > > > > Final status: time: Sun, 13 Nov 2011 02:17:05 +0000 Failed:1 > > > The following errors have occurred: > > > 1. Task failed: 1113-160254-000000 Block task ended prematurely > > > > > > > > > > > > Trying the submit script outside of swift also does not seem to be > > > working. > > > The scripts get submitted to the queue and immediately exits without > > > writing anything to stdout or stderr. > > > > > > Were there any recent changes that could have affected this? > > > > > > I remember to have tried this successfully in the last week of last > > > month. > > > > > > Regards, > > > -- > > > Ketan > > > > > > > -- > > Sent from my mobile device > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Sun Nov 13 09:28:29 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sun, 13 Nov 2011 09:28:29 -0600 (CST) Subject: [Swift-devel] swift pbs/beagle broken In-Reply-To: Message-ID: <1248689057.24670.1321198109090.JavaMail.root@zimbra.anl.gov> 2 thoughts here, Ketan: - when I tried my manual coaster test, I replaced the options "-n 3 -N 1 -cc none -d 24 -F exclusive" on aprun with simply "-B" which says "use the options from qsub". I was going to go back and see if there was some subtle new mismatch between these qsub and aprun processor-layout options. - I realized that manually testing the swift-generated submit file will give new errors because the swift service is no longer alive and listening on the port that the worker will try to connect to. Also, it seemed that the .pl file itself that automatic coaster bootstrap places in ~/.globus/coasters was not there. Im assuming that Swift removes these files when it exits, but need to verify that this is true and that the failure is not due to a missing .pl file. I suspect that this is normal and is not the problem, but again, we need to keep debugging until the root cause is found. Mike ----- Original Message ----- > From: "Ketan Maheshwari" > To: "Michael Wilde" > Cc: "Swift Devel" > Sent: Sunday, November 13, 2011 7:20:25 AM > Subject: Re: [Swift-devel] swift pbs/beagle broken > I tried with a simple /bin/date command at the end of the submit > script removing the call to worker.pl : > > > > #CoG This script generated by CoG > #CoG by class: class > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > #CoG on date: 2011/11/13 02:16:54 > > > #PBS -S /bin/bash > #PBS -N Block-1113-1602 > #PBS -m n > #PBS -A CI-DEB000002 > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > #PBS -l walltime=00:10:00 > #PBS -o > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > #PBS -e > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > WORKER_LOGGING_LEVEL=NONE > #PBS -v WORKER_LOGGING_LEVEL > cd / && aprun -n 3 -N 1 -cc none -d 24 -F exclusive /bin/sh -c > /bin/date > > > ======= > > > This fails too. The queue cancels the job as soon as it starts > running, without writing anything to stdout or stderr. > > > > On Sun, Nov 13, 2011 at 12:54 AM, Michael Wilde < wilde at mcs.anl.gov > > wrote: > > > OK, I dont need these; I can reproduce the problem as well. > > For some reason, the coaster worker is exiting immediately. > > I see a few possibilities: > > - Beagle networking may have changed, making it no longer possible to > reach the coaster service from the compute nodes using the previous IP > address ranges. > > - the worker.pl script is not being created in $HOME/.globus/coasters > > Mike > > > > > > ----- Original Message ----- > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > Sent: Saturday, November 12, 2011 8:39:36 PM > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > Ketan, can you post the submit script and site file? > > > > On 11/12/11, Ketan Maheshwari < ketancmaheshwari at gmail.com > wrote: > > > Hi, > > > > > > It seems the pbs-coaster provider (local:pbs) is broken for swift. > > > I > > > tried > > > swift trunk, 0.93 svn branch, 0.93RC3 and 0.93RC4 but getting the > > > same > > > response: > > > > > > Swift svn swift-r5205 cog-r3293 > > > > > > RunID: 20111113-0216-1d35h7eb > > > Progress: time: Sun, 13 Nov 2011 02:16:54 +0000 > > > site setting workersPerNode has been replaced with jobsPerNode! > > > Progress: time: Sun, 13 Nov 2011 02:17:05 +0000 Active:1 > > > Failed to transfer wrapper log for job cat-1hg8aoik > > > Exception in cat: > > > Arguments: [data.txt] > > > Host: pbs > > > Directory: catsn-20111113-0216-1d35h7eb/jobs/1/cat-1hg8aoik > > > stderr.txt: > > > > > > stdout.txt: > > > > > > ---- > > > > > > Caused by: Task failed: 1113-160254-000000 Block task ended > > > prematurely > > > > > > > > > Final status: time: Sun, 13 Nov 2011 02:17:05 +0000 Failed:1 > > > The following errors have occurred: > > > 1. Task failed: 1113-160254-000000 Block task ended prematurely > > > > > > > > > > > > Trying the submit script outside of swift also does not seem to be > > > working. > > > The scripts get submitted to the queue and immediately exits > > > without > > > writing anything to stdout or stderr. > > > > > > Were there any recent changes that could have affected this? > > > > > > I remember to have tried this successfully in the last week of > > > last > > > month. > > > > > > Regards, > > > -- > > > Ketan > > > > > > > -- > > Sent from my mobile device > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > > > > > -- > Ketan -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From ketancmaheshwari at gmail.com Sun Nov 13 09:35:24 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Sun, 13 Nov 2011 09:35:24 -0600 Subject: [Swift-devel] swift pbs/beagle broken In-Reply-To: <1248689057.24670.1321198109090.JavaMail.root@zimbra.anl.gov> References: <1248689057.24670.1321198109090.JavaMail.root@zimbra.anl.gov> Message-ID: On Sun, Nov 13, 2011 at 9:28 AM, Michael Wilde wrote: > 2 thoughts here, Ketan: > > - when I tried my manual coaster test, I replaced the options "-n 3 -N 1 > -cc none -d 24 -F exclusive" on aprun with simply "-B" which says "use the > options from qsub". I was going to go back and see if there was some subtle > new mismatch between these qsub and aprun processor-layout options. > I tried the -B option: #CoG This script generated by CoG #CoG by class: class org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor #CoG on date: 2011/11/13 02:16:54 #PBS -S /bin/bash #PBS -N Block-1113-1602 #PBS -m n #PBS -A CI-DEB000002 #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 #PBS -l walltime=00:10:00 #PBS -o /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout #PBS -e /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr WORKER_LOGGING_LEVEL=NONE #PBS -v WORKER_LOGGING_LEVEL cd / && aprun -B /bin/sh -c /bin/date /bin/echo $? >/home/ketan/.globus/scripts/PBS2583661693904024220.submit.exitcode And see the same behavior. The exitcode file is indeed updated each time with a code 0. > > - I realized that manually testing the swift-generated submit file will > give new errors because the swift service is no longer alive and listening > on the port that the worker will try to connect to. Also, it seemed that > the .pl file itself that automatic coaster bootstrap places in > ~/.globus/coasters was not there. Im assuming that Swift removes these > files when it exits, but need to verify that this is true and that the > failure is not due to a missing .pl file. I suspect that this is normal > and is not the problem, but again, we need to keep debugging until the root > cause is found. > Mike > > > ----- Original Message ----- > > From: "Ketan Maheshwari" > > To: "Michael Wilde" > > Cc: "Swift Devel" > > Sent: Sunday, November 13, 2011 7:20:25 AM > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > I tried with a simple /bin/date command at the end of the submit > > script removing the call to worker.pl : > > > > > > > > #CoG This script generated by CoG > > #CoG by class: class > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > > #CoG on date: 2011/11/13 02:16:54 > > > > > > #PBS -S /bin/bash > > #PBS -N Block-1113-1602 > > #PBS -m n > > #PBS -A CI-DEB000002 > > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > > #PBS -l walltime=00:10:00 > > #PBS -o > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > > #PBS -e > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > > WORKER_LOGGING_LEVEL=NONE > > #PBS -v WORKER_LOGGING_LEVEL > > cd / && aprun -n 3 -N 1 -cc none -d 24 -F exclusive /bin/sh -c > > /bin/date > > > > > > ======= > > > > > > This fails too. The queue cancels the job as soon as it starts > > running, without writing anything to stdout or stderr. > > > > > > > > On Sun, Nov 13, 2011 at 12:54 AM, Michael Wilde < wilde at mcs.anl.gov > > > wrote: > > > > > > OK, I dont need these; I can reproduce the problem as well. > > > > For some reason, the coaster worker is exiting immediately. > > > > I see a few possibilities: > > > > - Beagle networking may have changed, making it no longer possible to > > reach the coaster service from the compute nodes using the previous IP > > address ranges. > > > > - the worker.pl script is not being created in $HOME/.globus/coasters > > > > Mike > > > > > > > > > > > > ----- Original Message ----- > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > Sent: Saturday, November 12, 2011 8:39:36 PM > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > Ketan, can you post the submit script and site file? > > > > > > On 11/12/11, Ketan Maheshwari < ketancmaheshwari at gmail.com > wrote: > > > > Hi, > > > > > > > > It seems the pbs-coaster provider (local:pbs) is broken for swift. > > > > I > > > > tried > > > > swift trunk, 0.93 svn branch, 0.93RC3 and 0.93RC4 but getting the > > > > same > > > > response: > > > > > > > > Swift svn swift-r5205 cog-r3293 > > > > > > > > RunID: 20111113-0216-1d35h7eb > > > > Progress: time: Sun, 13 Nov 2011 02:16:54 +0000 > > > > site setting workersPerNode has been replaced with jobsPerNode! > > > > Progress: time: Sun, 13 Nov 2011 02:17:05 +0000 Active:1 > > > > Failed to transfer wrapper log for job cat-1hg8aoik > > > > Exception in cat: > > > > Arguments: [data.txt] > > > > Host: pbs > > > > Directory: catsn-20111113-0216-1d35h7eb/jobs/1/cat-1hg8aoik > > > > stderr.txt: > > > > > > > > stdout.txt: > > > > > > > > ---- > > > > > > > > Caused by: Task failed: 1113-160254-000000 Block task ended > > > > prematurely > > > > > > > > > > > > Final status: time: Sun, 13 Nov 2011 02:17:05 +0000 Failed:1 > > > > The following errors have occurred: > > > > 1. Task failed: 1113-160254-000000 Block task ended prematurely > > > > > > > > > > > > > > > > Trying the submit script outside of swift also does not seem to be > > > > working. > > > > The scripts get submitted to the queue and immediately exits > > > > without > > > > writing anything to stdout or stderr. > > > > > > > > Were there any recent changes that could have affected this? > > > > > > > > I remember to have tried this successfully in the last week of > > > > last > > > > month. > > > > > > > > Regards, > > > > -- > > > > Ketan > > > > > > > > > > -- > > > Sent from my mobile device > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > > > > > > > > > -- > > Ketan > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Sun Nov 13 09:36:26 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sun, 13 Nov 2011 09:36:26 -0600 (CST) Subject: [Swift-devel] swift pbs/beagle broken In-Reply-To: <1248689057.24670.1321198109090.JavaMail.root@zimbra.anl.gov> Message-ID: <1812855757.24674.1321198586792.JavaMail.root@zimbra.anl.gov> Ketan, here are two cross-checks to try: 1) Go back your last successful runs (DSSAT?) and re-try a simple catsn with the same rev of swift+cog you used for that run. 2) Test on crow with that known working rev and the current rev. Im testing for the moment on beagle using the RC3 I get from "module load swift". - Mike ----- Original Message ----- > From: "Michael Wilde" > To: "Ketan Maheshwari" > Cc: "Swift Devel" > Sent: Sunday, November 13, 2011 7:28:29 AM > Subject: Re: [Swift-devel] swift pbs/beagle broken > 2 thoughts here, Ketan: > > - when I tried my manual coaster test, I replaced the options "-n 3 -N > 1 -cc none -d 24 -F exclusive" on aprun with simply "-B" which says > "use the options from qsub". I was going to go back and see if there > was some subtle new mismatch between these qsub and aprun > processor-layout options. > > - I realized that manually testing the swift-generated submit file > will give new errors because the swift service is no longer alive and > listening on the port that the worker will try to connect to. Also, it > seemed that the .pl file itself that automatic coaster bootstrap > places in ~/.globus/coasters was not there. Im assuming that Swift > removes these files when it exits, but need to verify that this is > true and that the failure is not due to a missing .pl file. I suspect > that this is normal and is not the problem, but again, we need to keep > debugging until the root cause is found. > > Mike > > > ----- Original Message ----- > > From: "Ketan Maheshwari" > > To: "Michael Wilde" > > Cc: "Swift Devel" > > Sent: Sunday, November 13, 2011 7:20:25 AM > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > I tried with a simple /bin/date command at the end of the submit > > script removing the call to worker.pl : > > > > > > > > #CoG This script generated by CoG > > #CoG by class: class > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > > #CoG on date: 2011/11/13 02:16:54 > > > > > > #PBS -S /bin/bash > > #PBS -N Block-1113-1602 > > #PBS -m n > > #PBS -A CI-DEB000002 > > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > > #PBS -l walltime=00:10:00 > > #PBS -o > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > > #PBS -e > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > > WORKER_LOGGING_LEVEL=NONE > > #PBS -v WORKER_LOGGING_LEVEL > > cd / && aprun -n 3 -N 1 -cc none -d 24 -F exclusive /bin/sh -c > > /bin/date > > > > > > ======= > > > > > > This fails too. The queue cancels the job as soon as it starts > > running, without writing anything to stdout or stderr. > > > > > > > > On Sun, Nov 13, 2011 at 12:54 AM, Michael Wilde < wilde at mcs.anl.gov > > > > > wrote: > > > > > > OK, I dont need these; I can reproduce the problem as well. > > > > For some reason, the coaster worker is exiting immediately. > > > > I see a few possibilities: > > > > - Beagle networking may have changed, making it no longer possible > > to > > reach the coaster service from the compute nodes using the previous > > IP > > address ranges. > > > > - the worker.pl script is not being created in > > $HOME/.globus/coasters > > > > Mike > > > > > > > > > > > > ----- Original Message ----- > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > Sent: Saturday, November 12, 2011 8:39:36 PM > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > Ketan, can you post the submit script and site file? > > > > > > On 11/12/11, Ketan Maheshwari < ketancmaheshwari at gmail.com > > > > wrote: > > > > Hi, > > > > > > > > It seems the pbs-coaster provider (local:pbs) is broken for > > > > swift. > > > > I > > > > tried > > > > swift trunk, 0.93 svn branch, 0.93RC3 and 0.93RC4 but getting > > > > the > > > > same > > > > response: > > > > > > > > Swift svn swift-r5205 cog-r3293 > > > > > > > > RunID: 20111113-0216-1d35h7eb > > > > Progress: time: Sun, 13 Nov 2011 02:16:54 +0000 > > > > site setting workersPerNode has been replaced with jobsPerNode! > > > > Progress: time: Sun, 13 Nov 2011 02:17:05 +0000 Active:1 > > > > Failed to transfer wrapper log for job cat-1hg8aoik > > > > Exception in cat: > > > > Arguments: [data.txt] > > > > Host: pbs > > > > Directory: catsn-20111113-0216-1d35h7eb/jobs/1/cat-1hg8aoik > > > > stderr.txt: > > > > > > > > stdout.txt: > > > > > > > > ---- > > > > > > > > Caused by: Task failed: 1113-160254-000000 Block task ended > > > > prematurely > > > > > > > > > > > > Final status: time: Sun, 13 Nov 2011 02:17:05 +0000 Failed:1 > > > > The following errors have occurred: > > > > 1. Task failed: 1113-160254-000000 Block task ended prematurely > > > > > > > > > > > > > > > > Trying the submit script outside of swift also does not seem to > > > > be > > > > working. > > > > The scripts get submitted to the queue and immediately exits > > > > without > > > > writing anything to stdout or stderr. > > > > > > > > Were there any recent changes that could have affected this? > > > > > > > > I remember to have tried this successfully in the last week of > > > > last > > > > month. > > > > > > > > Regards, > > > > -- > > > > Ketan > > > > > > > > > > -- > > > Sent from my mobile device > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > > > > > > > > > -- > > Ketan > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From wilde at mcs.anl.gov Sun Nov 13 09:45:05 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sun, 13 Nov 2011 09:45:05 -0600 (CST) Subject: [Swift-devel] swift pbs/beagle broken In-Reply-To: Message-ID: <1532372571.24697.1321199105753.JavaMail.root@zimbra.anl.gov> But if you put an explicit output redirection in the /bin/sh -c command, you will see that those commands are indeed executing and generating output. So like I mentioned earlier, I dont know if the qsub -o and -e flags have changed behavior (eg they now cant write to /home???), or if we are using them incorrectly. But I think we need to go backwards and see why this is not working with the swift-generated qsub files. We should next add the two tags to the sites file to obtain a log from the worker, on the (untested!) assumption that the worker is really starting in the automatic swift case: DEBUG /lustre/beagle/wilde/beagle - Mike ----- Original Message ----- > From: "Ketan Maheshwari" > To: "Michael Wilde" > Cc: "Swift Devel" > Sent: Sunday, November 13, 2011 7:35:24 AM > Subject: Re: [Swift-devel] swift pbs/beagle broken > On Sun, Nov 13, 2011 at 9:28 AM, Michael Wilde < wilde at mcs.anl.gov > > wrote: > > > 2 thoughts here, Ketan: > > - when I tried my manual coaster test, I replaced the options "-n 3 -N > 1 -cc none -d 24 -F exclusive" on aprun with simply "-B" which says > "use the options from qsub". I was going to go back and see if there > was some subtle new mismatch between these qsub and aprun > processor-layout options. > > > > I tried the -B option: > > > > #CoG This script generated by CoG > #CoG by class: class > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > #CoG on date: 2011/11/13 02:16:54 > > > #PBS -S /bin/bash > #PBS -N Block-1113-1602 > #PBS -m n > #PBS -A CI-DEB000002 > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > #PBS -l walltime=00:10:00 > #PBS -o > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > #PBS -e > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > WORKER_LOGGING_LEVEL=NONE > #PBS -v WORKER_LOGGING_LEVEL > cd / && aprun -B /bin/sh -c /bin/date > /bin/echo $? > >/home/ketan/.globus/scripts/PBS2583661693904024220.submit.exitcode > > > And see the same behavior. The exitcode file is indeed updated each > time with a code 0. > > > > - I realized that manually testing the swift-generated submit file > will give new errors because the swift service is no longer alive and > listening on the port that the worker will try to connect to. Also, it > seemed that the .pl file itself that automatic coaster bootstrap > places in ~/.globus/coasters was not there. Im assuming that Swift > removes these files when it exits, but need to verify that this is > true and that the failure is not due to a missing .pl file. I suspect > that this is normal and is not the problem, but again, we need to keep > debugging until the root cause is found. > > > > Mike > > > ----- Original Message ----- > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > To: "Michael Wilde" < wilde at mcs.anl.gov > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > Sent: Sunday, November 13, 2011 7:20:25 AM > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > I tried with a simple /bin/date command at the end of the submit > > script removing the call to worker.pl : > > > > > > > > #CoG This script generated by CoG > > #CoG by class: class > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > > #CoG on date: 2011/11/13 02:16:54 > > > > > > #PBS -S /bin/bash > > #PBS -N Block-1113-1602 > > #PBS -m n > > #PBS -A CI-DEB000002 > > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > > #PBS -l walltime=00:10:00 > > #PBS -o > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > > #PBS -e > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > > WORKER_LOGGING_LEVEL=NONE > > #PBS -v WORKER_LOGGING_LEVEL > > cd / && aprun -n 3 -N 1 -cc none -d 24 -F exclusive /bin/sh -c > > /bin/date > > > > > > ======= > > > > > > This fails too. The queue cancels the job as soon as it starts > > running, without writing anything to stdout or stderr. > > > > > > > > On Sun, Nov 13, 2011 at 12:54 AM, Michael Wilde < wilde at mcs.anl.gov > > > > > wrote: > > > > > > OK, I dont need these; I can reproduce the problem as well. > > > > For some reason, the coaster worker is exiting immediately. > > > > I see a few possibilities: > > > > - Beagle networking may have changed, making it no longer possible > > to > > reach the coaster service from the compute nodes using the previous > > IP > > address ranges. > > > > - the worker.pl script is not being created in > > $HOME/.globus/coasters > > > > Mike > > > > > > > > > > > > ----- Original Message ----- > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > Sent: Saturday, November 12, 2011 8:39:36 PM > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > Ketan, can you post the submit script and site file? > > > > > > On 11/12/11, Ketan Maheshwari < ketancmaheshwari at gmail.com > > > > wrote: > > > > Hi, > > > > > > > > It seems the pbs-coaster provider (local:pbs) is broken for > > > > swift. > > > > I > > > > tried > > > > swift trunk, 0.93 svn branch, 0.93RC3 and 0.93RC4 but getting > > > > the > > > > same > > > > response: > > > > > > > > Swift svn swift-r5205 cog-r3293 > > > > > > > > RunID: 20111113-0216-1d35h7eb > > > > Progress: time: Sun, 13 Nov 2011 02:16:54 +0000 > > > > site setting workersPerNode has been replaced with jobsPerNode! > > > > Progress: time: Sun, 13 Nov 2011 02:17:05 +0000 Active:1 > > > > Failed to transfer wrapper log for job cat-1hg8aoik > > > > Exception in cat: > > > > Arguments: [data.txt] > > > > Host: pbs > > > > Directory: catsn-20111113-0216-1d35h7eb/jobs/1/cat-1hg8aoik > > > > stderr.txt: > > > > > > > > stdout.txt: > > > > > > > > ---- > > > > > > > > Caused by: Task failed: 1113-160254-000000 Block task ended > > > > prematurely > > > > > > > > > > > > Final status: time: Sun, 13 Nov 2011 02:17:05 +0000 Failed:1 > > > > The following errors have occurred: > > > > 1. Task failed: 1113-160254-000000 Block task ended prematurely > > > > > > > > > > > > > > > > Trying the submit script outside of swift also does not seem to > > > > be > > > > working. > > > > The scripts get submitted to the queue and immediately exits > > > > without > > > > writing anything to stdout or stderr. > > > > > > > > Were there any recent changes that could have affected this? > > > > > > > > I remember to have tried this successfully in the last week of > > > > last > > > > month. > > > > > > > > Regards, > > > > -- > > > > Ketan > > > > > > > > > > -- > > > Sent from my mobile device > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > > > > > > > > > -- > > Ketan > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > > > > > -- > Ketan -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From wilde at mcs.anl.gov Sun Nov 13 09:51:57 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sun, 13 Nov 2011 09:51:57 -0600 (CST) Subject: [Swift-devel] swift pbs/beagle broken In-Reply-To: <1532372571.24697.1321199105753.JavaMail.root@zimbra.anl.gov> Message-ID: <1321029876.24711.1321199517139.JavaMail.root@zimbra.anl.gov> Ive backed up and just did a test from swift (automatic) I see that in that case I am *not* getting an exitcode file. Are you getting one? - Mike ----- Original Message ----- > From: "Michael Wilde" > To: "Ketan Maheshwari" > Cc: "Swift Devel" > Sent: Sunday, November 13, 2011 7:45:05 AM > Subject: Re: [Swift-devel] swift pbs/beagle broken > But if you put an explicit output redirection in the /bin/sh -c > command, you will see that those commands are indeed executing and > generating output. > > So like I mentioned earlier, I dont know if the qsub -o and -e flags > have changed behavior (eg they now cant write to /home???), or if we > are using them incorrectly. > > But I think we need to go backwards and see why this is not working > with the swift-generated qsub files. > > We should next add the two tags to the sites file to obtain a log from > the worker, on the (untested!) assumption that the worker is really > starting in the automatic swift case: > > DEBUG > key="workerLoggingDirectory">/lustre/beagle/wilde/beagle > > - Mike > > > ----- Original Message ----- > > From: "Ketan Maheshwari" > > To: "Michael Wilde" > > Cc: "Swift Devel" > > Sent: Sunday, November 13, 2011 7:35:24 AM > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > On Sun, Nov 13, 2011 at 9:28 AM, Michael Wilde < wilde at mcs.anl.gov > > > wrote: > > > > > > 2 thoughts here, Ketan: > > > > - when I tried my manual coaster test, I replaced the options "-n 3 > > -N > > 1 -cc none -d 24 -F exclusive" on aprun with simply "-B" which says > > "use the options from qsub". I was going to go back and see if there > > was some subtle new mismatch between these qsub and aprun > > processor-layout options. > > > > > > > > I tried the -B option: > > > > > > > > #CoG This script generated by CoG > > #CoG by class: class > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > > #CoG on date: 2011/11/13 02:16:54 > > > > > > #PBS -S /bin/bash > > #PBS -N Block-1113-1602 > > #PBS -m n > > #PBS -A CI-DEB000002 > > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > > #PBS -l walltime=00:10:00 > > #PBS -o > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > > #PBS -e > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > > WORKER_LOGGING_LEVEL=NONE > > #PBS -v WORKER_LOGGING_LEVEL > > cd / && aprun -B /bin/sh -c /bin/date > > /bin/echo $? > > >/home/ketan/.globus/scripts/PBS2583661693904024220.submit.exitcode > > > > > > And see the same behavior. The exitcode file is indeed updated each > > time with a code 0. > > > > > > > > - I realized that manually testing the swift-generated submit file > > will give new errors because the swift service is no longer alive > > and > > listening on the port that the worker will try to connect to. Also, > > it > > seemed that the .pl file itself that automatic coaster bootstrap > > places in ~/.globus/coasters was not there. Im assuming that Swift > > removes these files when it exits, but need to verify that this is > > true and that the failure is not due to a missing .pl file. I > > suspect > > that this is normal and is not the problem, but again, we need to > > keep > > debugging until the root cause is found. > > > > > > > > Mike > > > > > > ----- Original Message ----- > > > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > To: "Michael Wilde" < wilde at mcs.anl.gov > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > > > > Sent: Sunday, November 13, 2011 7:20:25 AM > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > I tried with a simple /bin/date command at the end of the submit > > > script removing the call to worker.pl : > > > > > > > > > > > > #CoG This script generated by CoG > > > #CoG by class: class > > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > > > #CoG on date: 2011/11/13 02:16:54 > > > > > > > > > #PBS -S /bin/bash > > > #PBS -N Block-1113-1602 > > > #PBS -m n > > > #PBS -A CI-DEB000002 > > > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > > > #PBS -l walltime=00:10:00 > > > #PBS -o > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > > > #PBS -e > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > > > WORKER_LOGGING_LEVEL=NONE > > > #PBS -v WORKER_LOGGING_LEVEL > > > cd / && aprun -n 3 -N 1 -cc none -d 24 -F exclusive /bin/sh -c > > > /bin/date > > > > > > > > > ======= > > > > > > > > > This fails too. The queue cancels the job as soon as it starts > > > running, without writing anything to stdout or stderr. > > > > > > > > > > > > On Sun, Nov 13, 2011 at 12:54 AM, Michael Wilde < > > > wilde at mcs.anl.gov > > > > > > > wrote: > > > > > > > > > OK, I dont need these; I can reproduce the problem as well. > > > > > > For some reason, the coaster worker is exiting immediately. > > > > > > I see a few possibilities: > > > > > > - Beagle networking may have changed, making it no longer possible > > > to > > > reach the coaster service from the compute nodes using the > > > previous > > > IP > > > address ranges. > > > > > > - the worker.pl script is not being created in > > > $HOME/.globus/coasters > > > > > > Mike > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > Sent: Saturday, November 12, 2011 8:39:36 PM > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > Ketan, can you post the submit script and site file? > > > > > > > > On 11/12/11, Ketan Maheshwari < ketancmaheshwari at gmail.com > > > > > wrote: > > > > > Hi, > > > > > > > > > > It seems the pbs-coaster provider (local:pbs) is broken for > > > > > swift. > > > > > I > > > > > tried > > > > > swift trunk, 0.93 svn branch, 0.93RC3 and 0.93RC4 but getting > > > > > the > > > > > same > > > > > response: > > > > > > > > > > Swift svn swift-r5205 cog-r3293 > > > > > > > > > > RunID: 20111113-0216-1d35h7eb > > > > > Progress: time: Sun, 13 Nov 2011 02:16:54 +0000 > > > > > site setting workersPerNode has been replaced with > > > > > jobsPerNode! > > > > > Progress: time: Sun, 13 Nov 2011 02:17:05 +0000 Active:1 > > > > > Failed to transfer wrapper log for job cat-1hg8aoik > > > > > Exception in cat: > > > > > Arguments: [data.txt] > > > > > Host: pbs > > > > > Directory: catsn-20111113-0216-1d35h7eb/jobs/1/cat-1hg8aoik > > > > > stderr.txt: > > > > > > > > > > stdout.txt: > > > > > > > > > > ---- > > > > > > > > > > Caused by: Task failed: 1113-160254-000000 Block task ended > > > > > prematurely > > > > > > > > > > > > > > > Final status: time: Sun, 13 Nov 2011 02:17:05 +0000 Failed:1 > > > > > The following errors have occurred: > > > > > 1. Task failed: 1113-160254-000000 Block task ended > > > > > prematurely > > > > > > > > > > > > > > > > > > > > Trying the submit script outside of swift also does not seem > > > > > to > > > > > be > > > > > working. > > > > > The scripts get submitted to the queue and immediately exits > > > > > without > > > > > writing anything to stdout or stderr. > > > > > > > > > > Were there any recent changes that could have affected this? > > > > > > > > > > I remember to have tried this successfully in the last week of > > > > > last > > > > > month. > > > > > > > > > > Regards, > > > > > -- > > > > > Ketan > > > > > > > > > > > > > -- > > > > Sent from my mobile device > > > > _______________________________________________ > > > > Swift-devel mailing list > > > > Swift-devel at ci.uchicago.edu > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > -- > > > Michael Wilde > > > Computation Institute, University of Chicago > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > -- > > > Ketan > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > > > > > > > > > -- > > Ketan > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From wilde at mcs.anl.gov Sun Nov 13 15:52:58 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sun, 13 Nov 2011 15:52:58 -0600 (CST) Subject: [Swift-devel] swift pbs/beagle broken In-Reply-To: <1321029876.24711.1321199517139.JavaMail.root@zimbra.anl.gov> Message-ID: <331770719.25046.1321221178389.JavaMail.root@zimbra.anl.gov> Its starting to look like some kind of aprun-based failure. I see this from more detailed logging I put into the generated script: IN .submit script aprun: Unexpected close of the apsys control connection aprun: Exiting due to errors. Application aborted aprun rc 1 I was led off track by the fact that the exitcode file is missing. Seems that its generated but then removed before we can see it. I suspect one part of the provider thinks the worker-launch job succeeded, and hence removes the exitcode file, but another part realizes that the job failed. (conjecture...) Now that that part is partially explained, I think I can go back to debugging this from manual qsubs which should go faster. Im still unsure if the missing stdout/err files is due to a Beagle issue; starting to look more like maybe due to the weird way in which the aprun dies. Digging deeper... - Mike ----- Original Message ----- > From: "Michael Wilde" > To: "Ketan Maheshwari" > Cc: "Swift Devel" > Sent: Sunday, November 13, 2011 7:51:57 AM > Subject: Re: [Swift-devel] swift pbs/beagle broken > Ive backed up and just did a test from swift (automatic) > > I see that in that case I am *not* getting an exitcode file. > Are you getting one? > > - Mike > > ----- Original Message ----- > > From: "Michael Wilde" > > To: "Ketan Maheshwari" > > Cc: "Swift Devel" > > Sent: Sunday, November 13, 2011 7:45:05 AM > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > But if you put an explicit output redirection in the /bin/sh -c > > command, you will see that those commands are indeed executing and > > generating output. > > > > So like I mentioned earlier, I dont know if the qsub -o and -e flags > > have changed behavior (eg they now cant write to /home???), or if we > > are using them incorrectly. > > > > But I think we need to go backwards and see why this is not working > > with the swift-generated qsub files. > > > > We should next add the two tags to the sites file to obtain a log > > from > > the worker, on the (untested!) assumption that the worker is really > > starting in the automatic swift case: > > > > DEBUG > > > key="workerLoggingDirectory">/lustre/beagle/wilde/beagle > > > > - Mike > > > > > > ----- Original Message ----- > > > From: "Ketan Maheshwari" > > > To: "Michael Wilde" > > > Cc: "Swift Devel" > > > Sent: Sunday, November 13, 2011 7:35:24 AM > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > On Sun, Nov 13, 2011 at 9:28 AM, Michael Wilde < wilde at mcs.anl.gov > > > > > > > wrote: > > > > > > > > > 2 thoughts here, Ketan: > > > > > > - when I tried my manual coaster test, I replaced the options "-n > > > 3 > > > -N > > > 1 -cc none -d 24 -F exclusive" on aprun with simply "-B" which > > > says > > > "use the options from qsub". I was going to go back and see if > > > there > > > was some subtle new mismatch between these qsub and aprun > > > processor-layout options. > > > > > > > > > > > > I tried the -B option: > > > > > > > > > > > > #CoG This script generated by CoG > > > #CoG by class: class > > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > > > #CoG on date: 2011/11/13 02:16:54 > > > > > > > > > #PBS -S /bin/bash > > > #PBS -N Block-1113-1602 > > > #PBS -m n > > > #PBS -A CI-DEB000002 > > > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > > > #PBS -l walltime=00:10:00 > > > #PBS -o > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > > > #PBS -e > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > > > WORKER_LOGGING_LEVEL=NONE > > > #PBS -v WORKER_LOGGING_LEVEL > > > cd / && aprun -B /bin/sh -c /bin/date > > > /bin/echo $? > > > >/home/ketan/.globus/scripts/PBS2583661693904024220.submit.exitcode > > > > > > > > > And see the same behavior. The exitcode file is indeed updated > > > each > > > time with a code 0. > > > > > > > > > > > > - I realized that manually testing the swift-generated submit file > > > will give new errors because the swift service is no longer alive > > > and > > > listening on the port that the worker will try to connect to. > > > Also, > > > it > > > seemed that the .pl file itself that automatic coaster bootstrap > > > places in ~/.globus/coasters was not there. Im assuming that Swift > > > removes these files when it exits, but need to verify that this is > > > true and that the failure is not due to a missing .pl file. I > > > suspect > > > that this is normal and is not the problem, but again, we need to > > > keep > > > debugging until the root cause is found. > > > > > > > > > > > > Mike > > > > > > > > > ----- Original Message ----- > > > > > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > To: "Michael Wilde" < wilde at mcs.anl.gov > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > > > > > > > > Sent: Sunday, November 13, 2011 7:20:25 AM > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > I tried with a simple /bin/date command at the end of the submit > > > > script removing the call to worker.pl : > > > > > > > > > > > > > > > > #CoG This script generated by CoG > > > > #CoG by class: class > > > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > > > > #CoG on date: 2011/11/13 02:16:54 > > > > > > > > > > > > #PBS -S /bin/bash > > > > #PBS -N Block-1113-1602 > > > > #PBS -m n > > > > #PBS -A CI-DEB000002 > > > > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > > > > #PBS -l walltime=00:10:00 > > > > #PBS -o > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > > > > #PBS -e > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > > > > WORKER_LOGGING_LEVEL=NONE > > > > #PBS -v WORKER_LOGGING_LEVEL > > > > cd / && aprun -n 3 -N 1 -cc none -d 24 -F exclusive /bin/sh -c > > > > /bin/date > > > > > > > > > > > > ======= > > > > > > > > > > > > This fails too. The queue cancels the job as soon as it starts > > > > running, without writing anything to stdout or stderr. > > > > > > > > > > > > > > > > On Sun, Nov 13, 2011 at 12:54 AM, Michael Wilde < > > > > wilde at mcs.anl.gov > > > > > > > > > wrote: > > > > > > > > > > > > OK, I dont need these; I can reproduce the problem as well. > > > > > > > > For some reason, the coaster worker is exiting immediately. > > > > > > > > I see a few possibilities: > > > > > > > > - Beagle networking may have changed, making it no longer > > > > possible > > > > to > > > > reach the coaster service from the compute nodes using the > > > > previous > > > > IP > > > > address ranges. > > > > > > > > - the worker.pl script is not being created in > > > > $HOME/.globus/coasters > > > > > > > > Mike > > > > > > > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > Sent: Saturday, November 12, 2011 8:39:36 PM > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > Ketan, can you post the submit script and site file? > > > > > > > > > > On 11/12/11, Ketan Maheshwari < ketancmaheshwari at gmail.com > > > > > > wrote: > > > > > > Hi, > > > > > > > > > > > > It seems the pbs-coaster provider (local:pbs) is broken for > > > > > > swift. > > > > > > I > > > > > > tried > > > > > > swift trunk, 0.93 svn branch, 0.93RC3 and 0.93RC4 but > > > > > > getting > > > > > > the > > > > > > same > > > > > > response: > > > > > > > > > > > > Swift svn swift-r5205 cog-r3293 > > > > > > > > > > > > RunID: 20111113-0216-1d35h7eb > > > > > > Progress: time: Sun, 13 Nov 2011 02:16:54 +0000 > > > > > > site setting workersPerNode has been replaced with > > > > > > jobsPerNode! > > > > > > Progress: time: Sun, 13 Nov 2011 02:17:05 +0000 Active:1 > > > > > > Failed to transfer wrapper log for job cat-1hg8aoik > > > > > > Exception in cat: > > > > > > Arguments: [data.txt] > > > > > > Host: pbs > > > > > > Directory: catsn-20111113-0216-1d35h7eb/jobs/1/cat-1hg8aoik > > > > > > stderr.txt: > > > > > > > > > > > > stdout.txt: > > > > > > > > > > > > ---- > > > > > > > > > > > > Caused by: Task failed: 1113-160254-000000 Block task ended > > > > > > prematurely > > > > > > > > > > > > > > > > > > Final status: time: Sun, 13 Nov 2011 02:17:05 +0000 Failed:1 > > > > > > The following errors have occurred: > > > > > > 1. Task failed: 1113-160254-000000 Block task ended > > > > > > prematurely > > > > > > > > > > > > > > > > > > > > > > > > Trying the submit script outside of swift also does not seem > > > > > > to > > > > > > be > > > > > > working. > > > > > > The scripts get submitted to the queue and immediately exits > > > > > > without > > > > > > writing anything to stdout or stderr. > > > > > > > > > > > > Were there any recent changes that could have affected this? > > > > > > > > > > > > I remember to have tried this successfully in the last week > > > > > > of > > > > > > last > > > > > > month. > > > > > > > > > > > > Regards, > > > > > > -- > > > > > > Ketan > > > > > > > > > > > > > > > > -- > > > > > Sent from my mobile device > > > > > _______________________________________________ > > > > > Swift-devel mailing list > > > > > Swift-devel at ci.uchicago.edu > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > -- > > > > Michael Wilde > > > > Computation Institute, University of Chicago > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Ketan > > > > > > -- > > > Michael Wilde > > > Computation Institute, University of Chicago > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > -- > > > Ketan > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From wilde at mcs.anl.gov Sun Nov 13 16:41:36 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sun, 13 Nov 2011 16:41:36 -0600 (CST) Subject: [Swift-devel] swift pbs/beagle broken In-Reply-To: <331770719.25046.1321221178389.JavaMail.root@zimbra.anl.gov> Message-ID: <165246084.25103.1321224096311.JavaMail.root@zimbra.anl.gov> I tracked the message below down to the fact that aprun doesnt like "&" in its command string. I vaguely recall reporting something similar to Cray way back and they agreed its a bug. But it seems that the *original* Swift command string did not have a "&" in it, so Im back to square one. - Mike ----- Original Message ----- > From: "Michael Wilde" > To: "Ketan Maheshwari" > Cc: "Swift Devel" > Sent: Sunday, November 13, 2011 1:52:58 PM > Subject: Re: [Swift-devel] swift pbs/beagle broken > Its starting to look like some kind of aprun-based failure. I see this > from more detailed logging I put into the generated script: > > IN .submit script > aprun: Unexpected close of the apsys control connection > aprun: Exiting due to errors. Application aborted > aprun rc 1 > > I was led off track by the fact that the exitcode file is missing. > Seems that its generated but then removed before we can see it. I > suspect one part of the provider thinks the worker-launch job > succeeded, and hence removes the exitcode file, but another part > realizes that the job failed. (conjecture...) > > Now that that part is partially explained, I think I can go back to > debugging this from manual qsubs which should go faster. > > Im still unsure if the missing stdout/err files is due to a Beagle > issue; starting to look more like maybe due to the weird way in which > the aprun dies. > > Digging deeper... > > - Mike > > ----- Original Message ----- > > From: "Michael Wilde" > > To: "Ketan Maheshwari" > > Cc: "Swift Devel" > > Sent: Sunday, November 13, 2011 7:51:57 AM > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > Ive backed up and just did a test from swift (automatic) > > > > I see that in that case I am *not* getting an exitcode file. > > Are you getting one? > > > > - Mike > > > > ----- Original Message ----- > > > From: "Michael Wilde" > > > To: "Ketan Maheshwari" > > > Cc: "Swift Devel" > > > Sent: Sunday, November 13, 2011 7:45:05 AM > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > But if you put an explicit output redirection in the /bin/sh -c > > > command, you will see that those commands are indeed executing and > > > generating output. > > > > > > So like I mentioned earlier, I dont know if the qsub -o and -e > > > flags > > > have changed behavior (eg they now cant write to /home???), or if > > > we > > > are using them incorrectly. > > > > > > But I think we need to go backwards and see why this is not > > > working > > > with the swift-generated qsub files. > > > > > > We should next add the two tags to the sites file to obtain a log > > > from > > > the worker, on the (untested!) assumption that the worker is > > > really > > > starting in the automatic swift case: > > > > > > > > key="workerLoggingLevel">DEBUG > > > > > key="workerLoggingDirectory">/lustre/beagle/wilde/beagle > > > > > > - Mike > > > > > > > > > ----- Original Message ----- > > > > From: "Ketan Maheshwari" > > > > To: "Michael Wilde" > > > > Cc: "Swift Devel" > > > > Sent: Sunday, November 13, 2011 7:35:24 AM > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > On Sun, Nov 13, 2011 at 9:28 AM, Michael Wilde < > > > > wilde at mcs.anl.gov > > > > > > > > > wrote: > > > > > > > > > > > > 2 thoughts here, Ketan: > > > > > > > > - when I tried my manual coaster test, I replaced the options > > > > "-n > > > > 3 > > > > -N > > > > 1 -cc none -d 24 -F exclusive" on aprun with simply "-B" which > > > > says > > > > "use the options from qsub". I was going to go back and see if > > > > there > > > > was some subtle new mismatch between these qsub and aprun > > > > processor-layout options. > > > > > > > > > > > > > > > > I tried the -B option: > > > > > > > > > > > > > > > > #CoG This script generated by CoG > > > > #CoG by class: class > > > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > > > > #CoG on date: 2011/11/13 02:16:54 > > > > > > > > > > > > #PBS -S /bin/bash > > > > #PBS -N Block-1113-1602 > > > > #PBS -m n > > > > #PBS -A CI-DEB000002 > > > > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > > > > #PBS -l walltime=00:10:00 > > > > #PBS -o > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > > > > #PBS -e > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > > > > WORKER_LOGGING_LEVEL=NONE > > > > #PBS -v WORKER_LOGGING_LEVEL > > > > cd / && aprun -B /bin/sh -c /bin/date > > > > /bin/echo $? > > > > >/home/ketan/.globus/scripts/PBS2583661693904024220.submit.exitcode > > > > > > > > > > > > And see the same behavior. The exitcode file is indeed updated > > > > each > > > > time with a code 0. > > > > > > > > > > > > > > > > - I realized that manually testing the swift-generated submit > > > > file > > > > will give new errors because the swift service is no longer > > > > alive > > > > and > > > > listening on the port that the worker will try to connect to. > > > > Also, > > > > it > > > > seemed that the .pl file itself that automatic coaster bootstrap > > > > places in ~/.globus/coasters was not there. Im assuming that > > > > Swift > > > > removes these files when it exits, but need to verify that this > > > > is > > > > true and that the failure is not due to a missing .pl file. I > > > > suspect > > > > that this is normal and is not the problem, but again, we need > > > > to > > > > keep > > > > debugging until the root cause is found. > > > > > > > > > > > > > > > > Mike > > > > > > > > > > > > ----- Original Message ----- > > > > > > > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > To: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > > > > > > > > > > > > Sent: Sunday, November 13, 2011 7:20:25 AM > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > I tried with a simple /bin/date command at the end of the > > > > > submit > > > > > script removing the call to worker.pl : > > > > > > > > > > > > > > > > > > > > #CoG This script generated by CoG > > > > > #CoG by class: class > > > > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > > > > > #CoG on date: 2011/11/13 02:16:54 > > > > > > > > > > > > > > > #PBS -S /bin/bash > > > > > #PBS -N Block-1113-1602 > > > > > #PBS -m n > > > > > #PBS -A CI-DEB000002 > > > > > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > > > > > #PBS -l walltime=00:10:00 > > > > > #PBS -o > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > > > > > #PBS -e > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > > > > > WORKER_LOGGING_LEVEL=NONE > > > > > #PBS -v WORKER_LOGGING_LEVEL > > > > > cd / && aprun -n 3 -N 1 -cc none -d 24 -F exclusive /bin/sh -c > > > > > /bin/date > > > > > > > > > > > > > > > ======= > > > > > > > > > > > > > > > This fails too. The queue cancels the job as soon as it starts > > > > > running, without writing anything to stdout or stderr. > > > > > > > > > > > > > > > > > > > > On Sun, Nov 13, 2011 at 12:54 AM, Michael Wilde < > > > > > wilde at mcs.anl.gov > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > OK, I dont need these; I can reproduce the problem as well. > > > > > > > > > > For some reason, the coaster worker is exiting immediately. > > > > > > > > > > I see a few possibilities: > > > > > > > > > > - Beagle networking may have changed, making it no longer > > > > > possible > > > > > to > > > > > reach the coaster service from the compute nodes using the > > > > > previous > > > > > IP > > > > > address ranges. > > > > > > > > > > - the worker.pl script is not being created in > > > > > $HOME/.globus/coasters > > > > > > > > > > Mike > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > Sent: Saturday, November 12, 2011 8:39:36 PM > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > Ketan, can you post the submit script and site file? > > > > > > > > > > > > On 11/12/11, Ketan Maheshwari < ketancmaheshwari at gmail.com > > > > > > > wrote: > > > > > > > Hi, > > > > > > > > > > > > > > It seems the pbs-coaster provider (local:pbs) is broken > > > > > > > for > > > > > > > swift. > > > > > > > I > > > > > > > tried > > > > > > > swift trunk, 0.93 svn branch, 0.93RC3 and 0.93RC4 but > > > > > > > getting > > > > > > > the > > > > > > > same > > > > > > > response: > > > > > > > > > > > > > > Swift svn swift-r5205 cog-r3293 > > > > > > > > > > > > > > RunID: 20111113-0216-1d35h7eb > > > > > > > Progress: time: Sun, 13 Nov 2011 02:16:54 +0000 > > > > > > > site setting workersPerNode has been replaced with > > > > > > > jobsPerNode! > > > > > > > Progress: time: Sun, 13 Nov 2011 02:17:05 +0000 Active:1 > > > > > > > Failed to transfer wrapper log for job cat-1hg8aoik > > > > > > > Exception in cat: > > > > > > > Arguments: [data.txt] > > > > > > > Host: pbs > > > > > > > Directory: > > > > > > > catsn-20111113-0216-1d35h7eb/jobs/1/cat-1hg8aoik > > > > > > > stderr.txt: > > > > > > > > > > > > > > stdout.txt: > > > > > > > > > > > > > > ---- > > > > > > > > > > > > > > Caused by: Task failed: 1113-160254-000000 Block task > > > > > > > ended > > > > > > > prematurely > > > > > > > > > > > > > > > > > > > > > Final status: time: Sun, 13 Nov 2011 02:17:05 +0000 > > > > > > > Failed:1 > > > > > > > The following errors have occurred: > > > > > > > 1. Task failed: 1113-160254-000000 Block task ended > > > > > > > prematurely > > > > > > > > > > > > > > > > > > > > > > > > > > > > Trying the submit script outside of swift also does not > > > > > > > seem > > > > > > > to > > > > > > > be > > > > > > > working. > > > > > > > The scripts get submitted to the queue and immediately > > > > > > > exits > > > > > > > without > > > > > > > writing anything to stdout or stderr. > > > > > > > > > > > > > > Were there any recent changes that could have affected > > > > > > > this? > > > > > > > > > > > > > > I remember to have tried this successfully in the last > > > > > > > week > > > > > > > of > > > > > > > last > > > > > > > month. > > > > > > > > > > > > > > Regards, > > > > > > > -- > > > > > > > Ketan > > > > > > > > > > > > > > > > > > > -- > > > > > > Sent from my mobile device > > > > > > _______________________________________________ > > > > > > Swift-devel mailing list > > > > > > Swift-devel at ci.uchicago.edu > > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > -- > > > > > Michael Wilde > > > > > Computation Institute, University of Chicago > > > > > Mathematics and Computer Science Division > > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Ketan > > > > > > > > -- > > > > Michael Wilde > > > > Computation Institute, University of Chicago > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Ketan > > > > > > -- > > > Michael Wilde > > > Computation Institute, University of Chicago > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From wilde at mcs.anl.gov Sun Nov 13 17:08:36 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sun, 13 Nov 2011 17:08:36 -0600 (CST) Subject: [Swift-devel] swift pbs/beagle broken In-Reply-To: <165246084.25103.1321224096311.JavaMail.root@zimbra.anl.gov> Message-ID: <150435543.25151.1321225716416.JavaMail.root@zimbra.anl.gov> OK, as some of you can see in the mesg I just send to beagle-support: it now looks ot me like the root problem of the swift jobs failing is that our home dirs are not beng seen on the compute nodes, hance the swift-generated PBS script to launch the coaster workers cant find the worker.pl script that swift copied to $HOME/.globus/coasters. This is what I see: The following was run under qsub -I; the line "total 0" shows that /home/wilde was empty as seen by the compute node. login1$ aprun /bin/sh -c 'hostname; ls -l /home/wilde/; mount | grep home; ' nid00466 total 0 /autonfs/home on /autonfs/home type dvs (ro,blksize=16384,nodename=c1-0c0s7n3:c4-0c0s2n0:c4-0c0s2n1:c4-0c0s2n2:c4-0c0s2n3,attrcache_timeout=14400,cache,nodatasync,noclosesync,retry,failover,userenv,clusterfs,killprocess,nobulk_rw,noatomic,nodeferopens,loadbalance,maxnodes=1,nnodes=5) /autonfs/home on /autonfs/home type dvs (ro,blksize=16384,nodename=c1-0c0s7n3:c4-0c0s2n0:c4-0c0s2n1:c4-0c0s2n2:c4-0c0s2n3,attrcache_timeout=14400,cache,nodatasync,noclosesync,retry,failover,userenv,clusterfs,killprocess,nobulk_rw,noatomic,nodeferopens,loadbalance,maxnodes=1,nnodes=5) Application 863284 resources: utime ~0s, stime ~0s login1$ Can anyone verify that they are seeing the same symptom? Thanks, - Mike ----- Original Message ----- > From: "Michael Wilde" > To: "Ketan Maheshwari" > Cc: "Swift Devel" > Sent: Sunday, November 13, 2011 2:41:36 PM > Subject: Re: [Swift-devel] swift pbs/beagle broken > I tracked the message below down to the fact that aprun doesnt like > "&" in its command string. I vaguely recall reporting something > similar to Cray way back and they agreed its a bug. > > But it seems that the *original* Swift command string did not have a > "&" in it, so Im back to square one. > > - Mike > > ----- Original Message ----- > > From: "Michael Wilde" > > To: "Ketan Maheshwari" > > Cc: "Swift Devel" > > Sent: Sunday, November 13, 2011 1:52:58 PM > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > Its starting to look like some kind of aprun-based failure. I see > > this > > from more detailed logging I put into the generated script: > > > > IN .submit script > > aprun: Unexpected close of the apsys control connection > > aprun: Exiting due to errors. Application aborted > > aprun rc 1 > > > > I was led off track by the fact that the exitcode file is missing. > > Seems that its generated but then removed before we can see it. I > > suspect one part of the provider thinks the worker-launch job > > succeeded, and hence removes the exitcode file, but another part > > realizes that the job failed. (conjecture...) > > > > Now that that part is partially explained, I think I can go back to > > debugging this from manual qsubs which should go faster. > > > > Im still unsure if the missing stdout/err files is due to a Beagle > > issue; starting to look more like maybe due to the weird way in > > which > > the aprun dies. > > > > Digging deeper... > > > > - Mike > > > > ----- Original Message ----- > > > From: "Michael Wilde" > > > To: "Ketan Maheshwari" > > > Cc: "Swift Devel" > > > Sent: Sunday, November 13, 2011 7:51:57 AM > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > Ive backed up and just did a test from swift (automatic) > > > > > > I see that in that case I am *not* getting an exitcode file. > > > Are you getting one? > > > > > > - Mike > > > > > > ----- Original Message ----- > > > > From: "Michael Wilde" > > > > To: "Ketan Maheshwari" > > > > Cc: "Swift Devel" > > > > Sent: Sunday, November 13, 2011 7:45:05 AM > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > But if you put an explicit output redirection in the /bin/sh -c > > > > command, you will see that those commands are indeed executing > > > > and > > > > generating output. > > > > > > > > So like I mentioned earlier, I dont know if the qsub -o and -e > > > > flags > > > > have changed behavior (eg they now cant write to /home???), or > > > > if > > > > we > > > > are using them incorrectly. > > > > > > > > But I think we need to go backwards and see why this is not > > > > working > > > > with the swift-generated qsub files. > > > > > > > > We should next add the two tags to the sites file to obtain a > > > > log > > > > from > > > > the worker, on the (untested!) assumption that the worker is > > > > really > > > > starting in the automatic swift case: > > > > > > > > > > > key="workerLoggingLevel">DEBUG > > > > > > > key="workerLoggingDirectory">/lustre/beagle/wilde/beagle > > > > > > > > - Mike > > > > > > > > > > > > ----- Original Message ----- > > > > > From: "Ketan Maheshwari" > > > > > To: "Michael Wilde" > > > > > Cc: "Swift Devel" > > > > > Sent: Sunday, November 13, 2011 7:35:24 AM > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > On Sun, Nov 13, 2011 at 9:28 AM, Michael Wilde < > > > > > wilde at mcs.anl.gov > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > 2 thoughts here, Ketan: > > > > > > > > > > - when I tried my manual coaster test, I replaced the options > > > > > "-n > > > > > 3 > > > > > -N > > > > > 1 -cc none -d 24 -F exclusive" on aprun with simply "-B" which > > > > > says > > > > > "use the options from qsub". I was going to go back and see if > > > > > there > > > > > was some subtle new mismatch between these qsub and aprun > > > > > processor-layout options. > > > > > > > > > > > > > > > > > > > > I tried the -B option: > > > > > > > > > > > > > > > > > > > > #CoG This script generated by CoG > > > > > #CoG by class: class > > > > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > > > > > #CoG on date: 2011/11/13 02:16:54 > > > > > > > > > > > > > > > #PBS -S /bin/bash > > > > > #PBS -N Block-1113-1602 > > > > > #PBS -m n > > > > > #PBS -A CI-DEB000002 > > > > > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > > > > > #PBS -l walltime=00:10:00 > > > > > #PBS -o > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > > > > > #PBS -e > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > > > > > WORKER_LOGGING_LEVEL=NONE > > > > > #PBS -v WORKER_LOGGING_LEVEL > > > > > cd / && aprun -B /bin/sh -c /bin/date > > > > > /bin/echo $? > > > > > >/home/ketan/.globus/scripts/PBS2583661693904024220.submit.exitcode > > > > > > > > > > > > > > > And see the same behavior. The exitcode file is indeed updated > > > > > each > > > > > time with a code 0. > > > > > > > > > > > > > > > > > > > > - I realized that manually testing the swift-generated submit > > > > > file > > > > > will give new errors because the swift service is no longer > > > > > alive > > > > > and > > > > > listening on the port that the worker will try to connect to. > > > > > Also, > > > > > it > > > > > seemed that the .pl file itself that automatic coaster > > > > > bootstrap > > > > > places in ~/.globus/coasters was not there. Im assuming that > > > > > Swift > > > > > removes these files when it exits, but need to verify that > > > > > this > > > > > is > > > > > true and that the failure is not due to a missing .pl file. I > > > > > suspect > > > > > that this is normal and is not the problem, but again, we need > > > > > to > > > > > keep > > > > > debugging until the root cause is found. > > > > > > > > > > > > > > > > > > > > Mike > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > > > > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > To: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > > > > > > > > > > > > > > > > Sent: Sunday, November 13, 2011 7:20:25 AM > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > I tried with a simple /bin/date command at the end of the > > > > > > submit > > > > > > script removing the call to worker.pl : > > > > > > > > > > > > > > > > > > > > > > > > #CoG This script generated by CoG > > > > > > #CoG by class: class > > > > > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > > > > > > #CoG on date: 2011/11/13 02:16:54 > > > > > > > > > > > > > > > > > > #PBS -S /bin/bash > > > > > > #PBS -N Block-1113-1602 > > > > > > #PBS -m n > > > > > > #PBS -A CI-DEB000002 > > > > > > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > > > > > > #PBS -l walltime=00:10:00 > > > > > > #PBS -o > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > > > > > > #PBS -e > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > > > > > > WORKER_LOGGING_LEVEL=NONE > > > > > > #PBS -v WORKER_LOGGING_LEVEL > > > > > > cd / && aprun -n 3 -N 1 -cc none -d 24 -F exclusive /bin/sh > > > > > > -c > > > > > > /bin/date > > > > > > > > > > > > > > > > > > ======= > > > > > > > > > > > > > > > > > > This fails too. The queue cancels the job as soon as it > > > > > > starts > > > > > > running, without writing anything to stdout or stderr. > > > > > > > > > > > > > > > > > > > > > > > > On Sun, Nov 13, 2011 at 12:54 AM, Michael Wilde < > > > > > > wilde at mcs.anl.gov > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > OK, I dont need these; I can reproduce the problem as well. > > > > > > > > > > > > For some reason, the coaster worker is exiting immediately. > > > > > > > > > > > > I see a few possibilities: > > > > > > > > > > > > - Beagle networking may have changed, making it no longer > > > > > > possible > > > > > > to > > > > > > reach the coaster service from the compute nodes using the > > > > > > previous > > > > > > IP > > > > > > address ranges. > > > > > > > > > > > > - the worker.pl script is not being created in > > > > > > $HOME/.globus/coasters > > > > > > > > > > > > Mike > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > > Sent: Saturday, November 12, 2011 8:39:36 PM > > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > > Ketan, can you post the submit script and site file? > > > > > > > > > > > > > > On 11/12/11, Ketan Maheshwari < ketancmaheshwari at gmail.com > > > > > > > > > > > > > > > wrote: > > > > > > > > Hi, > > > > > > > > > > > > > > > > It seems the pbs-coaster provider (local:pbs) is broken > > > > > > > > for > > > > > > > > swift. > > > > > > > > I > > > > > > > > tried > > > > > > > > swift trunk, 0.93 svn branch, 0.93RC3 and 0.93RC4 but > > > > > > > > getting > > > > > > > > the > > > > > > > > same > > > > > > > > response: > > > > > > > > > > > > > > > > Swift svn swift-r5205 cog-r3293 > > > > > > > > > > > > > > > > RunID: 20111113-0216-1d35h7eb > > > > > > > > Progress: time: Sun, 13 Nov 2011 02:16:54 +0000 > > > > > > > > site setting workersPerNode has been replaced with > > > > > > > > jobsPerNode! > > > > > > > > Progress: time: Sun, 13 Nov 2011 02:17:05 +0000 Active:1 > > > > > > > > Failed to transfer wrapper log for job cat-1hg8aoik > > > > > > > > Exception in cat: > > > > > > > > Arguments: [data.txt] > > > > > > > > Host: pbs > > > > > > > > Directory: > > > > > > > > catsn-20111113-0216-1d35h7eb/jobs/1/cat-1hg8aoik > > > > > > > > stderr.txt: > > > > > > > > > > > > > > > > stdout.txt: > > > > > > > > > > > > > > > > ---- > > > > > > > > > > > > > > > > Caused by: Task failed: 1113-160254-000000 Block task > > > > > > > > ended > > > > > > > > prematurely > > > > > > > > > > > > > > > > > > > > > > > > Final status: time: Sun, 13 Nov 2011 02:17:05 +0000 > > > > > > > > Failed:1 > > > > > > > > The following errors have occurred: > > > > > > > > 1. Task failed: 1113-160254-000000 Block task ended > > > > > > > > prematurely > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Trying the submit script outside of swift also does not > > > > > > > > seem > > > > > > > > to > > > > > > > > be > > > > > > > > working. > > > > > > > > The scripts get submitted to the queue and immediately > > > > > > > > exits > > > > > > > > without > > > > > > > > writing anything to stdout or stderr. > > > > > > > > > > > > > > > > Were there any recent changes that could have affected > > > > > > > > this? > > > > > > > > > > > > > > > > I remember to have tried this successfully in the last > > > > > > > > week > > > > > > > > of > > > > > > > > last > > > > > > > > month. > > > > > > > > > > > > > > > > Regards, > > > > > > > > -- > > > > > > > > Ketan > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Sent from my mobile device > > > > > > > _______________________________________________ > > > > > > > Swift-devel mailing list > > > > > > > Swift-devel at ci.uchicago.edu > > > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > -- > > > > > > Michael Wilde > > > > > > Computation Institute, University of Chicago > > > > > > Mathematics and Computer Science Division > > > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Ketan > > > > > > > > > > -- > > > > > Michael Wilde > > > > > Computation Institute, University of Chicago > > > > > Mathematics and Computer Science Division > > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Ketan > > > > > > > > -- > > > > Michael Wilde > > > > Computation Institute, University of Chicago > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > > > > > _______________________________________________ > > > > Swift-devel mailing list > > > > Swift-devel at ci.uchicago.edu > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > -- > > > Michael Wilde > > > Computation Institute, University of Chicago > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From hockyg at uchicago.edu Sun Nov 13 17:15:50 2011 From: hockyg at uchicago.edu (Glen Hocky) Date: Sun, 13 Nov 2011 18:15:50 -0500 Subject: [Swift-devel] swift pbs/beagle broken In-Reply-To: <150435543.25151.1321225716416.JavaMail.root@zimbra.anl.gov> References: <165246084.25103.1321224096311.JavaMail.root@zimbra.anl.gov> <150435543.25151.1321225716416.JavaMail.root@zimbra.anl.gov> Message-ID: Mike, I just checked to help you narrow down the problem. Not using beagle right now. anyway to me it looks like /autonfs isn't mounted with the correct user permissions, but actually is mounted. see below under qsub -I hockyg at login2:~> aprun /bin/sh -c 'hostname; ls -ld /autonfs/home/wilde; ' nid00467 drwxr-xr-x 2 root root 2 2010-12-15 15:59 /autonfs/home/wilde Application 863300 resources: utime ~0s, stime ~0s aprun /bin/sh -c 'hostname; ls -ld /lustre/beagle/hockyg; ' nid00467 drwxr-xr-x 3 hockyg ci-users 4096 2011-02-16 14:10 /lustre/beagle/hockyg Application 863302 resources: utime ~0s, stime ~0s hockyg at login2:~> aprun /bin/sh -c 'hostname; df -h; ' .nid00467 Filesystem Size Used Avail Use% Mounted on rootfs 166G 72G 86G 46% / initramdevs 16G 728K 16G 1% /dev rwvar 16G 80K 16G 1% /var rwtmp 16G 0 16G 0% /tmp 31 at gni:/lustrefs 449T 129T 298T 31% /lustre/beagle initramdevs 16G 728K 16G 1% /var/spool/alps / 166G 72G 86G 46% / /gpfs/pads 350T 275T 75T 79% /gpfs/pads /autonfs/home 26T 1.0M 26T 1% /autonfs/home /soft 493G 109G 359G 24% /soft /ufs 37G 702M 35G 2% /ufs /.shared/class/cnos/etc 166G 72G 86G 46% /etc initramdevs 16G 728K 16G 1% /dev rwtmp 16G 0 16G 0% /tmp rwvar 16G 80K 16G 1% /var initramdevs 16G 728K 16G 1% /var/spool/alps /gpfs/pads 350T 275T 75T 79% /gpfs/pads /autonfs/home 26T 1.0M 26T 1% /autonfs/home /soft 493G 109G 359G 24% /soft /ufs 37G 702M 35G 2% /ufs 31 at gni:/lustrefs 449T 129T 298T 31% /lustre/beagle Application 863301 resources: utime ~0s, stime ~0s On Sun, Nov 13, 2011 at 6:08 PM, Michael Wilde wrote: > OK, as some of you can see in the mesg I just send to beagle-support: it > now looks ot me like the root problem of the swift jobs failing is that our > home dirs are not beng seen on the compute nodes, hance the swift-generated > PBS script to launch the coaster workers cant find the worker.pl script > that swift copied to $HOME/.globus/coasters. > > This is what I see: > > The following was run under qsub -I; the line "total 0" shows that > /home/wilde was empty as seen by the compute node. > > login1$ aprun /bin/sh -c 'hostname; ls -l /home/wilde/; mount | grep home; > ' > nid00466 > total 0 > /autonfs/home on /autonfs/home type dvs > (ro,blksize=16384,nodename=c1-0c0s7n3:c4-0c0s2n0:c4-0c0s2n1:c4-0c0s2n2:c4-0c0s2n3,attrcache_timeout=14400,cache,nodatasync,noclosesync,retry,failover,userenv,clusterfs,killprocess,nobulk_rw,noatomic,nodeferopens,loadbalance,maxnodes=1,nnodes=5) > /autonfs/home on /autonfs/home type dvs > (ro,blksize=16384,nodename=c1-0c0s7n3:c4-0c0s2n0:c4-0c0s2n1:c4-0c0s2n2:c4-0c0s2n3,attrcache_timeout=14400,cache,nodatasync,noclosesync,retry,failover,userenv,clusterfs,killprocess,nobulk_rw,noatomic,nodeferopens,loadbalance,maxnodes=1,nnodes=5) > Application 863284 resources: utime ~0s, stime ~0s > login1$ > > Can anyone verify that they are seeing the same symptom? > > Thanks, > > - Mike > > > ----- Original Message ----- > > From: "Michael Wilde" > > To: "Ketan Maheshwari" > > Cc: "Swift Devel" > > Sent: Sunday, November 13, 2011 2:41:36 PM > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > I tracked the message below down to the fact that aprun doesnt like > > "&" in its command string. I vaguely recall reporting something > > similar to Cray way back and they agreed its a bug. > > > > But it seems that the *original* Swift command string did not have a > > "&" in it, so Im back to square one. > > > > - Mike > > > > ----- Original Message ----- > > > From: "Michael Wilde" > > > To: "Ketan Maheshwari" > > > Cc: "Swift Devel" > > > Sent: Sunday, November 13, 2011 1:52:58 PM > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > Its starting to look like some kind of aprun-based failure. I see > > > this > > > from more detailed logging I put into the generated script: > > > > > > IN .submit script > > > aprun: Unexpected close of the apsys control connection > > > aprun: Exiting due to errors. Application aborted > > > aprun rc 1 > > > > > > I was led off track by the fact that the exitcode file is missing. > > > Seems that its generated but then removed before we can see it. I > > > suspect one part of the provider thinks the worker-launch job > > > succeeded, and hence removes the exitcode file, but another part > > > realizes that the job failed. (conjecture...) > > > > > > Now that that part is partially explained, I think I can go back to > > > debugging this from manual qsubs which should go faster. > > > > > > Im still unsure if the missing stdout/err files is due to a Beagle > > > issue; starting to look more like maybe due to the weird way in > > > which > > > the aprun dies. > > > > > > Digging deeper... > > > > > > - Mike > > > > > > ----- Original Message ----- > > > > From: "Michael Wilde" > > > > To: "Ketan Maheshwari" > > > > Cc: "Swift Devel" > > > > Sent: Sunday, November 13, 2011 7:51:57 AM > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > Ive backed up and just did a test from swift (automatic) > > > > > > > > I see that in that case I am *not* getting an exitcode file. > > > > Are you getting one? > > > > > > > > - Mike > > > > > > > > ----- Original Message ----- > > > > > From: "Michael Wilde" > > > > > To: "Ketan Maheshwari" > > > > > Cc: "Swift Devel" > > > > > Sent: Sunday, November 13, 2011 7:45:05 AM > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > But if you put an explicit output redirection in the /bin/sh -c > > > > > command, you will see that those commands are indeed executing > > > > > and > > > > > generating output. > > > > > > > > > > So like I mentioned earlier, I dont know if the qsub -o and -e > > > > > flags > > > > > have changed behavior (eg they now cant write to /home???), or > > > > > if > > > > > we > > > > > are using them incorrectly. > > > > > > > > > > But I think we need to go backwards and see why this is not > > > > > working > > > > > with the swift-generated qsub files. > > > > > > > > > > We should next add the two tags to the sites file to obtain a > > > > > log > > > > > from > > > > > the worker, on the (untested!) assumption that the worker is > > > > > really > > > > > starting in the automatic swift case: > > > > > > > > > > > > > > key="workerLoggingLevel">DEBUG > > > > > > > > > key="workerLoggingDirectory">/lustre/beagle/wilde/beagle > > > > > > > > > > - Mike > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > From: "Ketan Maheshwari" > > > > > > To: "Michael Wilde" > > > > > > Cc: "Swift Devel" > > > > > > Sent: Sunday, November 13, 2011 7:35:24 AM > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > On Sun, Nov 13, 2011 at 9:28 AM, Michael Wilde < > > > > > > wilde at mcs.anl.gov > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > 2 thoughts here, Ketan: > > > > > > > > > > > > - when I tried my manual coaster test, I replaced the options > > > > > > "-n > > > > > > 3 > > > > > > -N > > > > > > 1 -cc none -d 24 -F exclusive" on aprun with simply "-B" which > > > > > > says > > > > > > "use the options from qsub". I was going to go back and see if > > > > > > there > > > > > > was some subtle new mismatch between these qsub and aprun > > > > > > processor-layout options. > > > > > > > > > > > > > > > > > > > > > > > > I tried the -B option: > > > > > > > > > > > > > > > > > > > > > > > > #CoG This script generated by CoG > > > > > > #CoG by class: class > > > > > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > > > > > > #CoG on date: 2011/11/13 02:16:54 > > > > > > > > > > > > > > > > > > #PBS -S /bin/bash > > > > > > #PBS -N Block-1113-1602 > > > > > > #PBS -m n > > > > > > #PBS -A CI-DEB000002 > > > > > > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > > > > > > #PBS -l walltime=00:10:00 > > > > > > #PBS -o > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > > > > > > #PBS -e > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > > > > > > WORKER_LOGGING_LEVEL=NONE > > > > > > #PBS -v WORKER_LOGGING_LEVEL > > > > > > cd / && aprun -B /bin/sh -c /bin/date > > > > > > /bin/echo $? > > > > > > > >/home/ketan/.globus/scripts/PBS2583661693904024220.submit.exitcode > > > > > > > > > > > > > > > > > > And see the same behavior. The exitcode file is indeed updated > > > > > > each > > > > > > time with a code 0. > > > > > > > > > > > > > > > > > > > > > > > > - I realized that manually testing the swift-generated submit > > > > > > file > > > > > > will give new errors because the swift service is no longer > > > > > > alive > > > > > > and > > > > > > listening on the port that the worker will try to connect to. > > > > > > Also, > > > > > > it > > > > > > seemed that the .pl file itself that automatic coaster > > > > > > bootstrap > > > > > > places in ~/.globus/coasters was not there. Im assuming that > > > > > > Swift > > > > > > removes these files when it exits, but need to verify that > > > > > > this > > > > > > is > > > > > > true and that the failure is not due to a missing .pl file. I > > > > > > suspect > > > > > > that this is normal and is not the problem, but again, we need > > > > > > to > > > > > > keep > > > > > > debugging until the root cause is found. > > > > > > > > > > > > > > > > > > > > > > > > Mike > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > > > > > > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > To: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > > > > > > > > > > > > > > > > > > > > Sent: Sunday, November 13, 2011 7:20:25 AM > > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > > I tried with a simple /bin/date command at the end of the > > > > > > > submit > > > > > > > script removing the call to worker.pl : > > > > > > > > > > > > > > > > > > > > > > > > > > > > #CoG This script generated by CoG > > > > > > > #CoG by class: class > > > > > > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > > > > > > > #CoG on date: 2011/11/13 02:16:54 > > > > > > > > > > > > > > > > > > > > > #PBS -S /bin/bash > > > > > > > #PBS -N Block-1113-1602 > > > > > > > #PBS -m n > > > > > > > #PBS -A CI-DEB000002 > > > > > > > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > > > > > > > #PBS -l walltime=00:10:00 > > > > > > > #PBS -o > > > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > > > > > > > #PBS -e > > > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > > > > > > > WORKER_LOGGING_LEVEL=NONE > > > > > > > #PBS -v WORKER_LOGGING_LEVEL > > > > > > > cd / && aprun -n 3 -N 1 -cc none -d 24 -F exclusive /bin/sh > > > > > > > -c > > > > > > > /bin/date > > > > > > > > > > > > > > > > > > > > > ======= > > > > > > > > > > > > > > > > > > > > > This fails too. The queue cancels the job as soon as it > > > > > > > starts > > > > > > > running, without writing anything to stdout or stderr. > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Sun, Nov 13, 2011 at 12:54 AM, Michael Wilde < > > > > > > > wilde at mcs.anl.gov > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > OK, I dont need these; I can reproduce the problem as well. > > > > > > > > > > > > > > For some reason, the coaster worker is exiting immediately. > > > > > > > > > > > > > > I see a few possibilities: > > > > > > > > > > > > > > - Beagle networking may have changed, making it no longer > > > > > > > possible > > > > > > > to > > > > > > > reach the coaster service from the compute nodes using the > > > > > > > previous > > > > > > > IP > > > > > > > address ranges. > > > > > > > > > > > > > > - the worker.pl script is not being created in > > > > > > > $HOME/.globus/coasters > > > > > > > > > > > > > > Mike > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > > > Sent: Saturday, November 12, 2011 8:39:36 PM > > > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > > > Ketan, can you post the submit script and site file? > > > > > > > > > > > > > > > > On 11/12/11, Ketan Maheshwari < ketancmaheshwari at gmail.com > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > It seems the pbs-coaster provider (local:pbs) is broken > > > > > > > > > for > > > > > > > > > swift. > > > > > > > > > I > > > > > > > > > tried > > > > > > > > > swift trunk, 0.93 svn branch, 0.93RC3 and 0.93RC4 but > > > > > > > > > getting > > > > > > > > > the > > > > > > > > > same > > > > > > > > > response: > > > > > > > > > > > > > > > > > > Swift svn swift-r5205 cog-r3293 > > > > > > > > > > > > > > > > > > RunID: 20111113-0216-1d35h7eb > > > > > > > > > Progress: time: Sun, 13 Nov 2011 02:16:54 +0000 > > > > > > > > > site setting workersPerNode has been replaced with > > > > > > > > > jobsPerNode! > > > > > > > > > Progress: time: Sun, 13 Nov 2011 02:17:05 +0000 Active:1 > > > > > > > > > Failed to transfer wrapper log for job cat-1hg8aoik > > > > > > > > > Exception in cat: > > > > > > > > > Arguments: [data.txt] > > > > > > > > > Host: pbs > > > > > > > > > Directory: > > > > > > > > > catsn-20111113-0216-1d35h7eb/jobs/1/cat-1hg8aoik > > > > > > > > > stderr.txt: > > > > > > > > > > > > > > > > > > stdout.txt: > > > > > > > > > > > > > > > > > > ---- > > > > > > > > > > > > > > > > > > Caused by: Task failed: 1113-160254-000000 Block task > > > > > > > > > ended > > > > > > > > > prematurely > > > > > > > > > > > > > > > > > > > > > > > > > > > Final status: time: Sun, 13 Nov 2011 02:17:05 +0000 > > > > > > > > > Failed:1 > > > > > > > > > The following errors have occurred: > > > > > > > > > 1. Task failed: 1113-160254-000000 Block task ended > > > > > > > > > prematurely > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Trying the submit script outside of swift also does not > > > > > > > > > seem > > > > > > > > > to > > > > > > > > > be > > > > > > > > > working. > > > > > > > > > The scripts get submitted to the queue and immediately > > > > > > > > > exits > > > > > > > > > without > > > > > > > > > writing anything to stdout or stderr. > > > > > > > > > > > > > > > > > > Were there any recent changes that could have affected > > > > > > > > > this? > > > > > > > > > > > > > > > > > > I remember to have tried this successfully in the last > > > > > > > > > week > > > > > > > > > of > > > > > > > > > last > > > > > > > > > month. > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > -- > > > > > > > > > Ketan > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Sent from my mobile device > > > > > > > > _______________________________________________ > > > > > > > > Swift-devel mailing list > > > > > > > > Swift-devel at ci.uchicago.edu > > > > > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > > > -- > > > > > > > Michael Wilde > > > > > > > Computation Institute, University of Chicago > > > > > > > Mathematics and Computer Science Division > > > > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Ketan > > > > > > > > > > > > -- > > > > > > Michael Wilde > > > > > > Computation Institute, University of Chicago > > > > > > Mathematics and Computer Science Division > > > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Ketan > > > > > > > > > > -- > > > > > Michael Wilde > > > > > Computation Institute, University of Chicago > > > > > Mathematics and Computer Science Division > > > > > Argonne National Laboratory > > > > > > > > > > _______________________________________________ > > > > > Swift-devel mailing list > > > > > Swift-devel at ci.uchicago.edu > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > -- > > > > Michael Wilde > > > > Computation Institute, University of Chicago > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > > > > > _______________________________________________ > > > > Swift-devel mailing list > > > > Swift-devel at ci.uchicago.edu > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > -- > > > Michael Wilde > > > Computation Institute, University of Chicago > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Sun Nov 13 19:48:35 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sun, 13 Nov 2011 19:48:35 -0600 (CST) Subject: [Swift-devel] swift pbs/beagle broken In-Reply-To: <150435543.25151.1321225716416.JavaMail.root@zimbra.anl.gov> Message-ID: <1857708396.25288.1321235315227.JavaMail.root@zimbra.anl.gov> OK, here is a simple fix for this problem. Just add the variable "SWIFT_USERHOME" to your swift command; then do: export SWIFT_USERHOME=/lustre/beagle/wilde swift etc This makes swift use $SWIFT_USERHOME instead of $HOME to locate the .globus directory. This will of course mess up if a swift run needs to locate your certificates; possibly you can get around that with a symlink. But I suspect most uses of this will be for local execution on systems like Beagle with non-writeable home dirs. Here's the 1-line fix: login$ pwd /home/wilde/swift/src/0.93/cog/modules/swift/bin login$ svn diff Index: swift =================================================================== --- swift (revision 5284) +++ swift (working copy) @@ -86,6 +86,7 @@ updateOptions "$X509_USER_PROXY" "X509_USER_PROXY" updateOptions "$SWIFT_HOME" "COG_INSTALL_PATH" updateOptions "$SWIFT_HOME" "swift.home" +updateOptions "$SWIFT_USERHOME" "user.home" #Use /dev/urandom instead of /dev/random for seeding RNGs #This will lower the randomness of the seed, but avoid #large delays if /dev/random does not have enough entropy collected login$ If others can confirm that this works, I'll check it in. - Mike ----- Original Message ----- > From: "Michael Wilde" > To: "Ketan Maheshwari" > Cc: "Swift Devel" > Sent: Sunday, November 13, 2011 3:08:36 PM > Subject: Re: [Swift-devel] swift pbs/beagle broken > OK, as some of you can see in the mesg I just send to beagle-support: > it now looks ot me like the root problem of the swift jobs failing is > that our home dirs are not beng seen on the compute nodes, hance the > swift-generated PBS script to launch the coaster workers cant find the > worker.pl script that swift copied to $HOME/.globus/coasters. > > This is what I see: > > The following was run under qsub -I; the line "total 0" shows that > /home/wilde was empty as seen by the compute node. > > login1$ aprun /bin/sh -c 'hostname; ls -l /home/wilde/; mount | grep > home; ' > nid00466 > total 0 > /autonfs/home on /autonfs/home type dvs > (ro,blksize=16384,nodename=c1-0c0s7n3:c4-0c0s2n0:c4-0c0s2n1:c4-0c0s2n2:c4-0c0s2n3,attrcache_timeout=14400,cache,nodatasync,noclosesync,retry,failover,userenv,clusterfs,killprocess,nobulk_rw,noatomic,nodeferopens,loadbalance,maxnodes=1,nnodes=5) > /autonfs/home on /autonfs/home type dvs > (ro,blksize=16384,nodename=c1-0c0s7n3:c4-0c0s2n0:c4-0c0s2n1:c4-0c0s2n2:c4-0c0s2n3,attrcache_timeout=14400,cache,nodatasync,noclosesync,retry,failover,userenv,clusterfs,killprocess,nobulk_rw,noatomic,nodeferopens,loadbalance,maxnodes=1,nnodes=5) > Application 863284 resources: utime ~0s, stime ~0s > login1$ > > Can anyone verify that they are seeing the same symptom? > > Thanks, > > - Mike > > > ----- Original Message ----- > > From: "Michael Wilde" > > To: "Ketan Maheshwari" > > Cc: "Swift Devel" > > Sent: Sunday, November 13, 2011 2:41:36 PM > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > I tracked the message below down to the fact that aprun doesnt like > > "&" in its command string. I vaguely recall reporting something > > similar to Cray way back and they agreed its a bug. > > > > But it seems that the *original* Swift command string did not have a > > "&" in it, so Im back to square one. > > > > - Mike > > > > ----- Original Message ----- > > > From: "Michael Wilde" > > > To: "Ketan Maheshwari" > > > Cc: "Swift Devel" > > > Sent: Sunday, November 13, 2011 1:52:58 PM > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > Its starting to look like some kind of aprun-based failure. I see > > > this > > > from more detailed logging I put into the generated script: > > > > > > IN .submit script > > > aprun: Unexpected close of the apsys control connection > > > aprun: Exiting due to errors. Application aborted > > > aprun rc 1 > > > > > > I was led off track by the fact that the exitcode file is missing. > > > Seems that its generated but then removed before we can see it. I > > > suspect one part of the provider thinks the worker-launch job > > > succeeded, and hence removes the exitcode file, but another part > > > realizes that the job failed. (conjecture...) > > > > > > Now that that part is partially explained, I think I can go back > > > to > > > debugging this from manual qsubs which should go faster. > > > > > > Im still unsure if the missing stdout/err files is due to a Beagle > > > issue; starting to look more like maybe due to the weird way in > > > which > > > the aprun dies. > > > > > > Digging deeper... > > > > > > - Mike > > > > > > ----- Original Message ----- > > > > From: "Michael Wilde" > > > > To: "Ketan Maheshwari" > > > > Cc: "Swift Devel" > > > > Sent: Sunday, November 13, 2011 7:51:57 AM > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > Ive backed up and just did a test from swift (automatic) > > > > > > > > I see that in that case I am *not* getting an exitcode file. > > > > Are you getting one? > > > > > > > > - Mike > > > > > > > > ----- Original Message ----- > > > > > From: "Michael Wilde" > > > > > To: "Ketan Maheshwari" > > > > > Cc: "Swift Devel" > > > > > Sent: Sunday, November 13, 2011 7:45:05 AM > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > But if you put an explicit output redirection in the /bin/sh > > > > > -c > > > > > command, you will see that those commands are indeed executing > > > > > and > > > > > generating output. > > > > > > > > > > So like I mentioned earlier, I dont know if the qsub -o and -e > > > > > flags > > > > > have changed behavior (eg they now cant write to /home???), or > > > > > if > > > > > we > > > > > are using them incorrectly. > > > > > > > > > > But I think we need to go backwards and see why this is not > > > > > working > > > > > with the swift-generated qsub files. > > > > > > > > > > We should next add the two tags to the sites file to obtain a > > > > > log > > > > > from > > > > > the worker, on the (untested!) assumption that the worker is > > > > > really > > > > > starting in the automatic swift case: > > > > > > > > > > > > > > key="workerLoggingLevel">DEBUG > > > > > > > > > key="workerLoggingDirectory">/lustre/beagle/wilde/beagle > > > > > > > > > > - Mike > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > From: "Ketan Maheshwari" > > > > > > To: "Michael Wilde" > > > > > > Cc: "Swift Devel" > > > > > > Sent: Sunday, November 13, 2011 7:35:24 AM > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > On Sun, Nov 13, 2011 at 9:28 AM, Michael Wilde < > > > > > > wilde at mcs.anl.gov > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > 2 thoughts here, Ketan: > > > > > > > > > > > > - when I tried my manual coaster test, I replaced the > > > > > > options > > > > > > "-n > > > > > > 3 > > > > > > -N > > > > > > 1 -cc none -d 24 -F exclusive" on aprun with simply "-B" > > > > > > which > > > > > > says > > > > > > "use the options from qsub". I was going to go back and see > > > > > > if > > > > > > there > > > > > > was some subtle new mismatch between these qsub and aprun > > > > > > processor-layout options. > > > > > > > > > > > > > > > > > > > > > > > > I tried the -B option: > > > > > > > > > > > > > > > > > > > > > > > > #CoG This script generated by CoG > > > > > > #CoG by class: class > > > > > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > > > > > > #CoG on date: 2011/11/13 02:16:54 > > > > > > > > > > > > > > > > > > #PBS -S /bin/bash > > > > > > #PBS -N Block-1113-1602 > > > > > > #PBS -m n > > > > > > #PBS -A CI-DEB000002 > > > > > > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > > > > > > #PBS -l walltime=00:10:00 > > > > > > #PBS -o > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > > > > > > #PBS -e > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > > > > > > WORKER_LOGGING_LEVEL=NONE > > > > > > #PBS -v WORKER_LOGGING_LEVEL > > > > > > cd / && aprun -B /bin/sh -c /bin/date > > > > > > /bin/echo $? > > > > > > >/home/ketan/.globus/scripts/PBS2583661693904024220.submit.exitcode > > > > > > > > > > > > > > > > > > And see the same behavior. The exitcode file is indeed > > > > > > updated > > > > > > each > > > > > > time with a code 0. > > > > > > > > > > > > > > > > > > > > > > > > - I realized that manually testing the swift-generated > > > > > > submit > > > > > > file > > > > > > will give new errors because the swift service is no longer > > > > > > alive > > > > > > and > > > > > > listening on the port that the worker will try to connect > > > > > > to. > > > > > > Also, > > > > > > it > > > > > > seemed that the .pl file itself that automatic coaster > > > > > > bootstrap > > > > > > places in ~/.globus/coasters was not there. Im assuming that > > > > > > Swift > > > > > > removes these files when it exits, but need to verify that > > > > > > this > > > > > > is > > > > > > true and that the failure is not due to a missing .pl file. > > > > > > I > > > > > > suspect > > > > > > that this is normal and is not the problem, but again, we > > > > > > need > > > > > > to > > > > > > keep > > > > > > debugging until the root cause is found. > > > > > > > > > > > > > > > > > > > > > > > > Mike > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > > > > > > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > To: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > > > > > > > > > > > > > > > > > > > > Sent: Sunday, November 13, 2011 7:20:25 AM > > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > > I tried with a simple /bin/date command at the end of the > > > > > > > submit > > > > > > > script removing the call to worker.pl : > > > > > > > > > > > > > > > > > > > > > > > > > > > > #CoG This script generated by CoG > > > > > > > #CoG by class: class > > > > > > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > > > > > > > #CoG on date: 2011/11/13 02:16:54 > > > > > > > > > > > > > > > > > > > > > #PBS -S /bin/bash > > > > > > > #PBS -N Block-1113-1602 > > > > > > > #PBS -m n > > > > > > > #PBS -A CI-DEB000002 > > > > > > > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > > > > > > > #PBS -l walltime=00:10:00 > > > > > > > #PBS -o > > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > > > > > > > #PBS -e > > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > > > > > > > WORKER_LOGGING_LEVEL=NONE > > > > > > > #PBS -v WORKER_LOGGING_LEVEL > > > > > > > cd / && aprun -n 3 -N 1 -cc none -d 24 -F exclusive > > > > > > > /bin/sh > > > > > > > -c > > > > > > > /bin/date > > > > > > > > > > > > > > > > > > > > > ======= > > > > > > > > > > > > > > > > > > > > > This fails too. The queue cancels the job as soon as it > > > > > > > starts > > > > > > > running, without writing anything to stdout or stderr. > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Sun, Nov 13, 2011 at 12:54 AM, Michael Wilde < > > > > > > > wilde at mcs.anl.gov > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > OK, I dont need these; I can reproduce the problem as > > > > > > > well. > > > > > > > > > > > > > > For some reason, the coaster worker is exiting > > > > > > > immediately. > > > > > > > > > > > > > > I see a few possibilities: > > > > > > > > > > > > > > - Beagle networking may have changed, making it no longer > > > > > > > possible > > > > > > > to > > > > > > > reach the coaster service from the compute nodes using the > > > > > > > previous > > > > > > > IP > > > > > > > address ranges. > > > > > > > > > > > > > > - the worker.pl script is not being created in > > > > > > > $HOME/.globus/coasters > > > > > > > > > > > > > > Mike > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > > > Sent: Saturday, November 12, 2011 8:39:36 PM > > > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > > > Ketan, can you post the submit script and site file? > > > > > > > > > > > > > > > > On 11/12/11, Ketan Maheshwari < > > > > > > > > ketancmaheshwari at gmail.com > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > It seems the pbs-coaster provider (local:pbs) is > > > > > > > > > broken > > > > > > > > > for > > > > > > > > > swift. > > > > > > > > > I > > > > > > > > > tried > > > > > > > > > swift trunk, 0.93 svn branch, 0.93RC3 and 0.93RC4 but > > > > > > > > > getting > > > > > > > > > the > > > > > > > > > same > > > > > > > > > response: > > > > > > > > > > > > > > > > > > Swift svn swift-r5205 cog-r3293 > > > > > > > > > > > > > > > > > > RunID: 20111113-0216-1d35h7eb > > > > > > > > > Progress: time: Sun, 13 Nov 2011 02:16:54 +0000 > > > > > > > > > site setting workersPerNode has been replaced with > > > > > > > > > jobsPerNode! > > > > > > > > > Progress: time: Sun, 13 Nov 2011 02:17:05 +0000 > > > > > > > > > Active:1 > > > > > > > > > Failed to transfer wrapper log for job cat-1hg8aoik > > > > > > > > > Exception in cat: > > > > > > > > > Arguments: [data.txt] > > > > > > > > > Host: pbs > > > > > > > > > Directory: > > > > > > > > > catsn-20111113-0216-1d35h7eb/jobs/1/cat-1hg8aoik > > > > > > > > > stderr.txt: > > > > > > > > > > > > > > > > > > stdout.txt: > > > > > > > > > > > > > > > > > > ---- > > > > > > > > > > > > > > > > > > Caused by: Task failed: 1113-160254-000000 Block task > > > > > > > > > ended > > > > > > > > > prematurely > > > > > > > > > > > > > > > > > > > > > > > > > > > Final status: time: Sun, 13 Nov 2011 02:17:05 +0000 > > > > > > > > > Failed:1 > > > > > > > > > The following errors have occurred: > > > > > > > > > 1. Task failed: 1113-160254-000000 Block task ended > > > > > > > > > prematurely > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Trying the submit script outside of swift also does > > > > > > > > > not > > > > > > > > > seem > > > > > > > > > to > > > > > > > > > be > > > > > > > > > working. > > > > > > > > > The scripts get submitted to the queue and immediately > > > > > > > > > exits > > > > > > > > > without > > > > > > > > > writing anything to stdout or stderr. > > > > > > > > > > > > > > > > > > Were there any recent changes that could have affected > > > > > > > > > this? > > > > > > > > > > > > > > > > > > I remember to have tried this successfully in the last > > > > > > > > > week > > > > > > > > > of > > > > > > > > > last > > > > > > > > > month. > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > -- > > > > > > > > > Ketan > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Sent from my mobile device > > > > > > > > _______________________________________________ > > > > > > > > Swift-devel mailing list > > > > > > > > Swift-devel at ci.uchicago.edu > > > > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > > > -- > > > > > > > Michael Wilde > > > > > > > Computation Institute, University of Chicago > > > > > > > Mathematics and Computer Science Division > > > > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Ketan > > > > > > > > > > > > -- > > > > > > Michael Wilde > > > > > > Computation Institute, University of Chicago > > > > > > Mathematics and Computer Science Division > > > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Ketan > > > > > > > > > > -- > > > > > Michael Wilde > > > > > Computation Institute, University of Chicago > > > > > Mathematics and Computer Science Division > > > > > Argonne National Laboratory > > > > > > > > > > _______________________________________________ > > > > > Swift-devel mailing list > > > > > Swift-devel at ci.uchicago.edu > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > -- > > > > Michael Wilde > > > > Computation Institute, University of Chicago > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > > > > > _______________________________________________ > > > > Swift-devel mailing list > > > > Swift-devel at ci.uchicago.edu > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > -- > > > Michael Wilde > > > Computation Institute, University of Chicago > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From wilde at mcs.anl.gov Sun Nov 13 20:00:57 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sun, 13 Nov 2011 20:00:57 -0600 (CST) Subject: [Swift-devel] Extra jobs in coaster runs? In-Reply-To: <528098462.25292.1321235701022.JavaMail.root@zimbra.anl.gov> Message-ID: <488176303.25299.1321236057141.JavaMail.root@zimbra.anl.gov> Mihael, ALl, I noticed in hunting down the beagle bug that we're getting "spurious" jobs submitted, seemingly at the end of a coaster run. FOr example, after running 1000 simple cats, I see these two extra in cleanup state: in the run below 51903 and 51905 were the "main" job that ran my app() calls. 1 app call for 51903, 1000 for 51905. Then as these scripts terminate, it seems that some weird (cleanup)? job gets sent, with the jobname "cog-000000". If that is indeed cleanup, any idea why it gets sent as a separate job instead of doing the cleanup action as part of the existing coaster block? In that case can we avoid having to wait on another queued job to finish the script? - Mike login1$ qstat -u wilde sdb: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------------- ------ ----- --- ------ ----- - ----- 519103.sdb wilde developm Block-1114-3601 19725 3 -- -- 00:02 C 00:00 519104.sdb wilde developm cog-000000 18568 3 -- -- 00:00 C 00:00 519105.sdb wilde developm Block-1114-3901 20898 3 -- -- 00:02 C 00:00 519106.sdb wilde developm cog-000000 21367 3 -- -- 00:00 C 00:00 login1$ -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From ketancmaheshwari at gmail.com Sun Nov 13 20:05:15 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Sun, 13 Nov 2011 20:05:15 -0600 Subject: [Swift-devel] swift pbs/beagle broken In-Reply-To: <1857708396.25288.1321235315227.JavaMail.root@zimbra.anl.gov> References: <150435543.25151.1321225716416.JavaMail.root@zimbra.anl.gov> <1857708396.25288.1321235315227.JavaMail.root@zimbra.anl.gov> Message-ID: This fix works for me. I tested with one catsn job on Beagle. On Sun, Nov 13, 2011 at 7:48 PM, Michael Wilde wrote: > OK, here is a simple fix for this problem. Just add the variable > "SWIFT_USERHOME" to your swift command; then do: > > export SWIFT_USERHOME=/lustre/beagle/wilde > swift etc > > This makes swift use $SWIFT_USERHOME instead of $HOME to locate the > .globus directory. > > This will of course mess up if a swift run needs to locate your > certificates; possibly you can get around that with a symlink. But I > suspect most uses of this will be for local execution on systems like > Beagle with non-writeable home dirs. > > Here's the 1-line fix: > > login$ pwd > /home/wilde/swift/src/0.93/cog/modules/swift/bin > login$ svn diff > Index: swift > =================================================================== > --- swift (revision 5284) > +++ swift (working copy) > @@ -86,6 +86,7 @@ > updateOptions "$X509_USER_PROXY" "X509_USER_PROXY" > updateOptions "$SWIFT_HOME" "COG_INSTALL_PATH" > updateOptions "$SWIFT_HOME" "swift.home" > +updateOptions "$SWIFT_USERHOME" "user.home" > #Use /dev/urandom instead of /dev/random for seeding RNGs > #This will lower the randomness of the seed, but avoid > #large delays if /dev/random does not have enough entropy collected > login$ > > If others can confirm that this works, I'll check it in. > > - Mike > > > > ----- Original Message ----- > > From: "Michael Wilde" > > To: "Ketan Maheshwari" > > Cc: "Swift Devel" > > Sent: Sunday, November 13, 2011 3:08:36 PM > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > OK, as some of you can see in the mesg I just send to beagle-support: > > it now looks ot me like the root problem of the swift jobs failing is > > that our home dirs are not beng seen on the compute nodes, hance the > > swift-generated PBS script to launch the coaster workers cant find the > > worker.pl script that swift copied to $HOME/.globus/coasters. > > > > This is what I see: > > > > The following was run under qsub -I; the line "total 0" shows that > > /home/wilde was empty as seen by the compute node. > > > > login1$ aprun /bin/sh -c 'hostname; ls -l /home/wilde/; mount | grep > > home; ' > > nid00466 > > total 0 > > /autonfs/home on /autonfs/home type dvs > > > (ro,blksize=16384,nodename=c1-0c0s7n3:c4-0c0s2n0:c4-0c0s2n1:c4-0c0s2n2:c4-0c0s2n3,attrcache_timeout=14400,cache,nodatasync,noclosesync,retry,failover,userenv,clusterfs,killprocess,nobulk_rw,noatomic,nodeferopens,loadbalance,maxnodes=1,nnodes=5) > > /autonfs/home on /autonfs/home type dvs > > > (ro,blksize=16384,nodename=c1-0c0s7n3:c4-0c0s2n0:c4-0c0s2n1:c4-0c0s2n2:c4-0c0s2n3,attrcache_timeout=14400,cache,nodatasync,noclosesync,retry,failover,userenv,clusterfs,killprocess,nobulk_rw,noatomic,nodeferopens,loadbalance,maxnodes=1,nnodes=5) > > Application 863284 resources: utime ~0s, stime ~0s > > login1$ > > > > Can anyone verify that they are seeing the same symptom? > > > > Thanks, > > > > - Mike > > > > > > ----- Original Message ----- > > > From: "Michael Wilde" > > > To: "Ketan Maheshwari" > > > Cc: "Swift Devel" > > > Sent: Sunday, November 13, 2011 2:41:36 PM > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > I tracked the message below down to the fact that aprun doesnt like > > > "&" in its command string. I vaguely recall reporting something > > > similar to Cray way back and they agreed its a bug. > > > > > > But it seems that the *original* Swift command string did not have a > > > "&" in it, so Im back to square one. > > > > > > - Mike > > > > > > ----- Original Message ----- > > > > From: "Michael Wilde" > > > > To: "Ketan Maheshwari" > > > > Cc: "Swift Devel" > > > > Sent: Sunday, November 13, 2011 1:52:58 PM > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > Its starting to look like some kind of aprun-based failure. I see > > > > this > > > > from more detailed logging I put into the generated script: > > > > > > > > IN .submit script > > > > aprun: Unexpected close of the apsys control connection > > > > aprun: Exiting due to errors. Application aborted > > > > aprun rc 1 > > > > > > > > I was led off track by the fact that the exitcode file is missing. > > > > Seems that its generated but then removed before we can see it. I > > > > suspect one part of the provider thinks the worker-launch job > > > > succeeded, and hence removes the exitcode file, but another part > > > > realizes that the job failed. (conjecture...) > > > > > > > > Now that that part is partially explained, I think I can go back > > > > to > > > > debugging this from manual qsubs which should go faster. > > > > > > > > Im still unsure if the missing stdout/err files is due to a Beagle > > > > issue; starting to look more like maybe due to the weird way in > > > > which > > > > the aprun dies. > > > > > > > > Digging deeper... > > > > > > > > - Mike > > > > > > > > ----- Original Message ----- > > > > > From: "Michael Wilde" > > > > > To: "Ketan Maheshwari" > > > > > Cc: "Swift Devel" > > > > > Sent: Sunday, November 13, 2011 7:51:57 AM > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > Ive backed up and just did a test from swift (automatic) > > > > > > > > > > I see that in that case I am *not* getting an exitcode file. > > > > > Are you getting one? > > > > > > > > > > - Mike > > > > > > > > > > ----- Original Message ----- > > > > > > From: "Michael Wilde" > > > > > > To: "Ketan Maheshwari" > > > > > > Cc: "Swift Devel" > > > > > > Sent: Sunday, November 13, 2011 7:45:05 AM > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > But if you put an explicit output redirection in the /bin/sh > > > > > > -c > > > > > > command, you will see that those commands are indeed executing > > > > > > and > > > > > > generating output. > > > > > > > > > > > > So like I mentioned earlier, I dont know if the qsub -o and -e > > > > > > flags > > > > > > have changed behavior (eg they now cant write to /home???), or > > > > > > if > > > > > > we > > > > > > are using them incorrectly. > > > > > > > > > > > > But I think we need to go backwards and see why this is not > > > > > > working > > > > > > with the swift-generated qsub files. > > > > > > > > > > > > We should next add the two tags to the sites file to obtain a > > > > > > log > > > > > > from > > > > > > the worker, on the (untested!) assumption that the worker is > > > > > > really > > > > > > starting in the automatic swift case: > > > > > > > > > > > > > > > > > key="workerLoggingLevel">DEBUG > > > > > > > > > > > > key="workerLoggingDirectory">/lustre/beagle/wilde/beagle > > > > > > > > > > > > - Mike > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > > From: "Ketan Maheshwari" > > > > > > > To: "Michael Wilde" > > > > > > > Cc: "Swift Devel" > > > > > > > Sent: Sunday, November 13, 2011 7:35:24 AM > > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > > On Sun, Nov 13, 2011 at 9:28 AM, Michael Wilde < > > > > > > > wilde at mcs.anl.gov > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > 2 thoughts here, Ketan: > > > > > > > > > > > > > > - when I tried my manual coaster test, I replaced the > > > > > > > options > > > > > > > "-n > > > > > > > 3 > > > > > > > -N > > > > > > > 1 -cc none -d 24 -F exclusive" on aprun with simply "-B" > > > > > > > which > > > > > > > says > > > > > > > "use the options from qsub". I was going to go back and see > > > > > > > if > > > > > > > there > > > > > > > was some subtle new mismatch between these qsub and aprun > > > > > > > processor-layout options. > > > > > > > > > > > > > > > > > > > > > > > > > > > > I tried the -B option: > > > > > > > > > > > > > > > > > > > > > > > > > > > > #CoG This script generated by CoG > > > > > > > #CoG by class: class > > > > > > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > > > > > > > #CoG on date: 2011/11/13 02:16:54 > > > > > > > > > > > > > > > > > > > > > #PBS -S /bin/bash > > > > > > > #PBS -N Block-1113-1602 > > > > > > > #PBS -m n > > > > > > > #PBS -A CI-DEB000002 > > > > > > > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > > > > > > > #PBS -l walltime=00:10:00 > > > > > > > #PBS -o > > > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > > > > > > > #PBS -e > > > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > > > > > > > WORKER_LOGGING_LEVEL=NONE > > > > > > > #PBS -v WORKER_LOGGING_LEVEL > > > > > > > cd / && aprun -B /bin/sh -c /bin/date > > > > > > > /bin/echo $? > > > > > > > > >/home/ketan/.globus/scripts/PBS2583661693904024220.submit.exitcode > > > > > > > > > > > > > > > > > > > > > And see the same behavior. The exitcode file is indeed > > > > > > > updated > > > > > > > each > > > > > > > time with a code 0. > > > > > > > > > > > > > > > > > > > > > > > > > > > > - I realized that manually testing the swift-generated > > > > > > > submit > > > > > > > file > > > > > > > will give new errors because the swift service is no longer > > > > > > > alive > > > > > > > and > > > > > > > listening on the port that the worker will try to connect > > > > > > > to. > > > > > > > Also, > > > > > > > it > > > > > > > seemed that the .pl file itself that automatic coaster > > > > > > > bootstrap > > > > > > > places in ~/.globus/coasters was not there. Im assuming that > > > > > > > Swift > > > > > > > removes these files when it exits, but need to verify that > > > > > > > this > > > > > > > is > > > > > > > true and that the failure is not due to a missing .pl file. > > > > > > > I > > > > > > > suspect > > > > > > > that this is normal and is not the problem, but again, we > > > > > > > need > > > > > > > to > > > > > > > keep > > > > > > > debugging until the root cause is found. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Mike > > > > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > > > > > > > > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > > To: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Sent: Sunday, November 13, 2011 7:20:25 AM > > > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > > > I tried with a simple /bin/date command at the end of the > > > > > > > > submit > > > > > > > > script removing the call to worker.pl : > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > #CoG This script generated by CoG > > > > > > > > #CoG by class: class > > > > > > > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > > > > > > > > #CoG on date: 2011/11/13 02:16:54 > > > > > > > > > > > > > > > > > > > > > > > > #PBS -S /bin/bash > > > > > > > > #PBS -N Block-1113-1602 > > > > > > > > #PBS -m n > > > > > > > > #PBS -A CI-DEB000002 > > > > > > > > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > > > > > > > > #PBS -l walltime=00:10:00 > > > > > > > > #PBS -o > > > > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > > > > > > > > #PBS -e > > > > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > > > > > > > > WORKER_LOGGING_LEVEL=NONE > > > > > > > > #PBS -v WORKER_LOGGING_LEVEL > > > > > > > > cd / && aprun -n 3 -N 1 -cc none -d 24 -F exclusive > > > > > > > > /bin/sh > > > > > > > > -c > > > > > > > > /bin/date > > > > > > > > > > > > > > > > > > > > > > > > ======= > > > > > > > > > > > > > > > > > > > > > > > > This fails too. The queue cancels the job as soon as it > > > > > > > > starts > > > > > > > > running, without writing anything to stdout or stderr. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Sun, Nov 13, 2011 at 12:54 AM, Michael Wilde < > > > > > > > > wilde at mcs.anl.gov > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > OK, I dont need these; I can reproduce the problem as > > > > > > > > well. > > > > > > > > > > > > > > > > For some reason, the coaster worker is exiting > > > > > > > > immediately. > > > > > > > > > > > > > > > > I see a few possibilities: > > > > > > > > > > > > > > > > - Beagle networking may have changed, making it no longer > > > > > > > > possible > > > > > > > > to > > > > > > > > reach the coaster service from the compute nodes using the > > > > > > > > previous > > > > > > > > IP > > > > > > > > address ranges. > > > > > > > > > > > > > > > > - the worker.pl script is not being created in > > > > > > > > $HOME/.globus/coasters > > > > > > > > > > > > > > > > Mike > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > > > > Sent: Saturday, November 12, 2011 8:39:36 PM > > > > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > > > > Ketan, can you post the submit script and site file? > > > > > > > > > > > > > > > > > > On 11/12/11, Ketan Maheshwari < > > > > > > > > > ketancmaheshwari at gmail.com > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > It seems the pbs-coaster provider (local:pbs) is > > > > > > > > > > broken > > > > > > > > > > for > > > > > > > > > > swift. > > > > > > > > > > I > > > > > > > > > > tried > > > > > > > > > > swift trunk, 0.93 svn branch, 0.93RC3 and 0.93RC4 but > > > > > > > > > > getting > > > > > > > > > > the > > > > > > > > > > same > > > > > > > > > > response: > > > > > > > > > > > > > > > > > > > > Swift svn swift-r5205 cog-r3293 > > > > > > > > > > > > > > > > > > > > RunID: 20111113-0216-1d35h7eb > > > > > > > > > > Progress: time: Sun, 13 Nov 2011 02:16:54 +0000 > > > > > > > > > > site setting workersPerNode has been replaced with > > > > > > > > > > jobsPerNode! > > > > > > > > > > Progress: time: Sun, 13 Nov 2011 02:17:05 +0000 > > > > > > > > > > Active:1 > > > > > > > > > > Failed to transfer wrapper log for job cat-1hg8aoik > > > > > > > > > > Exception in cat: > > > > > > > > > > Arguments: [data.txt] > > > > > > > > > > Host: pbs > > > > > > > > > > Directory: > > > > > > > > > > catsn-20111113-0216-1d35h7eb/jobs/1/cat-1hg8aoik > > > > > > > > > > stderr.txt: > > > > > > > > > > > > > > > > > > > > stdout.txt: > > > > > > > > > > > > > > > > > > > > ---- > > > > > > > > > > > > > > > > > > > > Caused by: Task failed: 1113-160254-000000 Block task > > > > > > > > > > ended > > > > > > > > > > prematurely > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Final status: time: Sun, 13 Nov 2011 02:17:05 +0000 > > > > > > > > > > Failed:1 > > > > > > > > > > The following errors have occurred: > > > > > > > > > > 1. Task failed: 1113-160254-000000 Block task ended > > > > > > > > > > prematurely > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Trying the submit script outside of swift also does > > > > > > > > > > not > > > > > > > > > > seem > > > > > > > > > > to > > > > > > > > > > be > > > > > > > > > > working. > > > > > > > > > > The scripts get submitted to the queue and immediately > > > > > > > > > > exits > > > > > > > > > > without > > > > > > > > > > writing anything to stdout or stderr. > > > > > > > > > > > > > > > > > > > > Were there any recent changes that could have affected > > > > > > > > > > this? > > > > > > > > > > > > > > > > > > > > I remember to have tried this successfully in the last > > > > > > > > > > week > > > > > > > > > > of > > > > > > > > > > last > > > > > > > > > > month. > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > -- > > > > > > > > > > Ketan > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Sent from my mobile device > > > > > > > > > _______________________________________________ > > > > > > > > > Swift-devel mailing list > > > > > > > > > Swift-devel at ci.uchicago.edu > > > > > > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > -- > > > > > > > > Michael Wilde > > > > > > > > Computation Institute, University of Chicago > > > > > > > > Mathematics and Computer Science Division > > > > > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Ketan > > > > > > > > > > > > > > -- > > > > > > > Michael Wilde > > > > > > > Computation Institute, University of Chicago > > > > > > > Mathematics and Computer Science Division > > > > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Ketan > > > > > > > > > > > > -- > > > > > > Michael Wilde > > > > > > Computation Institute, University of Chicago > > > > > > Mathematics and Computer Science Division > > > > > > Argonne National Laboratory > > > > > > > > > > > > _______________________________________________ > > > > > > Swift-devel mailing list > > > > > > Swift-devel at ci.uchicago.edu > > > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > -- > > > > > Michael Wilde > > > > > Computation Institute, University of Chicago > > > > > Mathematics and Computer Science Division > > > > > Argonne National Laboratory > > > > > > > > > > _______________________________________________ > > > > > Swift-devel mailing list > > > > > Swift-devel at ci.uchicago.edu > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > -- > > > > Michael Wilde > > > > Computation Institute, University of Chicago > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > > > > > _______________________________________________ > > > > Swift-devel mailing list > > > > Swift-devel at ci.uchicago.edu > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > -- > > > Michael Wilde > > > Computation Institute, University of Chicago > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Sun Nov 13 20:12:31 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sun, 13 Nov 2011 18:12:31 -0800 Subject: [Swift-devel] Extra jobs in coaster runs? In-Reply-To: <488176303.25299.1321236057141.JavaMail.root@zimbra.anl.gov> References: <488176303.25299.1321236057141.JavaMail.root@zimbra.anl.gov> Message-ID: <1321236751.6521.1.camel@blabla> On Sun, 2011-11-13 at 20:00 -0600, Michael Wilde wrote: > Mihael, ALl, > > I noticed in hunting down the beagle bug that we're getting "spurious" > jobs submitted, seemingly at the end of a coaster run. FOr example, > after running 1000 simple cats, I see these two extra in cleanup > state: in the run below 51903 and 51905 were the "main" job that ran > my app() calls. 1 app call for 51903, 1000 for 51905. > > Then as these scripts terminate, it seems that some weird (cleanup)? > job gets sent, with the jobname "cog-000000". > > If that is indeed cleanup, any idea why it gets sent as a separate job > instead of doing the cleanup action as part of the existing coaster > block? In that case can we avoid having to wait on another queued job > to finish the script? The cleanup jobs are actually sent as batch jobs, so swift doesn't wait for them. Mihael From wilde at mcs.anl.gov Sun Nov 13 20:16:49 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sun, 13 Nov 2011 20:16:49 -0600 (CST) Subject: [Swift-devel] Extra jobs in coaster runs? In-Reply-To: <1321236751.6521.1.camel@blabla> Message-ID: <1612443729.25315.1321237009115.JavaMail.root@zimbra.anl.gov> Ah, cool - makes sense, sounds good. I'll file a ticket to document this. - Mike ----- Original Message ----- > From: "Mihael Hategan" > To: "Michael Wilde" > Cc: "Swift Devel" > Sent: Sunday, November 13, 2011 6:12:31 PM > Subject: Re: Extra jobs in coaster runs? > On Sun, 2011-11-13 at 20:00 -0600, Michael Wilde wrote: > > Mihael, ALl, > > > > I noticed in hunting down the beagle bug that we're getting > > "spurious" > > jobs submitted, seemingly at the end of a coaster run. FOr example, > > after running 1000 simple cats, I see these two extra in cleanup > > state: in the run below 51903 and 51905 were the "main" job that ran > > my app() calls. 1 app call for 51903, 1000 for 51905. > > > > Then as these scripts terminate, it seems that some weird (cleanup)? > > job gets sent, with the jobname "cog-000000". > > > > If that is indeed cleanup, any idea why it gets sent as a separate > > job > > instead of doing the cleanup action as part of the existing coaster > > block? In that case can we avoid having to wait on another queued > > job > > to finish the script? > > The cleanup jobs are actually sent as batch jobs, so swift doesn't > wait > for them. > > Mihael -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From wilde at mcs.anl.gov Sun Nov 13 20:34:51 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sun, 13 Nov 2011 20:34:51 -0600 (CST) Subject: [Swift-devel] swift pbs/beagle broken In-Reply-To: Message-ID: <24844164.25319.1321238091049.JavaMail.root@zimbra.anl.gov> OK. I tested with a 1-job run and a 1000 job run. I committed this "feature" to 0.93 as Swift rev 5285. Im assuming its safe because no one is likely to set SWIFT_USERHOME unless we tell them to. David, can you build and post a new RC? (Do you know how to mark the release as 0.93RC5 per the method Justin described in our last meeting? So that it shows up in the Swift log as that release name...) Ketan, can you see if this now gets Fangfang rolling? Thanks, all. - Mike ----- Original Message ----- > From: "Ketan Maheshwari" > To: "Michael Wilde" > Cc: "Swift Devel" > Sent: Sunday, November 13, 2011 6:05:15 PM > Subject: Re: [Swift-devel] swift pbs/beagle broken > This fix works for me. I tested with one catsn job on Beagle. > > > On Sun, Nov 13, 2011 at 7:48 PM, Michael Wilde < wilde at mcs.anl.gov > > wrote: > > > OK, here is a simple fix for this problem. Just add the variable > "SWIFT_USERHOME" to your swift command; then do: > > export SWIFT_USERHOME=/lustre/beagle/wilde > swift etc > > This makes swift use $SWIFT_USERHOME instead of $HOME to locate the > .globus directory. > > This will of course mess up if a swift run needs to locate your > certificates; possibly you can get around that with a symlink. But I > suspect most uses of this will be for local execution on systems like > Beagle with non-writeable home dirs. > > Here's the 1-line fix: > > login$ pwd > /home/wilde/swift/src/0.93/cog/modules/swift/bin > login$ svn diff > Index: swift > =================================================================== > --- swift (revision 5284) > +++ swift (working copy) > @@ -86,6 +86,7 @@ > updateOptions "$X509_USER_PROXY" "X509_USER_PROXY" > updateOptions "$SWIFT_HOME" "COG_INSTALL_PATH" > updateOptions "$SWIFT_HOME" "swift.home" > +updateOptions "$SWIFT_USERHOME" "user.home" > #Use /dev/urandom instead of /dev/random for seeding RNGs > #This will lower the randomness of the seed, but avoid > #large delays if /dev/random does not have enough entropy collected > login$ > > If others can confirm that this works, I'll check it in. > > > - Mike > > > > ----- Original Message ----- > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > Sent: Sunday, November 13, 2011 3:08:36 PM > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > OK, as some of you can see in the mesg I just send to > > beagle-support: > > it now looks ot me like the root problem of the swift jobs failing > > is > > that our home dirs are not beng seen on the compute nodes, hance the > > swift-generated PBS script to launch the coaster workers cant find > > the > > worker.pl script that swift copied to $HOME/.globus/coasters. > > > > This is what I see: > > > > The following was run under qsub -I; the line "total 0" shows that > > /home/wilde was empty as seen by the compute node. > > > > login1$ aprun /bin/sh -c 'hostname; ls -l /home/wilde/; mount | grep > > home; ' > > nid00466 > > total 0 > > /autonfs/home on /autonfs/home type dvs > > (ro,blksize=16384,nodename=c1-0c0s7n3:c4-0c0s2n0:c4-0c0s2n1:c4-0c0s2n2:c4-0c0s2n3,attrcache_timeout=14400,cache,nodatasync,noclosesync,retry,failover,userenv,clusterfs,killprocess,nobulk_rw,noatomic,nodeferopens,loadbalance,maxnodes=1,nnodes=5) > > /autonfs/home on /autonfs/home type dvs > > (ro,blksize=16384,nodename=c1-0c0s7n3:c4-0c0s2n0:c4-0c0s2n1:c4-0c0s2n2:c4-0c0s2n3,attrcache_timeout=14400,cache,nodatasync,noclosesync,retry,failover,userenv,clusterfs,killprocess,nobulk_rw,noatomic,nodeferopens,loadbalance,maxnodes=1,nnodes=5) > > Application 863284 resources: utime ~0s, stime ~0s > > login1$ > > > > Can anyone verify that they are seeing the same symptom? > > > > Thanks, > > > > - Mike > > > > > > ----- Original Message ----- > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > Sent: Sunday, November 13, 2011 2:41:36 PM > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > I tracked the message below down to the fact that aprun doesnt > > > like > > > "&" in its command string. I vaguely recall reporting something > > > similar to Cray way back and they agreed its a bug. > > > > > > But it seems that the *original* Swift command string did not have > > > a > > > "&" in it, so Im back to square one. > > > > > > - Mike > > > > > > ----- Original Message ----- > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > Sent: Sunday, November 13, 2011 1:52:58 PM > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > Its starting to look like some kind of aprun-based failure. I > > > > see > > > > this > > > > from more detailed logging I put into the generated script: > > > > > > > > IN .submit script > > > > aprun: Unexpected close of the apsys control connection > > > > aprun: Exiting due to errors. Application aborted > > > > aprun rc 1 > > > > > > > > I was led off track by the fact that the exitcode file is > > > > missing. > > > > Seems that its generated but then removed before we can see it. > > > > I > > > > suspect one part of the provider thinks the worker-launch job > > > > succeeded, and hence removes the exitcode file, but another part > > > > realizes that the job failed. (conjecture...) > > > > > > > > Now that that part is partially explained, I think I can go back > > > > to > > > > debugging this from manual qsubs which should go faster. > > > > > > > > Im still unsure if the missing stdout/err files is due to a > > > > Beagle > > > > issue; starting to look more like maybe due to the weird way in > > > > which > > > > the aprun dies. > > > > > > > > Digging deeper... > > > > > > > > - Mike > > > > > > > > ----- Original Message ----- > > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > Sent: Sunday, November 13, 2011 7:51:57 AM > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > Ive backed up and just did a test from swift (automatic) > > > > > > > > > > I see that in that case I am *not* getting an exitcode file. > > > > > Are you getting one? > > > > > > > > > > - Mike > > > > > > > > > > ----- Original Message ----- > > > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > Sent: Sunday, November 13, 2011 7:45:05 AM > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > But if you put an explicit output redirection in the /bin/sh > > > > > > -c > > > > > > command, you will see that those commands are indeed > > > > > > executing > > > > > > and > > > > > > generating output. > > > > > > > > > > > > So like I mentioned earlier, I dont know if the qsub -o and > > > > > > -e > > > > > > flags > > > > > > have changed behavior (eg they now cant write to /home???), > > > > > > or > > > > > > if > > > > > > we > > > > > > are using them incorrectly. > > > > > > > > > > > > But I think we need to go backwards and see why this is not > > > > > > working > > > > > > with the swift-generated qsub files. > > > > > > > > > > > > We should next add the two tags to the sites file to obtain > > > > > > a > > > > > > log > > > > > > from > > > > > > the worker, on the (untested!) assumption that the worker is > > > > > > really > > > > > > starting in the automatic swift case: > > > > > > > > > > > > > > > > > key="workerLoggingLevel">DEBUG > > > > > > > > > > > key="workerLoggingDirectory">/lustre/beagle/wilde/beagle > > > > > > > > > > > > - Mike > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > To: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > > Sent: Sunday, November 13, 2011 7:35:24 AM > > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > > On Sun, Nov 13, 2011 at 9:28 AM, Michael Wilde < > > > > > > > wilde at mcs.anl.gov > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > 2 thoughts here, Ketan: > > > > > > > > > > > > > > - when I tried my manual coaster test, I replaced the > > > > > > > options > > > > > > > "-n > > > > > > > 3 > > > > > > > -N > > > > > > > 1 -cc none -d 24 -F exclusive" on aprun with simply "-B" > > > > > > > which > > > > > > > says > > > > > > > "use the options from qsub". I was going to go back and > > > > > > > see > > > > > > > if > > > > > > > there > > > > > > > was some subtle new mismatch between these qsub and aprun > > > > > > > processor-layout options. > > > > > > > > > > > > > > > > > > > > > > > > > > > > I tried the -B option: > > > > > > > > > > > > > > > > > > > > > > > > > > > > #CoG This script generated by CoG > > > > > > > #CoG by class: class > > > > > > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > > > > > > > #CoG on date: 2011/11/13 02:16:54 > > > > > > > > > > > > > > > > > > > > > #PBS -S /bin/bash > > > > > > > #PBS -N Block-1113-1602 > > > > > > > #PBS -m n > > > > > > > #PBS -A CI-DEB000002 > > > > > > > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > > > > > > > #PBS -l walltime=00:10:00 > > > > > > > #PBS -o > > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > > > > > > > #PBS -e > > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > > > > > > > WORKER_LOGGING_LEVEL=NONE > > > > > > > #PBS -v WORKER_LOGGING_LEVEL > > > > > > > cd / && aprun -B /bin/sh -c /bin/date > > > > > > > /bin/echo $? > > > > > > > >/home/ketan/.globus/scripts/PBS2583661693904024220.submit.exitcode > > > > > > > > > > > > > > > > > > > > > And see the same behavior. The exitcode file is indeed > > > > > > > updated > > > > > > > each > > > > > > > time with a code 0. > > > > > > > > > > > > > > > > > > > > > > > > > > > > - I realized that manually testing the swift-generated > > > > > > > submit > > > > > > > file > > > > > > > will give new errors because the swift service is no > > > > > > > longer > > > > > > > alive > > > > > > > and > > > > > > > listening on the port that the worker will try to connect > > > > > > > to. > > > > > > > Also, > > > > > > > it > > > > > > > seemed that the .pl file itself that automatic coaster > > > > > > > bootstrap > > > > > > > places in ~/.globus/coasters was not there. Im assuming > > > > > > > that > > > > > > > Swift > > > > > > > removes these files when it exits, but need to verify that > > > > > > > this > > > > > > > is > > > > > > > true and that the failure is not due to a missing .pl > > > > > > > file. > > > > > > > I > > > > > > > suspect > > > > > > > that this is normal and is not the problem, but again, we > > > > > > > need > > > > > > > to > > > > > > > keep > > > > > > > debugging until the root cause is found. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Mike > > > > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > > > > > > > > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > > To: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Sent: Sunday, November 13, 2011 7:20:25 AM > > > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > > > I tried with a simple /bin/date command at the end of > > > > > > > > the > > > > > > > > submit > > > > > > > > script removing the call to worker.pl : > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > #CoG This script generated by CoG > > > > > > > > #CoG by class: class > > > > > > > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > > > > > > > > #CoG on date: 2011/11/13 02:16:54 > > > > > > > > > > > > > > > > > > > > > > > > #PBS -S /bin/bash > > > > > > > > #PBS -N Block-1113-1602 > > > > > > > > #PBS -m n > > > > > > > > #PBS -A CI-DEB000002 > > > > > > > > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > > > > > > > > #PBS -l walltime=00:10:00 > > > > > > > > #PBS -o > > > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > > > > > > > > #PBS -e > > > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > > > > > > > > WORKER_LOGGING_LEVEL=NONE > > > > > > > > #PBS -v WORKER_LOGGING_LEVEL > > > > > > > > cd / && aprun -n 3 -N 1 -cc none -d 24 -F exclusive > > > > > > > > /bin/sh > > > > > > > > -c > > > > > > > > /bin/date > > > > > > > > > > > > > > > > > > > > > > > > ======= > > > > > > > > > > > > > > > > > > > > > > > > This fails too. The queue cancels the job as soon as it > > > > > > > > starts > > > > > > > > running, without writing anything to stdout or stderr. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Sun, Nov 13, 2011 at 12:54 AM, Michael Wilde < > > > > > > > > wilde at mcs.anl.gov > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > OK, I dont need these; I can reproduce the problem as > > > > > > > > well. > > > > > > > > > > > > > > > > For some reason, the coaster worker is exiting > > > > > > > > immediately. > > > > > > > > > > > > > > > > I see a few possibilities: > > > > > > > > > > > > > > > > - Beagle networking may have changed, making it no > > > > > > > > longer > > > > > > > > possible > > > > > > > > to > > > > > > > > reach the coaster service from the compute nodes using > > > > > > > > the > > > > > > > > previous > > > > > > > > IP > > > > > > > > address ranges. > > > > > > > > > > > > > > > > - the worker.pl script is not being created in > > > > > > > > $HOME/.globus/coasters > > > > > > > > > > > > > > > > Mike > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > > > > Sent: Saturday, November 12, 2011 8:39:36 PM > > > > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > > > > Ketan, can you post the submit script and site file? > > > > > > > > > > > > > > > > > > On 11/12/11, Ketan Maheshwari < > > > > > > > > > ketancmaheshwari at gmail.com > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > It seems the pbs-coaster provider (local:pbs) is > > > > > > > > > > broken > > > > > > > > > > for > > > > > > > > > > swift. > > > > > > > > > > I > > > > > > > > > > tried > > > > > > > > > > swift trunk, 0.93 svn branch, 0.93RC3 and 0.93RC4 > > > > > > > > > > but > > > > > > > > > > getting > > > > > > > > > > the > > > > > > > > > > same > > > > > > > > > > response: > > > > > > > > > > > > > > > > > > > > Swift svn swift-r5205 cog-r3293 > > > > > > > > > > > > > > > > > > > > RunID: 20111113-0216-1d35h7eb > > > > > > > > > > Progress: time: Sun, 13 Nov 2011 02:16:54 +0000 > > > > > > > > > > site setting workersPerNode has been replaced with > > > > > > > > > > jobsPerNode! > > > > > > > > > > Progress: time: Sun, 13 Nov 2011 02:17:05 +0000 > > > > > > > > > > Active:1 > > > > > > > > > > Failed to transfer wrapper log for job cat-1hg8aoik > > > > > > > > > > Exception in cat: > > > > > > > > > > Arguments: [data.txt] > > > > > > > > > > Host: pbs > > > > > > > > > > Directory: > > > > > > > > > > catsn-20111113-0216-1d35h7eb/jobs/1/cat-1hg8aoik > > > > > > > > > > stderr.txt: > > > > > > > > > > > > > > > > > > > > stdout.txt: > > > > > > > > > > > > > > > > > > > > ---- > > > > > > > > > > > > > > > > > > > > Caused by: Task failed: 1113-160254-000000 Block > > > > > > > > > > task > > > > > > > > > > ended > > > > > > > > > > prematurely > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Final status: time: Sun, 13 Nov 2011 02:17:05 +0000 > > > > > > > > > > Failed:1 > > > > > > > > > > The following errors have occurred: > > > > > > > > > > 1. Task failed: 1113-160254-000000 Block task ended > > > > > > > > > > prematurely > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Trying the submit script outside of swift also does > > > > > > > > > > not > > > > > > > > > > seem > > > > > > > > > > to > > > > > > > > > > be > > > > > > > > > > working. > > > > > > > > > > The scripts get submitted to the queue and > > > > > > > > > > immediately > > > > > > > > > > exits > > > > > > > > > > without > > > > > > > > > > writing anything to stdout or stderr. > > > > > > > > > > > > > > > > > > > > Were there any recent changes that could have > > > > > > > > > > affected > > > > > > > > > > this? > > > > > > > > > > > > > > > > > > > > I remember to have tried this successfully in the > > > > > > > > > > last > > > > > > > > > > week > > > > > > > > > > of > > > > > > > > > > last > > > > > > > > > > month. > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > -- > > > > > > > > > > Ketan > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Sent from my mobile device > > > > > > > > > _______________________________________________ > > > > > > > > > Swift-devel mailing list > > > > > > > > > Swift-devel at ci.uchicago.edu > > > > > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > -- > > > > > > > > Michael Wilde > > > > > > > > Computation Institute, University of Chicago > > > > > > > > Mathematics and Computer Science Division > > > > > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Ketan > > > > > > > > > > > > > > -- > > > > > > > Michael Wilde > > > > > > > Computation Institute, University of Chicago > > > > > > > Mathematics and Computer Science Division > > > > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Ketan > > > > > > > > > > > > -- > > > > > > Michael Wilde > > > > > > Computation Institute, University of Chicago > > > > > > Mathematics and Computer Science Division > > > > > > Argonne National Laboratory > > > > > > > > > > > > _______________________________________________ > > > > > > Swift-devel mailing list > > > > > > Swift-devel at ci.uchicago.edu > > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > -- > > > > > Michael Wilde > > > > > Computation Institute, University of Chicago > > > > > Mathematics and Computer Science Division > > > > > Argonne National Laboratory > > > > > > > > > > _______________________________________________ > > > > > Swift-devel mailing list > > > > > Swift-devel at ci.uchicago.edu > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > -- > > > > Michael Wilde > > > > Computation Institute, University of Chicago > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > > > > > _______________________________________________ > > > > Swift-devel mailing list > > > > Swift-devel at ci.uchicago.edu > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > -- > > > Michael Wilde > > > Computation Institute, University of Chicago > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > > > > > -- > Ketan -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From ketancmaheshwari at gmail.com Sun Nov 13 22:31:35 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Sun, 13 Nov 2011 22:31:35 -0600 Subject: [Swift-devel] swift pbs/beagle broken In-Reply-To: <24844164.25319.1321238091049.JavaMail.root@zimbra.anl.gov> References: <24844164.25319.1321238091049.JavaMail.root@zimbra.anl.gov> Message-ID: On Sun, Nov 13, 2011 at 8:34 PM, Michael Wilde wrote: > OK. I tested with a 1-job run and a 1000 job run. > > I committed this "feature" to 0.93 as Swift rev 5285. Im assuming its safe > because no one is likely to set SWIFT_USERHOME unless we tell them to. > > David, can you build and post a new RC? > (Do you know how to mark the release as 0.93RC5 per the method Justin > described in our last meeting? So that it shows up in the Swift log as that > release name...) > > Ketan, can you see if this now gets Fangfang rolling? > I too tested with 1000 catsn jobs and seemed to work well. I will write Fangfang informing him of the new setup. > > Thanks, all. > > - Mike > > > > ----- Original Message ----- > > From: "Ketan Maheshwari" > > To: "Michael Wilde" > > Cc: "Swift Devel" > > Sent: Sunday, November 13, 2011 6:05:15 PM > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > This fix works for me. I tested with one catsn job on Beagle. > > > > > > On Sun, Nov 13, 2011 at 7:48 PM, Michael Wilde < wilde at mcs.anl.gov > > > wrote: > > > > > > OK, here is a simple fix for this problem. Just add the variable > > "SWIFT_USERHOME" to your swift command; then do: > > > > export SWIFT_USERHOME=/lustre/beagle/wilde > > swift etc > > > > This makes swift use $SWIFT_USERHOME instead of $HOME to locate the > > .globus directory. > > > > This will of course mess up if a swift run needs to locate your > > certificates; possibly you can get around that with a symlink. But I > > suspect most uses of this will be for local execution on systems like > > Beagle with non-writeable home dirs. > > > > Here's the 1-line fix: > > > > login$ pwd > > /home/wilde/swift/src/0.93/cog/modules/swift/bin > > login$ svn diff > > Index: swift > > =================================================================== > > --- swift (revision 5284) > > +++ swift (working copy) > > @@ -86,6 +86,7 @@ > > updateOptions "$X509_USER_PROXY" "X509_USER_PROXY" > > updateOptions "$SWIFT_HOME" "COG_INSTALL_PATH" > > updateOptions "$SWIFT_HOME" "swift.home" > > +updateOptions "$SWIFT_USERHOME" "user.home" > > #Use /dev/urandom instead of /dev/random for seeding RNGs > > #This will lower the randomness of the seed, but avoid > > #large delays if /dev/random does not have enough entropy collected > > login$ > > > > If others can confirm that this works, I'll check it in. > > > > > > - Mike > > > > > > > > ----- Original Message ----- > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > > > > Sent: Sunday, November 13, 2011 3:08:36 PM > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > OK, as some of you can see in the mesg I just send to > > > beagle-support: > > > it now looks ot me like the root problem of the swift jobs failing > > > is > > > that our home dirs are not beng seen on the compute nodes, hance the > > > swift-generated PBS script to launch the coaster workers cant find > > > the > > > worker.pl script that swift copied to $HOME/.globus/coasters. > > > > > > This is what I see: > > > > > > The following was run under qsub -I; the line "total 0" shows that > > > /home/wilde was empty as seen by the compute node. > > > > > > login1$ aprun /bin/sh -c 'hostname; ls -l /home/wilde/; mount | grep > > > home; ' > > > nid00466 > > > total 0 > > > /autonfs/home on /autonfs/home type dvs > > > > (ro,blksize=16384,nodename=c1-0c0s7n3:c4-0c0s2n0:c4-0c0s2n1:c4-0c0s2n2:c4-0c0s2n3,attrcache_timeout=14400,cache,nodatasync,noclosesync,retry,failover,userenv,clusterfs,killprocess,nobulk_rw,noatomic,nodeferopens,loadbalance,maxnodes=1,nnodes=5) > > > /autonfs/home on /autonfs/home type dvs > > > > (ro,blksize=16384,nodename=c1-0c0s7n3:c4-0c0s2n0:c4-0c0s2n1:c4-0c0s2n2:c4-0c0s2n3,attrcache_timeout=14400,cache,nodatasync,noclosesync,retry,failover,userenv,clusterfs,killprocess,nobulk_rw,noatomic,nodeferopens,loadbalance,maxnodes=1,nnodes=5) > > > Application 863284 resources: utime ~0s, stime ~0s > > > login1$ > > > > > > Can anyone verify that they are seeing the same symptom? > > > > > > Thanks, > > > > > > - Mike > > > > > > > > > ----- Original Message ----- > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > Sent: Sunday, November 13, 2011 2:41:36 PM > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > I tracked the message below down to the fact that aprun doesnt > > > > like > > > > "&" in its command string. I vaguely recall reporting something > > > > similar to Cray way back and they agreed its a bug. > > > > > > > > But it seems that the *original* Swift command string did not have > > > > a > > > > "&" in it, so Im back to square one. > > > > > > > > - Mike > > > > > > > > ----- Original Message ----- > > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > Sent: Sunday, November 13, 2011 1:52:58 PM > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > Its starting to look like some kind of aprun-based failure. I > > > > > see > > > > > this > > > > > from more detailed logging I put into the generated script: > > > > > > > > > > IN .submit script > > > > > aprun: Unexpected close of the apsys control connection > > > > > aprun: Exiting due to errors. Application aborted > > > > > aprun rc 1 > > > > > > > > > > I was led off track by the fact that the exitcode file is > > > > > missing. > > > > > Seems that its generated but then removed before we can see it. > > > > > I > > > > > suspect one part of the provider thinks the worker-launch job > > > > > succeeded, and hence removes the exitcode file, but another part > > > > > realizes that the job failed. (conjecture...) > > > > > > > > > > Now that that part is partially explained, I think I can go back > > > > > to > > > > > debugging this from manual qsubs which should go faster. > > > > > > > > > > Im still unsure if the missing stdout/err files is due to a > > > > > Beagle > > > > > issue; starting to look more like maybe due to the weird way in > > > > > which > > > > > the aprun dies. > > > > > > > > > > Digging deeper... > > > > > > > > > > - Mike > > > > > > > > > > ----- Original Message ----- > > > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > Sent: Sunday, November 13, 2011 7:51:57 AM > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > Ive backed up and just did a test from swift (automatic) > > > > > > > > > > > > I see that in that case I am *not* getting an exitcode file. > > > > > > Are you getting one? > > > > > > > > > > > > - Mike > > > > > > > > > > > > ----- Original Message ----- > > > > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > > Sent: Sunday, November 13, 2011 7:45:05 AM > > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > > But if you put an explicit output redirection in the /bin/sh > > > > > > > -c > > > > > > > command, you will see that those commands are indeed > > > > > > > executing > > > > > > > and > > > > > > > generating output. > > > > > > > > > > > > > > So like I mentioned earlier, I dont know if the qsub -o and > > > > > > > -e > > > > > > > flags > > > > > > > have changed behavior (eg they now cant write to /home???), > > > > > > > or > > > > > > > if > > > > > > > we > > > > > > > are using them incorrectly. > > > > > > > > > > > > > > But I think we need to go backwards and see why this is not > > > > > > > working > > > > > > > with the swift-generated qsub files. > > > > > > > > > > > > > > We should next add the two tags to the sites file to obtain > > > > > > > a > > > > > > > log > > > > > > > from > > > > > > > the worker, on the (untested!) assumption that the worker is > > > > > > > really > > > > > > > starting in the automatic swift case: > > > > > > > > > > > > > > > > > > > > key="workerLoggingLevel">DEBUG > > > > > > > > > > > > > > key="workerLoggingDirectory">/lustre/beagle/wilde/beagle > > > > > > > > > > > > > > - Mike > > > > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > > To: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > > > Sent: Sunday, November 13, 2011 7:35:24 AM > > > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > > > On Sun, Nov 13, 2011 at 9:28 AM, Michael Wilde < > > > > > > > > wilde at mcs.anl.gov > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > 2 thoughts here, Ketan: > > > > > > > > > > > > > > > > - when I tried my manual coaster test, I replaced the > > > > > > > > options > > > > > > > > "-n > > > > > > > > 3 > > > > > > > > -N > > > > > > > > 1 -cc none -d 24 -F exclusive" on aprun with simply "-B" > > > > > > > > which > > > > > > > > says > > > > > > > > "use the options from qsub". I was going to go back and > > > > > > > > see > > > > > > > > if > > > > > > > > there > > > > > > > > was some subtle new mismatch between these qsub and aprun > > > > > > > > processor-layout options. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I tried the -B option: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > #CoG This script generated by CoG > > > > > > > > #CoG by class: class > > > > > > > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > > > > > > > > #CoG on date: 2011/11/13 02:16:54 > > > > > > > > > > > > > > > > > > > > > > > > #PBS -S /bin/bash > > > > > > > > #PBS -N Block-1113-1602 > > > > > > > > #PBS -m n > > > > > > > > #PBS -A CI-DEB000002 > > > > > > > > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > > > > > > > > #PBS -l walltime=00:10:00 > > > > > > > > #PBS -o > > > > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > > > > > > > > #PBS -e > > > > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > > > > > > > > WORKER_LOGGING_LEVEL=NONE > > > > > > > > #PBS -v WORKER_LOGGING_LEVEL > > > > > > > > cd / && aprun -B /bin/sh -c /bin/date > > > > > > > > /bin/echo $? > > > > > > > > > >/home/ketan/.globus/scripts/PBS2583661693904024220.submit.exitcode > > > > > > > > > > > > > > > > > > > > > > > > And see the same behavior. The exitcode file is indeed > > > > > > > > updated > > > > > > > > each > > > > > > > > time with a code 0. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - I realized that manually testing the swift-generated > > > > > > > > submit > > > > > > > > file > > > > > > > > will give new errors because the swift service is no > > > > > > > > longer > > > > > > > > alive > > > > > > > > and > > > > > > > > listening on the port that the worker will try to connect > > > > > > > > to. > > > > > > > > Also, > > > > > > > > it > > > > > > > > seemed that the .pl file itself that automatic coaster > > > > > > > > bootstrap > > > > > > > > places in ~/.globus/coasters was not there. Im assuming > > > > > > > > that > > > > > > > > Swift > > > > > > > > removes these files when it exits, but need to verify that > > > > > > > > this > > > > > > > > is > > > > > > > > true and that the failure is not due to a missing .pl > > > > > > > > file. > > > > > > > > I > > > > > > > > suspect > > > > > > > > that this is normal and is not the problem, but again, we > > > > > > > > need > > > > > > > > to > > > > > > > > keep > > > > > > > > debugging until the root cause is found. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Mike > > > > > > > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > > > > > > > > > > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > > > To: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Sent: Sunday, November 13, 2011 7:20:25 AM > > > > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > > > > I tried with a simple /bin/date command at the end of > > > > > > > > > the > > > > > > > > > submit > > > > > > > > > script removing the call to worker.pl : > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > #CoG This script generated by CoG > > > > > > > > > #CoG by class: class > > > > > > > > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > > > > > > > > > #CoG on date: 2011/11/13 02:16:54 > > > > > > > > > > > > > > > > > > > > > > > > > > > #PBS -S /bin/bash > > > > > > > > > #PBS -N Block-1113-1602 > > > > > > > > > #PBS -m n > > > > > > > > > #PBS -A CI-DEB000002 > > > > > > > > > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > > > > > > > > > #PBS -l walltime=00:10:00 > > > > > > > > > #PBS -o > > > > > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > > > > > > > > > #PBS -e > > > > > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > > > > > > > > > WORKER_LOGGING_LEVEL=NONE > > > > > > > > > #PBS -v WORKER_LOGGING_LEVEL > > > > > > > > > cd / && aprun -n 3 -N 1 -cc none -d 24 -F exclusive > > > > > > > > > /bin/sh > > > > > > > > > -c > > > > > > > > > /bin/date > > > > > > > > > > > > > > > > > > > > > > > > > > > ======= > > > > > > > > > > > > > > > > > > > > > > > > > > > This fails too. The queue cancels the job as soon as it > > > > > > > > > starts > > > > > > > > > running, without writing anything to stdout or stderr. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Sun, Nov 13, 2011 at 12:54 AM, Michael Wilde < > > > > > > > > > wilde at mcs.anl.gov > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > OK, I dont need these; I can reproduce the problem as > > > > > > > > > well. > > > > > > > > > > > > > > > > > > For some reason, the coaster worker is exiting > > > > > > > > > immediately. > > > > > > > > > > > > > > > > > > I see a few possibilities: > > > > > > > > > > > > > > > > > > - Beagle networking may have changed, making it no > > > > > > > > > longer > > > > > > > > > possible > > > > > > > > > to > > > > > > > > > reach the coaster service from the compute nodes using > > > > > > > > > the > > > > > > > > > previous > > > > > > > > > IP > > > > > > > > > address ranges. > > > > > > > > > > > > > > > > > > - the worker.pl script is not being created in > > > > > > > > > $HOME/.globus/coasters > > > > > > > > > > > > > > > > > > Mike > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > > > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > > > > > Sent: Saturday, November 12, 2011 8:39:36 PM > > > > > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > > > > > Ketan, can you post the submit script and site file? > > > > > > > > > > > > > > > > > > > > On 11/12/11, Ketan Maheshwari < > > > > > > > > > > ketancmaheshwari at gmail.com > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > > > It seems the pbs-coaster provider (local:pbs) is > > > > > > > > > > > broken > > > > > > > > > > > for > > > > > > > > > > > swift. > > > > > > > > > > > I > > > > > > > > > > > tried > > > > > > > > > > > swift trunk, 0.93 svn branch, 0.93RC3 and 0.93RC4 > > > > > > > > > > > but > > > > > > > > > > > getting > > > > > > > > > > > the > > > > > > > > > > > same > > > > > > > > > > > response: > > > > > > > > > > > > > > > > > > > > > > Swift svn swift-r5205 cog-r3293 > > > > > > > > > > > > > > > > > > > > > > RunID: 20111113-0216-1d35h7eb > > > > > > > > > > > Progress: time: Sun, 13 Nov 2011 02:16:54 +0000 > > > > > > > > > > > site setting workersPerNode has been replaced with > > > > > > > > > > > jobsPerNode! > > > > > > > > > > > Progress: time: Sun, 13 Nov 2011 02:17:05 +0000 > > > > > > > > > > > Active:1 > > > > > > > > > > > Failed to transfer wrapper log for job cat-1hg8aoik > > > > > > > > > > > Exception in cat: > > > > > > > > > > > Arguments: [data.txt] > > > > > > > > > > > Host: pbs > > > > > > > > > > > Directory: > > > > > > > > > > > catsn-20111113-0216-1d35h7eb/jobs/1/cat-1hg8aoik > > > > > > > > > > > stderr.txt: > > > > > > > > > > > > > > > > > > > > > > stdout.txt: > > > > > > > > > > > > > > > > > > > > > > ---- > > > > > > > > > > > > > > > > > > > > > > Caused by: Task failed: 1113-160254-000000 Block > > > > > > > > > > > task > > > > > > > > > > > ended > > > > > > > > > > > prematurely > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Final status: time: Sun, 13 Nov 2011 02:17:05 +0000 > > > > > > > > > > > Failed:1 > > > > > > > > > > > The following errors have occurred: > > > > > > > > > > > 1. Task failed: 1113-160254-000000 Block task ended > > > > > > > > > > > prematurely > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Trying the submit script outside of swift also does > > > > > > > > > > > not > > > > > > > > > > > seem > > > > > > > > > > > to > > > > > > > > > > > be > > > > > > > > > > > working. > > > > > > > > > > > The scripts get submitted to the queue and > > > > > > > > > > > immediately > > > > > > > > > > > exits > > > > > > > > > > > without > > > > > > > > > > > writing anything to stdout or stderr. > > > > > > > > > > > > > > > > > > > > > > Were there any recent changes that could have > > > > > > > > > > > affected > > > > > > > > > > > this? > > > > > > > > > > > > > > > > > > > > > > I remember to have tried this successfully in the > > > > > > > > > > > last > > > > > > > > > > > week > > > > > > > > > > > of > > > > > > > > > > > last > > > > > > > > > > > month. > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > -- > > > > > > > > > > > Ketan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Sent from my mobile device > > > > > > > > > > _______________________________________________ > > > > > > > > > > Swift-devel mailing list > > > > > > > > > > Swift-devel at ci.uchicago.edu > > > > > > > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Michael Wilde > > > > > > > > > Computation Institute, University of Chicago > > > > > > > > > Mathematics and Computer Science Division > > > > > > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Ketan > > > > > > > > > > > > > > > > -- > > > > > > > > Michael Wilde > > > > > > > > Computation Institute, University of Chicago > > > > > > > > Mathematics and Computer Science Division > > > > > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Ketan > > > > > > > > > > > > > > -- > > > > > > > Michael Wilde > > > > > > > Computation Institute, University of Chicago > > > > > > > Mathematics and Computer Science Division > > > > > > > Argonne National Laboratory > > > > > > > > > > > > > > _______________________________________________ > > > > > > > Swift-devel mailing list > > > > > > > Swift-devel at ci.uchicago.edu > > > > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > -- > > > > > > Michael Wilde > > > > > > Computation Institute, University of Chicago > > > > > > Mathematics and Computer Science Division > > > > > > Argonne National Laboratory > > > > > > > > > > > > _______________________________________________ > > > > > > Swift-devel mailing list > > > > > > Swift-devel at ci.uchicago.edu > > > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > -- > > > > > Michael Wilde > > > > > Computation Institute, University of Chicago > > > > > Mathematics and Computer Science Division > > > > > Argonne National Laboratory > > > > > > > > > > _______________________________________________ > > > > > Swift-devel mailing list > > > > > Swift-devel at ci.uchicago.edu > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > -- > > > > Michael Wilde > > > > Computation Institute, University of Chicago > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > > > -- > > > Michael Wilde > > > Computation Institute, University of Chicago > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > > > > > > > > > -- > > Ketan > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Mon Nov 14 08:35:05 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Mon, 14 Nov 2011 08:35:05 -0600 Subject: [Swift-devel] User has problem with PBS provider on Beagle - Fwd: Swift question In-Reply-To: References: <818D3B69-4716-40B9-B4CD-422A62D56787@gmail.com> <1285627078.23656.1321126262196.JavaMail.root@zimbra.anl.gov> <7C4E0EA1-9DC4-404A-8C4A-16B9B011EADF@gmail.com> <832E99F6-26FF-40E4-81C1-DCB19728D339@gmail.com> <2F4CFB07-EADD-4A05-8D99-AFC3E7D99329@gmail.com> Message-ID: Hello Fangfang, Mike discovered yesterday that there was a problem in the mounting of /home filesystem from workernodes. Later, Glen narrowed it down to the fact that the /home filesystem is being mounted with incorrect permissions (755). We think, this issue might have come up after a recent Beagle maintenance. We do not know at the moment if this is an intentional change. Mike implemented a fix for a workaround to this issue which effectively makes use of /lustre/beagle/ filesystem to store and run the pbs submit scripts managed by swift. To use the fix, kindly add the following line before your swift commandline: === export SWIFT_USERHOME=/lustre/beagle/fangfang === After this your /lustre home will be treated as the location for swift managed scripts. I've tested this on Beagle and has worked to completion. Thanks for your patience. Best Regards, Ketan On Sat, Nov 12, 2011 at 4:31 PM, Ketan Maheshwari < ketancmaheshwari at gmail.com> wrote: > Hello Fangfang, > > I logged out of my screen session before my test completed. Now, I could > see that the run I tested was submitted but in the end I saw the same > message. > > I am looking into this and will get back to you soon. > > Regards, > Ketan > > > On Sat, Nov 12, 2011 at 3:55 PM, Fangfang Xia wrote: > >> Thanks Mike and Ketan. >> >> I think it's nice that Swift is preinstalled on Beagle. When I do "module >> load swift", I get "Swift version swift-0.93RC3 loaded"; is that the latest >> Swift? >> >> This time I get a command line error: >> >> login2:catsn > swift -config cf -tc.file tc -sites.file sites.xml >> catsn.swift -n=2 >> Swift svn swift-r5205 cog-r3293 >> >> RunID: 20111112-2151-rbu7ivof >> Progress: time: Sat, 12 Nov 2011 21:51:47 +0000 >> Progress: time: Sat, 12 Nov 2011 21:51:59 +0000 Submitted:1 Active:1 >> Progress: time: Sat, 12 Nov 2011 21:52:10 +0000 Submitted:1 Active:1 >> Exception in cat: >> Arguments: [data.txt] >> Host: pbs >> Directory: catsn-20111112-2151-rbu7ivof/jobs/v/cat-vbgmznik >> - - - >> >> Caused by: Task failed: 1112-510947-000001 Block task ended prematurely >> >> >> Exception in cat: >> Arguments: [data.txt] >> Host: pbs >> Directory: catsn-20111112-2151-rbu7ivof/jobs/t/cat-tbgmznik >> - - - >> >> Caused by: Task failed: 1112-510947-000001 Block task ended prematurely >> >> >> Final status: time: Sat, 12 Nov 2011 21:52:10 +0000 Failed:2 >> The following errors have occurred: >> 1. Task failed: 1112-510947-000001 Block task ended prematurely (2 times) >> >> >> On Nov 12, 2011, at 3:43 PM, Ketan Maheshwari wrote: >> >> Hello Fangfang, >> >> Sorry, I made a mistake in the new line, in place of key="ppn", it should >> be key="providerAttributes". >> >> So the line should be as follows: >> >> > key="providerAttributes">pbs.aprun;pbs.mpp;depth=24 >> >> I just tested this on Beagle and it works now. >> >> Regards, >> Ketan >> >> On Sat, Nov 12, 2011 at 2:30 PM, Fangfang Xia wrote: >> >>> Thanks. The "Illegal value for ppn" line seems to persist in the log. >>> >>> >>> >>> On Nov 12, 2011, at 2:21 PM, Ketan Maheshwari wrote: >>> >>> Hello Fangfang, >>> >>> Could you replace the following line: >>> 24:cray:pack >>> >>> with this one: >>> >> key="ppn">pbs.aprun;pbs.mpp;depth=24 >>> >>> in your sites.xml. >>> >>> The line you have is obsoleted form from the 0.92 version of Swift. >>> >>> It should work now. >>> >>> Regards, >>> Ketan >>> >>> >>> On Sat, Nov 12, 2011 at 2:07 PM, Fangfang Xia wrote: >>> >>>> Hi Ketan, >>>> >>>> Thanks for getting back to me so promptly. I have attached the log >>>> file, and here's the content of sites.xml: >>>> >>>> >>>> >>>> >>>> CI-DEB000002 >>>> >>>> 24:cray:pack >>>> >>>> 24 >>>> 1000 >>>> 1 >>>> 1 >>>> 1 >>>> >>>> .63 >>>> 10000 >>>> >>>> >>>> >>> >/lustre/beagle/fangfang/swift-lab/swift.workdir >>>> >>>> >>>> >>>> There's no error message on the command line. >>>> >>>> >>>> >>>> On Nov 12, 2011, at 2:02 PM, Ketan Maheshwari wrote: >>>> >>>> Hello Fangfang, >>>> >>>> The log file does not seem to be found. Could you attach it please. >>>> >>>> From this line: >>>> Illegal value for ppn. Must be an integer. >>>> >>>> Looks like the sites file is not configured well for the pbs provider. >>>> Could you post your sites.xml. >>>> >>>> Were there any error messages on commandline? >>>> >>>> Regards, >>>> Ketan >>>> >>>> On Sat, Nov 12, 2011 at 1:31 PM, Michael Wilde wrote: >>>> >>>>> Can the first person who has time try to address the problem below? >>>>> Im about to head to SC. >>>>> >>>>> Thanks, >>>>> >>>>> - Mike >>>>> >>>>> >>>>> ----- Forwarded Message ----- >>>>> From: "Fangfang Xia" >>>>> To: "Michael Wilde" >>>>> Cc: "Ketan Maheshwari" , "Scott Devoid" < >>>>> devoid at ci.uchicago.edu> >>>>> Sent: Saturday, November 12, 2011 1:27:29 PM >>>>> Subject: Swift question >>>>> >>>>> Hi Mike and Ketan, >>>>> >>>>> Thanks for the guide. I tried to follow the "cat" example, and got the >>>>> following error: >>>>> >>>>> 2011-11-12 19:21:06,510+0000 DEBUG AbstractExecutor Writing PBS script >>>>> to /home/fangfang/.globus/scripts/PBS6954924010553344333.submit >>>>> 2011-11-12 19:21:06,521+0000 DEBUG PBSExecutor PBS name: for: >>>>> Block-1112-210706-000000 is: Block-1112-2107 >>>>> 2011-11-12 19:21:06,521+0000 INFO BlockTaskSubmitter Error submitting >>>>> block task: Cannot submit job: Illegal value for ppn. Must be an integer. >>>>> 2011-11-12 19:21:16,429+0000 INFO TaskNotifier Congestion queue size: >>>>> 0 >>>>> >>>>> I looked at the PBS script and somehow it's blank. I have attached the >>>>> full log file. Could you please take a look and let me know how to proceed? >>>>> >>>>> Thanks, >>>>> >>>>> Fangfang >>>>> >>>>> On Nov 8, 2011, at 12:42 PM, Michael Wilde wrote: >>>>> >>>>> > Hi Fangfang, Scott, >>>>> > >>>>> > Sorry for the late reply! I think the best roadmap to follow is >>>>> this: >>>>> > >>>>> > - try running the sample tutorial Swift script on Beagle using the >>>>> instructions posted at: >>>>> > >>>>> > >>>>> http://www.ci.uchicago.edu/swift/wwwdev/guides/release-0.93/siteguide/siteguide.html#_beagle >>>>> > >>>>> > This tiny tutorial contains a simple Swift script that does N "cat" >>>>> commends in parallel to "process" an input file and create an output file. >>>>> It contains all the related config files you need to run on Beagle, and is >>>>> thus a good "Hello World" application. You can then copy catsn.swift to >>>>> create the first Swift script to run your actual applications. >>>>> > >>>>> > - set up a face to face meeting with Ketan Maheshwari, the Beagle >>>>> Catalyst for Swift applications. Ketan is based here at Argonne, on the 5th >>>>> floor near my office, 5141. Ketan can help answer any questions you have, >>>>> and will be your personal contact to help you make good use of Beagle. >>>>> > >>>>> > - then do your first Model-SEED script based on catsn.swift, first >>>>> with N = 1 to just ensure that you have described your app's command >>>>> line(s) correctly to Swift and that the app is getting invoked and >>>>> returning output correctly. >>>>> > >>>>> > - then, with help form Ketan as needed, start scaling up to >>>>> increasingly larger runs. >>>>> > >>>>> > I'll try to stay close in the loop and help out as needed. >>>>> > >>>>> > Do you have any questions I can answer to get started? If you are >>>>> at Argonne and available today, perhaps I can join you and Ketan in an >>>>> introductory meeting. Im free from 3 to 4:40 today or after 5:30. >>>>> Otherwise, pelase do this at your joint conveniences. >>>>> > >>>>> > Regards, >>>>> > >>>>> > - Mike >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > ----- Original Message ----- >>>>> >> From: "Fangfang Xia" >>>>> >> To: "Michael Wilde" >>>>> >> Sent: Monday, October 31, 2011 12:44:23 PM >>>>> >> Subject: Re: How is install/test of Model SEED on Beagle going? >>>>> >> Hi Mike, >>>>> >> >>>>> >> We got two types of flux balance analysis to run on beagle. I was >>>>> >> wondering if we should test them with Swift to see if things scale. >>>>> >> Both operations take about 40 seconds to run on sandbox. Ideally we >>>>> >> should also test two more expensive computation "fba single >>>>> knockouts" >>>>> >> and "gapfilling", but I won't be able to resolve the problems with >>>>> >> those until I meet with Chris this week. >>>>> >> >>>>> >> source /lustre/beagle/fangfang/Model-SEED-core/bin/source-me.sh >>>>> >> >>>>> >> fbacheckgrowth -model iJR904.16242 >>>>> >> fbafva -model iJR904.16242 >>>>> >> >>>>> >> You can find the descriptions of these tools at: >>>>> >> http://bionet.mcs.anl.gov/index.php/Using_the_Model_SEED >>>>> >> >>>>> >> I've been switching between PrgEnv-pgi/gcc to get perl modules and >>>>> >> mfatoolkit to compile. And I still seem to be getting the cc1plus >>>>> >> error with gcc which you don't have. So if this version doesn't work >>>>> >> well on multiple processors, I'll need your help with recompiling my >>>>> >> updated mfatoolkit in >>>>> >> >>>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit. >>>>> >> >>>>> >> I have 777'ed my /lustre/beagle/fangfang/ModelSEED/ directory in >>>>> case >>>>> >> you need to test something there. >>>>> >> >>>>> >> Thanks, >>>>> >> Fangfang >>>>> >> >>>>> >> On Oct 24, 2011, at 11:14 PM, Michael Wilde wrote: >>>>> >> >>>>> >>> Hi Fangfang, >>>>> >>> >>>>> >>> I was able to build that directory using the gcc module; I past the >>>>> >>> make output below. It gave many warnings, but I did not get the >>>>> >>> cc1plus libmpc.so error that you encountered. >>>>> >>> >>>>> >>> My build is in $HOME/wilde/mfatoolkit >>>>> >>> >>>>> >>> I ran this on sandbox.beagle.ci.uchicago.edu. >>>>> >>> >>>>> >>> - Mike >>>>> >>> >>>>> >>> ---- make output: >>>>> >>> >>>>> >>> sandbox$ make >>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>> >>> >>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>> >>> -c /home/wilde/mfatoolkit/Source/driver.cpp; mv *.o >>>>> >>> /home/wilde/mfatoolkit/Source >>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>> >>> >>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>> >>> -c /home/wilde/mfatoolkit/Source/MFAProblem.cpp; mv *.o >>>>> >>> /home/wilde/mfatoolkit/Source >>>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>>>> >>> 'int MFAProblem::ModifyInputConstraints(ConstraintsToModify*, >>>>> >>> Data*)': >>>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1828:16: warning: NULL >>>>> >>> used in arithmetic >>>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1832:16: warning: NULL >>>>> >>> used in arithmetic >>>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>>>> >>> 'int MFAProblem::FluxCouplingAnalysis(Data*, >>>>> OptimizationParameter*, >>>>> >>> bool, std::string&, bool)': >>>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:4900:46: warning: >>>>> >>> converting 'false' to pointer type for argument 1 of >>>>> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>>>> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>>>> >>> std::char_traits, _Alloc = std::allocator]' >>>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5023:47: warning: >>>>> >>> converting 'false' to pointer type for argument 1 of >>>>> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>>>> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>>>> >>> std::char_traits, _Alloc = std::allocator]' >>>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5212:47: warning: >>>>> >>> converting 'false' to pointer type for argument 1 of >>>>> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>>>> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>>>> >>> std::char_traits, _Alloc = std::allocator]' >>>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>>>> >>> 'int MFAProblem::IdentifyReactionLoops(Data*, >>>>> >>> OptimizationParameter*)': >>>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5989:43: warning: >>>>> >>> converting 'false' to pointer type for argument 1 of >>>>> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>>>> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>>>> >>> std::char_traits, _Alloc = std::allocator]' >>>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>>>> >>> 'int MFAProblem::ParseRegExp(OptimizationParameter*, Data*, >>>>> >>> std::string)': >>>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:7984:10: warning: >>>>> >>> converting to non-pointer type 'int' from NULL >>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>> >>> >>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>> >>> -c /home/wilde/mfatoolkit/Source/CPLEXapi.cpp; mv *.o >>>>> >>> /home/wilde/mfatoolkit/Source >>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>> >>> >>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>> >>> -c /home/wilde/mfatoolkit/Source/SCIPapi.cpp; mv *.o >>>>> >>> /home/wilde/mfatoolkit/Source >>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>> >>> >>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>> >>> -c /home/wilde/mfatoolkit/Source/GLPKapi.cpp; mv *.o >>>>> >>> /home/wilde/mfatoolkit/Source >>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>> >>> >>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>> >>> -c /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.cpp; mv *.o >>>>> >>> /home/wilde/mfatoolkit/Source >>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>> >>> >>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>> >>> -c /home/wilde/mfatoolkit/Source/SolverInterface.cpp; mv *.o >>>>> >>> /home/wilde/mfatoolkit/Source >>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>> >>> >>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>> >>> -c /home/wilde/mfatoolkit/Source/Species.cpp; mv *.o >>>>> >>> /home/wilde/mfatoolkit/Source >>>>> >>> /home/wilde/mfatoolkit/Source/Species.cpp: In member function 'void >>>>> >>> Species::AddpKab(std::string, bool)': >>>>> >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing >>>>> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>>>> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>>>> >>> std::allocator, value_type = int]' >>>>> >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing >>>>> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>>>> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>>>> >>> std::allocator, value_type = int]' >>>>> >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing >>>>> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>>>> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>>>> >>> std::allocator, value_type = int]' >>>>> >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing >>>>> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>>>> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>>>> >>> std::allocator, value_type = int]' >>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>> >>> >>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>> >>> -c /home/wilde/mfatoolkit/Source/Data.cpp; mv *.o >>>>> >>> /home/wilde/mfatoolkit/Source >>>>> >>> /home/wilde/mfatoolkit/Source/Data.cpp:2220:43: warning: >>>>> >>> multi-character character constant >>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>> >>> >>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>> >>> -c /home/wilde/mfatoolkit/Source/InterfaceFunctions.cpp; mv *.o >>>>> >>> /home/wilde/mfatoolkit/Source >>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>> >>> >>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>> >>> -c /home/wilde/mfatoolkit/Source/Identity.cpp; mv *.o >>>>> >>> /home/wilde/mfatoolkit/Source >>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>> >>> >>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>> >>> -c /home/wilde/mfatoolkit/Source/Reaction.cpp; mv *.o >>>>> >>> /home/wilde/mfatoolkit/Source >>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>> >>> >>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>> >>> -c /home/wilde/mfatoolkit/Source/GlobalFunctions.cpp; mv *.o >>>>> >>> /home/wilde/mfatoolkit/Source >>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>> >>> >>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>> >>> -c /home/wilde/mfatoolkit/Source/AtomCPP.cpp; mv *.o >>>>> >>> /home/wilde/mfatoolkit/Source >>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>> >>> >>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>> >>> -c /home/wilde/mfatoolkit/Source/UtilityFunctions.cpp; mv *.o >>>>> >>> /home/wilde/mfatoolkit/Source >>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>> >>> >>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>> >>> -c /home/wilde/mfatoolkit/Source/AtomType.cpp; mv *.o >>>>> >>> /home/wilde/mfatoolkit/Source >>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>> >>> >>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>> >>> -c /home/wilde/mfatoolkit/Source/Gene.cpp; mv *.o >>>>> >>> /home/wilde/mfatoolkit/Source >>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>> >>> >>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>> >>> -c /home/wilde/mfatoolkit/Source/GeneInterval.cpp; mv *.o >>>>> >>> /home/wilde/mfatoolkit/Source >>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>> >>> >>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>> >>> -c /home/wilde/mfatoolkit/Source/stringDB.cpp; mv *.o >>>>> >>> /home/wilde/mfatoolkit/Source >>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>> >>> >>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>> >>> -o /home/wilde/mfatoolkit/Linux/mfatoolkit >>>>> >>> /home/wilde/mfatoolkit/Source/driver.o >>>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.o >>>>> >>> /home/wilde/mfatoolkit/Source/CPLEXapi.o >>>>> >>> /home/wilde/mfatoolkit/Source/SCIPapi.o >>>>> >>> /home/wilde/mfatoolkit/Source/GLPKapi.o >>>>> >>> /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.o >>>>> >>> /home/wilde/mfatoolkit/Source/SolverInterface.o >>>>> >>> /home/wilde/mfatoolkit/Source/Species.o >>>>> >>> /home/wilde/mfatoolkit/Source/Data.o >>>>> >>> /home/wilde/mfatoolkit/Source/InterfaceFunctions.o >>>>> >>> /home/wilde/mfatoolkit/Source/Identity.o >>>>> >>> /home/wilde/mfatoolkit/Source/Reaction.o >>>>> >>> /home/wilde/mfatoolkit/Source/GlobalFunctions.o >>>>> >>> /home/wilde/mfatoolkit/Source/AtomCPP.o >>>>> >>> /home/wilde/mfatoolkit/Source/UtilityFunctions.o >>>>> >>> /home/wilde/mfatoolkit/Source/AtomType.o >>>>> >>> /home/wilde/mfatoolkit/Source/Gene.o >>>>> >>> /home/wilde/mfatoolkit/Source/GeneInterval.o >>>>> >>> /home/wilde/mfatoolkit/Source/stringDB.o >>>>> >>> -L/lustre/beagle/fangfang/ModelSEED/Solvers/lib -lglpk >>>>> >>> >>>>> -L/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/lib/x86-64_sles10_4.1/static_pic >>>>> >>> -lcplex -lm -lpthread -lz >>>>> >>> sandbox$ >>>>> >>> >>>>> >>> >>>>> >>> ----- Original Message ----- >>>>> >>>> From: "Fangfang Xia" >>>>> >>>> To: "Michael Wilde" >>>>> >>>> Cc: "Scott Devoid" >>>>> >>>> Sent: Monday, October 24, 2011 5:20:20 PM >>>>> >>>> Subject: Re: How is install/test of Model SEED on Beagle going? >>>>> >>>> Hi Mike, >>>>> >>>> >>>>> >>>> This is very helpful. Thanks for pointing out the difference >>>>> >>>> between >>>>> >>>> PrgEnv-pgi and gcc. Here's an error message we got when trying to >>>>> >>>> compile our core c++ code. >>>>> >>>> >>>>> >>>> /opt/gcc/4.5.2/snos/libexec/gcc/x86_64-suse-linux/default/cc1plus: >>>>> >>>> error while loading shared libraries: libmpc.so.2: cannot open >>>>> >>>> shared >>>>> >>>> object file: No such file or directory >>>>> >>>> >>>>> >>>> It looks like something is wrong with cc1plus. I suppose it's part >>>>> >>>> of >>>>> >>>> the g++? I don't know what it does. >>>>> >>>> >>>>> >>>> So we resolved the perl dependency issues, and we were able to >>>>> >>>> compile >>>>> >>>> the code with the default PrgEnv-pgi just for testing purposes. It >>>>> >>>> seems we still have some issues with our new pipeline code. But I >>>>> >>>> don't think we are very far from giving you a running example. >>>>> >>>> >>>>> >>>> Just in case you could help us with the gcc compilation issue, I >>>>> >>>> have >>>>> >>>> 777'ed my directory and here's the steps to compile the core C++ >>>>> >>>> code: >>>>> >>>> >>>>> >>>> source >>>>> >>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/bin/source-me.sh >>>>> >>>> cd >>>>> >>>> >>>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit/Linux >>>>> >>>> make >>>>> >>>> >>>>> >>>> Thanks, >>>>> >>>> Fangfang >>>>> >>>> >>>>> >>>> On Oct 22, 2011, at 1:10 PM, Michael Wilde wrote: >>>>> >>>> >>>>> >>>>> Sounds great, thanks for the update, Fangfang. >>>>> >>>>> >>>>> >>>>> One question: what compiler are you using? >>>>> >>>>> >>>>> >>>>> I'd like to suggest, for the first pass, that you use the "gcc" >>>>> >>>>> module (rather than the PrgEnv-pgi or PrgEnv-gnu modules). Thats >>>>> >>>>> because the GCC module will create code that we can run in >>>>> >>>>> parallel, >>>>> >>>>> multiple programs in parallel per compute node. The PrgEnv >>>>> modules >>>>> >>>>> all create code that expects to run only one program per node, >>>>> >>>>> because its meant for MPI, OpenMP, etc). >>>>> >>>>> >>>>> >>>>> Also, I think that the gcc module (which I think includes gcc, >>>>> g++ >>>>> >>>>> and gfortran) may be more like the traditional Linux gcc than >>>>> >>>>> PrgEnv-gnu. >>>>> >>>>> >>>>> >>>>> The default PrgEnv (at least for me) is pgi. So before i build >>>>> >>>>> software I do: >>>>> >>>>> >>>>> >>>>> module unload PrgEnv-pgi >>>>> >>>>> module load gcc >>>>> >>>>> >>>>> >>>>> Let me know if I can help; if you want i can try to build you a >>>>> >>>>> libxml2 using gcc. >>>>> >>>>> Same for Perl if it needs to be executed multiple copies per node >>>>> >>>>> in >>>>> >>>>> parallel. >>>>> >>>>> >>>>> >>>>> We can discuss more next week, and I'll be working off and on >>>>> this >>>>> >>>>> weekend. >>>>> >>>>> >>>>> >>>>> Regards, >>>>> >>>>> >>>>> >>>>> - Mike >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> ----- Original Message ----- >>>>> >>>>>> From: "Fangfang Xia" >>>>> >>>>>> To: "Michael Wilde" >>>>> >>>>>> Cc: "Fangfang Xia" , "Scott Devoid" >>>>> >>>>>> >>>>> >>>>>> Sent: Saturday, October 22, 2011 12:39:05 PM >>>>> >>>>>> Subject: Re: How is install/test of Model SEED on Beagle going? >>>>> >>>>>> Hi Mike, >>>>> >>>>>> >>>>> >>>>>> We encountered some dependency issues while attempting to >>>>> install >>>>> >>>>>> some >>>>> >>>>>> additional Perl libraries for ModelSEED. We have asked Beagle >>>>> >>>>>> systems >>>>> >>>>>> folks to help install libxml2. I'm also looking into ways to >>>>> >>>>>> install >>>>> >>>>>> it in a user directory. I get the feeling that things should be >>>>> >>>>>> resolved after our group meeting on Monday. So we'll keep you >>>>> >>>>>> posted. >>>>> >>>>>> >>>>> >>>>>> Thanks, >>>>> >>>>>> >>>>> >>>>>> Fangfang >>>>> >>>>>> >>>>> >>>>>> On Oct 21, 2011, at 2:08 PM, Michael Wilde wrote: >>>>> >>>>>> >>>>> >>>>>>> Hi Fangfang, Scott, >>>>> >>>>>>> >>>>> >>>>>>> Any progress - can I try it soon? >>>>> >>>>>>> >>>>> >>>>>>> Or, any problems that I can help with? Im at Argonne today >>>>> >>>>>>> (5141) >>>>> >>>>>>> if >>>>> >>>>>>> I can help or you'd like to talk. Free except for 3:30 - 4:30. >>>>> >>>>>>> >>>>> >>>>>>> Regards, >>>>> >>>>>>> >>>>> >>>>>>> - Mike >>>>> >>>>>>> >>>>> >>>>>>> >>>>> >>>>>>> -- >>>>> >>>>>>> Michael Wilde >>>>> >>>>>>> Computation Institute, University of Chicago >>>>> >>>>>>> Mathematics and Computer Science Division >>>>> >>>>>>> Argonne National Laboratory >>>>> >>>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Michael Wilde >>>>> >>>>> Computation Institute, University of Chicago >>>>> >>>>> Mathematics and Computer Science Division >>>>> >>>>> Argonne National Laboratory >>>>> >>>>> >>>>> >>> >>>>> >>> -- >>>>> >>> Michael Wilde >>>>> >>> Computation Institute, University of Chicago >>>>> >>> Mathematics and Computer Science Division >>>>> >>> Argonne National Laboratory >>>>> >>> >>>>> > >>>>> > -- >>>>> > Michael Wilde >>>>> > Computation Institute, University of Chicago >>>>> > Mathematics and Computer Science Division >>>>> > Argonne National Laboratory >>>>> > >>>>> >>>>> >>>>> -- >>>>> Michael Wilde >>>>> Computation Institute, University of Chicago >>>>> Mathematics and Computer Science Division >>>>> Argonne National Laboratory >>>>> >>>>> _______________________________________________ >>>>> Swift-devel mailing list >>>>> Swift-devel at ci.uchicago.edu >>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>>> >>>> >>>> >>>> >>>> -- >>>> Ketan >>>> >>>> >>>> >>>> >>>> >>> >>> >>> -- >>> Ketan >>> >>> >>> >>> >>> >> >> >> -- >> Ketan >> >> >> >> > > > -- > Ketan > > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From fangfang.xia at gmail.com Mon Nov 14 08:53:20 2011 From: fangfang.xia at gmail.com (Fangfang Xia) Date: Mon, 14 Nov 2011 08:53:20 -0600 Subject: [Swift-devel] User has problem with PBS provider on Beagle - Fwd: Swift question In-Reply-To: References: <818D3B69-4716-40B9-B4CD-422A62D56787@gmail.com> <1285627078.23656.1321126262196.JavaMail.root@zimbra.anl.gov> <7C4E0EA1-9DC4-404A-8C4A-16B9B011EADF@gmail.com> <832E99F6-26FF-40E4-81C1-DCB19728D339@gmail.com> <2F4CFB07-EADD-4A05-8D99-AFC3E7D99329@gmail.com> Message-ID: <1F284688-8A70-4B53-854A-2C5FE294EDFF@gmail.com> Hi Ketan and Mike, Thanks for your help. This works wonderfully. Now I'll go ahead and play with the ModelSEED pipelines. Fangfang On Nov 14, 2011, at 8:35 AM, Ketan Maheshwari wrote: > Hello Fangfang, > > Mike discovered yesterday that there was a problem in the mounting of /home filesystem from workernodes. Later, Glen narrowed it down to the fact that the /home filesystem is being mounted with incorrect permissions (755). We think, this issue might have come up after a recent Beagle maintenance. We do not know at the moment if this is an intentional change. > > Mike implemented a fix for a workaround to this issue which effectively makes use of /lustre/beagle/ filesystem to store and run the pbs submit scripts managed by swift. > > To use the fix, kindly add the following line before your swift commandline: > > === > export SWIFT_USERHOME=/lustre/beagle/fangfang > === > > After this your /lustre home will be treated as the location for swift managed scripts. > > I've tested this on Beagle and has worked to completion. > > Thanks for your patience. > > Best Regards, > Ketan > > On Sat, Nov 12, 2011 at 4:31 PM, Ketan Maheshwari wrote: > Hello Fangfang, > > I logged out of my screen session before my test completed. Now, I could see that the run I tested was submitted but in the end I saw the same message. > > I am looking into this and will get back to you soon. > > Regards, > Ketan > > > On Sat, Nov 12, 2011 at 3:55 PM, Fangfang Xia wrote: > Thanks Mike and Ketan. > > I think it's nice that Swift is preinstalled on Beagle. When I do "module load swift", I get "Swift version swift-0.93RC3 loaded"; is that the latest Swift? > > This time I get a command line error: > > login2:catsn > swift -config cf -tc.file tc -sites.file sites.xml catsn.swift -n=2 > Swift svn swift-r5205 cog-r3293 > > RunID: 20111112-2151-rbu7ivof > Progress: time: Sat, 12 Nov 2011 21:51:47 +0000 > Progress: time: Sat, 12 Nov 2011 21:51:59 +0000 Submitted:1 Active:1 > Progress: time: Sat, 12 Nov 2011 21:52:10 +0000 Submitted:1 Active:1 > Exception in cat: > Arguments: [data.txt] > Host: pbs > Directory: catsn-20111112-2151-rbu7ivof/jobs/v/cat-vbgmznik > - - - > > Caused by: Task failed: 1112-510947-000001 Block task ended prematurely > > > Exception in cat: > Arguments: [data.txt] > Host: pbs > Directory: catsn-20111112-2151-rbu7ivof/jobs/t/cat-tbgmznik > - - - > > Caused by: Task failed: 1112-510947-000001 Block task ended prematurely > > > Final status: time: Sat, 12 Nov 2011 21:52:10 +0000 Failed:2 > The following errors have occurred: > 1. Task failed: 1112-510947-000001 Block task ended prematurely (2 times) > > > On Nov 12, 2011, at 3:43 PM, Ketan Maheshwari wrote: > >> Hello Fangfang, >> >> Sorry, I made a mistake in the new line, in place of key="ppn", it should be key="providerAttributes". >> >> So the line should be as follows: >> >> pbs.aprun;pbs.mpp;depth=24 >> >> I just tested this on Beagle and it works now. >> >> Regards, >> Ketan >> >> On Sat, Nov 12, 2011 at 2:30 PM, Fangfang Xia wrote: >> Thanks. The "Illegal value for ppn" line seems to persist in the log. >> >> >> >> On Nov 12, 2011, at 2:21 PM, Ketan Maheshwari wrote: >> >>> Hello Fangfang, >>> >>> Could you replace the following line: >>> 24:cray:pack >>> >>> with this one: >>> pbs.aprun;pbs.mpp;depth=24 >>> >>> in your sites.xml. >>> >>> The line you have is obsoleted form from the 0.92 version of Swift. >>> >>> It should work now. >>> >>> Regards, >>> Ketan >>> >>> >>> On Sat, Nov 12, 2011 at 2:07 PM, Fangfang Xia wrote: >>> Hi Ketan, >>> >>> Thanks for getting back to me so promptly. I have attached the log file, and here's the content of sites.xml: >>> >>> >>> >>> >>> CI-DEB000002 >>> >>> 24:cray:pack >>> >>> 24 >>> 1000 >>> 1 >>> 1 >>> 1 >>> >>> .63 >>> 10000 >>> >>> >>> /lustre/beagle/fangfang/swift-lab/swift.workdir >>> >>> >>> >>> There's no error message on the command line. >>> >>> >>> >>> On Nov 12, 2011, at 2:02 PM, Ketan Maheshwari wrote: >>> >>>> Hello Fangfang, >>>> >>>> The log file does not seem to be found. Could you attach it please. >>>> >>>> From this line: >>>> Illegal value for ppn. Must be an integer. >>>> >>>> Looks like the sites file is not configured well for the pbs provider. Could you post your sites.xml. >>>> >>>> Were there any error messages on commandline? >>>> >>>> Regards, >>>> Ketan >>>> >>>> On Sat, Nov 12, 2011 at 1:31 PM, Michael Wilde wrote: >>>> Can the first person who has time try to address the problem below? >>>> Im about to head to SC. >>>> >>>> Thanks, >>>> >>>> - Mike >>>> >>>> >>>> ----- Forwarded Message ----- >>>> From: "Fangfang Xia" >>>> To: "Michael Wilde" >>>> Cc: "Ketan Maheshwari" , "Scott Devoid" >>>> Sent: Saturday, November 12, 2011 1:27:29 PM >>>> Subject: Swift question >>>> >>>> Hi Mike and Ketan, >>>> >>>> Thanks for the guide. I tried to follow the "cat" example, and got the following error: >>>> >>>> 2011-11-12 19:21:06,510+0000 DEBUG AbstractExecutor Writing PBS script to /home/fangfang/.globus/scripts/PBS6954924010553344333.submit >>>> 2011-11-12 19:21:06,521+0000 DEBUG PBSExecutor PBS name: for: Block-1112-210706-000000 is: Block-1112-2107 >>>> 2011-11-12 19:21:06,521+0000 INFO BlockTaskSubmitter Error submitting block task: Cannot submit job: Illegal value for ppn. Must be an integer. >>>> 2011-11-12 19:21:16,429+0000 INFO TaskNotifier Congestion queue size: 0 >>>> >>>> I looked at the PBS script and somehow it's blank. I have attached the full log file. Could you please take a look and let me know how to proceed? >>>> >>>> Thanks, >>>> >>>> Fangfang >>>> >>>> On Nov 8, 2011, at 12:42 PM, Michael Wilde wrote: >>>> >>>> > Hi Fangfang, Scott, >>>> > >>>> > Sorry for the late reply! I think the best roadmap to follow is this: >>>> > >>>> > - try running the sample tutorial Swift script on Beagle using the instructions posted at: >>>> > >>>> > http://www.ci.uchicago.edu/swift/wwwdev/guides/release-0.93/siteguide/siteguide.html#_beagle >>>> > >>>> > This tiny tutorial contains a simple Swift script that does N "cat" commends in parallel to "process" an input file and create an output file. It contains all the related config files you need to run on Beagle, and is thus a good "Hello World" application. You can then copy catsn.swift to create the first Swift script to run your actual applications. >>>> > >>>> > - set up a face to face meeting with Ketan Maheshwari, the Beagle Catalyst for Swift applications. Ketan is based here at Argonne, on the 5th floor near my office, 5141. Ketan can help answer any questions you have, and will be your personal contact to help you make good use of Beagle. >>>> > >>>> > - then do your first Model-SEED script based on catsn.swift, first with N = 1 to just ensure that you have described your app's command line(s) correctly to Swift and that the app is getting invoked and returning output correctly. >>>> > >>>> > - then, with help form Ketan as needed, start scaling up to increasingly larger runs. >>>> > >>>> > I'll try to stay close in the loop and help out as needed. >>>> > >>>> > Do you have any questions I can answer to get started? If you are at Argonne and available today, perhaps I can join you and Ketan in an introductory meeting. Im free from 3 to 4:40 today or after 5:30. Otherwise, pelase do this at your joint conveniences. >>>> > >>>> > Regards, >>>> > >>>> > - Mike >>>> > >>>> > >>>> > >>>> > >>>> > ----- Original Message ----- >>>> >> From: "Fangfang Xia" >>>> >> To: "Michael Wilde" >>>> >> Sent: Monday, October 31, 2011 12:44:23 PM >>>> >> Subject: Re: How is install/test of Model SEED on Beagle going? >>>> >> Hi Mike, >>>> >> >>>> >> We got two types of flux balance analysis to run on beagle. I was >>>> >> wondering if we should test them with Swift to see if things scale. >>>> >> Both operations take about 40 seconds to run on sandbox. Ideally we >>>> >> should also test two more expensive computation "fba single knockouts" >>>> >> and "gapfilling", but I won't be able to resolve the problems with >>>> >> those until I meet with Chris this week. >>>> >> >>>> >> source /lustre/beagle/fangfang/Model-SEED-core/bin/source-me.sh >>>> >> >>>> >> fbacheckgrowth -model iJR904.16242 >>>> >> fbafva -model iJR904.16242 >>>> >> >>>> >> You can find the descriptions of these tools at: >>>> >> http://bionet.mcs.anl.gov/index.php/Using_the_Model_SEED >>>> >> >>>> >> I've been switching between PrgEnv-pgi/gcc to get perl modules and >>>> >> mfatoolkit to compile. And I still seem to be getting the cc1plus >>>> >> error with gcc which you don't have. So if this version doesn't work >>>> >> well on multiple processors, I'll need your help with recompiling my >>>> >> updated mfatoolkit in >>>> >> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit. >>>> >> >>>> >> I have 777'ed my /lustre/beagle/fangfang/ModelSEED/ directory in case >>>> >> you need to test something there. >>>> >> >>>> >> Thanks, >>>> >> Fangfang >>>> >> >>>> >> On Oct 24, 2011, at 11:14 PM, Michael Wilde wrote: >>>> >> >>>> >>> Hi Fangfang, >>>> >>> >>>> >>> I was able to build that directory using the gcc module; I past the >>>> >>> make output below. It gave many warnings, but I did not get the >>>> >>> cc1plus libmpc.so error that you encountered. >>>> >>> >>>> >>> My build is in $HOME/wilde/mfatoolkit >>>> >>> >>>> >>> I ran this on sandbox.beagle.ci.uchicago.edu. >>>> >>> >>>> >>> - Mike >>>> >>> >>>> >>> ---- make output: >>>> >>> >>>> >>> sandbox$ make >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/driver.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/MFAProblem.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>>> >>> 'int MFAProblem::ModifyInputConstraints(ConstraintsToModify*, >>>> >>> Data*)': >>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1828:16: warning: NULL >>>> >>> used in arithmetic >>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1832:16: warning: NULL >>>> >>> used in arithmetic >>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>>> >>> 'int MFAProblem::FluxCouplingAnalysis(Data*, OptimizationParameter*, >>>> >>> bool, std::string&, bool)': >>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:4900:46: warning: >>>> >>> converting 'false' to pointer type for argument 1 of >>>> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>>> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>>> >>> std::char_traits, _Alloc = std::allocator]' >>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5023:47: warning: >>>> >>> converting 'false' to pointer type for argument 1 of >>>> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>>> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>>> >>> std::char_traits, _Alloc = std::allocator]' >>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5212:47: warning: >>>> >>> converting 'false' to pointer type for argument 1 of >>>> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>>> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>>> >>> std::char_traits, _Alloc = std::allocator]' >>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>>> >>> 'int MFAProblem::IdentifyReactionLoops(Data*, >>>> >>> OptimizationParameter*)': >>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5989:43: warning: >>>> >>> converting 'false' to pointer type for argument 1 of >>>> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>>> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>>> >>> std::char_traits, _Alloc = std::allocator]' >>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>>> >>> 'int MFAProblem::ParseRegExp(OptimizationParameter*, Data*, >>>> >>> std::string)': >>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:7984:10: warning: >>>> >>> converting to non-pointer type 'int' from NULL >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/CPLEXapi.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/SCIPapi.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/GLPKapi.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/SolverInterface.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/Species.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> /home/wilde/mfatoolkit/Source/Species.cpp: In member function 'void >>>> >>> Species::AddpKab(std::string, bool)': >>>> >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing >>>> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>>> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>>> >>> std::allocator, value_type = int]' >>>> >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing >>>> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>>> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>>> >>> std::allocator, value_type = int]' >>>> >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing >>>> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>>> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>>> >>> std::allocator, value_type = int]' >>>> >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing >>>> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>>> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>>> >>> std::allocator, value_type = int]' >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/Data.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> /home/wilde/mfatoolkit/Source/Data.cpp:2220:43: warning: >>>> >>> multi-character character constant >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/InterfaceFunctions.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/Identity.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/Reaction.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/GlobalFunctions.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/AtomCPP.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/UtilityFunctions.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/AtomType.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/Gene.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/GeneInterval.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -c /home/wilde/mfatoolkit/Source/stringDB.cpp; mv *.o >>>> >>> /home/wilde/mfatoolkit/Source >>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD -DLINUX >>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>> >>> -o /home/wilde/mfatoolkit/Linux/mfatoolkit >>>> >>> /home/wilde/mfatoolkit/Source/driver.o >>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.o >>>> >>> /home/wilde/mfatoolkit/Source/CPLEXapi.o >>>> >>> /home/wilde/mfatoolkit/Source/SCIPapi.o >>>> >>> /home/wilde/mfatoolkit/Source/GLPKapi.o >>>> >>> /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.o >>>> >>> /home/wilde/mfatoolkit/Source/SolverInterface.o >>>> >>> /home/wilde/mfatoolkit/Source/Species.o >>>> >>> /home/wilde/mfatoolkit/Source/Data.o >>>> >>> /home/wilde/mfatoolkit/Source/InterfaceFunctions.o >>>> >>> /home/wilde/mfatoolkit/Source/Identity.o >>>> >>> /home/wilde/mfatoolkit/Source/Reaction.o >>>> >>> /home/wilde/mfatoolkit/Source/GlobalFunctions.o >>>> >>> /home/wilde/mfatoolkit/Source/AtomCPP.o >>>> >>> /home/wilde/mfatoolkit/Source/UtilityFunctions.o >>>> >>> /home/wilde/mfatoolkit/Source/AtomType.o >>>> >>> /home/wilde/mfatoolkit/Source/Gene.o >>>> >>> /home/wilde/mfatoolkit/Source/GeneInterval.o >>>> >>> /home/wilde/mfatoolkit/Source/stringDB.o >>>> >>> -L/lustre/beagle/fangfang/ModelSEED/Solvers/lib -lglpk >>>> >>> -L/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/lib/x86-64_sles10_4.1/static_pic >>>> >>> -lcplex -lm -lpthread -lz >>>> >>> sandbox$ >>>> >>> >>>> >>> >>>> >>> ----- Original Message ----- >>>> >>>> From: "Fangfang Xia" >>>> >>>> To: "Michael Wilde" >>>> >>>> Cc: "Scott Devoid" >>>> >>>> Sent: Monday, October 24, 2011 5:20:20 PM >>>> >>>> Subject: Re: How is install/test of Model SEED on Beagle going? >>>> >>>> Hi Mike, >>>> >>>> >>>> >>>> This is very helpful. Thanks for pointing out the difference >>>> >>>> between >>>> >>>> PrgEnv-pgi and gcc. Here's an error message we got when trying to >>>> >>>> compile our core c++ code. >>>> >>>> >>>> >>>> /opt/gcc/4.5.2/snos/libexec/gcc/x86_64-suse-linux/default/cc1plus: >>>> >>>> error while loading shared libraries: libmpc.so.2: cannot open >>>> >>>> shared >>>> >>>> object file: No such file or directory >>>> >>>> >>>> >>>> It looks like something is wrong with cc1plus. I suppose it's part >>>> >>>> of >>>> >>>> the g++? I don't know what it does. >>>> >>>> >>>> >>>> So we resolved the perl dependency issues, and we were able to >>>> >>>> compile >>>> >>>> the code with the default PrgEnv-pgi just for testing purposes. It >>>> >>>> seems we still have some issues with our new pipeline code. But I >>>> >>>> don't think we are very far from giving you a running example. >>>> >>>> >>>> >>>> Just in case you could help us with the gcc compilation issue, I >>>> >>>> have >>>> >>>> 777'ed my directory and here's the steps to compile the core C++ >>>> >>>> code: >>>> >>>> >>>> >>>> source >>>> >>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/bin/source-me.sh >>>> >>>> cd >>>> >>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit/Linux >>>> >>>> make >>>> >>>> >>>> >>>> Thanks, >>>> >>>> Fangfang >>>> >>>> >>>> >>>> On Oct 22, 2011, at 1:10 PM, Michael Wilde wrote: >>>> >>>> >>>> >>>>> Sounds great, thanks for the update, Fangfang. >>>> >>>>> >>>> >>>>> One question: what compiler are you using? >>>> >>>>> >>>> >>>>> I'd like to suggest, for the first pass, that you use the "gcc" >>>> >>>>> module (rather than the PrgEnv-pgi or PrgEnv-gnu modules). Thats >>>> >>>>> because the GCC module will create code that we can run in >>>> >>>>> parallel, >>>> >>>>> multiple programs in parallel per compute node. The PrgEnv modules >>>> >>>>> all create code that expects to run only one program per node, >>>> >>>>> because its meant for MPI, OpenMP, etc). >>>> >>>>> >>>> >>>>> Also, I think that the gcc module (which I think includes gcc, g++ >>>> >>>>> and gfortran) may be more like the traditional Linux gcc than >>>> >>>>> PrgEnv-gnu. >>>> >>>>> >>>> >>>>> The default PrgEnv (at least for me) is pgi. So before i build >>>> >>>>> software I do: >>>> >>>>> >>>> >>>>> module unload PrgEnv-pgi >>>> >>>>> module load gcc >>>> >>>>> >>>> >>>>> Let me know if I can help; if you want i can try to build you a >>>> >>>>> libxml2 using gcc. >>>> >>>>> Same for Perl if it needs to be executed multiple copies per node >>>> >>>>> in >>>> >>>>> parallel. >>>> >>>>> >>>> >>>>> We can discuss more next week, and I'll be working off and on this >>>> >>>>> weekend. >>>> >>>>> >>>> >>>>> Regards, >>>> >>>>> >>>> >>>>> - Mike >>>> >>>>> >>>> >>>>> >>>> >>>>> ----- Original Message ----- >>>> >>>>>> From: "Fangfang Xia" >>>> >>>>>> To: "Michael Wilde" >>>> >>>>>> Cc: "Fangfang Xia" , "Scott Devoid" >>>> >>>>>> >>>> >>>>>> Sent: Saturday, October 22, 2011 12:39:05 PM >>>> >>>>>> Subject: Re: How is install/test of Model SEED on Beagle going? >>>> >>>>>> Hi Mike, >>>> >>>>>> >>>> >>>>>> We encountered some dependency issues while attempting to install >>>> >>>>>> some >>>> >>>>>> additional Perl libraries for ModelSEED. We have asked Beagle >>>> >>>>>> systems >>>> >>>>>> folks to help install libxml2. I'm also looking into ways to >>>> >>>>>> install >>>> >>>>>> it in a user directory. I get the feeling that things should be >>>> >>>>>> resolved after our group meeting on Monday. So we'll keep you >>>> >>>>>> posted. >>>> >>>>>> >>>> >>>>>> Thanks, >>>> >>>>>> >>>> >>>>>> Fangfang >>>> >>>>>> >>>> >>>>>> On Oct 21, 2011, at 2:08 PM, Michael Wilde wrote: >>>> >>>>>> >>>> >>>>>>> Hi Fangfang, Scott, >>>> >>>>>>> >>>> >>>>>>> Any progress - can I try it soon? >>>> >>>>>>> >>>> >>>>>>> Or, any problems that I can help with? Im at Argonne today >>>> >>>>>>> (5141) >>>> >>>>>>> if >>>> >>>>>>> I can help or you'd like to talk. Free except for 3:30 - 4:30. >>>> >>>>>>> >>>> >>>>>>> Regards, >>>> >>>>>>> >>>> >>>>>>> - Mike >>>> >>>>>>> >>>> >>>>>>> >>>> >>>>>>> -- >>>> >>>>>>> Michael Wilde >>>> >>>>>>> Computation Institute, University of Chicago >>>> >>>>>>> Mathematics and Computer Science Division >>>> >>>>>>> Argonne National Laboratory >>>> >>>>>>> >>>> >>>>> >>>> >>>>> -- >>>> >>>>> Michael Wilde >>>> >>>>> Computation Institute, University of Chicago >>>> >>>>> Mathematics and Computer Science Division >>>> >>>>> Argonne National Laboratory >>>> >>>>> >>>> >>> >>>> >>> -- >>>> >>> Michael Wilde >>>> >>> Computation Institute, University of Chicago >>>> >>> Mathematics and Computer Science Division >>>> >>> Argonne National Laboratory >>>> >>> >>>> > >>>> > -- >>>> > Michael Wilde >>>> > Computation Institute, University of Chicago >>>> > Mathematics and Computer Science Division >>>> > Argonne National Laboratory >>>> > >>>> >>>> >>>> -- >>>> Michael Wilde >>>> Computation Institute, University of Chicago >>>> Mathematics and Computer Science Division >>>> Argonne National Laboratory >>>> >>>> _______________________________________________ >>>> Swift-devel mailing list >>>> Swift-devel at ci.uchicago.edu >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>> >>>> >>>> >>>> -- >>>> Ketan >>>> >>>> >>> >>> >>> >>> >>> >>> -- >>> Ketan >>> >>> >> >> >> >> >> >> -- >> Ketan >> >> > > > > > -- > Ketan > > > > > > -- > Ketan > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidk at ci.uchicago.edu Mon Nov 14 08:53:33 2011 From: davidk at ci.uchicago.edu (David Kelly) Date: Mon, 14 Nov 2011 08:53:33 -0600 (CST) Subject: [Swift-devel] swift pbs/beagle broken In-Reply-To: <24844164.25319.1321238091049.JavaMail.root@zimbra.anl.gov> Message-ID: <1992201892.14905.1321282413788.JavaMail.root@zimbra-mb2.anl.gov> RC5 has been released and is available in the swift module on beagle David ----- Original Message ----- > David, can you build and post a new RC? > (Do you know how to mark the release as 0.93RC5 per the method Justin > described in our last meeting? So that it shows up in the Swift log as > that release name...) > > Ketan, can you see if this now gets Fangfang rolling? > > Thanks, all. > > - Mike > > > > ----- Original Message ----- > > From: "Ketan Maheshwari" > > To: "Michael Wilde" > > Cc: "Swift Devel" > > Sent: Sunday, November 13, 2011 6:05:15 PM > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > This fix works for me. I tested with one catsn job on Beagle. > > > > > > On Sun, Nov 13, 2011 at 7:48 PM, Michael Wilde < wilde at mcs.anl.gov > > > wrote: > > > > > > OK, here is a simple fix for this problem. Just add the variable > > "SWIFT_USERHOME" to your swift command; then do: > > > > export SWIFT_USERHOME=/lustre/beagle/wilde > > swift etc > > > > This makes swift use $SWIFT_USERHOME instead of $HOME to locate the > > .globus directory. > > > > This will of course mess up if a swift run needs to locate your > > certificates; possibly you can get around that with a symlink. But I > > suspect most uses of this will be for local execution on systems > > like > > Beagle with non-writeable home dirs. > > > > Here's the 1-line fix: > > > > login$ pwd > > /home/wilde/swift/src/0.93/cog/modules/swift/bin > > login$ svn diff > > Index: swift > > =================================================================== > > --- swift (revision 5284) > > +++ swift (working copy) > > @@ -86,6 +86,7 @@ > > updateOptions "$X509_USER_PROXY" "X509_USER_PROXY" > > updateOptions "$SWIFT_HOME" "COG_INSTALL_PATH" > > updateOptions "$SWIFT_HOME" "swift.home" > > +updateOptions "$SWIFT_USERHOME" "user.home" > > #Use /dev/urandom instead of /dev/random for seeding RNGs > > #This will lower the randomness of the seed, but avoid > > #large delays if /dev/random does not have enough entropy collected > > login$ > > > > If others can confirm that this works, I'll check it in. > > > > > > - Mike > > > > > > > > ----- Original Message ----- > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > > > > Sent: Sunday, November 13, 2011 3:08:36 PM > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > OK, as some of you can see in the mesg I just send to > > > beagle-support: > > > it now looks ot me like the root problem of the swift jobs failing > > > is > > > that our home dirs are not beng seen on the compute nodes, hance > > > the > > > swift-generated PBS script to launch the coaster workers cant find > > > the > > > worker.pl script that swift copied to $HOME/.globus/coasters. > > > > > > This is what I see: > > > > > > The following was run under qsub -I; the line "total 0" shows that > > > /home/wilde was empty as seen by the compute node. > > > > > > login1$ aprun /bin/sh -c 'hostname; ls -l /home/wilde/; mount | > > > grep > > > home; ' > > > nid00466 > > > total 0 > > > /autonfs/home on /autonfs/home type dvs > > > (ro,blksize=16384,nodename=c1-0c0s7n3:c4-0c0s2n0:c4-0c0s2n1:c4-0c0s2n2:c4-0c0s2n3,attrcache_timeout=14400,cache,nodatasync,noclosesync,retry,failover,userenv,clusterfs,killprocess,nobulk_rw,noatomic,nodeferopens,loadbalance,maxnodes=1,nnodes=5) > > > /autonfs/home on /autonfs/home type dvs > > > (ro,blksize=16384,nodename=c1-0c0s7n3:c4-0c0s2n0:c4-0c0s2n1:c4-0c0s2n2:c4-0c0s2n3,attrcache_timeout=14400,cache,nodatasync,noclosesync,retry,failover,userenv,clusterfs,killprocess,nobulk_rw,noatomic,nodeferopens,loadbalance,maxnodes=1,nnodes=5) > > > Application 863284 resources: utime ~0s, stime ~0s > > > login1$ > > > > > > Can anyone verify that they are seeing the same symptom? > > > > > > Thanks, > > > > > > - Mike > > > > > > > > > ----- Original Message ----- > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > Sent: Sunday, November 13, 2011 2:41:36 PM > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > I tracked the message below down to the fact that aprun doesnt > > > > like > > > > "&" in its command string. I vaguely recall reporting something > > > > similar to Cray way back and they agreed its a bug. > > > > > > > > But it seems that the *original* Swift command string did not > > > > have > > > > a > > > > "&" in it, so Im back to square one. > > > > > > > > - Mike > > > > > > > > ----- Original Message ----- > > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > Sent: Sunday, November 13, 2011 1:52:58 PM > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > Its starting to look like some kind of aprun-based failure. I > > > > > see > > > > > this > > > > > from more detailed logging I put into the generated script: > > > > > > > > > > IN .submit script > > > > > aprun: Unexpected close of the apsys control connection > > > > > aprun: Exiting due to errors. Application aborted > > > > > aprun rc 1 > > > > > > > > > > I was led off track by the fact that the exitcode file is > > > > > missing. > > > > > Seems that its generated but then removed before we can see > > > > > it. > > > > > I > > > > > suspect one part of the provider thinks the worker-launch job > > > > > succeeded, and hence removes the exitcode file, but another > > > > > part > > > > > realizes that the job failed. (conjecture...) > > > > > > > > > > Now that that part is partially explained, I think I can go > > > > > back > > > > > to > > > > > debugging this from manual qsubs which should go faster. > > > > > > > > > > Im still unsure if the missing stdout/err files is due to a > > > > > Beagle > > > > > issue; starting to look more like maybe due to the weird way > > > > > in > > > > > which > > > > > the aprun dies. > > > > > > > > > > Digging deeper... > > > > > > > > > > - Mike > > > > > > > > > > ----- Original Message ----- > > > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > Sent: Sunday, November 13, 2011 7:51:57 AM > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > Ive backed up and just did a test from swift (automatic) > > > > > > > > > > > > I see that in that case I am *not* getting an exitcode file. > > > > > > Are you getting one? > > > > > > > > > > > > - Mike > > > > > > > > > > > > ----- Original Message ----- > > > > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > > Sent: Sunday, November 13, 2011 7:45:05 AM > > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > > But if you put an explicit output redirection in the > > > > > > > /bin/sh > > > > > > > -c > > > > > > > command, you will see that those commands are indeed > > > > > > > executing > > > > > > > and > > > > > > > generating output. > > > > > > > > > > > > > > So like I mentioned earlier, I dont know if the qsub -o > > > > > > > and > > > > > > > -e > > > > > > > flags > > > > > > > have changed behavior (eg they now cant write to > > > > > > > /home???), > > > > > > > or > > > > > > > if > > > > > > > we > > > > > > > are using them incorrectly. > > > > > > > > > > > > > > But I think we need to go backwards and see why this is > > > > > > > not > > > > > > > working > > > > > > > with the swift-generated qsub files. > > > > > > > > > > > > > > We should next add the two tags to the sites file to > > > > > > > obtain > > > > > > > a > > > > > > > log > > > > > > > from > > > > > > > the worker, on the (untested!) assumption that the worker > > > > > > > is > > > > > > > really > > > > > > > starting in the automatic swift case: > > > > > > > > > > > > > > > > > > > > key="workerLoggingLevel">DEBUG > > > > > > > > > > > > > key="workerLoggingDirectory">/lustre/beagle/wilde/beagle > > > > > > > > > > > > > > - Mike > > > > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > > To: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > > > Sent: Sunday, November 13, 2011 7:35:24 AM > > > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > > > On Sun, Nov 13, 2011 at 9:28 AM, Michael Wilde < > > > > > > > > wilde at mcs.anl.gov > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > 2 thoughts here, Ketan: > > > > > > > > > > > > > > > > - when I tried my manual coaster test, I replaced the > > > > > > > > options > > > > > > > > "-n > > > > > > > > 3 > > > > > > > > -N > > > > > > > > 1 -cc none -d 24 -F exclusive" on aprun with simply "-B" > > > > > > > > which > > > > > > > > says > > > > > > > > "use the options from qsub". I was going to go back and > > > > > > > > see > > > > > > > > if > > > > > > > > there > > > > > > > > was some subtle new mismatch between these qsub and > > > > > > > > aprun > > > > > > > > processor-layout options. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I tried the -B option: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > #CoG This script generated by CoG > > > > > > > > #CoG by class: class > > > > > > > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > > > > > > > > #CoG on date: 2011/11/13 02:16:54 > > > > > > > > > > > > > > > > > > > > > > > > #PBS -S /bin/bash > > > > > > > > #PBS -N Block-1113-1602 > > > > > > > > #PBS -m n > > > > > > > > #PBS -A CI-DEB000002 > > > > > > > > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > > > > > > > > #PBS -l walltime=00:10:00 > > > > > > > > #PBS -o > > > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > > > > > > > > #PBS -e > > > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > > > > > > > > WORKER_LOGGING_LEVEL=NONE > > > > > > > > #PBS -v WORKER_LOGGING_LEVEL > > > > > > > > cd / && aprun -B /bin/sh -c /bin/date > > > > > > > > /bin/echo $? > > > > > > > > >/home/ketan/.globus/scripts/PBS2583661693904024220.submit.exitcode > > > > > > > > > > > > > > > > > > > > > > > > And see the same behavior. The exitcode file is indeed > > > > > > > > updated > > > > > > > > each > > > > > > > > time with a code 0. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > - I realized that manually testing the swift-generated > > > > > > > > submit > > > > > > > > file > > > > > > > > will give new errors because the swift service is no > > > > > > > > longer > > > > > > > > alive > > > > > > > > and > > > > > > > > listening on the port that the worker will try to > > > > > > > > connect > > > > > > > > to. > > > > > > > > Also, > > > > > > > > it > > > > > > > > seemed that the .pl file itself that automatic coaster > > > > > > > > bootstrap > > > > > > > > places in ~/.globus/coasters was not there. Im assuming > > > > > > > > that > > > > > > > > Swift > > > > > > > > removes these files when it exits, but need to verify > > > > > > > > that > > > > > > > > this > > > > > > > > is > > > > > > > > true and that the failure is not due to a missing .pl > > > > > > > > file. > > > > > > > > I > > > > > > > > suspect > > > > > > > > that this is normal and is not the problem, but again, > > > > > > > > we > > > > > > > > need > > > > > > > > to > > > > > > > > keep > > > > > > > > debugging until the root cause is found. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Mike > > > > > > > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > > > > > > > > > > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > > > > > > > > > > > > To: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Sent: Sunday, November 13, 2011 7:20:25 AM > > > > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > > > > I tried with a simple /bin/date command at the end of > > > > > > > > > the > > > > > > > > > submit > > > > > > > > > script removing the call to worker.pl : > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > #CoG This script generated by CoG > > > > > > > > > #CoG by class: class > > > > > > > > > org.globus.cog.abstraction.impl.scheduler.pbs.PBSExecutor > > > > > > > > > #CoG on date: 2011/11/13 02:16:54 > > > > > > > > > > > > > > > > > > > > > > > > > > > #PBS -S /bin/bash > > > > > > > > > #PBS -N Block-1113-1602 > > > > > > > > > #PBS -m n > > > > > > > > > #PBS -A CI-DEB000002 > > > > > > > > > #PBS -l mppwidth=3,mppnppn=1,mppdepth=24 > > > > > > > > > #PBS -l walltime=00:10:00 > > > > > > > > > #PBS -o > > > > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stdout > > > > > > > > > #PBS -e > > > > > > > > > /home/ketan/.globus/scripts/PBS2583661693904024220.submit.stderr > > > > > > > > > WORKER_LOGGING_LEVEL=NONE > > > > > > > > > #PBS -v WORKER_LOGGING_LEVEL > > > > > > > > > cd / && aprun -n 3 -N 1 -cc none -d 24 -F exclusive > > > > > > > > > /bin/sh > > > > > > > > > -c > > > > > > > > > /bin/date > > > > > > > > > > > > > > > > > > > > > > > > > > > ======= > > > > > > > > > > > > > > > > > > > > > > > > > > > This fails too. The queue cancels the job as soon as > > > > > > > > > it > > > > > > > > > starts > > > > > > > > > running, without writing anything to stdout or stderr. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Sun, Nov 13, 2011 at 12:54 AM, Michael Wilde < > > > > > > > > > wilde at mcs.anl.gov > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > OK, I dont need these; I can reproduce the problem as > > > > > > > > > well. > > > > > > > > > > > > > > > > > > For some reason, the coaster worker is exiting > > > > > > > > > immediately. > > > > > > > > > > > > > > > > > > I see a few possibilities: > > > > > > > > > > > > > > > > > > - Beagle networking may have changed, making it no > > > > > > > > > longer > > > > > > > > > possible > > > > > > > > > to > > > > > > > > > reach the coaster service from the compute nodes using > > > > > > > > > the > > > > > > > > > previous > > > > > > > > > IP > > > > > > > > > address ranges. > > > > > > > > > > > > > > > > > > - the worker.pl script is not being created in > > > > > > > > > $HOME/.globus/coasters > > > > > > > > > > > > > > > > > > Mike > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > > > > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > > > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > > > > > > > > > > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > > > > > > Sent: Saturday, November 12, 2011 8:39:36 PM > > > > > > > > > > Subject: Re: [Swift-devel] swift pbs/beagle broken > > > > > > > > > > Ketan, can you post the submit script and site file? > > > > > > > > > > > > > > > > > > > > On 11/12/11, Ketan Maheshwari < > > > > > > > > > > ketancmaheshwari at gmail.com > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > > > > > > > > > > > It seems the pbs-coaster provider (local:pbs) is > > > > > > > > > > > broken > > > > > > > > > > > for > > > > > > > > > > > swift. > > > > > > > > > > > I > > > > > > > > > > > tried > > > > > > > > > > > swift trunk, 0.93 svn branch, 0.93RC3 and 0.93RC4 > > > > > > > > > > > but > > > > > > > > > > > getting > > > > > > > > > > > the > > > > > > > > > > > same > > > > > > > > > > > response: > > > > > > > > > > > > > > > > > > > > > > Swift svn swift-r5205 cog-r3293 > > > > > > > > > > > > > > > > > > > > > > RunID: 20111113-0216-1d35h7eb > > > > > > > > > > > Progress: time: Sun, 13 Nov 2011 02:16:54 +0000 > > > > > > > > > > > site setting workersPerNode has been replaced with > > > > > > > > > > > jobsPerNode! > > > > > > > > > > > Progress: time: Sun, 13 Nov 2011 02:17:05 +0000 > > > > > > > > > > > Active:1 > > > > > > > > > > > Failed to transfer wrapper log for job > > > > > > > > > > > cat-1hg8aoik > > > > > > > > > > > Exception in cat: > > > > > > > > > > > Arguments: [data.txt] > > > > > > > > > > > Host: pbs > > > > > > > > > > > Directory: > > > > > > > > > > > catsn-20111113-0216-1d35h7eb/jobs/1/cat-1hg8aoik > > > > > > > > > > > stderr.txt: > > > > > > > > > > > > > > > > > > > > > > stdout.txt: > > > > > > > > > > > > > > > > > > > > > > ---- > > > > > > > > > > > > > > > > > > > > > > Caused by: Task failed: 1113-160254-000000 Block > > > > > > > > > > > task > > > > > > > > > > > ended > > > > > > > > > > > prematurely > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Final status: time: Sun, 13 Nov 2011 02:17:05 > > > > > > > > > > > +0000 > > > > > > > > > > > Failed:1 > > > > > > > > > > > The following errors have occurred: > > > > > > > > > > > 1. Task failed: 1113-160254-000000 Block task > > > > > > > > > > > ended > > > > > > > > > > > prematurely > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Trying the submit script outside of swift also > > > > > > > > > > > does > > > > > > > > > > > not > > > > > > > > > > > seem > > > > > > > > > > > to > > > > > > > > > > > be > > > > > > > > > > > working. > > > > > > > > > > > The scripts get submitted to the queue and > > > > > > > > > > > immediately > > > > > > > > > > > exits > > > > > > > > > > > without > > > > > > > > > > > writing anything to stdout or stderr. > > > > > > > > > > > > > > > > > > > > > > Were there any recent changes that could have > > > > > > > > > > > affected > > > > > > > > > > > this? > > > > > > > > > > > > > > > > > > > > > > I remember to have tried this successfully in the > > > > > > > > > > > last > > > > > > > > > > > week > > > > > > > > > > > of > > > > > > > > > > > last > > > > > > > > > > > month. > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > -- > > > > > > > > > > > Ketan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Sent from my mobile device > > > > > > > > > > _______________________________________________ > > > > > > > > > > Swift-devel mailing list > > > > > > > > > > Swift-devel at ci.uchicago.edu > > > > > > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Michael Wilde > > > > > > > > > Computation Institute, University of Chicago > > > > > > > > > Mathematics and Computer Science Division > > > > > > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Ketan > > > > > > > > > > > > > > > > -- > > > > > > > > Michael Wilde > > > > > > > > Computation Institute, University of Chicago > > > > > > > > Mathematics and Computer Science Division > > > > > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Ketan > > > > > > > > > > > > > > -- > > > > > > > Michael Wilde > > > > > > > Computation Institute, University of Chicago > > > > > > > Mathematics and Computer Science Division > > > > > > > Argonne National Laboratory > > > > > > > > > > > > > > _______________________________________________ > > > > > > > Swift-devel mailing list > > > > > > > Swift-devel at ci.uchicago.edu > > > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > -- > > > > > > Michael Wilde > > > > > > Computation Institute, University of Chicago > > > > > > Mathematics and Computer Science Division > > > > > > Argonne National Laboratory > > > > > > > > > > > > _______________________________________________ > > > > > > Swift-devel mailing list > > > > > > Swift-devel at ci.uchicago.edu > > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > -- > > > > > Michael Wilde > > > > > Computation Institute, University of Chicago > > > > > Mathematics and Computer Science Division > > > > > Argonne National Laboratory > > > > > > > > > > _______________________________________________ > > > > > Swift-devel mailing list > > > > > Swift-devel at ci.uchicago.edu > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > -- > > > > Michael Wilde > > > > Computation Institute, University of Chicago > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > > > -- > > > Michael Wilde > > > Computation Institute, University of Chicago > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > > > > > > > > > -- > > Ketan > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory From ketancmaheshwari at gmail.com Mon Nov 14 08:55:04 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Mon, 14 Nov 2011 08:55:04 -0600 Subject: [Swift-devel] User has problem with PBS provider on Beagle - Fwd: Swift question In-Reply-To: <1F284688-8A70-4B53-854A-2C5FE294EDFF@gmail.com> References: <818D3B69-4716-40B9-B4CD-422A62D56787@gmail.com> <1285627078.23656.1321126262196.JavaMail.root@zimbra.anl.gov> <7C4E0EA1-9DC4-404A-8C4A-16B9B011EADF@gmail.com> <832E99F6-26FF-40E4-81C1-DCB19728D339@gmail.com> <2F4CFB07-EADD-4A05-8D99-AFC3E7D99329@gmail.com> <1F284688-8A70-4B53-854A-2C5FE294EDFF@gmail.com> Message-ID: Excellent. Feel free to come by my desk for any discussion if you are here at Argonne. I am on fifth floor in front of #5141. Regards, Ketan On Mon, Nov 14, 2011 at 8:53 AM, Fangfang Xia wrote: > Hi Ketan and Mike, > > Thanks for your help. This works wonderfully. Now I'll go ahead and play > with the ModelSEED pipelines. > > Fangfang > > On Nov 14, 2011, at 8:35 AM, Ketan Maheshwari wrote: > > Hello Fangfang, > > Mike discovered yesterday that there was a problem in the mounting of > /home filesystem from workernodes. Later, Glen narrowed it down to the fact > that the /home filesystem is being mounted with incorrect permissions > (755). We think, this issue might have come up after a recent Beagle > maintenance. We do not know at the moment if this is an intentional change. > > Mike implemented a fix for a workaround to this issue which effectively > makes use of /lustre/beagle/ filesystem to store and run the pbs > submit scripts managed by swift. > > To use the fix, kindly add the following line before your swift > commandline: > > === > export SWIFT_USERHOME=/lustre/beagle/fangfang > === > > After this your /lustre home will be treated as the location for swift > managed scripts. > > I've tested this on Beagle and has worked to completion. > > Thanks for your patience. > > Best Regards, > Ketan > > On Sat, Nov 12, 2011 at 4:31 PM, Ketan Maheshwari < > ketancmaheshwari at gmail.com> wrote: > >> Hello Fangfang, >> >> I logged out of my screen session before my test completed. Now, I could >> see that the run I tested was submitted but in the end I saw the same >> message. >> >> I am looking into this and will get back to you soon. >> >> Regards, >> Ketan >> >> >> On Sat, Nov 12, 2011 at 3:55 PM, Fangfang Xia wrote: >> >>> Thanks Mike and Ketan. >>> >>> I think it's nice that Swift is preinstalled on Beagle. When I do >>> "module load swift", I get "Swift version swift-0.93RC3 loaded"; is that >>> the latest Swift? >>> >>> This time I get a command line error: >>> >>> login2:catsn > swift -config cf -tc.file tc -sites.file sites.xml >>> catsn.swift -n=2 >>> Swift svn swift-r5205 cog-r3293 >>> >>> RunID: 20111112-2151-rbu7ivof >>> Progress: time: Sat, 12 Nov 2011 21:51:47 +0000 >>> Progress: time: Sat, 12 Nov 2011 21:51:59 +0000 Submitted:1 Active:1 >>> Progress: time: Sat, 12 Nov 2011 21:52:10 +0000 Submitted:1 Active:1 >>> Exception in cat: >>> Arguments: [data.txt] >>> Host: pbs >>> Directory: catsn-20111112-2151-rbu7ivof/jobs/v/cat-vbgmznik >>> - - - >>> >>> Caused by: Task failed: 1112-510947-000001 Block task ended prematurely >>> >>> >>> Exception in cat: >>> Arguments: [data.txt] >>> Host: pbs >>> Directory: catsn-20111112-2151-rbu7ivof/jobs/t/cat-tbgmznik >>> - - - >>> >>> Caused by: Task failed: 1112-510947-000001 Block task ended prematurely >>> >>> >>> Final status: time: Sat, 12 Nov 2011 21:52:10 +0000 Failed:2 >>> The following errors have occurred: >>> 1. Task failed: 1112-510947-000001 Block task ended prematurely (2 times) >>> >>> >>> On Nov 12, 2011, at 3:43 PM, Ketan Maheshwari wrote: >>> >>> Hello Fangfang, >>> >>> Sorry, I made a mistake in the new line, in place of key="ppn", it >>> should be key="providerAttributes". >>> >>> So the line should be as follows: >>> >>> >> key="providerAttributes">pbs.aprun;pbs.mpp;depth=24 >>> >>> I just tested this on Beagle and it works now. >>> >>> Regards, >>> Ketan >>> >>> On Sat, Nov 12, 2011 at 2:30 PM, Fangfang Xia wrote: >>> >>>> Thanks. The "Illegal value for ppn" line seems to persist in the log. >>>> >>>> >>>> >>>> On Nov 12, 2011, at 2:21 PM, Ketan Maheshwari wrote: >>>> >>>> Hello Fangfang, >>>> >>>> Could you replace the following line: >>>> 24:cray:pack >>>> >>>> with this one: >>>> >>> key="ppn">pbs.aprun;pbs.mpp;depth=24 >>>> >>>> in your sites.xml. >>>> >>>> The line you have is obsoleted form from the 0.92 version of Swift. >>>> >>>> It should work now. >>>> >>>> Regards, >>>> Ketan >>>> >>>> >>>> On Sat, Nov 12, 2011 at 2:07 PM, Fangfang Xia wrote: >>>> >>>>> Hi Ketan, >>>>> >>>>> Thanks for getting back to me so promptly. I have attached the log >>>>> file, and here's the content of sites.xml: >>>>> >>>>> >>>>> >>>>> >>>>> CI-DEB000002 >>>>> >>>>> 24:cray:pack >>>>> >>>>> 24 >>>>> 1000 >>>>> 1 >>>>> 1 >>>>> 1 >>>>> >>>>> .63 >>>>> 10000 >>>>> >>>>> >>>>> >>>> >/lustre/beagle/fangfang/swift-lab/swift.workdir >>>>> >>>>> >>>>> >>>>> There's no error message on the command line. >>>>> >>>>> >>>>> >>>>> On Nov 12, 2011, at 2:02 PM, Ketan Maheshwari wrote: >>>>> >>>>> Hello Fangfang, >>>>> >>>>> The log file does not seem to be found. Could you attach it please. >>>>> >>>>> From this line: >>>>> Illegal value for ppn. Must be an integer. >>>>> >>>>> Looks like the sites file is not configured well for the pbs provider. >>>>> Could you post your sites.xml. >>>>> >>>>> Were there any error messages on commandline? >>>>> >>>>> Regards, >>>>> Ketan >>>>> >>>>> On Sat, Nov 12, 2011 at 1:31 PM, Michael Wilde wrote: >>>>> >>>>>> Can the first person who has time try to address the problem below? >>>>>> Im about to head to SC. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> - Mike >>>>>> >>>>>> >>>>>> ----- Forwarded Message ----- >>>>>> From: "Fangfang Xia" >>>>>> To: "Michael Wilde" >>>>>> Cc: "Ketan Maheshwari" , "Scott Devoid" < >>>>>> devoid at ci.uchicago.edu> >>>>>> Sent: Saturday, November 12, 2011 1:27:29 PM >>>>>> Subject: Swift question >>>>>> >>>>>> Hi Mike and Ketan, >>>>>> >>>>>> Thanks for the guide. I tried to follow the "cat" example, and got >>>>>> the following error: >>>>>> >>>>>> 2011-11-12 19:21:06,510+0000 DEBUG AbstractExecutor Writing PBS >>>>>> script to /home/fangfang/.globus/scripts/PBS6954924010553344333.submit >>>>>> 2011-11-12 19:21:06,521+0000 DEBUG PBSExecutor PBS name: for: >>>>>> Block-1112-210706-000000 is: Block-1112-2107 >>>>>> 2011-11-12 19:21:06,521+0000 INFO BlockTaskSubmitter Error >>>>>> submitting block task: Cannot submit job: Illegal value for ppn. Must be an >>>>>> integer. >>>>>> 2011-11-12 19:21:16,429+0000 INFO TaskNotifier Congestion queue >>>>>> size: 0 >>>>>> >>>>>> I looked at the PBS script and somehow it's blank. I have attached >>>>>> the full log file. Could you please take a look and let me know how to >>>>>> proceed? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Fangfang >>>>>> >>>>>> On Nov 8, 2011, at 12:42 PM, Michael Wilde wrote: >>>>>> >>>>>> > Hi Fangfang, Scott, >>>>>> > >>>>>> > Sorry for the late reply! I think the best roadmap to follow is >>>>>> this: >>>>>> > >>>>>> > - try running the sample tutorial Swift script on Beagle using the >>>>>> instructions posted at: >>>>>> > >>>>>> > >>>>>> http://www.ci.uchicago.edu/swift/wwwdev/guides/release-0.93/siteguide/siteguide.html#_beagle >>>>>> > >>>>>> > This tiny tutorial contains a simple Swift script that does N "cat" >>>>>> commends in parallel to "process" an input file and create an output file. >>>>>> It contains all the related config files you need to run on Beagle, and is >>>>>> thus a good "Hello World" application. You can then copy catsn.swift to >>>>>> create the first Swift script to run your actual applications. >>>>>> > >>>>>> > - set up a face to face meeting with Ketan Maheshwari, the Beagle >>>>>> Catalyst for Swift applications. Ketan is based here at Argonne, on the 5th >>>>>> floor near my office, 5141. Ketan can help answer any questions you have, >>>>>> and will be your personal contact to help you make good use of Beagle. >>>>>> > >>>>>> > - then do your first Model-SEED script based on catsn.swift, first >>>>>> with N = 1 to just ensure that you have described your app's command >>>>>> line(s) correctly to Swift and that the app is getting invoked and >>>>>> returning output correctly. >>>>>> > >>>>>> > - then, with help form Ketan as needed, start scaling up to >>>>>> increasingly larger runs. >>>>>> > >>>>>> > I'll try to stay close in the loop and help out as needed. >>>>>> > >>>>>> > Do you have any questions I can answer to get started? If you are >>>>>> at Argonne and available today, perhaps I can join you and Ketan in an >>>>>> introductory meeting. Im free from 3 to 4:40 today or after 5:30. >>>>>> Otherwise, pelase do this at your joint conveniences. >>>>>> > >>>>>> > Regards, >>>>>> > >>>>>> > - Mike >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > ----- Original Message ----- >>>>>> >> From: "Fangfang Xia" >>>>>> >> To: "Michael Wilde" >>>>>> >> Sent: Monday, October 31, 2011 12:44:23 PM >>>>>> >> Subject: Re: How is install/test of Model SEED on Beagle going? >>>>>> >> Hi Mike, >>>>>> >> >>>>>> >> We got two types of flux balance analysis to run on beagle. I was >>>>>> >> wondering if we should test them with Swift to see if things scale. >>>>>> >> Both operations take about 40 seconds to run on sandbox. Ideally we >>>>>> >> should also test two more expensive computation "fba single >>>>>> knockouts" >>>>>> >> and "gapfilling", but I won't be able to resolve the problems with >>>>>> >> those until I meet with Chris this week. >>>>>> >> >>>>>> >> source /lustre/beagle/fangfang/Model-SEED-core/bin/source-me.sh >>>>>> >> >>>>>> >> fbacheckgrowth -model iJR904.16242 >>>>>> >> fbafva -model iJR904.16242 >>>>>> >> >>>>>> >> You can find the descriptions of these tools at: >>>>>> >> http://bionet.mcs.anl.gov/index.php/Using_the_Model_SEED >>>>>> >> >>>>>> >> I've been switching between PrgEnv-pgi/gcc to get perl modules and >>>>>> >> mfatoolkit to compile. And I still seem to be getting the cc1plus >>>>>> >> error with gcc which you don't have. So if this version doesn't >>>>>> work >>>>>> >> well on multiple processors, I'll need your help with recompiling >>>>>> my >>>>>> >> updated mfatoolkit in >>>>>> >> >>>>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit. >>>>>> >> >>>>>> >> I have 777'ed my /lustre/beagle/fangfang/ModelSEED/ directory in >>>>>> case >>>>>> >> you need to test something there. >>>>>> >> >>>>>> >> Thanks, >>>>>> >> Fangfang >>>>>> >> >>>>>> >> On Oct 24, 2011, at 11:14 PM, Michael Wilde wrote: >>>>>> >> >>>>>> >>> Hi Fangfang, >>>>>> >>> >>>>>> >>> I was able to build that directory using the gcc module; I past >>>>>> the >>>>>> >>> make output below. It gave many warnings, but I did not get the >>>>>> >>> cc1plus libmpc.so error that you encountered. >>>>>> >>> >>>>>> >>> My build is in $HOME/wilde/mfatoolkit >>>>>> >>> >>>>>> >>> I ran this on sandbox.beagle.ci.uchicago.edu. >>>>>> >>> >>>>>> >>> - Mike >>>>>> >>> >>>>>> >>> ---- make output: >>>>>> >>> >>>>>> >>> sandbox$ make >>>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD >>>>>> -DLINUX >>>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>>> >>> >>>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>>> >>> -c /home/wilde/mfatoolkit/Source/driver.cpp; mv *.o >>>>>> >>> /home/wilde/mfatoolkit/Source >>>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD >>>>>> -DLINUX >>>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>>> >>> >>>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>>> >>> -c /home/wilde/mfatoolkit/Source/MFAProblem.cpp; mv *.o >>>>>> >>> /home/wilde/mfatoolkit/Source >>>>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>>>>> >>> 'int MFAProblem::ModifyInputConstraints(ConstraintsToModify*, >>>>>> >>> Data*)': >>>>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1828:16: warning: >>>>>> NULL >>>>>> >>> used in arithmetic >>>>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:1832:16: warning: >>>>>> NULL >>>>>> >>> used in arithmetic >>>>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>>>>> >>> 'int MFAProblem::FluxCouplingAnalysis(Data*, >>>>>> OptimizationParameter*, >>>>>> >>> bool, std::string&, bool)': >>>>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:4900:46: warning: >>>>>> >>> converting 'false' to pointer type for argument 1 of >>>>>> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>>>>> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>>>>> >>> std::char_traits, _Alloc = std::allocator]' >>>>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5023:47: warning: >>>>>> >>> converting 'false' to pointer type for argument 1 of >>>>>> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>>>>> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>>>>> >>> std::char_traits, _Alloc = std::allocator]' >>>>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5212:47: warning: >>>>>> >>> converting 'false' to pointer type for argument 1 of >>>>>> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>>>>> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>>>>> >>> std::char_traits, _Alloc = std::allocator]' >>>>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>>>>> >>> 'int MFAProblem::IdentifyReactionLoops(Data*, >>>>>> >>> OptimizationParameter*)': >>>>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:5989:43: warning: >>>>>> >>> converting 'false' to pointer type for argument 1 of >>>>>> >>> 'std::basic_string<_CharT, _Traits, _Alloc>::basic_string(const >>>>>> >>> _CharT*, const _Alloc&) [with _CharT = char, _Traits = >>>>>> >>> std::char_traits, _Alloc = std::allocator]' >>>>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp: In member function >>>>>> >>> 'int MFAProblem::ParseRegExp(OptimizationParameter*, Data*, >>>>>> >>> std::string)': >>>>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.cpp:7984:10: warning: >>>>>> >>> converting to non-pointer type 'int' from NULL >>>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD >>>>>> -DLINUX >>>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>>> >>> >>>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>>> >>> -c /home/wilde/mfatoolkit/Source/CPLEXapi.cpp; mv *.o >>>>>> >>> /home/wilde/mfatoolkit/Source >>>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD >>>>>> -DLINUX >>>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>>> >>> >>>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>>> >>> -c /home/wilde/mfatoolkit/Source/SCIPapi.cpp; mv *.o >>>>>> >>> /home/wilde/mfatoolkit/Source >>>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD >>>>>> -DLINUX >>>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>>> >>> >>>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>>> >>> -c /home/wilde/mfatoolkit/Source/GLPKapi.cpp; mv *.o >>>>>> >>> /home/wilde/mfatoolkit/Source >>>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD >>>>>> -DLINUX >>>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>>> >>> >>>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>>> >>> -c /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.cpp; mv *.o >>>>>> >>> /home/wilde/mfatoolkit/Source >>>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD >>>>>> -DLINUX >>>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>>> >>> >>>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>>> >>> -c /home/wilde/mfatoolkit/Source/SolverInterface.cpp; mv *.o >>>>>> >>> /home/wilde/mfatoolkit/Source >>>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD >>>>>> -DLINUX >>>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>>> >>> >>>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>>> >>> -c /home/wilde/mfatoolkit/Source/Species.cpp; mv *.o >>>>>> >>> /home/wilde/mfatoolkit/Source >>>>>> >>> /home/wilde/mfatoolkit/Source/Species.cpp: In member function >>>>>> 'void >>>>>> >>> Species::AddpKab(std::string, bool)': >>>>>> >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing >>>>>> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>>>>> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>>>>> >>> std::allocator, value_type = int]' >>>>>> >>> /home/wilde/mfatoolkit/Source/Species.cpp:189:28: warning: passing >>>>>> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>>>>> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>>>>> >>> std::allocator, value_type = int]' >>>>>> >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing >>>>>> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>>>>> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>>>>> >>> std::allocator, value_type = int]' >>>>>> >>> /home/wilde/mfatoolkit/Source/Species.cpp:196:28: warning: passing >>>>>> >>> NULL to non-pointer argument 1 of 'void std::vector<_Tp, >>>>>> >>> _Alloc>::push_back(const value_type&) [with _Tp = int, _Alloc = >>>>>> >>> std::allocator, value_type = int]' >>>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD >>>>>> -DLINUX >>>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>>> >>> >>>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>>> >>> -c /home/wilde/mfatoolkit/Source/Data.cpp; mv *.o >>>>>> >>> /home/wilde/mfatoolkit/Source >>>>>> >>> /home/wilde/mfatoolkit/Source/Data.cpp:2220:43: warning: >>>>>> >>> multi-character character constant >>>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD >>>>>> -DLINUX >>>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>>> >>> >>>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>>> >>> -c /home/wilde/mfatoolkit/Source/InterfaceFunctions.cpp; mv *.o >>>>>> >>> /home/wilde/mfatoolkit/Source >>>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD >>>>>> -DLINUX >>>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>>> >>> >>>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>>> >>> -c /home/wilde/mfatoolkit/Source/Identity.cpp; mv *.o >>>>>> >>> /home/wilde/mfatoolkit/Source >>>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD >>>>>> -DLINUX >>>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>>> >>> >>>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>>> >>> -c /home/wilde/mfatoolkit/Source/Reaction.cpp; mv *.o >>>>>> >>> /home/wilde/mfatoolkit/Source >>>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD >>>>>> -DLINUX >>>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>>> >>> >>>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>>> >>> -c /home/wilde/mfatoolkit/Source/GlobalFunctions.cpp; mv *.o >>>>>> >>> /home/wilde/mfatoolkit/Source >>>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD >>>>>> -DLINUX >>>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>>> >>> >>>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>>> >>> -c /home/wilde/mfatoolkit/Source/AtomCPP.cpp; mv *.o >>>>>> >>> /home/wilde/mfatoolkit/Source >>>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD >>>>>> -DLINUX >>>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>>> >>> >>>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>>> >>> -c /home/wilde/mfatoolkit/Source/UtilityFunctions.cpp; mv *.o >>>>>> >>> /home/wilde/mfatoolkit/Source >>>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD >>>>>> -DLINUX >>>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>>> >>> >>>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>>> >>> -c /home/wilde/mfatoolkit/Source/AtomType.cpp; mv *.o >>>>>> >>> /home/wilde/mfatoolkit/Source >>>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD >>>>>> -DLINUX >>>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>>> >>> >>>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>>> >>> -c /home/wilde/mfatoolkit/Source/Gene.cpp; mv *.o >>>>>> >>> /home/wilde/mfatoolkit/Source >>>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD >>>>>> -DLINUX >>>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>>> >>> >>>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>>> >>> -c /home/wilde/mfatoolkit/Source/GeneInterval.cpp; mv *.o >>>>>> >>> /home/wilde/mfatoolkit/Source >>>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD >>>>>> -DLINUX >>>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>>> >>> >>>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>>> >>> -c /home/wilde/mfatoolkit/Source/stringDB.cpp; mv *.o >>>>>> >>> /home/wilde/mfatoolkit/Source >>>>>> >>> g++ -O3 -fPIC -fexceptions -DNDEBUG -DIL_STD -DILOSTRICTPOD >>>>>> -DLINUX >>>>>> >>> -I../Include/ -DNOSAFEMEM -DNOBLOCKMEM >>>>>> >>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/include >>>>>> >>> >>>>>> -I/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/include/ilcplex/ >>>>>> >>> -o /home/wilde/mfatoolkit/Linux/mfatoolkit >>>>>> >>> /home/wilde/mfatoolkit/Source/driver.o >>>>>> >>> /home/wilde/mfatoolkit/Source/MFAProblem.o >>>>>> >>> /home/wilde/mfatoolkit/Source/CPLEXapi.o >>>>>> >>> /home/wilde/mfatoolkit/Source/SCIPapi.o >>>>>> >>> /home/wilde/mfatoolkit/Source/GLPKapi.o >>>>>> >>> /home/wilde/mfatoolkit/Source/LINDOapiEMPTY.o >>>>>> >>> /home/wilde/mfatoolkit/Source/SolverInterface.o >>>>>> >>> /home/wilde/mfatoolkit/Source/Species.o >>>>>> >>> /home/wilde/mfatoolkit/Source/Data.o >>>>>> >>> /home/wilde/mfatoolkit/Source/InterfaceFunctions.o >>>>>> >>> /home/wilde/mfatoolkit/Source/Identity.o >>>>>> >>> /home/wilde/mfatoolkit/Source/Reaction.o >>>>>> >>> /home/wilde/mfatoolkit/Source/GlobalFunctions.o >>>>>> >>> /home/wilde/mfatoolkit/Source/AtomCPP.o >>>>>> >>> /home/wilde/mfatoolkit/Source/UtilityFunctions.o >>>>>> >>> /home/wilde/mfatoolkit/Source/AtomType.o >>>>>> >>> /home/wilde/mfatoolkit/Source/Gene.o >>>>>> >>> /home/wilde/mfatoolkit/Source/GeneInterval.o >>>>>> >>> /home/wilde/mfatoolkit/Source/stringDB.o >>>>>> >>> -L/lustre/beagle/fangfang/ModelSEED/Solvers/lib -lglpk >>>>>> >>> >>>>>> -L/lustre/beagle/fangfang/ModelSEED/Solvers/ILOG/CPLEX_Studio_AcademicResearch122/cplex/lib/x86-64_sles10_4.1/static_pic >>>>>> >>> -lcplex -lm -lpthread -lz >>>>>> >>> sandbox$ >>>>>> >>> >>>>>> >>> >>>>>> >>> ----- Original Message ----- >>>>>> >>>> From: "Fangfang Xia" >>>>>> >>>> To: "Michael Wilde" >>>>>> >>>> Cc: "Scott Devoid" >>>>>> >>>> Sent: Monday, October 24, 2011 5:20:20 PM >>>>>> >>>> Subject: Re: How is install/test of Model SEED on Beagle going? >>>>>> >>>> Hi Mike, >>>>>> >>>> >>>>>> >>>> This is very helpful. Thanks for pointing out the difference >>>>>> >>>> between >>>>>> >>>> PrgEnv-pgi and gcc. Here's an error message we got when trying to >>>>>> >>>> compile our core c++ code. >>>>>> >>>> >>>>>> >>>> >>>>>> /opt/gcc/4.5.2/snos/libexec/gcc/x86_64-suse-linux/default/cc1plus: >>>>>> >>>> error while loading shared libraries: libmpc.so.2: cannot open >>>>>> >>>> shared >>>>>> >>>> object file: No such file or directory >>>>>> >>>> >>>>>> >>>> It looks like something is wrong with cc1plus. I suppose it's >>>>>> part >>>>>> >>>> of >>>>>> >>>> the g++? I don't know what it does. >>>>>> >>>> >>>>>> >>>> So we resolved the perl dependency issues, and we were able to >>>>>> >>>> compile >>>>>> >>>> the code with the default PrgEnv-pgi just for testing purposes. >>>>>> It >>>>>> >>>> seems we still have some issues with our new pipeline code. But I >>>>>> >>>> don't think we are very far from giving you a running example. >>>>>> >>>> >>>>>> >>>> Just in case you could help us with the gcc compilation issue, I >>>>>> >>>> have >>>>>> >>>> 777'ed my directory and here's the steps to compile the core C++ >>>>>> >>>> code: >>>>>> >>>> >>>>>> >>>> source >>>>>> >>>> >>>>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/bin/source-me.sh >>>>>> >>>> cd >>>>>> >>>> >>>>>> /lustre/beagle/fangfang/ModelSEED/Model-SEED-core/software/mfatoolkit/Linux >>>>>> >>>> make >>>>>> >>>> >>>>>> >>>> Thanks, >>>>>> >>>> Fangfang >>>>>> >>>> >>>>>> >>>> On Oct 22, 2011, at 1:10 PM, Michael Wilde wrote: >>>>>> >>>> >>>>>> >>>>> Sounds great, thanks for the update, Fangfang. >>>>>> >>>>> >>>>>> >>>>> One question: what compiler are you using? >>>>>> >>>>> >>>>>> >>>>> I'd like to suggest, for the first pass, that you use the "gcc" >>>>>> >>>>> module (rather than the PrgEnv-pgi or PrgEnv-gnu modules). Thats >>>>>> >>>>> because the GCC module will create code that we can run in >>>>>> >>>>> parallel, >>>>>> >>>>> multiple programs in parallel per compute node. The PrgEnv >>>>>> modules >>>>>> >>>>> all create code that expects to run only one program per node, >>>>>> >>>>> because its meant for MPI, OpenMP, etc). >>>>>> >>>>> >>>>>> >>>>> Also, I think that the gcc module (which I think includes gcc, >>>>>> g++ >>>>>> >>>>> and gfortran) may be more like the traditional Linux gcc than >>>>>> >>>>> PrgEnv-gnu. >>>>>> >>>>> >>>>>> >>>>> The default PrgEnv (at least for me) is pgi. So before i build >>>>>> >>>>> software I do: >>>>>> >>>>> >>>>>> >>>>> module unload PrgEnv-pgi >>>>>> >>>>> module load gcc >>>>>> >>>>> >>>>>> >>>>> Let me know if I can help; if you want i can try to build you a >>>>>> >>>>> libxml2 using gcc. >>>>>> >>>>> Same for Perl if it needs to be executed multiple copies per >>>>>> node >>>>>> >>>>> in >>>>>> >>>>> parallel. >>>>>> >>>>> >>>>>> >>>>> We can discuss more next week, and I'll be working off and on >>>>>> this >>>>>> >>>>> weekend. >>>>>> >>>>> >>>>>> >>>>> Regards, >>>>>> >>>>> >>>>>> >>>>> - Mike >>>>>> >>>>> >>>>>> >>>>> >>>>>> >>>>> ----- Original Message ----- >>>>>> >>>>>> From: "Fangfang Xia" >>>>>> >>>>>> To: "Michael Wilde" >>>>>> >>>>>> Cc: "Fangfang Xia" , "Scott Devoid" >>>>>> >>>>>> >>>>>> >>>>>> Sent: Saturday, October 22, 2011 12:39:05 PM >>>>>> >>>>>> Subject: Re: How is install/test of Model SEED on Beagle going? >>>>>> >>>>>> Hi Mike, >>>>>> >>>>>> >>>>>> >>>>>> We encountered some dependency issues while attempting to >>>>>> install >>>>>> >>>>>> some >>>>>> >>>>>> additional Perl libraries for ModelSEED. We have asked Beagle >>>>>> >>>>>> systems >>>>>> >>>>>> folks to help install libxml2. I'm also looking into ways to >>>>>> >>>>>> install >>>>>> >>>>>> it in a user directory. I get the feeling that things should be >>>>>> >>>>>> resolved after our group meeting on Monday. So we'll keep you >>>>>> >>>>>> posted. >>>>>> >>>>>> >>>>>> >>>>>> Thanks, >>>>>> >>>>>> >>>>>> >>>>>> Fangfang >>>>>> >>>>>> >>>>>> >>>>>> On Oct 21, 2011, at 2:08 PM, Michael Wilde wrote: >>>>>> >>>>>> >>>>>> >>>>>>> Hi Fangfang, Scott, >>>>>> >>>>>>> >>>>>> >>>>>>> Any progress - can I try it soon? >>>>>> >>>>>>> >>>>>> >>>>>>> Or, any problems that I can help with? Im at Argonne today >>>>>> >>>>>>> (5141) >>>>>> >>>>>>> if >>>>>> >>>>>>> I can help or you'd like to talk. Free except for 3:30 - 4:30. >>>>>> >>>>>>> >>>>>> >>>>>>> Regards, >>>>>> >>>>>>> >>>>>> >>>>>>> - Mike >>>>>> >>>>>>> >>>>>> >>>>>>> >>>>>> >>>>>>> -- >>>>>> >>>>>>> Michael Wilde >>>>>> >>>>>>> Computation Institute, University of Chicago >>>>>> >>>>>>> Mathematics and Computer Science Division >>>>>> >>>>>>> Argonne National Laboratory >>>>>> >>>>>>> >>>>>> >>>>> >>>>>> >>>>> -- >>>>>> >>>>> Michael Wilde >>>>>> >>>>> Computation Institute, University of Chicago >>>>>> >>>>> Mathematics and Computer Science Division >>>>>> >>>>> Argonne National Laboratory >>>>>> >>>>> >>>>>> >>> >>>>>> >>> -- >>>>>> >>> Michael Wilde >>>>>> >>> Computation Institute, University of Chicago >>>>>> >>> Mathematics and Computer Science Division >>>>>> >>> Argonne National Laboratory >>>>>> >>> >>>>>> > >>>>>> > -- >>>>>> > Michael Wilde >>>>>> > Computation Institute, University of Chicago >>>>>> > Mathematics and Computer Science Division >>>>>> > Argonne National Laboratory >>>>>> > >>>>>> >>>>>> >>>>>> -- >>>>>> Michael Wilde >>>>>> Computation Institute, University of Chicago >>>>>> Mathematics and Computer Science Division >>>>>> Argonne National Laboratory >>>>>> >>>>>> _______________________________________________ >>>>>> Swift-devel mailing list >>>>>> Swift-devel at ci.uchicago.edu >>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Ketan >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Ketan >>>> >>>> >>>> >>>> >>>> >>> >>> >>> -- >>> Ketan >>> >>> >>> >>> >> >> >> -- >> Ketan >> >> >> > > > -- > Ketan > > > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Mon Nov 14 10:13:48 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Mon, 14 Nov 2011 10:13:48 -0600 Subject: [Swift-devel] CDM Tests Message-ID: Hello, I had a discussion with Mike about testing the CDM behavior for the following cases: 1. full versus relative paths for input 2. full versus relative paths for output 3. relative versus absolute option in config property: wrapper.invocation.mode In this regard, I made 8 tests for all the above combinations. I used simple local provider in this first set of tests. >From the tests it seems that when specifying the relative option on the config for wrapper.invocation.mode property the script works regardless of the paths of input/output. A detailed result with stdout, and paths to logs can be found here: https://docs.google.com/spreadsheet/ccc?key=0AmvYSwENKFY9dG44V2VjRXJlUmZLNG9saERFeWZDcUE The tests are in my CI dir: /home/ketan/cdm_tests Regards, -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Mon Nov 14 10:56:37 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 14 Nov 2011 10:56:37 -0600 (CST) Subject: [Swift-devel] CDM Tests In-Reply-To: Message-ID: <1496102379.26617.1321289797332.JavaMail.root@zimbra.anl.gov> Thanks, Ketan - very nice tests and summary document. I should point out for those that didn't open the doc: the 4 tests with absolute pathnames are failing. Ketan, can you work with Justin to see if this is a bug, or if the CDM directive needs to be coded differently for absolute paths? Then please test a fix, and as we discussed adapt the tests with annotations to enhance the User Guide section on CDM. - Mike ----- Original Message ----- > From: "Ketan Maheshwari" > To: "Swift Devel" > Sent: Monday, November 14, 2011 8:13:48 AM > Subject: [Swift-devel] CDM Tests > Hello, > > > I had a discussion with Mike about testing the CDM behavior for the > following cases: > 1. full versus relative paths for input > 2. full versus relative paths for output > 3. relative versus absolute option in config property: > wrapper.invocation.mode > > > In this regard, I made 8 tests for all the above combinations. I used > simple local provider in this first set of tests. > > > From the tests it seems that when specifying the relative option on > the config for wrapper.invocation.mode property the script works > regardless of the paths of input/output. > > > A detailed result with stdout, and paths to logs can be found here: > https://docs.google.com/spreadsheet/ccc?key=0AmvYSwENKFY9dG44V2VjRXJlUmZLNG9saERFeWZDcUE > > > > The tests are in my CI dir: /home/ketan/cdm_tests > > > > > Regards, -- > Ketan > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From jonmon at mcs.anl.gov Mon Nov 14 11:13:01 2011 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Mon, 14 Nov 2011 11:13:01 -0600 Subject: [Swift-devel] CDM Tests In-Reply-To: <1496102379.26617.1321289797332.JavaMail.root@zimbra.anl.gov> References: <1496102379.26617.1321289797332.JavaMail.root@zimbra.anl.gov> Message-ID: <63B6A148-A00F-4796-8E31-0DEDC44B4258@mcs.anl.gov> I couldn't tell from the excel(using my phone) but what was absolute? If I am not mistaken what CDM does is takes the path that is specified in the CDM file(which should be absolute) and appends the file path that is used in the swift script to it(which should be relative). So if they are both absolute I would expect it to fail. On Nov 14, 2011, at 10:56 AM, Michael Wilde wrote: > Thanks, Ketan - very nice tests and summary document. > > I should point out for those that didn't open the doc: the 4 tests with absolute pathnames are failing. > > Ketan, can you work with Justin to see if this is a bug, or if the CDM directive needs to be coded differently for absolute paths? Then please test a fix, and as we discussed adapt the tests with annotations to enhance the User Guide section on CDM. > > - Mike > > > ----- Original Message ----- >> From: "Ketan Maheshwari" >> To: "Swift Devel" >> Sent: Monday, November 14, 2011 8:13:48 AM >> Subject: [Swift-devel] CDM Tests >> Hello, >> >> >> I had a discussion with Mike about testing the CDM behavior for the >> following cases: >> 1. full versus relative paths for input >> 2. full versus relative paths for output >> 3. relative versus absolute option in config property: >> wrapper.invocation.mode >> >> >> In this regard, I made 8 tests for all the above combinations. I used >> simple local provider in this first set of tests. >> >> >> From the tests it seems that when specifying the relative option on >> the config for wrapper.invocation.mode property the script works >> regardless of the paths of input/output. >> >> >> A detailed result with stdout, and paths to logs can be found here: >> https://docs.google.com/spreadsheet/ccc?key=0AmvYSwENKFY9dG44V2VjRXJlUmZLNG9saERFeWZDcUE >> >> >> >> The tests are in my CI dir: /home/ketan/cdm_tests >> >> >> >> >> Regards, -- >> Ketan >> >> >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From wilde at mcs.anl.gov Mon Nov 14 12:28:18 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 14 Nov 2011 12:28:18 -0600 (CST) Subject: [Swift-devel] CDM Tests In-Reply-To: <63B6A148-A00F-4796-8E31-0DEDC44B4258@mcs.anl.gov> Message-ID: <526638122.27156.1321295298546.JavaMail.root@zimbra.anl.gov> I would expect that for absolute path mappings, the user would specify either :: or "/" as the path in the CDM DIRECT pattern. In other words, the user is saying "I have mapped the object to an absolute path name, and I want the app to get exactly this name". Hence I think that the best "location" field for a DIRECT rule is null (""). Justin, can you clarify what the code expects for this case? - Mike ----- Original Message ----- > From: "Jonathan Monette" > To: "Michael Wilde" > Cc: "Ketan Maheshwari" , "Swift Devel" > Sent: Monday, November 14, 2011 9:13:01 AM > Subject: Re: [Swift-devel] CDM Tests > I couldn't tell from the excel(using my phone) but what was absolute? > If I am not mistaken what CDM does is takes the path that is specified > in the CDM file(which should be absolute) and appends the file path > that is used in the swift script to it(which should be relative). So > if they are both absolute I would expect it to fail. > > > > On Nov 14, 2011, at 10:56 AM, Michael Wilde wrote: > > > Thanks, Ketan - very nice tests and summary document. > > > > I should point out for those that didn't open the doc: the 4 tests > > with absolute pathnames are failing. > > > > Ketan, can you work with Justin to see if this is a bug, or if the > > CDM directive needs to be coded differently for absolute paths? Then > > please test a fix, and as we discussed adapt the tests with > > annotations to enhance the User Guide section on CDM. > > > > - Mike > > > > > > ----- Original Message ----- > >> From: "Ketan Maheshwari" > >> To: "Swift Devel" > >> Sent: Monday, November 14, 2011 8:13:48 AM > >> Subject: [Swift-devel] CDM Tests > >> Hello, > >> > >> > >> I had a discussion with Mike about testing the CDM behavior for the > >> following cases: > >> 1. full versus relative paths for input > >> 2. full versus relative paths for output > >> 3. relative versus absolute option in config property: > >> wrapper.invocation.mode > >> > >> > >> In this regard, I made 8 tests for all the above combinations. I > >> used > >> simple local provider in this first set of tests. > >> > >> > >> From the tests it seems that when specifying the relative option on > >> the config for wrapper.invocation.mode property the script works > >> regardless of the paths of input/output. > >> > >> > >> A detailed result with stdout, and paths to logs can be found here: > >> https://docs.google.com/spreadsheet/ccc?key=0AmvYSwENKFY9dG44V2VjRXJlUmZLNG9saERFeWZDcUE > >> > >> > >> > >> The tests are in my CI dir: /home/ketan/cdm_tests > >> > >> > >> > >> > >> Regards, -- > >> Ketan > >> > >> > >> > >> _______________________________________________ > >> Swift-devel mailing list > >> Swift-devel at ci.uchicago.edu > >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From wozniak at mcs.anl.gov Mon Nov 14 12:44:36 2011 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Mon, 14 Nov 2011 10:44:36 -0800 (Pacific Standard Time) Subject: [Swift-devel] CDM Tests In-Reply-To: <526638122.27156.1321295298546.JavaMail.root@zimbra.anl.gov> References: <526638122.27156.1321295298546.JavaMail.root@zimbra.anl.gov> Message-ID: CDM DIRECT does not currently support absolute path names. I think we could provide something simple like "" for the field, as you suggest. On Mon, 14 Nov 2011, Michael Wilde wrote: > I would expect that for absolute path mappings, the user would specify > either :: or "/" as the path in the CDM DIRECT pattern. In other words, > the user is saying "I have mapped the object to an absolute path name, > and I want the app to get exactly this name". Hence I think that the > best "location" field for a DIRECT rule is null (""). > > Justin, can you clarify what the code expects for this case? > > - Mike > > ----- Original Message ----- >> From: "Jonathan Monette" >> To: "Michael Wilde" >> Cc: "Ketan Maheshwari" , "Swift Devel" >> Sent: Monday, November 14, 2011 9:13:01 AM >> Subject: Re: [Swift-devel] CDM Tests >> I couldn't tell from the excel(using my phone) but what was absolute? >> If I am not mistaken what CDM does is takes the path that is specified >> in the CDM file(which should be absolute) and appends the file path >> that is used in the swift script to it(which should be relative). So >> if they are both absolute I would expect it to fail. >> >> >> >> On Nov 14, 2011, at 10:56 AM, Michael Wilde wrote: >> >>> Thanks, Ketan - very nice tests and summary document. >>> >>> I should point out for those that didn't open the doc: the 4 tests >>> with absolute pathnames are failing. >>> >>> Ketan, can you work with Justin to see if this is a bug, or if the >>> CDM directive needs to be coded differently for absolute paths? Then >>> please test a fix, and as we discussed adapt the tests with >>> annotations to enhance the User Guide section on CDM. >>> >>> - Mike >>> >>> >>> ----- Original Message ----- >>>> From: "Ketan Maheshwari" >>>> To: "Swift Devel" >>>> Sent: Monday, November 14, 2011 8:13:48 AM >>>> Subject: [Swift-devel] CDM Tests >>>> Hello, >>>> >>>> >>>> I had a discussion with Mike about testing the CDM behavior for the >>>> following cases: >>>> 1. full versus relative paths for input >>>> 2. full versus relative paths for output >>>> 3. relative versus absolute option in config property: >>>> wrapper.invocation.mode >>>> >>>> >>>> In this regard, I made 8 tests for all the above combinations. I >>>> used >>>> simple local provider in this first set of tests. >>>> >>>> >>>> From the tests it seems that when specifying the relative option on >>>> the config for wrapper.invocation.mode property the script works >>>> regardless of the paths of input/output. >>>> >>>> >>>> A detailed result with stdout, and paths to logs can be found here: >>>> https://docs.google.com/spreadsheet/ccc?key=0AmvYSwENKFY9dG44V2VjRXJlUmZLNG9saERFeWZDcUE >>>> >>>> >>>> >>>> The tests are in my CI dir: /home/ketan/cdm_tests >>>> >>>> >>>> >>>> >>>> Regards, -- >>>> Ketan >>>> >>>> >>>> >>>> _______________________________________________ >>>> Swift-devel mailing list >>>> Swift-devel at ci.uchicago.edu >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>> >>> -- >>> Michael Wilde >>> Computation Institute, University of Chicago >>> Mathematics and Computer Science Division >>> Argonne National Laboratory >>> >>> _______________________________________________ >>> Swift-devel mailing list >>> Swift-devel at ci.uchicago.edu >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- Justin M Wozniak From ketancmaheshwari at gmail.com Mon Nov 14 13:01:35 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Mon, 14 Nov 2011 13:01:35 -0600 Subject: [Swift-devel] CDM Tests In-Reply-To: References: <526638122.27156.1321295298546.JavaMail.root@zimbra.anl.gov> Message-ID: Given the CDM's support for relative pathnames, I would expect the case of relative inputs, relative outputs would succeed irrespective of config. However, from the tests, it seems that the config option is overriding CDM directives. On Mon, Nov 14, 2011 at 12:44 PM, Justin M Wozniak wrote: > > CDM DIRECT does not currently support absolute path names. I think we > could provide something simple like "" for the field, as you suggest. > > On Mon, 14 Nov 2011, Michael Wilde wrote: > > > I would expect that for absolute path mappings, the user would specify > > either :: or "/" as the path in the CDM DIRECT pattern. In other words, > > the user is saying "I have mapped the object to an absolute path name, > > and I want the app to get exactly this name". Hence I think that the > > best "location" field for a DIRECT rule is null (""). > > > > Justin, can you clarify what the code expects for this case? > > > > - Mike > > > > ----- Original Message ----- > >> From: "Jonathan Monette" > >> To: "Michael Wilde" > >> Cc: "Ketan Maheshwari" , "Swift Devel" < > swift-devel at ci.uchicago.edu> > >> Sent: Monday, November 14, 2011 9:13:01 AM > >> Subject: Re: [Swift-devel] CDM Tests > >> I couldn't tell from the excel(using my phone) but what was absolute? > >> If I am not mistaken what CDM does is takes the path that is specified > >> in the CDM file(which should be absolute) and appends the file path > >> that is used in the swift script to it(which should be relative). So > >> if they are both absolute I would expect it to fail. > >> > >> > >> > >> On Nov 14, 2011, at 10:56 AM, Michael Wilde wrote: > >> > >>> Thanks, Ketan - very nice tests and summary document. > >>> > >>> I should point out for those that didn't open the doc: the 4 tests > >>> with absolute pathnames are failing. > >>> > >>> Ketan, can you work with Justin to see if this is a bug, or if the > >>> CDM directive needs to be coded differently for absolute paths? Then > >>> please test a fix, and as we discussed adapt the tests with > >>> annotations to enhance the User Guide section on CDM. > >>> > >>> - Mike > >>> > >>> > >>> ----- Original Message ----- > >>>> From: "Ketan Maheshwari" > >>>> To: "Swift Devel" > >>>> Sent: Monday, November 14, 2011 8:13:48 AM > >>>> Subject: [Swift-devel] CDM Tests > >>>> Hello, > >>>> > >>>> > >>>> I had a discussion with Mike about testing the CDM behavior for the > >>>> following cases: > >>>> 1. full versus relative paths for input > >>>> 2. full versus relative paths for output > >>>> 3. relative versus absolute option in config property: > >>>> wrapper.invocation.mode > >>>> > >>>> > >>>> In this regard, I made 8 tests for all the above combinations. I > >>>> used > >>>> simple local provider in this first set of tests. > >>>> > >>>> > >>>> From the tests it seems that when specifying the relative option on > >>>> the config for wrapper.invocation.mode property the script works > >>>> regardless of the paths of input/output. > >>>> > >>>> > >>>> A detailed result with stdout, and paths to logs can be found here: > >>>> > https://docs.google.com/spreadsheet/ccc?key=0AmvYSwENKFY9dG44V2VjRXJlUmZLNG9saERFeWZDcUE > >>>> > >>>> > >>>> > >>>> The tests are in my CI dir: /home/ketan/cdm_tests > >>>> > >>>> > >>>> > >>>> > >>>> Regards, -- > >>>> Ketan > >>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> Swift-devel mailing list > >>>> Swift-devel at ci.uchicago.edu > >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >>> > >>> -- > >>> Michael Wilde > >>> Computation Institute, University of Chicago > >>> Mathematics and Computer Science Division > >>> Argonne National Laboratory > >>> > >>> _______________________________________________ > >>> Swift-devel mailing list > >>> Swift-devel at ci.uchicago.edu > >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > -- > Justin M Wozniak > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Mon Nov 14 13:21:45 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 14 Nov 2011 13:21:45 -0600 (CST) Subject: [Swift-devel] CDM Tests In-Reply-To: Message-ID: <905842519.27456.1321298505668.JavaMail.root@zimbra.anl.gov> Ketan, I thought your initial email stated that the 4 cases of relative path names "worked". But this latter comment indicates that there is some kind of problem even in the relative cases. Can you clarify? - Mike ----- Original Message ----- > From: "Ketan Maheshwari" > To: "Justin M Wozniak" > Cc: "Michael Wilde" , "Swift Devel" > Sent: Monday, November 14, 2011 11:01:35 AM > Subject: Re: [Swift-devel] CDM Tests > Given the CDM's support for relative pathnames, I would expect the > case of relative inputs, relative outputs would succeed irrespective > of config. However, from the tests, it seems that the config option is > overriding CDM directives. > > > On Mon, Nov 14, 2011 at 12:44 PM, Justin M Wozniak < > wozniak at mcs.anl.gov > wrote: > > > > CDM DIRECT does not currently support absolute path names. I think we > could provide something simple like "" for the field, as you suggest. > > > > > On Mon, 14 Nov 2011, Michael Wilde wrote: > > > I would expect that for absolute path mappings, the user would > > specify > > either :: or "/" as the path in the CDM DIRECT pattern. In other > > words, > > the user is saying "I have mapped the object to an absolute path > > name, > > and I want the app to get exactly this name". Hence I think that the > > best "location" field for a DIRECT rule is null (""). > > > > Justin, can you clarify what the code expects for this case? > > > > - Mike > > > > ----- Original Message ----- > >> From: "Jonathan Monette" < jonmon at mcs.anl.gov > > >> To: "Michael Wilde" < wilde at mcs.anl.gov > > >> Cc: "Ketan Maheshwari" < ketancmaheshwari at gmail.com >, "Swift > >> Devel" < swift-devel at ci.uchicago.edu > > >> Sent: Monday, November 14, 2011 9:13:01 AM > >> Subject: Re: [Swift-devel] CDM Tests > >> I couldn't tell from the excel(using my phone) but what was > >> absolute? > >> If I am not mistaken what CDM does is takes the path that is > >> specified > >> in the CDM file(which should be absolute) and appends the file path > >> that is used in the swift script to it(which should be relative). > >> So > >> if they are both absolute I would expect it to fail. > >> > >> > >> > >> On Nov 14, 2011, at 10:56 AM, Michael Wilde < wilde at mcs.anl.gov > > >> wrote: > >> > >>> Thanks, Ketan - very nice tests and summary document. > >>> > >>> I should point out for those that didn't open the doc: the 4 tests > >>> with absolute pathnames are failing. > >>> > >>> Ketan, can you work with Justin to see if this is a bug, or if the > >>> CDM directive needs to be coded differently for absolute paths? > >>> Then > >>> please test a fix, and as we discussed adapt the tests with > >>> annotations to enhance the User Guide section on CDM. > >>> > >>> - Mike > >>> > >>> > >>> ----- Original Message ----- > >>>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > >>>> To: "Swift Devel" < swift-devel at ci.uchicago.edu > > >>>> Sent: Monday, November 14, 2011 8:13:48 AM > >>>> Subject: [Swift-devel] CDM Tests > >>>> Hello, > >>>> > >>>> > >>>> I had a discussion with Mike about testing the CDM behavior for > >>>> the > >>>> following cases: > >>>> 1. full versus relative paths for input > >>>> 2. full versus relative paths for output > >>>> 3. relative versus absolute option in config property: > >>>> wrapper.invocation.mode > >>>> > >>>> > >>>> In this regard, I made 8 tests for all the above combinations. I > >>>> used > >>>> simple local provider in this first set of tests. > >>>> > >>>> > >>>> From the tests it seems that when specifying the relative option > >>>> on > >>>> the config for wrapper.invocation.mode property the script works > >>>> regardless of the paths of input/output. > >>>> > >>>> > >>>> A detailed result with stdout, and paths to logs can be found > >>>> here: > >>>> https://docs.google.com/spreadsheet/ccc?key=0AmvYSwENKFY9dG44V2VjRXJlUmZLNG9saERFeWZDcUE > >>>> > >>>> > >>>> > >>>> The tests are in my CI dir: /home/ketan/cdm_tests > >>>> > >>>> > >>>> > >>>> > >>>> Regards, -- > >>>> Ketan > >>>> > >>>> > >>>> > >>>> _______________________________________________ > >>>> Swift-devel mailing list > >>>> Swift-devel at ci.uchicago.edu > >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >>> > >>> -- > >>> Michael Wilde > >>> Computation Institute, University of Chicago > >>> Mathematics and Computer Science Division > >>> Argonne National Laboratory > >>> > >>> _______________________________________________ > >>> Swift-devel mailing list > >>> Swift-devel at ci.uchicago.edu > >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > -- > Justin M Wozniak > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > -- > Ketan -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From ketancmaheshwari at gmail.com Mon Nov 14 13:29:36 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Mon, 14 Nov 2011 13:29:36 -0600 Subject: [Swift-devel] CDM Tests In-Reply-To: <905842519.27456.1321298505668.JavaMail.root@zimbra.anl.gov> References: <905842519.27456.1321298505668.JavaMail.root@zimbra.anl.gov> Message-ID: Mike, In my initial email I said: ... when specifying the relative option on the config for wrapper.invocation.mode property the script works regardless of the paths of input/output. Which made me conclude that the option specified in the config for wrapper.invocation.mode property overrides the CDM policies. On Mon, Nov 14, 2011 at 1:21 PM, Michael Wilde wrote: > Ketan, > > I thought your initial email stated that the 4 cases of relative path > names "worked". But this latter comment indicates that there is some kind > of problem even in the relative cases. > > Can you clarify? > > - Mike > > > ----- Original Message ----- > > From: "Ketan Maheshwari" > > To: "Justin M Wozniak" > > Cc: "Michael Wilde" , "Swift Devel" < > swift-devel at ci.uchicago.edu> > > Sent: Monday, November 14, 2011 11:01:35 AM > > Subject: Re: [Swift-devel] CDM Tests > > Given the CDM's support for relative pathnames, I would expect the > > case of relative inputs, relative outputs would succeed irrespective > > of config. However, from the tests, it seems that the config option is > > overriding CDM directives. > > > > > > On Mon, Nov 14, 2011 at 12:44 PM, Justin M Wozniak < > > wozniak at mcs.anl.gov > wrote: > > > > > > > > CDM DIRECT does not currently support absolute path names. I think we > > could provide something simple like "" for the field, as you suggest. > > > > > > > > > > On Mon, 14 Nov 2011, Michael Wilde wrote: > > > > > I would expect that for absolute path mappings, the user would > > > specify > > > either :: or "/" as the path in the CDM DIRECT pattern. In other > > > words, > > > the user is saying "I have mapped the object to an absolute path > > > name, > > > and I want the app to get exactly this name". Hence I think that the > > > best "location" field for a DIRECT rule is null (""). > > > > > > Justin, can you clarify what the code expects for this case? > > > > > > - Mike > > > > > > ----- Original Message ----- > > >> From: "Jonathan Monette" < jonmon at mcs.anl.gov > > > >> To: "Michael Wilde" < wilde at mcs.anl.gov > > > >> Cc: "Ketan Maheshwari" < ketancmaheshwari at gmail.com >, "Swift > > >> Devel" < swift-devel at ci.uchicago.edu > > > >> Sent: Monday, November 14, 2011 9:13:01 AM > > >> Subject: Re: [Swift-devel] CDM Tests > > >> I couldn't tell from the excel(using my phone) but what was > > >> absolute? > > >> If I am not mistaken what CDM does is takes the path that is > > >> specified > > >> in the CDM file(which should be absolute) and appends the file path > > >> that is used in the swift script to it(which should be relative). > > >> So > > >> if they are both absolute I would expect it to fail. > > >> > > >> > > >> > > >> On Nov 14, 2011, at 10:56 AM, Michael Wilde < wilde at mcs.anl.gov > > > >> wrote: > > >> > > >>> Thanks, Ketan - very nice tests and summary document. > > >>> > > >>> I should point out for those that didn't open the doc: the 4 tests > > >>> with absolute pathnames are failing. > > >>> > > >>> Ketan, can you work with Justin to see if this is a bug, or if the > > >>> CDM directive needs to be coded differently for absolute paths? > > >>> Then > > >>> please test a fix, and as we discussed adapt the tests with > > >>> annotations to enhance the User Guide section on CDM. > > >>> > > >>> - Mike > > >>> > > >>> > > >>> ----- Original Message ----- > > >>>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > >>>> To: "Swift Devel" < swift-devel at ci.uchicago.edu > > > >>>> Sent: Monday, November 14, 2011 8:13:48 AM > > >>>> Subject: [Swift-devel] CDM Tests > > >>>> Hello, > > >>>> > > >>>> > > >>>> I had a discussion with Mike about testing the CDM behavior for > > >>>> the > > >>>> following cases: > > >>>> 1. full versus relative paths for input > > >>>> 2. full versus relative paths for output > > >>>> 3. relative versus absolute option in config property: > > >>>> wrapper.invocation.mode > > >>>> > > >>>> > > >>>> In this regard, I made 8 tests for all the above combinations. I > > >>>> used > > >>>> simple local provider in this first set of tests. > > >>>> > > >>>> > > >>>> From the tests it seems that when specifying the relative option > > >>>> on > > >>>> the config for wrapper.invocation.mode property the script works > > >>>> regardless of the paths of input/output. > > >>>> > > >>>> > > >>>> A detailed result with stdout, and paths to logs can be found > > >>>> here: > > >>>> > https://docs.google.com/spreadsheet/ccc?key=0AmvYSwENKFY9dG44V2VjRXJlUmZLNG9saERFeWZDcUE > > >>>> > > >>>> > > >>>> > > >>>> The tests are in my CI dir: /home/ketan/cdm_tests > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> Regards, -- > > >>>> Ketan > > >>>> > > >>>> > > >>>> > > >>>> _______________________________________________ > > >>>> Swift-devel mailing list > > >>>> Swift-devel at ci.uchicago.edu > > >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > >>> > > >>> -- > > >>> Michael Wilde > > >>> Computation Institute, University of Chicago > > >>> Mathematics and Computer Science Division > > >>> Argonne National Laboratory > > >>> > > >>> _______________________________________________ > > >>> Swift-devel mailing list > > >>> Swift-devel at ci.uchicago.edu > > >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > -- > > Justin M Wozniak > > > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > -- > > Ketan > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Mon Nov 14 13:39:10 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 14 Nov 2011 13:39:10 -0600 (CST) Subject: [Swift-devel] CDM Tests In-Reply-To: Message-ID: <257996315.27556.1321299550146.JavaMail.root@zimbra.anl.gov> Even though the tests worked in each of these 4 cases, the question remains of "did the app see the same path names" and "did copies get done"? So I think you need to dig deeper and explain the difference between what CDM does for the direct case, and how (if at all) the wrapper mode affects things. Its equally possible that CDM overrode the wrapper mode arg. Just specifying the wrapper mode arg will not, I think, achieve the elimination of the "copy to shared workflow directory" that "direct mode" accomplishes. - Mike ----- Original Message ----- > From: "Ketan Maheshwari" > To: "Michael Wilde" > Cc: "Swift Devel" , "Justin M Wozniak" > Sent: Monday, November 14, 2011 11:29:36 AM > Subject: Re: [Swift-devel] CDM Tests > Mike, > > > In my initial email I said: > > > ... when specifying the relative option on the config for > wrapper.invocation.mode property the script works regardless of the > paths of input/output. > > > Which made me conclude that the option specified in the config for > wrapper.invocation.mode property overrides the CDM policies. > > > On Mon, Nov 14, 2011 at 1:21 PM, Michael Wilde < wilde at mcs.anl.gov > > wrote: > > > Ketan, > > I thought your initial email stated that the 4 cases of relative path > names "worked". But this latter comment indicates that there is some > kind of problem even in the relative cases. > > Can you clarify? > > > - Mike > > > ----- Original Message ----- > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > To: "Justin M Wozniak" < wozniak at mcs.anl.gov > > > Cc: "Michael Wilde" < wilde at mcs.anl.gov >, "Swift Devel" < > > swift-devel at ci.uchicago.edu > > > Sent: Monday, November 14, 2011 11:01:35 AM > > Subject: Re: [Swift-devel] CDM Tests > > Given the CDM's support for relative pathnames, I would expect the > > case of relative inputs, relative outputs would succeed irrespective > > of config. However, from the tests, it seems that the config option > > is > > overriding CDM directives. > > > > > > On Mon, Nov 14, 2011 at 12:44 PM, Justin M Wozniak < > > wozniak at mcs.anl.gov > wrote: > > > > > > > > CDM DIRECT does not currently support absolute path names. I think > > we > > could provide something simple like "" for the field, as you > > suggest. > > > > > > > > > > On Mon, 14 Nov 2011, Michael Wilde wrote: > > > > > I would expect that for absolute path mappings, the user would > > > specify > > > either :: or "/" as the path in the CDM DIRECT pattern. In other > > > words, > > > the user is saying "I have mapped the object to an absolute path > > > name, > > > and I want the app to get exactly this name". Hence I think that > > > the > > > best "location" field for a DIRECT rule is null (""). > > > > > > Justin, can you clarify what the code expects for this case? > > > > > > - Mike > > > > > > ----- Original Message ----- > > >> From: "Jonathan Monette" < jonmon at mcs.anl.gov > > > >> To: "Michael Wilde" < wilde at mcs.anl.gov > > > >> Cc: "Ketan Maheshwari" < ketancmaheshwari at gmail.com >, "Swift > > >> Devel" < swift-devel at ci.uchicago.edu > > > >> Sent: Monday, November 14, 2011 9:13:01 AM > > >> Subject: Re: [Swift-devel] CDM Tests > > >> I couldn't tell from the excel(using my phone) but what was > > >> absolute? > > >> If I am not mistaken what CDM does is takes the path that is > > >> specified > > >> in the CDM file(which should be absolute) and appends the file > > >> path > > >> that is used in the swift script to it(which should be relative). > > >> So > > >> if they are both absolute I would expect it to fail. > > >> > > >> > > >> > > >> On Nov 14, 2011, at 10:56 AM, Michael Wilde < wilde at mcs.anl.gov > > > >> wrote: > > >> > > >>> Thanks, Ketan - very nice tests and summary document. > > >>> > > >>> I should point out for those that didn't open the doc: the 4 > > >>> tests > > >>> with absolute pathnames are failing. > > >>> > > >>> Ketan, can you work with Justin to see if this is a bug, or if > > >>> the > > >>> CDM directive needs to be coded differently for absolute paths? > > >>> Then > > >>> please test a fix, and as we discussed adapt the tests with > > >>> annotations to enhance the User Guide section on CDM. > > >>> > > >>> - Mike > > >>> > > >>> > > >>> ----- Original Message ----- > > >>>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > >>>> To: "Swift Devel" < swift-devel at ci.uchicago.edu > > > >>>> Sent: Monday, November 14, 2011 8:13:48 AM > > >>>> Subject: [Swift-devel] CDM Tests > > >>>> Hello, > > >>>> > > >>>> > > >>>> I had a discussion with Mike about testing the CDM behavior for > > >>>> the > > >>>> following cases: > > >>>> 1. full versus relative paths for input > > >>>> 2. full versus relative paths for output > > >>>> 3. relative versus absolute option in config property: > > >>>> wrapper.invocation.mode > > >>>> > > >>>> > > >>>> In this regard, I made 8 tests for all the above combinations. > > >>>> I > > >>>> used > > >>>> simple local provider in this first set of tests. > > >>>> > > >>>> > > >>>> From the tests it seems that when specifying the relative > > >>>> option > > >>>> on > > >>>> the config for wrapper.invocation.mode property the script > > >>>> works > > >>>> regardless of the paths of input/output. > > >>>> > > >>>> > > >>>> A detailed result with stdout, and paths to logs can be found > > >>>> here: > > >>>> https://docs.google.com/spreadsheet/ccc?key=0AmvYSwENKFY9dG44V2VjRXJlUmZLNG9saERFeWZDcUE > > >>>> > > >>>> > > >>>> > > >>>> The tests are in my CI dir: /home/ketan/cdm_tests > > >>>> > > >>>> > > >>>> > > >>>> > > >>>> Regards, -- > > >>>> Ketan > > >>>> > > >>>> > > >>>> > > >>>> _______________________________________________ > > >>>> Swift-devel mailing list > > >>>> Swift-devel at ci.uchicago.edu > > >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > >>> > > >>> -- > > >>> Michael Wilde > > >>> Computation Institute, University of Chicago > > >>> Mathematics and Computer Science Division > > >>> Argonne National Laboratory > > >>> > > >>> _______________________________________________ > > >>> Swift-devel mailing list > > >>> Swift-devel at ci.uchicago.edu > > >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > -- > > Justin M Wozniak > > > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > -- > > Ketan > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > > > > > -- > Ketan -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From wozniak at mcs.anl.gov Mon Nov 14 13:44:16 2011 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Mon, 14 Nov 2011 11:44:16 -0800 (Pacific Standard Time) Subject: [Swift-devel] CDM Tests In-Reply-To: <257996315.27556.1321299550146.JavaMail.root@zimbra.anl.gov> References: <257996315.27556.1321299550146.JavaMail.root@zimbra.anl.gov> Message-ID: Ketan, if you can add these tests to the test suite in tests/cdm (or tests/cdm/absolute) that will be the fastest way to precisely determine what you are running and make these tests reproducible by the group. I should be able to take a look at them later today. Thanks Justin On Mon, 14 Nov 2011, Michael Wilde wrote: > Even though the tests worked in each of these 4 cases, the question remains of "did the app see the same path names" and "did copies get done"? > > So I think you need to dig deeper and explain the difference between what CDM does for the direct case, and how (if at all) the wrapper mode affects things. > > Its equally possible that CDM overrode the wrapper mode arg. Just specifying the wrapper mode arg will not, I think, achieve the elimination of the "copy to shared workflow directory" that "direct mode" accomplishes. > > - Mike > > ----- Original Message ----- >> From: "Ketan Maheshwari" >> To: "Michael Wilde" >> Cc: "Swift Devel" , "Justin M Wozniak" >> Sent: Monday, November 14, 2011 11:29:36 AM >> Subject: Re: [Swift-devel] CDM Tests >> Mike, >> >> >> In my initial email I said: >> >> >> ... when specifying the relative option on the config for >> wrapper.invocation.mode property the script works regardless of the >> paths of input/output. >> >> >> Which made me conclude that the option specified in the config for >> wrapper.invocation.mode property overrides the CDM policies. >> >> >> On Mon, Nov 14, 2011 at 1:21 PM, Michael Wilde < wilde at mcs.anl.gov > >> wrote: >> >> >> Ketan, >> >> I thought your initial email stated that the 4 cases of relative path >> names "worked". But this latter comment indicates that there is some >> kind of problem even in the relative cases. >> >> Can you clarify? >> >> >> - Mike >> >> >> ----- Original Message ----- >>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > >> >> >> >>> To: "Justin M Wozniak" < wozniak at mcs.anl.gov > >>> Cc: "Michael Wilde" < wilde at mcs.anl.gov >, "Swift Devel" < >>> swift-devel at ci.uchicago.edu > >>> Sent: Monday, November 14, 2011 11:01:35 AM >>> Subject: Re: [Swift-devel] CDM Tests >>> Given the CDM's support for relative pathnames, I would expect the >>> case of relative inputs, relative outputs would succeed irrespective >>> of config. However, from the tests, it seems that the config option >>> is >>> overriding CDM directives. >>> >>> >>> On Mon, Nov 14, 2011 at 12:44 PM, Justin M Wozniak < >>> wozniak at mcs.anl.gov > wrote: >>> >>> >>> >>> CDM DIRECT does not currently support absolute path names. I think >>> we >>> could provide something simple like "" for the field, as you >>> suggest. >>> >>> >>> >>> >>> On Mon, 14 Nov 2011, Michael Wilde wrote: >>> >>>> I would expect that for absolute path mappings, the user would >>>> specify >>>> either :: or "/" as the path in the CDM DIRECT pattern. In other >>>> words, >>>> the user is saying "I have mapped the object to an absolute path >>>> name, >>>> and I want the app to get exactly this name". Hence I think that >>>> the >>>> best "location" field for a DIRECT rule is null (""). >>>> >>>> Justin, can you clarify what the code expects for this case? >>>> >>>> - Mike >>>> >>>> ----- Original Message ----- >>>>> From: "Jonathan Monette" < jonmon at mcs.anl.gov > >>>>> To: "Michael Wilde" < wilde at mcs.anl.gov > >>>>> Cc: "Ketan Maheshwari" < ketancmaheshwari at gmail.com >, "Swift >>>>> Devel" < swift-devel at ci.uchicago.edu > >>>>> Sent: Monday, November 14, 2011 9:13:01 AM >>>>> Subject: Re: [Swift-devel] CDM Tests >>>>> I couldn't tell from the excel(using my phone) but what was >>>>> absolute? >>>>> If I am not mistaken what CDM does is takes the path that is >>>>> specified >>>>> in the CDM file(which should be absolute) and appends the file >>>>> path >>>>> that is used in the swift script to it(which should be relative). >>>>> So >>>>> if they are both absolute I would expect it to fail. >>>>> >>>>> >>>>> >>>>> On Nov 14, 2011, at 10:56 AM, Michael Wilde < wilde at mcs.anl.gov > >>>>> wrote: >>>>> >>>>>> Thanks, Ketan - very nice tests and summary document. >>>>>> >>>>>> I should point out for those that didn't open the doc: the 4 >>>>>> tests >>>>>> with absolute pathnames are failing. >>>>>> >>>>>> Ketan, can you work with Justin to see if this is a bug, or if >>>>>> the >>>>>> CDM directive needs to be coded differently for absolute paths? >>>>>> Then >>>>>> please test a fix, and as we discussed adapt the tests with >>>>>> annotations to enhance the User Guide section on CDM. >>>>>> >>>>>> - Mike >>>>>> >>>>>> >>>>>> ----- Original Message ----- >>>>>>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > >>>>>>> To: "Swift Devel" < swift-devel at ci.uchicago.edu > >>>>>>> Sent: Monday, November 14, 2011 8:13:48 AM >>>>>>> Subject: [Swift-devel] CDM Tests >>>>>>> Hello, >>>>>>> >>>>>>> >>>>>>> I had a discussion with Mike about testing the CDM behavior for >>>>>>> the >>>>>>> following cases: >>>>>>> 1. full versus relative paths for input >>>>>>> 2. full versus relative paths for output >>>>>>> 3. relative versus absolute option in config property: >>>>>>> wrapper.invocation.mode >>>>>>> >>>>>>> >>>>>>> In this regard, I made 8 tests for all the above combinations. >>>>>>> I >>>>>>> used >>>>>>> simple local provider in this first set of tests. >>>>>>> >>>>>>> >>>>>>> From the tests it seems that when specifying the relative >>>>>>> option >>>>>>> on >>>>>>> the config for wrapper.invocation.mode property the script >>>>>>> works >>>>>>> regardless of the paths of input/output. >>>>>>> >>>>>>> >>>>>>> A detailed result with stdout, and paths to logs can be found >>>>>>> here: >>>>>>> https://docs.google.com/spreadsheet/ccc?key=0AmvYSwENKFY9dG44V2VjRXJlUmZLNG9saERFeWZDcUE >>>>>>> >>>>>>> >>>>>>> >>>>>>> The tests are in my CI dir: /home/ketan/cdm_tests >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Regards, -- >>>>>>> Ketan >>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Swift-devel mailing list >>>>>>> Swift-devel at ci.uchicago.edu >>>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>>>> >>>>>> -- >>>>>> Michael Wilde >>>>>> Computation Institute, University of Chicago >>>>>> Mathematics and Computer Science Division >>>>>> Argonne National Laboratory >>>>>> >>>>>> _______________________________________________ >>>>>> Swift-devel mailing list >>>>>> Swift-devel at ci.uchicago.edu >>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>> >>>> >>> >>> -- >>> Justin M Wozniak >>> >>> >>> >>> _______________________________________________ >>> Swift-devel mailing list >>> Swift-devel at ci.uchicago.edu >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>> >>> >>> >>> >>> -- >>> Ketan >> >> -- >> Michael Wilde >> Computation Institute, University of Chicago >> Mathematics and Computer Science Division >> Argonne National Laboratory >> >> >> >> >> >> -- >> Ketan > > -- Justin M Wozniak From wilde at mcs.anl.gov Mon Nov 14 13:46:33 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 14 Nov 2011 13:46:33 -0600 (CST) Subject: [Swift-devel] CDM Tests In-Reply-To: <257996315.27556.1321299550146.JavaMail.root@zimbra.anl.gov> Message-ID: <1056466748.27587.1321299993235.JavaMail.root@zimbra.anl.gov> Ketan, in these tests, I thought that the names that the app actually receives on its command line is worth documenting. Does that show anything that the script writer should be aware of? - Mike ----- Original Message ----- > From: "Michael Wilde" > To: "Ketan Maheshwari" > Cc: "Swift Devel" > Sent: Monday, November 14, 2011 11:39:10 AM > Subject: Re: [Swift-devel] CDM Tests > Even though the tests worked in each of these 4 cases, the question > remains of "did the app see the same path names" and "did copies get > done"? > > So I think you need to dig deeper and explain the difference between > what CDM does for the direct case, and how (if at all) the wrapper > mode affects things. > > Its equally possible that CDM overrode the wrapper mode arg. Just > specifying the wrapper mode arg will not, I think, achieve the > elimination of the "copy to shared workflow directory" that "direct > mode" accomplishes. > > - Mike > > ----- Original Message ----- > > From: "Ketan Maheshwari" > > To: "Michael Wilde" > > Cc: "Swift Devel" , "Justin M Wozniak" > > > > Sent: Monday, November 14, 2011 11:29:36 AM > > Subject: Re: [Swift-devel] CDM Tests > > Mike, > > > > > > In my initial email I said: > > > > > > ... when specifying the relative option on the config for > > wrapper.invocation.mode property the script works regardless of the > > paths of input/output. > > > > > > Which made me conclude that the option specified in the config for > > wrapper.invocation.mode property overrides the CDM policies. > > > > > > On Mon, Nov 14, 2011 at 1:21 PM, Michael Wilde < wilde at mcs.anl.gov > > > wrote: > > > > > > Ketan, > > > > I thought your initial email stated that the 4 cases of relative > > path > > names "worked". But this latter comment indicates that there is some > > kind of problem even in the relative cases. > > > > Can you clarify? > > > > > > - Mike > > > > > > ----- Original Message ----- > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > > > To: "Justin M Wozniak" < wozniak at mcs.anl.gov > > > > Cc: "Michael Wilde" < wilde at mcs.anl.gov >, "Swift Devel" < > > > swift-devel at ci.uchicago.edu > > > > Sent: Monday, November 14, 2011 11:01:35 AM > > > Subject: Re: [Swift-devel] CDM Tests > > > Given the CDM's support for relative pathnames, I would expect the > > > case of relative inputs, relative outputs would succeed > > > irrespective > > > of config. However, from the tests, it seems that the config > > > option > > > is > > > overriding CDM directives. > > > > > > > > > On Mon, Nov 14, 2011 at 12:44 PM, Justin M Wozniak < > > > wozniak at mcs.anl.gov > wrote: > > > > > > > > > > > > CDM DIRECT does not currently support absolute path names. I think > > > we > > > could provide something simple like "" for the field, as you > > > suggest. > > > > > > > > > > > > > > > On Mon, 14 Nov 2011, Michael Wilde wrote: > > > > > > > I would expect that for absolute path mappings, the user would > > > > specify > > > > either :: or "/" as the path in the CDM DIRECT pattern. In other > > > > words, > > > > the user is saying "I have mapped the object to an absolute path > > > > name, > > > > and I want the app to get exactly this name". Hence I think that > > > > the > > > > best "location" field for a DIRECT rule is null (""). > > > > > > > > Justin, can you clarify what the code expects for this case? > > > > > > > > - Mike > > > > > > > > ----- Original Message ----- > > > >> From: "Jonathan Monette" < jonmon at mcs.anl.gov > > > > >> To: "Michael Wilde" < wilde at mcs.anl.gov > > > > >> Cc: "Ketan Maheshwari" < ketancmaheshwari at gmail.com >, "Swift > > > >> Devel" < swift-devel at ci.uchicago.edu > > > > >> Sent: Monday, November 14, 2011 9:13:01 AM > > > >> Subject: Re: [Swift-devel] CDM Tests > > > >> I couldn't tell from the excel(using my phone) but what was > > > >> absolute? > > > >> If I am not mistaken what CDM does is takes the path that is > > > >> specified > > > >> in the CDM file(which should be absolute) and appends the file > > > >> path > > > >> that is used in the swift script to it(which should be > > > >> relative). > > > >> So > > > >> if they are both absolute I would expect it to fail. > > > >> > > > >> > > > >> > > > >> On Nov 14, 2011, at 10:56 AM, Michael Wilde < wilde at mcs.anl.gov > > > >> > > > > >> wrote: > > > >> > > > >>> Thanks, Ketan - very nice tests and summary document. > > > >>> > > > >>> I should point out for those that didn't open the doc: the 4 > > > >>> tests > > > >>> with absolute pathnames are failing. > > > >>> > > > >>> Ketan, can you work with Justin to see if this is a bug, or if > > > >>> the > > > >>> CDM directive needs to be coded differently for absolute > > > >>> paths? > > > >>> Then > > > >>> please test a fix, and as we discussed adapt the tests with > > > >>> annotations to enhance the User Guide section on CDM. > > > >>> > > > >>> - Mike > > > >>> > > > >>> > > > >>> ----- Original Message ----- > > > >>>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > >>>> To: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > >>>> Sent: Monday, November 14, 2011 8:13:48 AM > > > >>>> Subject: [Swift-devel] CDM Tests > > > >>>> Hello, > > > >>>> > > > >>>> > > > >>>> I had a discussion with Mike about testing the CDM behavior > > > >>>> for > > > >>>> the > > > >>>> following cases: > > > >>>> 1. full versus relative paths for input > > > >>>> 2. full versus relative paths for output > > > >>>> 3. relative versus absolute option in config property: > > > >>>> wrapper.invocation.mode > > > >>>> > > > >>>> > > > >>>> In this regard, I made 8 tests for all the above > > > >>>> combinations. > > > >>>> I > > > >>>> used > > > >>>> simple local provider in this first set of tests. > > > >>>> > > > >>>> > > > >>>> From the tests it seems that when specifying the relative > > > >>>> option > > > >>>> on > > > >>>> the config for wrapper.invocation.mode property the script > > > >>>> works > > > >>>> regardless of the paths of input/output. > > > >>>> > > > >>>> > > > >>>> A detailed result with stdout, and paths to logs can be found > > > >>>> here: > > > >>>> https://docs.google.com/spreadsheet/ccc?key=0AmvYSwENKFY9dG44V2VjRXJlUmZLNG9saERFeWZDcUE > > > >>>> > > > >>>> > > > >>>> > > > >>>> The tests are in my CI dir: /home/ketan/cdm_tests > > > >>>> > > > >>>> > > > >>>> > > > >>>> > > > >>>> Regards, -- > > > >>>> Ketan > > > >>>> > > > >>>> > > > >>>> > > > >>>> _______________________________________________ > > > >>>> Swift-devel mailing list > > > >>>> Swift-devel at ci.uchicago.edu > > > >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > >>> > > > >>> -- > > > >>> Michael Wilde > > > >>> Computation Institute, University of Chicago > > > >>> Mathematics and Computer Science Division > > > >>> Argonne National Laboratory > > > >>> > > > >>> _______________________________________________ > > > >>> Swift-devel mailing list > > > >>> Swift-devel at ci.uchicago.edu > > > >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > > > -- > > > Justin M Wozniak > > > > > > > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > -- > > > Ketan > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > > > > > > > > > -- > > Ketan > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From ketancmaheshwari at gmail.com Mon Nov 14 13:50:48 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Mon, 14 Nov 2011 13:50:48 -0600 Subject: [Swift-devel] CDM Tests In-Reply-To: <1056466748.27587.1321299993235.JavaMail.root@zimbra.anl.gov> References: <257996315.27556.1321299550146.JavaMail.root@zimbra.anl.gov> <1056466748.27587.1321299993235.JavaMail.root@zimbra.anl.gov> Message-ID: All commandlines are in a run.sh script batched together. I am writing a README in the testsuite to document this. On Mon, Nov 14, 2011 at 1:46 PM, Michael Wilde wrote: > Ketan, in these tests, I thought that the names that the app actually > receives on its command line is worth documenting. Does that show anything > that the script writer should be aware of? > > - Mike > > ----- Original Message ----- > > From: "Michael Wilde" > > To: "Ketan Maheshwari" > > Cc: "Swift Devel" > > Sent: Monday, November 14, 2011 11:39:10 AM > > Subject: Re: [Swift-devel] CDM Tests > > Even though the tests worked in each of these 4 cases, the question > > remains of "did the app see the same path names" and "did copies get > > done"? > > > > So I think you need to dig deeper and explain the difference between > > what CDM does for the direct case, and how (if at all) the wrapper > > mode affects things. > > > > Its equally possible that CDM overrode the wrapper mode arg. Just > > specifying the wrapper mode arg will not, I think, achieve the > > elimination of the "copy to shared workflow directory" that "direct > > mode" accomplishes. > > > > - Mike > > > > ----- Original Message ----- > > > From: "Ketan Maheshwari" > > > To: "Michael Wilde" > > > Cc: "Swift Devel" , "Justin M Wozniak" > > > > > > Sent: Monday, November 14, 2011 11:29:36 AM > > > Subject: Re: [Swift-devel] CDM Tests > > > Mike, > > > > > > > > > In my initial email I said: > > > > > > > > > ... when specifying the relative option on the config for > > > wrapper.invocation.mode property the script works regardless of the > > > paths of input/output. > > > > > > > > > Which made me conclude that the option specified in the config for > > > wrapper.invocation.mode property overrides the CDM policies. > > > > > > > > > On Mon, Nov 14, 2011 at 1:21 PM, Michael Wilde < wilde at mcs.anl.gov > > > > wrote: > > > > > > > > > Ketan, > > > > > > I thought your initial email stated that the 4 cases of relative > > > path > > > names "worked". But this latter comment indicates that there is some > > > kind of problem even in the relative cases. > > > > > > Can you clarify? > > > > > > > > > - Mike > > > > > > > > > ----- Original Message ----- > > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > > > > > > > To: "Justin M Wozniak" < wozniak at mcs.anl.gov > > > > > Cc: "Michael Wilde" < wilde at mcs.anl.gov >, "Swift Devel" < > > > > swift-devel at ci.uchicago.edu > > > > > Sent: Monday, November 14, 2011 11:01:35 AM > > > > Subject: Re: [Swift-devel] CDM Tests > > > > Given the CDM's support for relative pathnames, I would expect the > > > > case of relative inputs, relative outputs would succeed > > > > irrespective > > > > of config. However, from the tests, it seems that the config > > > > option > > > > is > > > > overriding CDM directives. > > > > > > > > > > > > On Mon, Nov 14, 2011 at 12:44 PM, Justin M Wozniak < > > > > wozniak at mcs.anl.gov > wrote: > > > > > > > > > > > > > > > > CDM DIRECT does not currently support absolute path names. I think > > > > we > > > > could provide something simple like "" for the field, as you > > > > suggest. > > > > > > > > > > > > > > > > > > > > On Mon, 14 Nov 2011, Michael Wilde wrote: > > > > > > > > > I would expect that for absolute path mappings, the user would > > > > > specify > > > > > either :: or "/" as the path in the CDM DIRECT pattern. In other > > > > > words, > > > > > the user is saying "I have mapped the object to an absolute path > > > > > name, > > > > > and I want the app to get exactly this name". Hence I think that > > > > > the > > > > > best "location" field for a DIRECT rule is null (""). > > > > > > > > > > Justin, can you clarify what the code expects for this case? > > > > > > > > > > - Mike > > > > > > > > > > ----- Original Message ----- > > > > >> From: "Jonathan Monette" < jonmon at mcs.anl.gov > > > > > >> To: "Michael Wilde" < wilde at mcs.anl.gov > > > > > >> Cc: "Ketan Maheshwari" < ketancmaheshwari at gmail.com >, "Swift > > > > >> Devel" < swift-devel at ci.uchicago.edu > > > > > >> Sent: Monday, November 14, 2011 9:13:01 AM > > > > >> Subject: Re: [Swift-devel] CDM Tests > > > > >> I couldn't tell from the excel(using my phone) but what was > > > > >> absolute? > > > > >> If I am not mistaken what CDM does is takes the path that is > > > > >> specified > > > > >> in the CDM file(which should be absolute) and appends the file > > > > >> path > > > > >> that is used in the swift script to it(which should be > > > > >> relative). > > > > >> So > > > > >> if they are both absolute I would expect it to fail. > > > > >> > > > > >> > > > > >> > > > > >> On Nov 14, 2011, at 10:56 AM, Michael Wilde < wilde at mcs.anl.gov > > > > >> > > > > > >> wrote: > > > > >> > > > > >>> Thanks, Ketan - very nice tests and summary document. > > > > >>> > > > > >>> I should point out for those that didn't open the doc: the 4 > > > > >>> tests > > > > >>> with absolute pathnames are failing. > > > > >>> > > > > >>> Ketan, can you work with Justin to see if this is a bug, or if > > > > >>> the > > > > >>> CDM directive needs to be coded differently for absolute > > > > >>> paths? > > > > >>> Then > > > > >>> please test a fix, and as we discussed adapt the tests with > > > > >>> annotations to enhance the User Guide section on CDM. > > > > >>> > > > > >>> - Mike > > > > >>> > > > > >>> > > > > >>> ----- Original Message ----- > > > > >>>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > >>>> To: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > >>>> Sent: Monday, November 14, 2011 8:13:48 AM > > > > >>>> Subject: [Swift-devel] CDM Tests > > > > >>>> Hello, > > > > >>>> > > > > >>>> > > > > >>>> I had a discussion with Mike about testing the CDM behavior > > > > >>>> for > > > > >>>> the > > > > >>>> following cases: > > > > >>>> 1. full versus relative paths for input > > > > >>>> 2. full versus relative paths for output > > > > >>>> 3. relative versus absolute option in config property: > > > > >>>> wrapper.invocation.mode > > > > >>>> > > > > >>>> > > > > >>>> In this regard, I made 8 tests for all the above > > > > >>>> combinations. > > > > >>>> I > > > > >>>> used > > > > >>>> simple local provider in this first set of tests. > > > > >>>> > > > > >>>> > > > > >>>> From the tests it seems that when specifying the relative > > > > >>>> option > > > > >>>> on > > > > >>>> the config for wrapper.invocation.mode property the script > > > > >>>> works > > > > >>>> regardless of the paths of input/output. > > > > >>>> > > > > >>>> > > > > >>>> A detailed result with stdout, and paths to logs can be found > > > > >>>> here: > > > > >>>> > https://docs.google.com/spreadsheet/ccc?key=0AmvYSwENKFY9dG44V2VjRXJlUmZLNG9saERFeWZDcUE > > > > >>>> > > > > >>>> > > > > >>>> > > > > >>>> The tests are in my CI dir: /home/ketan/cdm_tests > > > > >>>> > > > > >>>> > > > > >>>> > > > > >>>> > > > > >>>> Regards, -- > > > > >>>> Ketan > > > > >>>> > > > > >>>> > > > > >>>> > > > > >>>> _______________________________________________ > > > > >>>> Swift-devel mailing list > > > > >>>> Swift-devel at ci.uchicago.edu > > > > >>>> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > >>> > > > > >>> -- > > > > >>> Michael Wilde > > > > >>> Computation Institute, University of Chicago > > > > >>> Mathematics and Computer Science Division > > > > >>> Argonne National Laboratory > > > > >>> > > > > >>> _______________________________________________ > > > > >>> Swift-devel mailing list > > > > >>> Swift-devel at ci.uchicago.edu > > > > >>> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > > > -- > > > > Justin M Wozniak > > > > > > > > > > > > > > > > _______________________________________________ > > > > Swift-devel mailing list > > > > Swift-devel at ci.uchicago.edu > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > > > > > -- > > > > Ketan > > > > > > -- > > > Michael Wilde > > > Computation Institute, University of Chicago > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > -- > > > Ketan > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Mon Nov 14 13:59:22 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 14 Nov 2011 13:59:22 -0600 (CST) Subject: [Swift-devel] CDM Tests In-Reply-To: Message-ID: <1763122897.27683.1321300762635.JavaMail.root@zimbra.anl.gov> I meant the command line to the app() program, not the commandline to the swift command. - Mike ----- Original Message ----- > From: "Ketan Maheshwari" > To: "Michael Wilde" > Cc: "Swift Devel" > Sent: Monday, November 14, 2011 11:50:48 AM > Subject: Re: [Swift-devel] CDM Tests > All commandlines are in a run.sh script batched together. I am writing > a README in the testsuite to document this. > > > On Mon, Nov 14, 2011 at 1:46 PM, Michael Wilde < wilde at mcs.anl.gov > > wrote: > > > Ketan, in these tests, I thought that the names that the app actually > receives on its command line is worth documenting. Does that show > anything that the script writer should be aware of? > > > - Mike > > ----- Original Message ----- > > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > Sent: Monday, November 14, 2011 11:39:10 AM > > Subject: Re: [Swift-devel] CDM Tests > > Even though the tests worked in each of these 4 cases, the question > > remains of "did the app see the same path names" and "did copies get > > done"? > > > > So I think you need to dig deeper and explain the difference between > > what CDM does for the direct case, and how (if at all) the wrapper > > mode affects things. > > > > Its equally possible that CDM overrode the wrapper mode arg. Just > > specifying the wrapper mode arg will not, I think, achieve the > > elimination of the "copy to shared workflow directory" that "direct > > mode" accomplishes. > > > > - Mike > > > > ----- Original Message ----- > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > To: "Michael Wilde" < wilde at mcs.anl.gov > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu >, "Justin M > > > Wozniak" > > > < wozniak at mcs.anl.gov > > > > Sent: Monday, November 14, 2011 11:29:36 AM > > > Subject: Re: [Swift-devel] CDM Tests > > > Mike, > > > > > > > > > In my initial email I said: > > > > > > > > > ... when specifying the relative option on the config for > > > wrapper.invocation.mode property the script works regardless of > > > the > > > paths of input/output. > > > > > > > > > Which made me conclude that the option specified in the config for > > > wrapper.invocation.mode property overrides the CDM policies. > > > > > > > > > On Mon, Nov 14, 2011 at 1:21 PM, Michael Wilde < wilde at mcs.anl.gov > > > > > > > wrote: > > > > > > > > > Ketan, > > > > > > I thought your initial email stated that the 4 cases of relative > > > path > > > names "worked". But this latter comment indicates that there is > > > some > > > kind of problem even in the relative cases. > > > > > > Can you clarify? > > > > > > > > > - Mike > > > > > > > > > ----- Original Message ----- > > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > > > > > > > To: "Justin M Wozniak" < wozniak at mcs.anl.gov > > > > > Cc: "Michael Wilde" < wilde at mcs.anl.gov >, "Swift Devel" < > > > > swift-devel at ci.uchicago.edu > > > > > Sent: Monday, November 14, 2011 11:01:35 AM > > > > Subject: Re: [Swift-devel] CDM Tests > > > > Given the CDM's support for relative pathnames, I would expect > > > > the > > > > case of relative inputs, relative outputs would succeed > > > > irrespective > > > > of config. However, from the tests, it seems that the config > > > > option > > > > is > > > > overriding CDM directives. > > > > > > > > > > > > On Mon, Nov 14, 2011 at 12:44 PM, Justin M Wozniak < > > > > wozniak at mcs.anl.gov > wrote: > > > > > > > > > > > > > > > > CDM DIRECT does not currently support absolute path names. I > > > > think > > > > we > > > > could provide something simple like "" for the field, as you > > > > suggest. > > > > > > > > > > > > > > > > > > > > On Mon, 14 Nov 2011, Michael Wilde wrote: > > > > > > > > > I would expect that for absolute path mappings, the user would > > > > > specify > > > > > either :: or "/" as the path in the CDM DIRECT pattern. In > > > > > other > > > > > words, > > > > > the user is saying "I have mapped the object to an absolute > > > > > path > > > > > name, > > > > > and I want the app to get exactly this name". Hence I think > > > > > that > > > > > the > > > > > best "location" field for a DIRECT rule is null (""). > > > > > > > > > > Justin, can you clarify what the code expects for this case? > > > > > > > > > > - Mike > > > > > > > > > > ----- Original Message ----- > > > > >> From: "Jonathan Monette" < jonmon at mcs.anl.gov > > > > > >> To: "Michael Wilde" < wilde at mcs.anl.gov > > > > > >> Cc: "Ketan Maheshwari" < ketancmaheshwari at gmail.com >, "Swift > > > > >> Devel" < swift-devel at ci.uchicago.edu > > > > > >> Sent: Monday, November 14, 2011 9:13:01 AM > > > > >> Subject: Re: [Swift-devel] CDM Tests > > > > >> I couldn't tell from the excel(using my phone) but what was > > > > >> absolute? > > > > >> If I am not mistaken what CDM does is takes the path that is > > > > >> specified > > > > >> in the CDM file(which should be absolute) and appends the > > > > >> file > > > > >> path > > > > >> that is used in the swift script to it(which should be > > > > >> relative). > > > > >> So > > > > >> if they are both absolute I would expect it to fail. > > > > >> > > > > >> > > > > >> > > > > >> On Nov 14, 2011, at 10:56 AM, Michael Wilde < > > > > >> wilde at mcs.anl.gov > > > > >> > > > > > >> wrote: > > > > >> > > > > >>> Thanks, Ketan - very nice tests and summary document. > > > > >>> > > > > >>> I should point out for those that didn't open the doc: the 4 > > > > >>> tests > > > > >>> with absolute pathnames are failing. > > > > >>> > > > > >>> Ketan, can you work with Justin to see if this is a bug, or > > > > >>> if > > > > >>> the > > > > >>> CDM directive needs to be coded differently for absolute > > > > >>> paths? > > > > >>> Then > > > > >>> please test a fix, and as we discussed adapt the tests with > > > > >>> annotations to enhance the User Guide section on CDM. > > > > >>> > > > > >>> - Mike > > > > >>> > > > > >>> > > > > >>> ----- Original Message ----- > > > > >>>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > >>>> To: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > >>>> Sent: Monday, November 14, 2011 8:13:48 AM > > > > >>>> Subject: [Swift-devel] CDM Tests > > > > >>>> Hello, > > > > >>>> > > > > >>>> > > > > >>>> I had a discussion with Mike about testing the CDM behavior > > > > >>>> for > > > > >>>> the > > > > >>>> following cases: > > > > >>>> 1. full versus relative paths for input > > > > >>>> 2. full versus relative paths for output > > > > >>>> 3. relative versus absolute option in config property: > > > > >>>> wrapper.invocation.mode > > > > >>>> > > > > >>>> > > > > >>>> In this regard, I made 8 tests for all the above > > > > >>>> combinations. > > > > >>>> I > > > > >>>> used > > > > >>>> simple local provider in this first set of tests. > > > > >>>> > > > > >>>> > > > > >>>> From the tests it seems that when specifying the relative > > > > >>>> option > > > > >>>> on > > > > >>>> the config for wrapper.invocation.mode property the script > > > > >>>> works > > > > >>>> regardless of the paths of input/output. > > > > >>>> > > > > >>>> > > > > >>>> A detailed result with stdout, and paths to logs can be > > > > >>>> found > > > > >>>> here: > > > > >>>> https://docs.google.com/spreadsheet/ccc?key=0AmvYSwENKFY9dG44V2VjRXJlUmZLNG9saERFeWZDcUE > > > > >>>> > > > > >>>> > > > > >>>> > > > > >>>> The tests are in my CI dir: /home/ketan/cdm_tests > > > > >>>> > > > > >>>> > > > > >>>> > > > > >>>> > > > > >>>> Regards, -- > > > > >>>> Ketan > > > > >>>> > > > > >>>> > > > > >>>> > > > > >>>> _______________________________________________ > > > > >>>> Swift-devel mailing list > > > > >>>> Swift-devel at ci.uchicago.edu > > > > >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > >>> > > > > >>> -- > > > > >>> Michael Wilde > > > > >>> Computation Institute, University of Chicago > > > > >>> Mathematics and Computer Science Division > > > > >>> Argonne National Laboratory > > > > >>> > > > > >>> _______________________________________________ > > > > >>> Swift-devel mailing list > > > > >>> Swift-devel at ci.uchicago.edu > > > > >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > > > -- > > > > Justin M Wozniak > > > > > > > > > > > > > > > > _______________________________________________ > > > > Swift-devel mailing list > > > > Swift-devel at ci.uchicago.edu > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > > > > > -- > > > > Ketan > > > > > > -- > > > Michael Wilde > > > Computation Institute, University of Chicago > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > -- > > > Ketan > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > > > > > -- > Ketan -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From jonmon at mcs.anl.gov Mon Nov 14 14:05:00 2011 From: jonmon at mcs.anl.gov (Jon Monette) Date: Mon, 14 Nov 2011 14:05:00 -0600 (CST) Subject: [Swift-devel] CDM Tests In-Reply-To: <1763122897.27683.1321300762635.JavaMail.root@zimbra.anl.gov> Message-ID: <1671120600.15941.1321301100679.JavaMail.root@zimbra-mb2.anl.gov> Output files do not work correctly in CDM. CDM creates symlinks to the files that are used in the DIRECT CDM directive. When the app outputs it overwrites the symlink in the work directory and *doesn't* update the original file. So the check to see if the expected output is the same as the correct output will fail. How are you checking to see if the app correctly ran? Is there an error or is the original file in the cwd empty? ----- Original Message ----- From: "Michael Wilde" To: "Ketan Maheshwari" Cc: "Swift Devel" Sent: Monday, November 14, 2011 1:59:22 PM Subject: Re: [Swift-devel] CDM Tests I meant the command line to the app() program, not the commandline to the swift command. - Mike ----- Original Message ----- > From: "Ketan Maheshwari" > To: "Michael Wilde" > Cc: "Swift Devel" > Sent: Monday, November 14, 2011 11:50:48 AM > Subject: Re: [Swift-devel] CDM Tests > All commandlines are in a run.sh script batched together. I am writing > a README in the testsuite to document this. > > > On Mon, Nov 14, 2011 at 1:46 PM, Michael Wilde < wilde at mcs.anl.gov > > wrote: > > > Ketan, in these tests, I thought that the names that the app actually > receives on its command line is worth documenting. Does that show > anything that the script writer should be aware of? > > > - Mike > > ----- Original Message ----- > > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > Sent: Monday, November 14, 2011 11:39:10 AM > > Subject: Re: [Swift-devel] CDM Tests > > Even though the tests worked in each of these 4 cases, the question > > remains of "did the app see the same path names" and "did copies get > > done"? > > > > So I think you need to dig deeper and explain the difference between > > what CDM does for the direct case, and how (if at all) the wrapper > > mode affects things. > > > > Its equally possible that CDM overrode the wrapper mode arg. Just > > specifying the wrapper mode arg will not, I think, achieve the > > elimination of the "copy to shared workflow directory" that "direct > > mode" accomplishes. > > > > - Mike > > > > ----- Original Message ----- > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > To: "Michael Wilde" < wilde at mcs.anl.gov > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu >, "Justin M > > > Wozniak" > > > < wozniak at mcs.anl.gov > > > > Sent: Monday, November 14, 2011 11:29:36 AM > > > Subject: Re: [Swift-devel] CDM Tests > > > Mike, > > > > > > > > > In my initial email I said: > > > > > > > > > ... when specifying the relative option on the config for > > > wrapper.invocation.mode property the script works regardless of > > > the > > > paths of input/output. > > > > > > > > > Which made me conclude that the option specified in the config for > > > wrapper.invocation.mode property overrides the CDM policies. > > > > > > > > > On Mon, Nov 14, 2011 at 1:21 PM, Michael Wilde < wilde at mcs.anl.gov > > > > > > > wrote: > > > > > > > > > Ketan, > > > > > > I thought your initial email stated that the 4 cases of relative > > > path > > > names "worked". But this latter comment indicates that there is > > > some > > > kind of problem even in the relative cases. > > > > > > Can you clarify? > > > > > > > > > - Mike > > > > > > > > > ----- Original Message ----- > > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > > > > > > > To: "Justin M Wozniak" < wozniak at mcs.anl.gov > > > > > Cc: "Michael Wilde" < wilde at mcs.anl.gov >, "Swift Devel" < > > > > swift-devel at ci.uchicago.edu > > > > > Sent: Monday, November 14, 2011 11:01:35 AM > > > > Subject: Re: [Swift-devel] CDM Tests > > > > Given the CDM's support for relative pathnames, I would expect > > > > the > > > > case of relative inputs, relative outputs would succeed > > > > irrespective > > > > of config. However, from the tests, it seems that the config > > > > option > > > > is > > > > overriding CDM directives. > > > > > > > > > > > > On Mon, Nov 14, 2011 at 12:44 PM, Justin M Wozniak < > > > > wozniak at mcs.anl.gov > wrote: > > > > > > > > > > > > > > > > CDM DIRECT does not currently support absolute path names. I > > > > think > > > > we > > > > could provide something simple like "" for the field, as you > > > > suggest. > > > > > > > > > > > > > > > > > > > > On Mon, 14 Nov 2011, Michael Wilde wrote: > > > > > > > > > I would expect that for absolute path mappings, the user would > > > > > specify > > > > > either :: or "/" as the path in the CDM DIRECT pattern. In > > > > > other > > > > > words, > > > > > the user is saying "I have mapped the object to an absolute > > > > > path > > > > > name, > > > > > and I want the app to get exactly this name". Hence I think > > > > > that > > > > > the > > > > > best "location" field for a DIRECT rule is null (""). > > > > > > > > > > Justin, can you clarify what the code expects for this case? > > > > > > > > > > - Mike > > > > > > > > > > ----- Original Message ----- > > > > >> From: "Jonathan Monette" < jonmon at mcs.anl.gov > > > > > >> To: "Michael Wilde" < wilde at mcs.anl.gov > > > > > >> Cc: "Ketan Maheshwari" < ketancmaheshwari at gmail.com >, "Swift > > > > >> Devel" < swift-devel at ci.uchicago.edu > > > > > >> Sent: Monday, November 14, 2011 9:13:01 AM > > > > >> Subject: Re: [Swift-devel] CDM Tests > > > > >> I couldn't tell from the excel(using my phone) but what was > > > > >> absolute? > > > > >> If I am not mistaken what CDM does is takes the path that is > > > > >> specified > > > > >> in the CDM file(which should be absolute) and appends the > > > > >> file > > > > >> path > > > > >> that is used in the swift script to it(which should be > > > > >> relative). > > > > >> So > > > > >> if they are both absolute I would expect it to fail. > > > > >> > > > > >> > > > > >> > > > > >> On Nov 14, 2011, at 10:56 AM, Michael Wilde < > > > > >> wilde at mcs.anl.gov > > > > >> > > > > > >> wrote: > > > > >> > > > > >>> Thanks, Ketan - very nice tests and summary document. > > > > >>> > > > > >>> I should point out for those that didn't open the doc: the 4 > > > > >>> tests > > > > >>> with absolute pathnames are failing. > > > > >>> > > > > >>> Ketan, can you work with Justin to see if this is a bug, or > > > > >>> if > > > > >>> the > > > > >>> CDM directive needs to be coded differently for absolute > > > > >>> paths? > > > > >>> Then > > > > >>> please test a fix, and as we discussed adapt the tests with > > > > >>> annotations to enhance the User Guide section on CDM. > > > > >>> > > > > >>> - Mike > > > > >>> > > > > >>> > > > > >>> ----- Original Message ----- > > > > >>>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > >>>> To: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > >>>> Sent: Monday, November 14, 2011 8:13:48 AM > > > > >>>> Subject: [Swift-devel] CDM Tests > > > > >>>> Hello, > > > > >>>> > > > > >>>> > > > > >>>> I had a discussion with Mike about testing the CDM behavior > > > > >>>> for > > > > >>>> the > > > > >>>> following cases: > > > > >>>> 1. full versus relative paths for input > > > > >>>> 2. full versus relative paths for output > > > > >>>> 3. relative versus absolute option in config property: > > > > >>>> wrapper.invocation.mode > > > > >>>> > > > > >>>> > > > > >>>> In this regard, I made 8 tests for all the above > > > > >>>> combinations. > > > > >>>> I > > > > >>>> used > > > > >>>> simple local provider in this first set of tests. > > > > >>>> > > > > >>>> > > > > >>>> From the tests it seems that when specifying the relative > > > > >>>> option > > > > >>>> on > > > > >>>> the config for wrapper.invocation.mode property the script > > > > >>>> works > > > > >>>> regardless of the paths of input/output. > > > > >>>> > > > > >>>> > > > > >>>> A detailed result with stdout, and paths to logs can be > > > > >>>> found > > > > >>>> here: > > > > >>>> https://docs.google.com/spreadsheet/ccc?key=0AmvYSwENKFY9dG44V2VjRXJlUmZLNG9saERFeWZDcUE > > > > >>>> > > > > >>>> > > > > >>>> > > > > >>>> The tests are in my CI dir: /home/ketan/cdm_tests > > > > >>>> > > > > >>>> > > > > >>>> > > > > >>>> > > > > >>>> Regards, -- > > > > >>>> Ketan > > > > >>>> > > > > >>>> > > > > >>>> > > > > >>>> _______________________________________________ > > > > >>>> Swift-devel mailing list > > > > >>>> Swift-devel at ci.uchicago.edu > > > > >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > >>> > > > > >>> -- > > > > >>> Michael Wilde > > > > >>> Computation Institute, University of Chicago > > > > >>> Mathematics and Computer Science Division > > > > >>> Argonne National Laboratory > > > > >>> > > > > >>> _______________________________________________ > > > > >>> Swift-devel mailing list > > > > >>> Swift-devel at ci.uchicago.edu > > > > >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > > > -- > > > > Justin M Wozniak > > > > > > > > > > > > > > > > _______________________________________________ > > > > Swift-devel mailing list > > > > Swift-devel at ci.uchicago.edu > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > > > > > -- > > > > Ketan > > > > > > -- > > > Michael Wilde > > > Computation Institute, University of Chicago > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > -- > > > Ketan > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > > > > > -- > Ketan -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory _______________________________________________ Swift-devel mailing list Swift-devel at ci.uchicago.edu https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From ketancmaheshwari at gmail.com Mon Nov 14 14:14:08 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Mon, 14 Nov 2011 14:14:08 -0600 Subject: [Swift-devel] CDM Tests In-Reply-To: <1763122897.27683.1321300762635.JavaMail.root@zimbra.anl.gov> References: <1763122897.27683.1321300762635.JavaMail.root@zimbra.anl.gov> Message-ID: >From the log, for all cases where input path is full, a full path is passed as app() args. From the logs: Case1.: input path full : jobid=catnap-fmveuqik tr=catnap arguments=[1, home/ketan/cdm_tests/data4.txt] Case2.: input path relative: jobid=catnap-li8fuqik tr=catnap arguments=[1, data2.txt] On Mon, Nov 14, 2011 at 1:59 PM, Michael Wilde wrote: > I meant the command line to the app() program, not the commandline to the > swift command. > > - Mike > > ----- Original Message ----- > > From: "Ketan Maheshwari" > > To: "Michael Wilde" > > Cc: "Swift Devel" > > Sent: Monday, November 14, 2011 11:50:48 AM > > Subject: Re: [Swift-devel] CDM Tests > > All commandlines are in a run.sh script batched together. I am writing > > a README in the testsuite to document this. > > > > > > On Mon, Nov 14, 2011 at 1:46 PM, Michael Wilde < wilde at mcs.anl.gov > > > wrote: > > > > > > Ketan, in these tests, I thought that the names that the app actually > > receives on its command line is worth documenting. Does that show > > anything that the script writer should be aware of? > > > > > > - Mike > > > > ----- Original Message ----- > > > > > > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > Sent: Monday, November 14, 2011 11:39:10 AM > > > Subject: Re: [Swift-devel] CDM Tests > > > Even though the tests worked in each of these 4 cases, the question > > > remains of "did the app see the same path names" and "did copies get > > > done"? > > > > > > So I think you need to dig deeper and explain the difference between > > > what CDM does for the direct case, and how (if at all) the wrapper > > > mode affects things. > > > > > > Its equally possible that CDM overrode the wrapper mode arg. Just > > > specifying the wrapper mode arg will not, I think, achieve the > > > elimination of the "copy to shared workflow directory" that "direct > > > mode" accomplishes. > > > > > > - Mike > > > > > > ----- Original Message ----- > > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > To: "Michael Wilde" < wilde at mcs.anl.gov > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu >, "Justin M > > > > Wozniak" > > > > < wozniak at mcs.anl.gov > > > > > Sent: Monday, November 14, 2011 11:29:36 AM > > > > Subject: Re: [Swift-devel] CDM Tests > > > > Mike, > > > > > > > > > > > > In my initial email I said: > > > > > > > > > > > > ... when specifying the relative option on the config for > > > > wrapper.invocation.mode property the script works regardless of > > > > the > > > > paths of input/output. > > > > > > > > > > > > Which made me conclude that the option specified in the config for > > > > wrapper.invocation.mode property overrides the CDM policies. > > > > > > > > > > > > On Mon, Nov 14, 2011 at 1:21 PM, Michael Wilde < wilde at mcs.anl.gov > > > > > > > > > wrote: > > > > > > > > > > > > Ketan, > > > > > > > > I thought your initial email stated that the 4 cases of relative > > > > path > > > > names "worked". But this latter comment indicates that there is > > > > some > > > > kind of problem even in the relative cases. > > > > > > > > Can you clarify? > > > > > > > > > > > > - Mike > > > > > > > > > > > > ----- Original Message ----- > > > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > > > > > > > > > > > To: "Justin M Wozniak" < wozniak at mcs.anl.gov > > > > > > Cc: "Michael Wilde" < wilde at mcs.anl.gov >, "Swift Devel" < > > > > > swift-devel at ci.uchicago.edu > > > > > > Sent: Monday, November 14, 2011 11:01:35 AM > > > > > Subject: Re: [Swift-devel] CDM Tests > > > > > Given the CDM's support for relative pathnames, I would expect > > > > > the > > > > > case of relative inputs, relative outputs would succeed > > > > > irrespective > > > > > of config. However, from the tests, it seems that the config > > > > > option > > > > > is > > > > > overriding CDM directives. > > > > > > > > > > > > > > > On Mon, Nov 14, 2011 at 12:44 PM, Justin M Wozniak < > > > > > wozniak at mcs.anl.gov > wrote: > > > > > > > > > > > > > > > > > > > > CDM DIRECT does not currently support absolute path names. I > > > > > think > > > > > we > > > > > could provide something simple like "" for the field, as you > > > > > suggest. > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 14 Nov 2011, Michael Wilde wrote: > > > > > > > > > > > I would expect that for absolute path mappings, the user would > > > > > > specify > > > > > > either :: or "/" as the path in the CDM DIRECT pattern. In > > > > > > other > > > > > > words, > > > > > > the user is saying "I have mapped the object to an absolute > > > > > > path > > > > > > name, > > > > > > and I want the app to get exactly this name". Hence I think > > > > > > that > > > > > > the > > > > > > best "location" field for a DIRECT rule is null (""). > > > > > > > > > > > > Justin, can you clarify what the code expects for this case? > > > > > > > > > > > > - Mike > > > > > > > > > > > > ----- Original Message ----- > > > > > >> From: "Jonathan Monette" < jonmon at mcs.anl.gov > > > > > > >> To: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > >> Cc: "Ketan Maheshwari" < ketancmaheshwari at gmail.com >, "Swift > > > > > >> Devel" < swift-devel at ci.uchicago.edu > > > > > > >> Sent: Monday, November 14, 2011 9:13:01 AM > > > > > >> Subject: Re: [Swift-devel] CDM Tests > > > > > >> I couldn't tell from the excel(using my phone) but what was > > > > > >> absolute? > > > > > >> If I am not mistaken what CDM does is takes the path that is > > > > > >> specified > > > > > >> in the CDM file(which should be absolute) and appends the > > > > > >> file > > > > > >> path > > > > > >> that is used in the swift script to it(which should be > > > > > >> relative). > > > > > >> So > > > > > >> if they are both absolute I would expect it to fail. > > > > > >> > > > > > >> > > > > > >> > > > > > >> On Nov 14, 2011, at 10:56 AM, Michael Wilde < > > > > > >> wilde at mcs.anl.gov > > > > > >> > > > > > > >> wrote: > > > > > >> > > > > > >>> Thanks, Ketan - very nice tests and summary document. > > > > > >>> > > > > > >>> I should point out for those that didn't open the doc: the 4 > > > > > >>> tests > > > > > >>> with absolute pathnames are failing. > > > > > >>> > > > > > >>> Ketan, can you work with Justin to see if this is a bug, or > > > > > >>> if > > > > > >>> the > > > > > >>> CDM directive needs to be coded differently for absolute > > > > > >>> paths? > > > > > >>> Then > > > > > >>> please test a fix, and as we discussed adapt the tests with > > > > > >>> annotations to enhance the User Guide section on CDM. > > > > > >>> > > > > > >>> - Mike > > > > > >>> > > > > > >>> > > > > > >>> ----- Original Message ----- > > > > > >>>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > >>>> To: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > >>>> Sent: Monday, November 14, 2011 8:13:48 AM > > > > > >>>> Subject: [Swift-devel] CDM Tests > > > > > >>>> Hello, > > > > > >>>> > > > > > >>>> > > > > > >>>> I had a discussion with Mike about testing the CDM behavior > > > > > >>>> for > > > > > >>>> the > > > > > >>>> following cases: > > > > > >>>> 1. full versus relative paths for input > > > > > >>>> 2. full versus relative paths for output > > > > > >>>> 3. relative versus absolute option in config property: > > > > > >>>> wrapper.invocation.mode > > > > > >>>> > > > > > >>>> > > > > > >>>> In this regard, I made 8 tests for all the above > > > > > >>>> combinations. > > > > > >>>> I > > > > > >>>> used > > > > > >>>> simple local provider in this first set of tests. > > > > > >>>> > > > > > >>>> > > > > > >>>> From the tests it seems that when specifying the relative > > > > > >>>> option > > > > > >>>> on > > > > > >>>> the config for wrapper.invocation.mode property the script > > > > > >>>> works > > > > > >>>> regardless of the paths of input/output. > > > > > >>>> > > > > > >>>> > > > > > >>>> A detailed result with stdout, and paths to logs can be > > > > > >>>> found > > > > > >>>> here: > > > > > >>>> > https://docs.google.com/spreadsheet/ccc?key=0AmvYSwENKFY9dG44V2VjRXJlUmZLNG9saERFeWZDcUE > > > > > >>>> > > > > > >>>> > > > > > >>>> > > > > > >>>> The tests are in my CI dir: /home/ketan/cdm_tests > > > > > >>>> > > > > > >>>> > > > > > >>>> > > > > > >>>> > > > > > >>>> Regards, -- > > > > > >>>> Ketan > > > > > >>>> > > > > > >>>> > > > > > >>>> > > > > > >>>> _______________________________________________ > > > > > >>>> Swift-devel mailing list > > > > > >>>> Swift-devel at ci.uchicago.edu > > > > > >>>> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > >>> > > > > > >>> -- > > > > > >>> Michael Wilde > > > > > >>> Computation Institute, University of Chicago > > > > > >>> Mathematics and Computer Science Division > > > > > >>> Argonne National Laboratory > > > > > >>> > > > > > >>> _______________________________________________ > > > > > >>> Swift-devel mailing list > > > > > >>> Swift-devel at ci.uchicago.edu > > > > > >>> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Justin M Wozniak > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > Swift-devel mailing list > > > > > Swift-devel at ci.uchicago.edu > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Ketan > > > > > > > > -- > > > > Michael Wilde > > > > Computation Institute, University of Chicago > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Ketan > > > > > > -- > > > Michael Wilde > > > Computation Institute, University of Chicago > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > > > > > > > > > -- > > Ketan > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Mon Nov 14 14:43:13 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Mon, 14 Nov 2011 14:43:13 -0600 Subject: [Swift-devel] CDM Tests In-Reply-To: <1671120600.15941.1321301100679.JavaMail.root@zimbra-mb2.anl.gov> References: <1763122897.27683.1321300762635.JavaMail.root@zimbra.anl.gov> <1671120600.15941.1321301100679.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: Yes, this is right. I added a rule for the outdir in the cdm file (which was not added earlier) and all runs fails with the message: file not found output/out?.txt. On Mon, Nov 14, 2011 at 2:05 PM, Jon Monette wrote: > Output files do not work correctly in CDM. CDM creates symlinks to the > files that are used in the DIRECT CDM directive. When the app outputs it > overwrites the symlink in the work directory and *doesn't* update the > original file. So the check to see if the expected output is the same as > the correct output will fail. How are you checking to see if the app > correctly ran? Is there an error or is the original file in the cwd empty? > > ----- Original Message ----- > From: "Michael Wilde" > To: "Ketan Maheshwari" > Cc: "Swift Devel" > Sent: Monday, November 14, 2011 1:59:22 PM > Subject: Re: [Swift-devel] CDM Tests > > I meant the command line to the app() program, not the commandline to the > swift command. > > - Mike > > ----- Original Message ----- > > From: "Ketan Maheshwari" > > To: "Michael Wilde" > > Cc: "Swift Devel" > > Sent: Monday, November 14, 2011 11:50:48 AM > > Subject: Re: [Swift-devel] CDM Tests > > All commandlines are in a run.sh script batched together. I am writing > > a README in the testsuite to document this. > > > > > > On Mon, Nov 14, 2011 at 1:46 PM, Michael Wilde < wilde at mcs.anl.gov > > > wrote: > > > > > > Ketan, in these tests, I thought that the names that the app actually > > receives on its command line is worth documenting. Does that show > > anything that the script writer should be aware of? > > > > > > - Mike > > > > ----- Original Message ----- > > > > > > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > Sent: Monday, November 14, 2011 11:39:10 AM > > > Subject: Re: [Swift-devel] CDM Tests > > > Even though the tests worked in each of these 4 cases, the question > > > remains of "did the app see the same path names" and "did copies get > > > done"? > > > > > > So I think you need to dig deeper and explain the difference between > > > what CDM does for the direct case, and how (if at all) the wrapper > > > mode affects things. > > > > > > Its equally possible that CDM overrode the wrapper mode arg. Just > > > specifying the wrapper mode arg will not, I think, achieve the > > > elimination of the "copy to shared workflow directory" that "direct > > > mode" accomplishes. > > > > > > - Mike > > > > > > ----- Original Message ----- > > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > To: "Michael Wilde" < wilde at mcs.anl.gov > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu >, "Justin M > > > > Wozniak" > > > > < wozniak at mcs.anl.gov > > > > > Sent: Monday, November 14, 2011 11:29:36 AM > > > > Subject: Re: [Swift-devel] CDM Tests > > > > Mike, > > > > > > > > > > > > In my initial email I said: > > > > > > > > > > > > ... when specifying the relative option on the config for > > > > wrapper.invocation.mode property the script works regardless of > > > > the > > > > paths of input/output. > > > > > > > > > > > > Which made me conclude that the option specified in the config for > > > > wrapper.invocation.mode property overrides the CDM policies. > > > > > > > > > > > > On Mon, Nov 14, 2011 at 1:21 PM, Michael Wilde < wilde at mcs.anl.gov > > > > > > > > > wrote: > > > > > > > > > > > > Ketan, > > > > > > > > I thought your initial email stated that the 4 cases of relative > > > > path > > > > names "worked". But this latter comment indicates that there is > > > > some > > > > kind of problem even in the relative cases. > > > > > > > > Can you clarify? > > > > > > > > > > > > - Mike > > > > > > > > > > > > ----- Original Message ----- > > > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > > > > > > > > > > > To: "Justin M Wozniak" < wozniak at mcs.anl.gov > > > > > > Cc: "Michael Wilde" < wilde at mcs.anl.gov >, "Swift Devel" < > > > > > swift-devel at ci.uchicago.edu > > > > > > Sent: Monday, November 14, 2011 11:01:35 AM > > > > > Subject: Re: [Swift-devel] CDM Tests > > > > > Given the CDM's support for relative pathnames, I would expect > > > > > the > > > > > case of relative inputs, relative outputs would succeed > > > > > irrespective > > > > > of config. However, from the tests, it seems that the config > > > > > option > > > > > is > > > > > overriding CDM directives. > > > > > > > > > > > > > > > On Mon, Nov 14, 2011 at 12:44 PM, Justin M Wozniak < > > > > > wozniak at mcs.anl.gov > wrote: > > > > > > > > > > > > > > > > > > > > CDM DIRECT does not currently support absolute path names. I > > > > > think > > > > > we > > > > > could provide something simple like "" for the field, as you > > > > > suggest. > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 14 Nov 2011, Michael Wilde wrote: > > > > > > > > > > > I would expect that for absolute path mappings, the user would > > > > > > specify > > > > > > either :: or "/" as the path in the CDM DIRECT pattern. In > > > > > > other > > > > > > words, > > > > > > the user is saying "I have mapped the object to an absolute > > > > > > path > > > > > > name, > > > > > > and I want the app to get exactly this name". Hence I think > > > > > > that > > > > > > the > > > > > > best "location" field for a DIRECT rule is null (""). > > > > > > > > > > > > Justin, can you clarify what the code expects for this case? > > > > > > > > > > > > - Mike > > > > > > > > > > > > ----- Original Message ----- > > > > > >> From: "Jonathan Monette" < jonmon at mcs.anl.gov > > > > > > >> To: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > >> Cc: "Ketan Maheshwari" < ketancmaheshwari at gmail.com >, "Swift > > > > > >> Devel" < swift-devel at ci.uchicago.edu > > > > > > >> Sent: Monday, November 14, 2011 9:13:01 AM > > > > > >> Subject: Re: [Swift-devel] CDM Tests > > > > > >> I couldn't tell from the excel(using my phone) but what was > > > > > >> absolute? > > > > > >> If I am not mistaken what CDM does is takes the path that is > > > > > >> specified > > > > > >> in the CDM file(which should be absolute) and appends the > > > > > >> file > > > > > >> path > > > > > >> that is used in the swift script to it(which should be > > > > > >> relative). > > > > > >> So > > > > > >> if they are both absolute I would expect it to fail. > > > > > >> > > > > > >> > > > > > >> > > > > > >> On Nov 14, 2011, at 10:56 AM, Michael Wilde < > > > > > >> wilde at mcs.anl.gov > > > > > >> > > > > > > >> wrote: > > > > > >> > > > > > >>> Thanks, Ketan - very nice tests and summary document. > > > > > >>> > > > > > >>> I should point out for those that didn't open the doc: the 4 > > > > > >>> tests > > > > > >>> with absolute pathnames are failing. > > > > > >>> > > > > > >>> Ketan, can you work with Justin to see if this is a bug, or > > > > > >>> if > > > > > >>> the > > > > > >>> CDM directive needs to be coded differently for absolute > > > > > >>> paths? > > > > > >>> Then > > > > > >>> please test a fix, and as we discussed adapt the tests with > > > > > >>> annotations to enhance the User Guide section on CDM. > > > > > >>> > > > > > >>> - Mike > > > > > >>> > > > > > >>> > > > > > >>> ----- Original Message ----- > > > > > >>>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > >>>> To: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > >>>> Sent: Monday, November 14, 2011 8:13:48 AM > > > > > >>>> Subject: [Swift-devel] CDM Tests > > > > > >>>> Hello, > > > > > >>>> > > > > > >>>> > > > > > >>>> I had a discussion with Mike about testing the CDM behavior > > > > > >>>> for > > > > > >>>> the > > > > > >>>> following cases: > > > > > >>>> 1. full versus relative paths for input > > > > > >>>> 2. full versus relative paths for output > > > > > >>>> 3. relative versus absolute option in config property: > > > > > >>>> wrapper.invocation.mode > > > > > >>>> > > > > > >>>> > > > > > >>>> In this regard, I made 8 tests for all the above > > > > > >>>> combinations. > > > > > >>>> I > > > > > >>>> used > > > > > >>>> simple local provider in this first set of tests. > > > > > >>>> > > > > > >>>> > > > > > >>>> From the tests it seems that when specifying the relative > > > > > >>>> option > > > > > >>>> on > > > > > >>>> the config for wrapper.invocation.mode property the script > > > > > >>>> works > > > > > >>>> regardless of the paths of input/output. > > > > > >>>> > > > > > >>>> > > > > > >>>> A detailed result with stdout, and paths to logs can be > > > > > >>>> found > > > > > >>>> here: > > > > > >>>> > https://docs.google.com/spreadsheet/ccc?key=0AmvYSwENKFY9dG44V2VjRXJlUmZLNG9saERFeWZDcUE > > > > > >>>> > > > > > >>>> > > > > > >>>> > > > > > >>>> The tests are in my CI dir: /home/ketan/cdm_tests > > > > > >>>> > > > > > >>>> > > > > > >>>> > > > > > >>>> > > > > > >>>> Regards, -- > > > > > >>>> Ketan > > > > > >>>> > > > > > >>>> > > > > > >>>> > > > > > >>>> _______________________________________________ > > > > > >>>> Swift-devel mailing list > > > > > >>>> Swift-devel at ci.uchicago.edu > > > > > >>>> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > >>> > > > > > >>> -- > > > > > >>> Michael Wilde > > > > > >>> Computation Institute, University of Chicago > > > > > >>> Mathematics and Computer Science Division > > > > > >>> Argonne National Laboratory > > > > > >>> > > > > > >>> _______________________________________________ > > > > > >>> Swift-devel mailing list > > > > > >>> Swift-devel at ci.uchicago.edu > > > > > >>> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Justin M Wozniak > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > Swift-devel mailing list > > > > > Swift-devel at ci.uchicago.edu > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Ketan > > > > > > > > -- > > > > Michael Wilde > > > > Computation Institute, University of Chicago > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Ketan > > > > > > -- > > > Michael Wilde > > > Computation Institute, University of Chicago > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > > > > > > > > > -- > > Ketan > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Mon Nov 14 14:47:25 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 14 Nov 2011 14:47:25 -0600 (CST) Subject: [Swift-devel] CDM Tests In-Reply-To: Message-ID: <143972182.27959.1321303645625.JavaMail.root@zimbra.anl.gov> I think the purpose of these tests was to find a default CDM DIRECT patter that would match all inout and output files and "do the right thing". Tests that match specific pathnames are yet another (but different) useful case. - Mike ----- Original Message ----- > From: "Ketan Maheshwari" > To: "Jon Monette" > Cc: "Michael Wilde" , "Swift Devel" > Sent: Monday, November 14, 2011 12:43:13 PM > Subject: Re: [Swift-devel] CDM Tests > Yes, this is right. I added a rule for the outdir in the cdm file > (which was not added earlier) and all runs fails with the message: > file not found output/out?.txt. > > > On Mon, Nov 14, 2011 at 2:05 PM, Jon Monette < jonmon at mcs.anl.gov > > wrote: > > > Output files do not work correctly in CDM. CDM creates symlinks to the > files that are used in the DIRECT CDM directive. When the app outputs > it overwrites the symlink in the work directory and *doesn't* update > the original file. So the check to see if the expected output is the > same as the correct output will fail. How are you checking to see if > the app correctly ran? Is there an error or is the original file in > the cwd empty? > > > ----- Original Message ----- > From: "Michael Wilde" < wilde at mcs.anl.gov > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > Sent: Monday, November 14, 2011 1:59:22 PM > Subject: Re: [Swift-devel] CDM Tests > > I meant the command line to the app() program, not the commandline to > the swift command. > > - Mike > > ----- Original Message ----- > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > To: "Michael Wilde" < wilde at mcs.anl.gov > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > Sent: Monday, November 14, 2011 11:50:48 AM > > Subject: Re: [Swift-devel] CDM Tests > > All commandlines are in a run.sh script batched together. I am > > writing > > a README in the testsuite to document this. > > > > > > On Mon, Nov 14, 2011 at 1:46 PM, Michael Wilde < wilde at mcs.anl.gov > > > wrote: > > > > > > Ketan, in these tests, I thought that the names that the app > > actually > > receives on its command line is worth documenting. Does that show > > anything that the script writer should be aware of? > > > > > > - Mike > > > > ----- Original Message ----- > > > > > > > > > From: "Michael Wilde" < wilde at mcs.anl.gov > > > > To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > Sent: Monday, November 14, 2011 11:39:10 AM > > > Subject: Re: [Swift-devel] CDM Tests > > > Even though the tests worked in each of these 4 cases, the > > > question > > > remains of "did the app see the same path names" and "did copies > > > get > > > done"? > > > > > > So I think you need to dig deeper and explain the difference > > > between > > > what CDM does for the direct case, and how (if at all) the wrapper > > > mode affects things. > > > > > > Its equally possible that CDM overrode the wrapper mode arg. Just > > > specifying the wrapper mode arg will not, I think, achieve the > > > elimination of the "copy to shared workflow directory" that > > > "direct > > > mode" accomplishes. > > > > > > - Mike > > > > > > ----- Original Message ----- > > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > To: "Michael Wilde" < wilde at mcs.anl.gov > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu >, "Justin M > > > > Wozniak" > > > > < wozniak at mcs.anl.gov > > > > > Sent: Monday, November 14, 2011 11:29:36 AM > > > > Subject: Re: [Swift-devel] CDM Tests > > > > Mike, > > > > > > > > > > > > In my initial email I said: > > > > > > > > > > > > ... when specifying the relative option on the config for > > > > wrapper.invocation.mode property the script works regardless of > > > > the > > > > paths of input/output. > > > > > > > > > > > > Which made me conclude that the option specified in the config > > > > for > > > > wrapper.invocation.mode property overrides the CDM policies. > > > > > > > > > > > > On Mon, Nov 14, 2011 at 1:21 PM, Michael Wilde < > > > > wilde at mcs.anl.gov > > > > > > > > > wrote: > > > > > > > > > > > > Ketan, > > > > > > > > I thought your initial email stated that the 4 cases of relative > > > > path > > > > names "worked". But this latter comment indicates that there is > > > > some > > > > kind of problem even in the relative cases. > > > > > > > > Can you clarify? > > > > > > > > > > > > - Mike > > > > > > > > > > > > ----- Original Message ----- > > > > > From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > > > > > > > > > > > > > To: "Justin M Wozniak" < wozniak at mcs.anl.gov > > > > > > Cc: "Michael Wilde" < wilde at mcs.anl.gov >, "Swift Devel" < > > > > > swift-devel at ci.uchicago.edu > > > > > > Sent: Monday, November 14, 2011 11:01:35 AM > > > > > Subject: Re: [Swift-devel] CDM Tests > > > > > Given the CDM's support for relative pathnames, I would expect > > > > > the > > > > > case of relative inputs, relative outputs would succeed > > > > > irrespective > > > > > of config. However, from the tests, it seems that the config > > > > > option > > > > > is > > > > > overriding CDM directives. > > > > > > > > > > > > > > > On Mon, Nov 14, 2011 at 12:44 PM, Justin M Wozniak < > > > > > wozniak at mcs.anl.gov > wrote: > > > > > > > > > > > > > > > > > > > > CDM DIRECT does not currently support absolute path names. I > > > > > think > > > > > we > > > > > could provide something simple like "" for the field, as you > > > > > suggest. > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 14 Nov 2011, Michael Wilde wrote: > > > > > > > > > > > I would expect that for absolute path mappings, the user > > > > > > would > > > > > > specify > > > > > > either :: or "/" as the path in the CDM DIRECT pattern. In > > > > > > other > > > > > > words, > > > > > > the user is saying "I have mapped the object to an absolute > > > > > > path > > > > > > name, > > > > > > and I want the app to get exactly this name". Hence I think > > > > > > that > > > > > > the > > > > > > best "location" field for a DIRECT rule is null (""). > > > > > > > > > > > > Justin, can you clarify what the code expects for this case? > > > > > > > > > > > > - Mike > > > > > > > > > > > > ----- Original Message ----- > > > > > >> From: "Jonathan Monette" < jonmon at mcs.anl.gov > > > > > > >> To: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > >> Cc: "Ketan Maheshwari" < ketancmaheshwari at gmail.com >, > > > > > >> "Swift > > > > > >> Devel" < swift-devel at ci.uchicago.edu > > > > > > >> Sent: Monday, November 14, 2011 9:13:01 AM > > > > > >> Subject: Re: [Swift-devel] CDM Tests > > > > > >> I couldn't tell from the excel(using my phone) but what was > > > > > >> absolute? > > > > > >> If I am not mistaken what CDM does is takes the path that > > > > > >> is > > > > > >> specified > > > > > >> in the CDM file(which should be absolute) and appends the > > > > > >> file > > > > > >> path > > > > > >> that is used in the swift script to it(which should be > > > > > >> relative). > > > > > >> So > > > > > >> if they are both absolute I would expect it to fail. > > > > > >> > > > > > >> > > > > > >> > > > > > >> On Nov 14, 2011, at 10:56 AM, Michael Wilde < > > > > > >> wilde at mcs.anl.gov > > > > > >> > > > > > > >> wrote: > > > > > >> > > > > > >>> Thanks, Ketan - very nice tests and summary document. > > > > > >>> > > > > > >>> I should point out for those that didn't open the doc: the > > > > > >>> 4 > > > > > >>> tests > > > > > >>> with absolute pathnames are failing. > > > > > >>> > > > > > >>> Ketan, can you work with Justin to see if this is a bug, > > > > > >>> or > > > > > >>> if > > > > > >>> the > > > > > >>> CDM directive needs to be coded differently for absolute > > > > > >>> paths? > > > > > >>> Then > > > > > >>> please test a fix, and as we discussed adapt the tests > > > > > >>> with > > > > > >>> annotations to enhance the User Guide section on CDM. > > > > > >>> > > > > > >>> - Mike > > > > > >>> > > > > > >>> > > > > > >>> ----- Original Message ----- > > > > > >>>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > > > > > >>>> To: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > > >>>> Sent: Monday, November 14, 2011 8:13:48 AM > > > > > >>>> Subject: [Swift-devel] CDM Tests > > > > > >>>> Hello, > > > > > >>>> > > > > > >>>> > > > > > >>>> I had a discussion with Mike about testing the CDM > > > > > >>>> behavior > > > > > >>>> for > > > > > >>>> the > > > > > >>>> following cases: > > > > > >>>> 1. full versus relative paths for input > > > > > >>>> 2. full versus relative paths for output > > > > > >>>> 3. relative versus absolute option in config property: > > > > > >>>> wrapper.invocation.mode > > > > > >>>> > > > > > >>>> > > > > > >>>> In this regard, I made 8 tests for all the above > > > > > >>>> combinations. > > > > > >>>> I > > > > > >>>> used > > > > > >>>> simple local provider in this first set of tests. > > > > > >>>> > > > > > >>>> > > > > > >>>> From the tests it seems that when specifying the relative > > > > > >>>> option > > > > > >>>> on > > > > > >>>> the config for wrapper.invocation.mode property the > > > > > >>>> script > > > > > >>>> works > > > > > >>>> regardless of the paths of input/output. > > > > > >>>> > > > > > >>>> > > > > > >>>> A detailed result with stdout, and paths to logs can be > > > > > >>>> found > > > > > >>>> here: > > > > > >>>> https://docs.google.com/spreadsheet/ccc?key=0AmvYSwENKFY9dG44V2VjRXJlUmZLNG9saERFeWZDcUE > > > > > >>>> > > > > > >>>> > > > > > >>>> > > > > > >>>> The tests are in my CI dir: /home/ketan/cdm_tests > > > > > >>>> > > > > > >>>> > > > > > >>>> > > > > > >>>> > > > > > >>>> Regards, -- > > > > > >>>> Ketan > > > > > >>>> > > > > > >>>> > > > > > >>>> > > > > > >>>> _______________________________________________ > > > > > >>>> Swift-devel mailing list > > > > > >>>> Swift-devel at ci.uchicago.edu > > > > > >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > >>> > > > > > >>> -- > > > > > >>> Michael Wilde > > > > > >>> Computation Institute, University of Chicago > > > > > >>> Mathematics and Computer Science Division > > > > > >>> Argonne National Laboratory > > > > > >>> > > > > > >>> _______________________________________________ > > > > > >>> Swift-devel mailing list > > > > > >>> Swift-devel at ci.uchicago.edu > > > > > >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Justin M Wozniak > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > > Swift-devel mailing list > > > > > Swift-devel at ci.uchicago.edu > > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Ketan > > > > > > > > -- > > > > Michael Wilde > > > > Computation Institute, University of Chicago > > > > Mathematics and Computer Science Division > > > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Ketan > > > > > > -- > > > Michael Wilde > > > Computation Institute, University of Chicago > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > > > > > > > > > -- > > Ketan > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > -- > Ketan -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From wilde at mcs.anl.gov Mon Nov 14 14:59:15 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 14 Nov 2011 14:59:15 -0600 (CST) Subject: [Swift-devel] Please avoid heavy use of bridled this week In-Reply-To: <35264080.28024.1321304284577.JavaMail.root@zimbra.anl.gov> Message-ID: <1425794294.28036.1321304355122.JavaMail.root@zimbra.anl.gov> We will be using it for demos at SC. Please go easy on communicado as well, so we have a backup. Thanks, - Mike From jonmon at mcs.anl.gov Mon Nov 14 18:36:32 2011 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Mon, 14 Nov 2011 18:36:32 -0600 Subject: [Swift-devel] CDM Tests In-Reply-To: <143972182.27959.1321303645625.JavaMail.root@zimbra.anl.gov> References: <143972182.27959.1321303645625.JavaMail.root@zimbra.anl.gov> Message-ID: The "do the right thing" is currently broken in CDM for output files. A wrapper needs to be written currently around the app that updates the intended file and not the symlink in the work directory. For instance, in the SwiftMontage scripts I wrap all the Montage calls in python scripts that checks to see if I am writing to a symlink, if I am writing to a symlink I write the what the symlink is pointing to instead of the symlink itself. So unless a wrapper is written to handle the output files, CDM currently does not work properly. This probably should be documented in the CDM section of the user guide as a caveat of using CDM for output files. On Nov 14, 2011, at 2:47 PM, Michael Wilde wrote: > I think the purpose of these tests was to find a default CDM DIRECT patter that would match all inout and output files and "do the right thing". > > Tests that match specific pathnames are yet another (but different) useful case. > > - Mike > ----- Original Message ----- >> From: "Ketan Maheshwari" >> To: "Jon Monette" >> Cc: "Michael Wilde" , "Swift Devel" >> Sent: Monday, November 14, 2011 12:43:13 PM >> Subject: Re: [Swift-devel] CDM Tests >> Yes, this is right. I added a rule for the outdir in the cdm file >> (which was not added earlier) and all runs fails with the message: >> file not found output/out?.txt. >> >> >> On Mon, Nov 14, 2011 at 2:05 PM, Jon Monette < jonmon at mcs.anl.gov > >> wrote: >> >> >> Output files do not work correctly in CDM. CDM creates symlinks to the >> files that are used in the DIRECT CDM directive. When the app outputs >> it overwrites the symlink in the work directory and *doesn't* update >> the original file. So the check to see if the expected output is the >> same as the correct output will fail. How are you checking to see if >> the app correctly ran? Is there an error or is the original file in >> the cwd empty? >> >> >> ----- Original Message ----- >> From: "Michael Wilde" < wilde at mcs.anl.gov > >> To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > >> Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > >> >> >> >> Sent: Monday, November 14, 2011 1:59:22 PM >> Subject: Re: [Swift-devel] CDM Tests >> >> I meant the command line to the app() program, not the commandline to >> the swift command. >> >> - Mike >> >> ----- Original Message ----- >>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > >>> To: "Michael Wilde" < wilde at mcs.anl.gov > >>> Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > >>> Sent: Monday, November 14, 2011 11:50:48 AM >>> Subject: Re: [Swift-devel] CDM Tests >>> All commandlines are in a run.sh script batched together. I am >>> writing >>> a README in the testsuite to document this. >>> >>> >>> On Mon, Nov 14, 2011 at 1:46 PM, Michael Wilde < wilde at mcs.anl.gov > >>> wrote: >>> >>> >>> Ketan, in these tests, I thought that the names that the app >>> actually >>> receives on its command line is worth documenting. Does that show >>> anything that the script writer should be aware of? >>> >>> >>> - Mike >>> >>> ----- Original Message ----- >>> >>> >>> >>>> From: "Michael Wilde" < wilde at mcs.anl.gov > >>>> To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > >>>> Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > >>>> Sent: Monday, November 14, 2011 11:39:10 AM >>>> Subject: Re: [Swift-devel] CDM Tests >>>> Even though the tests worked in each of these 4 cases, the >>>> question >>>> remains of "did the app see the same path names" and "did copies >>>> get >>>> done"? >>>> >>>> So I think you need to dig deeper and explain the difference >>>> between >>>> what CDM does for the direct case, and how (if at all) the wrapper >>>> mode affects things. >>>> >>>> Its equally possible that CDM overrode the wrapper mode arg. Just >>>> specifying the wrapper mode arg will not, I think, achieve the >>>> elimination of the "copy to shared workflow directory" that >>>> "direct >>>> mode" accomplishes. >>>> >>>> - Mike >>>> >>>> ----- Original Message ----- >>>>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > >>>>> To: "Michael Wilde" < wilde at mcs.anl.gov > >>>>> Cc: "Swift Devel" < swift-devel at ci.uchicago.edu >, "Justin M >>>>> Wozniak" >>>>> < wozniak at mcs.anl.gov > >>>>> Sent: Monday, November 14, 2011 11:29:36 AM >>>>> Subject: Re: [Swift-devel] CDM Tests >>>>> Mike, >>>>> >>>>> >>>>> In my initial email I said: >>>>> >>>>> >>>>> ... when specifying the relative option on the config for >>>>> wrapper.invocation.mode property the script works regardless of >>>>> the >>>>> paths of input/output. >>>>> >>>>> >>>>> Which made me conclude that the option specified in the config >>>>> for >>>>> wrapper.invocation.mode property overrides the CDM policies. >>>>> >>>>> >>>>> On Mon, Nov 14, 2011 at 1:21 PM, Michael Wilde < >>>>> wilde at mcs.anl.gov >>>>>> >>>>> wrote: >>>>> >>>>> >>>>> Ketan, >>>>> >>>>> I thought your initial email stated that the 4 cases of relative >>>>> path >>>>> names "worked". But this latter comment indicates that there is >>>>> some >>>>> kind of problem even in the relative cases. >>>>> >>>>> Can you clarify? >>>>> >>>>> >>>>> - Mike >>>>> >>>>> >>>>> ----- Original Message ----- >>>>>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > >>>>> >>>>> >>>>> >>>>>> To: "Justin M Wozniak" < wozniak at mcs.anl.gov > >>>>>> Cc: "Michael Wilde" < wilde at mcs.anl.gov >, "Swift Devel" < >>>>>> swift-devel at ci.uchicago.edu > >>>>>> Sent: Monday, November 14, 2011 11:01:35 AM >>>>>> Subject: Re: [Swift-devel] CDM Tests >>>>>> Given the CDM's support for relative pathnames, I would expect >>>>>> the >>>>>> case of relative inputs, relative outputs would succeed >>>>>> irrespective >>>>>> of config. However, from the tests, it seems that the config >>>>>> option >>>>>> is >>>>>> overriding CDM directives. >>>>>> >>>>>> >>>>>> On Mon, Nov 14, 2011 at 12:44 PM, Justin M Wozniak < >>>>>> wozniak at mcs.anl.gov > wrote: >>>>>> >>>>>> >>>>>> >>>>>> CDM DIRECT does not currently support absolute path names. I >>>>>> think >>>>>> we >>>>>> could provide something simple like "" for the field, as you >>>>>> suggest. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Mon, 14 Nov 2011, Michael Wilde wrote: >>>>>> >>>>>>> I would expect that for absolute path mappings, the user >>>>>>> would >>>>>>> specify >>>>>>> either :: or "/" as the path in the CDM DIRECT pattern. In >>>>>>> other >>>>>>> words, >>>>>>> the user is saying "I have mapped the object to an absolute >>>>>>> path >>>>>>> name, >>>>>>> and I want the app to get exactly this name". Hence I think >>>>>>> that >>>>>>> the >>>>>>> best "location" field for a DIRECT rule is null (""). >>>>>>> >>>>>>> Justin, can you clarify what the code expects for this case? >>>>>>> >>>>>>> - Mike >>>>>>> >>>>>>> ----- Original Message ----- >>>>>>>> From: "Jonathan Monette" < jonmon at mcs.anl.gov > >>>>>>>> To: "Michael Wilde" < wilde at mcs.anl.gov > >>>>>>>> Cc: "Ketan Maheshwari" < ketancmaheshwari at gmail.com >, >>>>>>>> "Swift >>>>>>>> Devel" < swift-devel at ci.uchicago.edu > >>>>>>>> Sent: Monday, November 14, 2011 9:13:01 AM >>>>>>>> Subject: Re: [Swift-devel] CDM Tests >>>>>>>> I couldn't tell from the excel(using my phone) but what was >>>>>>>> absolute? >>>>>>>> If I am not mistaken what CDM does is takes the path that >>>>>>>> is >>>>>>>> specified >>>>>>>> in the CDM file(which should be absolute) and appends the >>>>>>>> file >>>>>>>> path >>>>>>>> that is used in the swift script to it(which should be >>>>>>>> relative). >>>>>>>> So >>>>>>>> if they are both absolute I would expect it to fail. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Nov 14, 2011, at 10:56 AM, Michael Wilde < >>>>>>>> wilde at mcs.anl.gov >>>>>>>>> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Thanks, Ketan - very nice tests and summary document. >>>>>>>>> >>>>>>>>> I should point out for those that didn't open the doc: the >>>>>>>>> 4 >>>>>>>>> tests >>>>>>>>> with absolute pathnames are failing. >>>>>>>>> >>>>>>>>> Ketan, can you work with Justin to see if this is a bug, >>>>>>>>> or >>>>>>>>> if >>>>>>>>> the >>>>>>>>> CDM directive needs to be coded differently for absolute >>>>>>>>> paths? >>>>>>>>> Then >>>>>>>>> please test a fix, and as we discussed adapt the tests >>>>>>>>> with >>>>>>>>> annotations to enhance the User Guide section on CDM. >>>>>>>>> >>>>>>>>> - Mike >>>>>>>>> >>>>>>>>> >>>>>>>>> ----- Original Message ----- >>>>>>>>>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > >>>>>>>>>> To: "Swift Devel" < swift-devel at ci.uchicago.edu > >>>>>>>>>> Sent: Monday, November 14, 2011 8:13:48 AM >>>>>>>>>> Subject: [Swift-devel] CDM Tests >>>>>>>>>> Hello, >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> I had a discussion with Mike about testing the CDM >>>>>>>>>> behavior >>>>>>>>>> for >>>>>>>>>> the >>>>>>>>>> following cases: >>>>>>>>>> 1. full versus relative paths for input >>>>>>>>>> 2. full versus relative paths for output >>>>>>>>>> 3. relative versus absolute option in config property: >>>>>>>>>> wrapper.invocation.mode >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> In this regard, I made 8 tests for all the above >>>>>>>>>> combinations. >>>>>>>>>> I >>>>>>>>>> used >>>>>>>>>> simple local provider in this first set of tests. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> From the tests it seems that when specifying the relative >>>>>>>>>> option >>>>>>>>>> on >>>>>>>>>> the config for wrapper.invocation.mode property the >>>>>>>>>> script >>>>>>>>>> works >>>>>>>>>> regardless of the paths of input/output. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> A detailed result with stdout, and paths to logs can be >>>>>>>>>> found >>>>>>>>>> here: >>>>>>>>>> https://docs.google.com/spreadsheet/ccc?key=0AmvYSwENKFY9dG44V2VjRXJlUmZLNG9saERFeWZDcUE >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> The tests are in my CI dir: /home/ketan/cdm_tests >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Regards, -- >>>>>>>>>> Ketan >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> _______________________________________________ >>>>>>>>>> Swift-devel mailing list >>>>>>>>>> Swift-devel at ci.uchicago.edu >>>>>>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Michael Wilde >>>>>>>>> Computation Institute, University of Chicago >>>>>>>>> Mathematics and Computer Science Division >>>>>>>>> Argonne National Laboratory >>>>>>>>> >>>>>>>>> _______________________________________________ >>>>>>>>> Swift-devel mailing list >>>>>>>>> Swift-devel at ci.uchicago.edu >>>>>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> Justin M Wozniak >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Swift-devel mailing list >>>>>> Swift-devel at ci.uchicago.edu >>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Ketan >>>>> >>>>> -- >>>>> Michael Wilde >>>>> Computation Institute, University of Chicago >>>>> Mathematics and Computer Science Division >>>>> Argonne National Laboratory >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Ketan >>>> >>>> -- >>>> Michael Wilde >>>> Computation Institute, University of Chicago >>>> Mathematics and Computer Science Division >>>> Argonne National Laboratory >>>> >>>> _______________________________________________ >>>> Swift-devel mailing list >>>> Swift-devel at ci.uchicago.edu >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>> >>> -- >>> Michael Wilde >>> Computation Institute, University of Chicago >>> Mathematics and Computer Science Division >>> Argonne National Laboratory >>> >>> >>> >>> >>> >>> -- >>> Ketan >> >> -- >> Michael Wilde >> Computation Institute, University of Chicago >> Mathematics and Computer Science Division >> Argonne National Laboratory >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> >> >> >> >> -- >> Ketan > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > From ketancmaheshwari at gmail.com Mon Nov 14 19:17:10 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Mon, 14 Nov 2011 19:17:10 -0600 Subject: [Swift-devel] CDM Tests In-Reply-To: References: <143972182.27959.1321303645625.JavaMail.root@zimbra.anl.gov> Message-ID: One thing I am not very clear about CDM Direct for output is if there is any benefit in having the outputs symlinked from the workdir? Wouldn't the outputs be expected at the final place in users application output destination? On Mon, Nov 14, 2011 at 6:36 PM, Jonathan Monette wrote: > The "do the right thing" is currently broken in CDM for output files. A > wrapper needs to be written currently around the app that updates the > intended file and not the symlink in the work directory. For instance, in > the SwiftMontage scripts I wrap all the Montage calls in python scripts > that checks to see if I am writing to a symlink, if I am writing to a > symlink I write the what the symlink is pointing to instead of the symlink > itself. So unless a wrapper is written to handle the output files, CDM > currently does not work properly. This probably should be documented in > the CDM section of the user guide as a caveat of using CDM for output files. > > On Nov 14, 2011, at 2:47 PM, Michael Wilde wrote: > > > I think the purpose of these tests was to find a default CDM DIRECT > patter that would match all inout and output files and "do the right thing". > > > > Tests that match specific pathnames are yet another (but different) > useful case. > > > > - Mike > > ----- Original Message ----- > >> From: "Ketan Maheshwari" > >> To: "Jon Monette" > >> Cc: "Michael Wilde" , "Swift Devel" < > swift-devel at ci.uchicago.edu> > >> Sent: Monday, November 14, 2011 12:43:13 PM > >> Subject: Re: [Swift-devel] CDM Tests > >> Yes, this is right. I added a rule for the outdir in the cdm file > >> (which was not added earlier) and all runs fails with the message: > >> file not found output/out?.txt. > >> > >> > >> On Mon, Nov 14, 2011 at 2:05 PM, Jon Monette < jonmon at mcs.anl.gov > > >> wrote: > >> > >> > >> Output files do not work correctly in CDM. CDM creates symlinks to the > >> files that are used in the DIRECT CDM directive. When the app outputs > >> it overwrites the symlink in the work directory and *doesn't* update > >> the original file. So the check to see if the expected output is the > >> same as the correct output will fail. How are you checking to see if > >> the app correctly ran? Is there an error or is the original file in > >> the cwd empty? > >> > >> > >> ----- Original Message ----- > >> From: "Michael Wilde" < wilde at mcs.anl.gov > > >> To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > >> Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > >> > >> > >> > >> Sent: Monday, November 14, 2011 1:59:22 PM > >> Subject: Re: [Swift-devel] CDM Tests > >> > >> I meant the command line to the app() program, not the commandline to > >> the swift command. > >> > >> - Mike > >> > >> ----- Original Message ----- > >>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > >>> To: "Michael Wilde" < wilde at mcs.anl.gov > > >>> Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > >>> Sent: Monday, November 14, 2011 11:50:48 AM > >>> Subject: Re: [Swift-devel] CDM Tests > >>> All commandlines are in a run.sh script batched together. I am > >>> writing > >>> a README in the testsuite to document this. > >>> > >>> > >>> On Mon, Nov 14, 2011 at 1:46 PM, Michael Wilde < wilde at mcs.anl.gov > > >>> wrote: > >>> > >>> > >>> Ketan, in these tests, I thought that the names that the app > >>> actually > >>> receives on its command line is worth documenting. Does that show > >>> anything that the script writer should be aware of? > >>> > >>> > >>> - Mike > >>> > >>> ----- Original Message ----- > >>> > >>> > >>> > >>>> From: "Michael Wilde" < wilde at mcs.anl.gov > > >>>> To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > >>>> Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > >>>> Sent: Monday, November 14, 2011 11:39:10 AM > >>>> Subject: Re: [Swift-devel] CDM Tests > >>>> Even though the tests worked in each of these 4 cases, the > >>>> question > >>>> remains of "did the app see the same path names" and "did copies > >>>> get > >>>> done"? > >>>> > >>>> So I think you need to dig deeper and explain the difference > >>>> between > >>>> what CDM does for the direct case, and how (if at all) the wrapper > >>>> mode affects things. > >>>> > >>>> Its equally possible that CDM overrode the wrapper mode arg. Just > >>>> specifying the wrapper mode arg will not, I think, achieve the > >>>> elimination of the "copy to shared workflow directory" that > >>>> "direct > >>>> mode" accomplishes. > >>>> > >>>> - Mike > >>>> > >>>> ----- Original Message ----- > >>>>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > >>>>> To: "Michael Wilde" < wilde at mcs.anl.gov > > >>>>> Cc: "Swift Devel" < swift-devel at ci.uchicago.edu >, "Justin M > >>>>> Wozniak" > >>>>> < wozniak at mcs.anl.gov > > >>>>> Sent: Monday, November 14, 2011 11:29:36 AM > >>>>> Subject: Re: [Swift-devel] CDM Tests > >>>>> Mike, > >>>>> > >>>>> > >>>>> In my initial email I said: > >>>>> > >>>>> > >>>>> ... when specifying the relative option on the config for > >>>>> wrapper.invocation.mode property the script works regardless of > >>>>> the > >>>>> paths of input/output. > >>>>> > >>>>> > >>>>> Which made me conclude that the option specified in the config > >>>>> for > >>>>> wrapper.invocation.mode property overrides the CDM policies. > >>>>> > >>>>> > >>>>> On Mon, Nov 14, 2011 at 1:21 PM, Michael Wilde < > >>>>> wilde at mcs.anl.gov > >>>>>> > >>>>> wrote: > >>>>> > >>>>> > >>>>> Ketan, > >>>>> > >>>>> I thought your initial email stated that the 4 cases of relative > >>>>> path > >>>>> names "worked". But this latter comment indicates that there is > >>>>> some > >>>>> kind of problem even in the relative cases. > >>>>> > >>>>> Can you clarify? > >>>>> > >>>>> > >>>>> - Mike > >>>>> > >>>>> > >>>>> ----- Original Message ----- > >>>>>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > >>>>> > >>>>> > >>>>> > >>>>>> To: "Justin M Wozniak" < wozniak at mcs.anl.gov > > >>>>>> Cc: "Michael Wilde" < wilde at mcs.anl.gov >, "Swift Devel" < > >>>>>> swift-devel at ci.uchicago.edu > > >>>>>> Sent: Monday, November 14, 2011 11:01:35 AM > >>>>>> Subject: Re: [Swift-devel] CDM Tests > >>>>>> Given the CDM's support for relative pathnames, I would expect > >>>>>> the > >>>>>> case of relative inputs, relative outputs would succeed > >>>>>> irrespective > >>>>>> of config. However, from the tests, it seems that the config > >>>>>> option > >>>>>> is > >>>>>> overriding CDM directives. > >>>>>> > >>>>>> > >>>>>> On Mon, Nov 14, 2011 at 12:44 PM, Justin M Wozniak < > >>>>>> wozniak at mcs.anl.gov > wrote: > >>>>>> > >>>>>> > >>>>>> > >>>>>> CDM DIRECT does not currently support absolute path names. I > >>>>>> think > >>>>>> we > >>>>>> could provide something simple like "" for the field, as you > >>>>>> suggest. > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Mon, 14 Nov 2011, Michael Wilde wrote: > >>>>>> > >>>>>>> I would expect that for absolute path mappings, the user > >>>>>>> would > >>>>>>> specify > >>>>>>> either :: or "/" as the path in the CDM DIRECT pattern. In > >>>>>>> other > >>>>>>> words, > >>>>>>> the user is saying "I have mapped the object to an absolute > >>>>>>> path > >>>>>>> name, > >>>>>>> and I want the app to get exactly this name". Hence I think > >>>>>>> that > >>>>>>> the > >>>>>>> best "location" field for a DIRECT rule is null (""). > >>>>>>> > >>>>>>> Justin, can you clarify what the code expects for this case? > >>>>>>> > >>>>>>> - Mike > >>>>>>> > >>>>>>> ----- Original Message ----- > >>>>>>>> From: "Jonathan Monette" < jonmon at mcs.anl.gov > > >>>>>>>> To: "Michael Wilde" < wilde at mcs.anl.gov > > >>>>>>>> Cc: "Ketan Maheshwari" < ketancmaheshwari at gmail.com >, > >>>>>>>> "Swift > >>>>>>>> Devel" < swift-devel at ci.uchicago.edu > > >>>>>>>> Sent: Monday, November 14, 2011 9:13:01 AM > >>>>>>>> Subject: Re: [Swift-devel] CDM Tests > >>>>>>>> I couldn't tell from the excel(using my phone) but what was > >>>>>>>> absolute? > >>>>>>>> If I am not mistaken what CDM does is takes the path that > >>>>>>>> is > >>>>>>>> specified > >>>>>>>> in the CDM file(which should be absolute) and appends the > >>>>>>>> file > >>>>>>>> path > >>>>>>>> that is used in the swift script to it(which should be > >>>>>>>> relative). > >>>>>>>> So > >>>>>>>> if they are both absolute I would expect it to fail. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> On Nov 14, 2011, at 10:56 AM, Michael Wilde < > >>>>>>>> wilde at mcs.anl.gov > >>>>>>>>> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Thanks, Ketan - very nice tests and summary document. > >>>>>>>>> > >>>>>>>>> I should point out for those that didn't open the doc: the > >>>>>>>>> 4 > >>>>>>>>> tests > >>>>>>>>> with absolute pathnames are failing. > >>>>>>>>> > >>>>>>>>> Ketan, can you work with Justin to see if this is a bug, > >>>>>>>>> or > >>>>>>>>> if > >>>>>>>>> the > >>>>>>>>> CDM directive needs to be coded differently for absolute > >>>>>>>>> paths? > >>>>>>>>> Then > >>>>>>>>> please test a fix, and as we discussed adapt the tests > >>>>>>>>> with > >>>>>>>>> annotations to enhance the User Guide section on CDM. > >>>>>>>>> > >>>>>>>>> - Mike > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> ----- Original Message ----- > >>>>>>>>>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > >>>>>>>>>> To: "Swift Devel" < swift-devel at ci.uchicago.edu > > >>>>>>>>>> Sent: Monday, November 14, 2011 8:13:48 AM > >>>>>>>>>> Subject: [Swift-devel] CDM Tests > >>>>>>>>>> Hello, > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> I had a discussion with Mike about testing the CDM > >>>>>>>>>> behavior > >>>>>>>>>> for > >>>>>>>>>> the > >>>>>>>>>> following cases: > >>>>>>>>>> 1. full versus relative paths for input > >>>>>>>>>> 2. full versus relative paths for output > >>>>>>>>>> 3. relative versus absolute option in config property: > >>>>>>>>>> wrapper.invocation.mode > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> In this regard, I made 8 tests for all the above > >>>>>>>>>> combinations. > >>>>>>>>>> I > >>>>>>>>>> used > >>>>>>>>>> simple local provider in this first set of tests. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> From the tests it seems that when specifying the relative > >>>>>>>>>> option > >>>>>>>>>> on > >>>>>>>>>> the config for wrapper.invocation.mode property the > >>>>>>>>>> script > >>>>>>>>>> works > >>>>>>>>>> regardless of the paths of input/output. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> A detailed result with stdout, and paths to logs can be > >>>>>>>>>> found > >>>>>>>>>> here: > >>>>>>>>>> > https://docs.google.com/spreadsheet/ccc?key=0AmvYSwENKFY9dG44V2VjRXJlUmZLNG9saERFeWZDcUE > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> The tests are in my CI dir: /home/ketan/cdm_tests > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Regards, -- > >>>>>>>>>> Ketan > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> _______________________________________________ > >>>>>>>>>> Swift-devel mailing list > >>>>>>>>>> Swift-devel at ci.uchicago.edu > >>>>>>>>>> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> Michael Wilde > >>>>>>>>> Computation Institute, University of Chicago > >>>>>>>>> Mathematics and Computer Science Division > >>>>>>>>> Argonne National Laboratory > >>>>>>>>> > >>>>>>>>> _______________________________________________ > >>>>>>>>> Swift-devel mailing list > >>>>>>>>> Swift-devel at ci.uchicago.edu > >>>>>>>>> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >>>>>>> > >>>>>>> > >>>>>> > >>>>>> -- > >>>>>> Justin M Wozniak > >>>>>> > >>>>>> > >>>>>> > >>>>>> _______________________________________________ > >>>>>> Swift-devel mailing list > >>>>>> Swift-devel at ci.uchicago.edu > >>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> Ketan > >>>>> > >>>>> -- > >>>>> Michael Wilde > >>>>> Computation Institute, University of Chicago > >>>>> Mathematics and Computer Science Division > >>>>> Argonne National Laboratory > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> Ketan > >>>> > >>>> -- > >>>> Michael Wilde > >>>> Computation Institute, University of Chicago > >>>> Mathematics and Computer Science Division > >>>> Argonne National Laboratory > >>>> > >>>> _______________________________________________ > >>>> Swift-devel mailing list > >>>> Swift-devel at ci.uchicago.edu > >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >>> > >>> -- > >>> Michael Wilde > >>> Computation Institute, University of Chicago > >>> Mathematics and Computer Science Division > >>> Argonne National Laboratory > >>> > >>> > >>> > >>> > >>> > >>> -- > >>> Ketan > >> > >> -- > >> Michael Wilde > >> Computation Institute, University of Chicago > >> Mathematics and Computer Science Division > >> Argonne National Laboratory > >> > >> _______________________________________________ > >> Swift-devel mailing list > >> Swift-devel at ci.uchicago.edu > >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >> > >> > >> > >> > >> -- > >> Ketan > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at mcs.anl.gov Mon Nov 14 19:22:29 2011 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Mon, 14 Nov 2011 19:22:29 -0600 Subject: [Swift-devel] CDM Tests In-Reply-To: References: <143972182.27959.1321303645625.JavaMail.root@zimbra.anl.gov> Message-ID: <469EDF4C-0963-4ED2-9F1D-26FB4ADAA673@mcs.anl.gov> I think Justin when I asked him that question was because that was how Swift expected everything. It didn't break how Swift worked and didn't require any modifications to the Swift source to account for this. I think this was just a case that wasn't thought out fully on what exactly would happen. I think CDM knows all the information it needs to actually write the data to the output destination, the logic is just not there. I didn't know of this behavior with CDM output till this past summer. I always used CDM for input only until this past summer. On Nov 14, 2011, at 7:17 PM, Ketan Maheshwari wrote: > One thing I am not very clear about CDM Direct for output is if there is any benefit in having the outputs symlinked from the workdir? Wouldn't the outputs be expected at the final place in users application output destination? > > On Mon, Nov 14, 2011 at 6:36 PM, Jonathan Monette wrote: > The "do the right thing" is currently broken in CDM for output files. A wrapper needs to be written currently around the app that updates the intended file and not the symlink in the work directory. For instance, in the SwiftMontage scripts I wrap all the Montage calls in python scripts that checks to see if I am writing to a symlink, if I am writing to a symlink I write the what the symlink is pointing to instead of the symlink itself. So unless a wrapper is written to handle the output files, CDM currently does not work properly. This probably should be documented in the CDM section of the user guide as a caveat of using CDM for output files. > > On Nov 14, 2011, at 2:47 PM, Michael Wilde wrote: > > > I think the purpose of these tests was to find a default CDM DIRECT patter that would match all inout and output files and "do the right thing". > > > > Tests that match specific pathnames are yet another (but different) useful case. > > > > - Mike > > ----- Original Message ----- > >> From: "Ketan Maheshwari" > >> To: "Jon Monette" > >> Cc: "Michael Wilde" , "Swift Devel" > >> Sent: Monday, November 14, 2011 12:43:13 PM > >> Subject: Re: [Swift-devel] CDM Tests > >> Yes, this is right. I added a rule for the outdir in the cdm file > >> (which was not added earlier) and all runs fails with the message: > >> file not found output/out?.txt. > >> > >> > >> On Mon, Nov 14, 2011 at 2:05 PM, Jon Monette < jonmon at mcs.anl.gov > > >> wrote: > >> > >> > >> Output files do not work correctly in CDM. CDM creates symlinks to the > >> files that are used in the DIRECT CDM directive. When the app outputs > >> it overwrites the symlink in the work directory and *doesn't* update > >> the original file. So the check to see if the expected output is the > >> same as the correct output will fail. How are you checking to see if > >> the app correctly ran? Is there an error or is the original file in > >> the cwd empty? > >> > >> > >> ----- Original Message ----- > >> From: "Michael Wilde" < wilde at mcs.anl.gov > > >> To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > >> Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > >> > >> > >> > >> Sent: Monday, November 14, 2011 1:59:22 PM > >> Subject: Re: [Swift-devel] CDM Tests > >> > >> I meant the command line to the app() program, not the commandline to > >> the swift command. > >> > >> - Mike > >> > >> ----- Original Message ----- > >>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > >>> To: "Michael Wilde" < wilde at mcs.anl.gov > > >>> Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > >>> Sent: Monday, November 14, 2011 11:50:48 AM > >>> Subject: Re: [Swift-devel] CDM Tests > >>> All commandlines are in a run.sh script batched together. I am > >>> writing > >>> a README in the testsuite to document this. > >>> > >>> > >>> On Mon, Nov 14, 2011 at 1:46 PM, Michael Wilde < wilde at mcs.anl.gov > > >>> wrote: > >>> > >>> > >>> Ketan, in these tests, I thought that the names that the app > >>> actually > >>> receives on its command line is worth documenting. Does that show > >>> anything that the script writer should be aware of? > >>> > >>> > >>> - Mike > >>> > >>> ----- Original Message ----- > >>> > >>> > >>> > >>>> From: "Michael Wilde" < wilde at mcs.anl.gov > > >>>> To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > >>>> Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > >>>> Sent: Monday, November 14, 2011 11:39:10 AM > >>>> Subject: Re: [Swift-devel] CDM Tests > >>>> Even though the tests worked in each of these 4 cases, the > >>>> question > >>>> remains of "did the app see the same path names" and "did copies > >>>> get > >>>> done"? > >>>> > >>>> So I think you need to dig deeper and explain the difference > >>>> between > >>>> what CDM does for the direct case, and how (if at all) the wrapper > >>>> mode affects things. > >>>> > >>>> Its equally possible that CDM overrode the wrapper mode arg. Just > >>>> specifying the wrapper mode arg will not, I think, achieve the > >>>> elimination of the "copy to shared workflow directory" that > >>>> "direct > >>>> mode" accomplishes. > >>>> > >>>> - Mike > >>>> > >>>> ----- Original Message ----- > >>>>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > >>>>> To: "Michael Wilde" < wilde at mcs.anl.gov > > >>>>> Cc: "Swift Devel" < swift-devel at ci.uchicago.edu >, "Justin M > >>>>> Wozniak" > >>>>> < wozniak at mcs.anl.gov > > >>>>> Sent: Monday, November 14, 2011 11:29:36 AM > >>>>> Subject: Re: [Swift-devel] CDM Tests > >>>>> Mike, > >>>>> > >>>>> > >>>>> In my initial email I said: > >>>>> > >>>>> > >>>>> ... when specifying the relative option on the config for > >>>>> wrapper.invocation.mode property the script works regardless of > >>>>> the > >>>>> paths of input/output. > >>>>> > >>>>> > >>>>> Which made me conclude that the option specified in the config > >>>>> for > >>>>> wrapper.invocation.mode property overrides the CDM policies. > >>>>> > >>>>> > >>>>> On Mon, Nov 14, 2011 at 1:21 PM, Michael Wilde < > >>>>> wilde at mcs.anl.gov > >>>>>> > >>>>> wrote: > >>>>> > >>>>> > >>>>> Ketan, > >>>>> > >>>>> I thought your initial email stated that the 4 cases of relative > >>>>> path > >>>>> names "worked". But this latter comment indicates that there is > >>>>> some > >>>>> kind of problem even in the relative cases. > >>>>> > >>>>> Can you clarify? > >>>>> > >>>>> > >>>>> - Mike > >>>>> > >>>>> > >>>>> ----- Original Message ----- > >>>>>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > >>>>> > >>>>> > >>>>> > >>>>>> To: "Justin M Wozniak" < wozniak at mcs.anl.gov > > >>>>>> Cc: "Michael Wilde" < wilde at mcs.anl.gov >, "Swift Devel" < > >>>>>> swift-devel at ci.uchicago.edu > > >>>>>> Sent: Monday, November 14, 2011 11:01:35 AM > >>>>>> Subject: Re: [Swift-devel] CDM Tests > >>>>>> Given the CDM's support for relative pathnames, I would expect > >>>>>> the > >>>>>> case of relative inputs, relative outputs would succeed > >>>>>> irrespective > >>>>>> of config. However, from the tests, it seems that the config > >>>>>> option > >>>>>> is > >>>>>> overriding CDM directives. > >>>>>> > >>>>>> > >>>>>> On Mon, Nov 14, 2011 at 12:44 PM, Justin M Wozniak < > >>>>>> wozniak at mcs.anl.gov > wrote: > >>>>>> > >>>>>> > >>>>>> > >>>>>> CDM DIRECT does not currently support absolute path names. I > >>>>>> think > >>>>>> we > >>>>>> could provide something simple like "" for the field, as you > >>>>>> suggest. > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Mon, 14 Nov 2011, Michael Wilde wrote: > >>>>>> > >>>>>>> I would expect that for absolute path mappings, the user > >>>>>>> would > >>>>>>> specify > >>>>>>> either :: or "/" as the path in the CDM DIRECT pattern. In > >>>>>>> other > >>>>>>> words, > >>>>>>> the user is saying "I have mapped the object to an absolute > >>>>>>> path > >>>>>>> name, > >>>>>>> and I want the app to get exactly this name". Hence I think > >>>>>>> that > >>>>>>> the > >>>>>>> best "location" field for a DIRECT rule is null (""). > >>>>>>> > >>>>>>> Justin, can you clarify what the code expects for this case? > >>>>>>> > >>>>>>> - Mike > >>>>>>> > >>>>>>> ----- Original Message ----- > >>>>>>>> From: "Jonathan Monette" < jonmon at mcs.anl.gov > > >>>>>>>> To: "Michael Wilde" < wilde at mcs.anl.gov > > >>>>>>>> Cc: "Ketan Maheshwari" < ketancmaheshwari at gmail.com >, > >>>>>>>> "Swift > >>>>>>>> Devel" < swift-devel at ci.uchicago.edu > > >>>>>>>> Sent: Monday, November 14, 2011 9:13:01 AM > >>>>>>>> Subject: Re: [Swift-devel] CDM Tests > >>>>>>>> I couldn't tell from the excel(using my phone) but what was > >>>>>>>> absolute? > >>>>>>>> If I am not mistaken what CDM does is takes the path that > >>>>>>>> is > >>>>>>>> specified > >>>>>>>> in the CDM file(which should be absolute) and appends the > >>>>>>>> file > >>>>>>>> path > >>>>>>>> that is used in the swift script to it(which should be > >>>>>>>> relative). > >>>>>>>> So > >>>>>>>> if they are both absolute I would expect it to fail. > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> On Nov 14, 2011, at 10:56 AM, Michael Wilde < > >>>>>>>> wilde at mcs.anl.gov > >>>>>>>>> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Thanks, Ketan - very nice tests and summary document. > >>>>>>>>> > >>>>>>>>> I should point out for those that didn't open the doc: the > >>>>>>>>> 4 > >>>>>>>>> tests > >>>>>>>>> with absolute pathnames are failing. > >>>>>>>>> > >>>>>>>>> Ketan, can you work with Justin to see if this is a bug, > >>>>>>>>> or > >>>>>>>>> if > >>>>>>>>> the > >>>>>>>>> CDM directive needs to be coded differently for absolute > >>>>>>>>> paths? > >>>>>>>>> Then > >>>>>>>>> please test a fix, and as we discussed adapt the tests > >>>>>>>>> with > >>>>>>>>> annotations to enhance the User Guide section on CDM. > >>>>>>>>> > >>>>>>>>> - Mike > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> ----- Original Message ----- > >>>>>>>>>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > > >>>>>>>>>> To: "Swift Devel" < swift-devel at ci.uchicago.edu > > >>>>>>>>>> Sent: Monday, November 14, 2011 8:13:48 AM > >>>>>>>>>> Subject: [Swift-devel] CDM Tests > >>>>>>>>>> Hello, > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> I had a discussion with Mike about testing the CDM > >>>>>>>>>> behavior > >>>>>>>>>> for > >>>>>>>>>> the > >>>>>>>>>> following cases: > >>>>>>>>>> 1. full versus relative paths for input > >>>>>>>>>> 2. full versus relative paths for output > >>>>>>>>>> 3. relative versus absolute option in config property: > >>>>>>>>>> wrapper.invocation.mode > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> In this regard, I made 8 tests for all the above > >>>>>>>>>> combinations. > >>>>>>>>>> I > >>>>>>>>>> used > >>>>>>>>>> simple local provider in this first set of tests. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> From the tests it seems that when specifying the relative > >>>>>>>>>> option > >>>>>>>>>> on > >>>>>>>>>> the config for wrapper.invocation.mode property the > >>>>>>>>>> script > >>>>>>>>>> works > >>>>>>>>>> regardless of the paths of input/output. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> A detailed result with stdout, and paths to logs can be > >>>>>>>>>> found > >>>>>>>>>> here: > >>>>>>>>>> https://docs.google.com/spreadsheet/ccc?key=0AmvYSwENKFY9dG44V2VjRXJlUmZLNG9saERFeWZDcUE > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> The tests are in my CI dir: /home/ketan/cdm_tests > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Regards, -- > >>>>>>>>>> Ketan > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> _______________________________________________ > >>>>>>>>>> Swift-devel mailing list > >>>>>>>>>> Swift-devel at ci.uchicago.edu > >>>>>>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> Michael Wilde > >>>>>>>>> Computation Institute, University of Chicago > >>>>>>>>> Mathematics and Computer Science Division > >>>>>>>>> Argonne National Laboratory > >>>>>>>>> > >>>>>>>>> _______________________________________________ > >>>>>>>>> Swift-devel mailing list > >>>>>>>>> Swift-devel at ci.uchicago.edu > >>>>>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >>>>>>> > >>>>>>> > >>>>>> > >>>>>> -- > >>>>>> Justin M Wozniak > >>>>>> > >>>>>> > >>>>>> > >>>>>> _______________________________________________ > >>>>>> Swift-devel mailing list > >>>>>> Swift-devel at ci.uchicago.edu > >>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> Ketan > >>>>> > >>>>> -- > >>>>> Michael Wilde > >>>>> Computation Institute, University of Chicago > >>>>> Mathematics and Computer Science Division > >>>>> Argonne National Laboratory > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> Ketan > >>>> > >>>> -- > >>>> Michael Wilde > >>>> Computation Institute, University of Chicago > >>>> Mathematics and Computer Science Division > >>>> Argonne National Laboratory > >>>> > >>>> _______________________________________________ > >>>> Swift-devel mailing list > >>>> Swift-devel at ci.uchicago.edu > >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >>> > >>> -- > >>> Michael Wilde > >>> Computation Institute, University of Chicago > >>> Mathematics and Computer Science Division > >>> Argonne National Laboratory > >>> > >>> > >>> > >>> > >>> > >>> -- > >>> Ketan > >> > >> -- > >> Michael Wilde > >> Computation Institute, University of Chicago > >> Mathematics and Computer Science Division > >> Argonne National Laboratory > >> > >> _______________________________________________ > >> Swift-devel mailing list > >> Swift-devel at ci.uchicago.edu > >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >> > >> > >> > >> > >> -- > >> Ketan > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > > > > -- > Ketan > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wozniak at mcs.anl.gov Mon Nov 14 20:45:07 2011 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Mon, 14 Nov 2011 18:45:07 -0800 (Pacific Standard Time) Subject: [Swift-devel] CDM Tests In-Reply-To: References: <526638122.27156.1321295298546.JavaMail.root@zimbra.anl.gov> Message-ID: Actually, CDM DIRECT does work with absolute path names. I just checked in working tests tests/cdm/205 and tests/cdm/206. Take a look at those, hopefully they will help sort this out. I will add a note about this in the user guide. Justin On Mon, 14 Nov 2011, Justin M Wozniak wrote: > CDM DIRECT does not currently support absolute path names. I think we > could provide something simple like "" for the field, as you suggest. > > On Mon, 14 Nov 2011, Michael Wilde wrote: > >> I would expect that for absolute path mappings, the user would specify >> either :: or "/" as the path in the CDM DIRECT pattern. In other words, >> the user is saying "I have mapped the object to an absolute path name, >> and I want the app to get exactly this name". Hence I think that the >> best "location" field for a DIRECT rule is null (""). >> >> Justin, can you clarify what the code expects for this case? >> >> - Mike >> >> ----- Original Message ----- >>> From: "Jonathan Monette" >>> To: "Michael Wilde" >>> Cc: "Ketan Maheshwari" , "Swift Devel" >>> Sent: Monday, November 14, 2011 9:13:01 AM >>> Subject: Re: [Swift-devel] CDM Tests >>> I couldn't tell from the excel(using my phone) but what was absolute? >>> If I am not mistaken what CDM does is takes the path that is specified >>> in the CDM file(which should be absolute) and appends the file path >>> that is used in the swift script to it(which should be relative). So >>> if they are both absolute I would expect it to fail. >>> >>> >>> >>> On Nov 14, 2011, at 10:56 AM, Michael Wilde wrote: >>> >>>> Thanks, Ketan - very nice tests and summary document. >>>> >>>> I should point out for those that didn't open the doc: the 4 tests >>>> with absolute pathnames are failing. >>>> >>>> Ketan, can you work with Justin to see if this is a bug, or if the >>>> CDM directive needs to be coded differently for absolute paths? Then >>>> please test a fix, and as we discussed adapt the tests with >>>> annotations to enhance the User Guide section on CDM. >>>> >>>> - Mike >>>> >>>> >>>> ----- Original Message ----- >>>>> From: "Ketan Maheshwari" >>>>> To: "Swift Devel" >>>>> Sent: Monday, November 14, 2011 8:13:48 AM >>>>> Subject: [Swift-devel] CDM Tests >>>>> Hello, >>>>> >>>>> >>>>> I had a discussion with Mike about testing the CDM behavior for the >>>>> following cases: >>>>> 1. full versus relative paths for input >>>>> 2. full versus relative paths for output >>>>> 3. relative versus absolute option in config property: >>>>> wrapper.invocation.mode >>>>> >>>>> >>>>> In this regard, I made 8 tests for all the above combinations. I >>>>> used >>>>> simple local provider in this first set of tests. >>>>> >>>>> >>>>> From the tests it seems that when specifying the relative option on >>>>> the config for wrapper.invocation.mode property the script works >>>>> regardless of the paths of input/output. >>>>> >>>>> >>>>> A detailed result with stdout, and paths to logs can be found here: >>>>> https://docs.google.com/spreadsheet/ccc?key=0AmvYSwENKFY9dG44V2VjRXJlUmZLNG9saERFeWZDcUE >>>>> >>>>> >>>>> >>>>> The tests are in my CI dir: /home/ketan/cdm_tests >>>>> >>>>> >>>>> >>>>> >>>>> Regards, -- >>>>> Ketan >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Swift-devel mailing list >>>>> Swift-devel at ci.uchicago.edu >>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>> >>>> -- >>>> Michael Wilde >>>> Computation Institute, University of Chicago >>>> Mathematics and Computer Science Division >>>> Argonne National Laboratory >>>> >>>> _______________________________________________ >>>> Swift-devel mailing list >>>> Swift-devel at ci.uchicago.edu >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> >> > > -- Justin M Wozniak From wozniak at mcs.anl.gov Mon Nov 14 20:51:58 2011 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Mon, 14 Nov 2011 18:51:58 -0800 (Pacific Standard Time) Subject: [Swift-devel] CDM Tests In-Reply-To: <469EDF4C-0963-4ED2-9F1D-26FB4ADAA673@mcs.anl.gov> References: <143972182.27959.1321303645625.JavaMail.root@zimbra.anl.gov> <469EDF4C-0963-4ED2-9F1D-26FB4ADAA673@mcs.anl.gov> Message-ID: I added a Bugzilla ticket about this issue. Justin On Mon, 14 Nov 2011, Jonathan Monette wrote: > I think Justin when I asked him that question was because that was how > Swift expected everything. It didn't break how Swift worked and didn't > require any modifications to the Swift source to account for this. I > think this was just a case that wasn't thought out fully on what exactly > would happen. I think CDM knows all the information it needs to > actually write the data to the output destination, the logic is just not > there. I didn't know of this behavior with CDM output till this past > summer. I always used CDM for input only until this past summer. > > On Nov 14, 2011, at 7:17 PM, Ketan Maheshwari wrote: > >> One thing I am not very clear about CDM Direct for output is if there is any benefit in having the outputs symlinked from the workdir? Wouldn't the outputs be expected at the final place in users application output destination? >> >> On Mon, Nov 14, 2011 at 6:36 PM, Jonathan Monette wrote: >> The "do the right thing" is currently broken in CDM for output files. A wrapper needs to be written currently around the app that updates the intended file and not the symlink in the work directory. For instance, in the SwiftMontage scripts I wrap all the Montage calls in python scripts that checks to see if I am writing to a symlink, if I am writing to a symlink I write the what the symlink is pointing to instead of the symlink itself. So unless a wrapper is written to handle the output files, CDM currently does not work properly. This probably should be documented in the CDM section of the user guide as a caveat of using CDM for output files. >> >> On Nov 14, 2011, at 2:47 PM, Michael Wilde wrote: >> >>> I think the purpose of these tests was to find a default CDM DIRECT patter that would match all inout and output files and "do the right thing". >>> >>> Tests that match specific pathnames are yet another (but different) useful case. >>> >>> - Mike >>> ----- Original Message ----- >>>> From: "Ketan Maheshwari" >>>> To: "Jon Monette" >>>> Cc: "Michael Wilde" , "Swift Devel" >>>> Sent: Monday, November 14, 2011 12:43:13 PM >>>> Subject: Re: [Swift-devel] CDM Tests >>>> Yes, this is right. I added a rule for the outdir in the cdm file >>>> (which was not added earlier) and all runs fails with the message: >>>> file not found output/out?.txt. >>>> >>>> >>>> On Mon, Nov 14, 2011 at 2:05 PM, Jon Monette < jonmon at mcs.anl.gov > >>>> wrote: >>>> >>>> >>>> Output files do not work correctly in CDM. CDM creates symlinks to the >>>> files that are used in the DIRECT CDM directive. When the app outputs >>>> it overwrites the symlink in the work directory and *doesn't* update >>>> the original file. So the check to see if the expected output is the >>>> same as the correct output will fail. How are you checking to see if >>>> the app correctly ran? Is there an error or is the original file in >>>> the cwd empty? >>>> >>>> >>>> ----- Original Message ----- >>>> From: "Michael Wilde" < wilde at mcs.anl.gov > >>>> To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > >>>> Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > >>>> >>>> >>>> >>>> Sent: Monday, November 14, 2011 1:59:22 PM >>>> Subject: Re: [Swift-devel] CDM Tests >>>> >>>> I meant the command line to the app() program, not the commandline to >>>> the swift command. >>>> >>>> - Mike >>>> >>>> ----- Original Message ----- >>>>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > >>>>> To: "Michael Wilde" < wilde at mcs.anl.gov > >>>>> Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > >>>>> Sent: Monday, November 14, 2011 11:50:48 AM >>>>> Subject: Re: [Swift-devel] CDM Tests >>>>> All commandlines are in a run.sh script batched together. I am >>>>> writing >>>>> a README in the testsuite to document this. >>>>> >>>>> >>>>> On Mon, Nov 14, 2011 at 1:46 PM, Michael Wilde < wilde at mcs.anl.gov > >>>>> wrote: >>>>> >>>>> >>>>> Ketan, in these tests, I thought that the names that the app >>>>> actually >>>>> receives on its command line is worth documenting. Does that show >>>>> anything that the script writer should be aware of? >>>>> >>>>> >>>>> - Mike >>>>> >>>>> ----- Original Message ----- >>>>> >>>>> >>>>> >>>>>> From: "Michael Wilde" < wilde at mcs.anl.gov > >>>>>> To: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > >>>>>> Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > >>>>>> Sent: Monday, November 14, 2011 11:39:10 AM >>>>>> Subject: Re: [Swift-devel] CDM Tests >>>>>> Even though the tests worked in each of these 4 cases, the >>>>>> question >>>>>> remains of "did the app see the same path names" and "did copies >>>>>> get >>>>>> done"? >>>>>> >>>>>> So I think you need to dig deeper and explain the difference >>>>>> between >>>>>> what CDM does for the direct case, and how (if at all) the wrapper >>>>>> mode affects things. >>>>>> >>>>>> Its equally possible that CDM overrode the wrapper mode arg. Just >>>>>> specifying the wrapper mode arg will not, I think, achieve the >>>>>> elimination of the "copy to shared workflow directory" that >>>>>> "direct >>>>>> mode" accomplishes. >>>>>> >>>>>> - Mike >>>>>> >>>>>> ----- Original Message ----- >>>>>>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > >>>>>>> To: "Michael Wilde" < wilde at mcs.anl.gov > >>>>>>> Cc: "Swift Devel" < swift-devel at ci.uchicago.edu >, "Justin M >>>>>>> Wozniak" >>>>>>> < wozniak at mcs.anl.gov > >>>>>>> Sent: Monday, November 14, 2011 11:29:36 AM >>>>>>> Subject: Re: [Swift-devel] CDM Tests >>>>>>> Mike, >>>>>>> >>>>>>> >>>>>>> In my initial email I said: >>>>>>> >>>>>>> >>>>>>> ... when specifying the relative option on the config for >>>>>>> wrapper.invocation.mode property the script works regardless of >>>>>>> the >>>>>>> paths of input/output. >>>>>>> >>>>>>> >>>>>>> Which made me conclude that the option specified in the config >>>>>>> for >>>>>>> wrapper.invocation.mode property overrides the CDM policies. >>>>>>> >>>>>>> >>>>>>> On Mon, Nov 14, 2011 at 1:21 PM, Michael Wilde < >>>>>>> wilde at mcs.anl.gov >>>>>>>> >>>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> Ketan, >>>>>>> >>>>>>> I thought your initial email stated that the 4 cases of relative >>>>>>> path >>>>>>> names "worked". But this latter comment indicates that there is >>>>>>> some >>>>>>> kind of problem even in the relative cases. >>>>>>> >>>>>>> Can you clarify? >>>>>>> >>>>>>> >>>>>>> - Mike >>>>>>> >>>>>>> >>>>>>> ----- Original Message ----- >>>>>>>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > >>>>>>> >>>>>>> >>>>>>> >>>>>>>> To: "Justin M Wozniak" < wozniak at mcs.anl.gov > >>>>>>>> Cc: "Michael Wilde" < wilde at mcs.anl.gov >, "Swift Devel" < >>>>>>>> swift-devel at ci.uchicago.edu > >>>>>>>> Sent: Monday, November 14, 2011 11:01:35 AM >>>>>>>> Subject: Re: [Swift-devel] CDM Tests >>>>>>>> Given the CDM's support for relative pathnames, I would expect >>>>>>>> the >>>>>>>> case of relative inputs, relative outputs would succeed >>>>>>>> irrespective >>>>>>>> of config. However, from the tests, it seems that the config >>>>>>>> option >>>>>>>> is >>>>>>>> overriding CDM directives. >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Nov 14, 2011 at 12:44 PM, Justin M Wozniak < >>>>>>>> wozniak at mcs.anl.gov > wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> CDM DIRECT does not currently support absolute path names. I >>>>>>>> think >>>>>>>> we >>>>>>>> could provide something simple like "" for the field, as you >>>>>>>> suggest. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Mon, 14 Nov 2011, Michael Wilde wrote: >>>>>>>> >>>>>>>>> I would expect that for absolute path mappings, the user >>>>>>>>> would >>>>>>>>> specify >>>>>>>>> either :: or "/" as the path in the CDM DIRECT pattern. In >>>>>>>>> other >>>>>>>>> words, >>>>>>>>> the user is saying "I have mapped the object to an absolute >>>>>>>>> path >>>>>>>>> name, >>>>>>>>> and I want the app to get exactly this name". Hence I think >>>>>>>>> that >>>>>>>>> the >>>>>>>>> best "location" field for a DIRECT rule is null (""). >>>>>>>>> >>>>>>>>> Justin, can you clarify what the code expects for this case? >>>>>>>>> >>>>>>>>> - Mike >>>>>>>>> >>>>>>>>> ----- Original Message ----- >>>>>>>>>> From: "Jonathan Monette" < jonmon at mcs.anl.gov > >>>>>>>>>> To: "Michael Wilde" < wilde at mcs.anl.gov > >>>>>>>>>> Cc: "Ketan Maheshwari" < ketancmaheshwari at gmail.com >, >>>>>>>>>> "Swift >>>>>>>>>> Devel" < swift-devel at ci.uchicago.edu > >>>>>>>>>> Sent: Monday, November 14, 2011 9:13:01 AM >>>>>>>>>> Subject: Re: [Swift-devel] CDM Tests >>>>>>>>>> I couldn't tell from the excel(using my phone) but what was >>>>>>>>>> absolute? >>>>>>>>>> If I am not mistaken what CDM does is takes the path that >>>>>>>>>> is >>>>>>>>>> specified >>>>>>>>>> in the CDM file(which should be absolute) and appends the >>>>>>>>>> file >>>>>>>>>> path >>>>>>>>>> that is used in the swift script to it(which should be >>>>>>>>>> relative). >>>>>>>>>> So >>>>>>>>>> if they are both absolute I would expect it to fail. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Nov 14, 2011, at 10:56 AM, Michael Wilde < >>>>>>>>>> wilde at mcs.anl.gov >>>>>>>>>>> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Thanks, Ketan - very nice tests and summary document. >>>>>>>>>>> >>>>>>>>>>> I should point out for those that didn't open the doc: the >>>>>>>>>>> 4 >>>>>>>>>>> tests >>>>>>>>>>> with absolute pathnames are failing. >>>>>>>>>>> >>>>>>>>>>> Ketan, can you work with Justin to see if this is a bug, >>>>>>>>>>> or >>>>>>>>>>> if >>>>>>>>>>> the >>>>>>>>>>> CDM directive needs to be coded differently for absolute >>>>>>>>>>> paths? >>>>>>>>>>> Then >>>>>>>>>>> please test a fix, and as we discussed adapt the tests >>>>>>>>>>> with >>>>>>>>>>> annotations to enhance the User Guide section on CDM. >>>>>>>>>>> >>>>>>>>>>> - Mike >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> ----- Original Message ----- >>>>>>>>>>>> From: "Ketan Maheshwari" < ketancmaheshwari at gmail.com > >>>>>>>>>>>> To: "Swift Devel" < swift-devel at ci.uchicago.edu > >>>>>>>>>>>> Sent: Monday, November 14, 2011 8:13:48 AM >>>>>>>>>>>> Subject: [Swift-devel] CDM Tests >>>>>>>>>>>> Hello, >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I had a discussion with Mike about testing the CDM >>>>>>>>>>>> behavior >>>>>>>>>>>> for >>>>>>>>>>>> the >>>>>>>>>>>> following cases: >>>>>>>>>>>> 1. full versus relative paths for input >>>>>>>>>>>> 2. full versus relative paths for output >>>>>>>>>>>> 3. relative versus absolute option in config property: >>>>>>>>>>>> wrapper.invocation.mode >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> In this regard, I made 8 tests for all the above >>>>>>>>>>>> combinations. >>>>>>>>>>>> I >>>>>>>>>>>> used >>>>>>>>>>>> simple local provider in this first set of tests. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> From the tests it seems that when specifying the relative >>>>>>>>>>>> option >>>>>>>>>>>> on >>>>>>>>>>>> the config for wrapper.invocation.mode property the >>>>>>>>>>>> script >>>>>>>>>>>> works >>>>>>>>>>>> regardless of the paths of input/output. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> A detailed result with stdout, and paths to logs can be >>>>>>>>>>>> found >>>>>>>>>>>> here: >>>>>>>>>>>> https://docs.google.com/spreadsheet/ccc?key=0AmvYSwENKFY9dG44V2VjRXJlUmZLNG9saERFeWZDcUE >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> The tests are in my CI dir: /home/ketan/cdm_tests >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Regards, -- >>>>>>>>>>>> Ketan >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> _______________________________________________ >>>>>>>>>>>> Swift-devel mailing list >>>>>>>>>>>> Swift-devel at ci.uchicago.edu >>>>>>>>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Michael Wilde >>>>>>>>>>> Computation Institute, University of Chicago >>>>>>>>>>> Mathematics and Computer Science Division >>>>>>>>>>> Argonne National Laboratory >>>>>>>>>>> >>>>>>>>>>> _______________________________________________ >>>>>>>>>>> Swift-devel mailing list >>>>>>>>>>> Swift-devel at ci.uchicago.edu >>>>>>>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Justin M Wozniak >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Swift-devel mailing list >>>>>>>> Swift-devel at ci.uchicago.edu >>>>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Ketan >>>>>>> >>>>>>> -- >>>>>>> Michael Wilde >>>>>>> Computation Institute, University of Chicago >>>>>>> Mathematics and Computer Science Division >>>>>>> Argonne National Laboratory >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Ketan >>>>>> >>>>>> -- >>>>>> Michael Wilde >>>>>> Computation Institute, University of Chicago >>>>>> Mathematics and Computer Science Division >>>>>> Argonne National Laboratory >>>>>> >>>>>> _______________________________________________ >>>>>> Swift-devel mailing list >>>>>> Swift-devel at ci.uchicago.edu >>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>>> >>>>> -- >>>>> Michael Wilde >>>>> Computation Institute, University of Chicago >>>>> Mathematics and Computer Science Division >>>>> Argonne National Laboratory >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Ketan >>>> >>>> -- >>>> Michael Wilde >>>> Computation Institute, University of Chicago >>>> Mathematics and Computer Science Division >>>> Argonne National Laboratory >>>> >>>> _______________________________________________ >>>> Swift-devel mailing list >>>> Swift-devel at ci.uchicago.edu >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>> >>>> >>>> >>>> >>>> -- >>>> Ketan >>> >>> -- >>> Michael Wilde >>> Computation Institute, University of Chicago >>> Mathematics and Computer Science Division >>> Argonne National Laboratory >>> >> >> >> >> >> -- >> Ketan >> >> > > -- Justin M Wozniak From wilde at mcs.anl.gov Thu Nov 17 10:49:17 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 17 Nov 2011 10:49:17 -0600 (CST) Subject: [Swift-devel] gridFTP urls for data in provider staging mode In-Reply-To: <1091769972.38154.1321547895379.JavaMail.root@zimbra.anl.gov> Message-ID: <1250188813.38212.1321548557304.JavaMail.root@zimbra.anl.gov> (moving this thread to the swift-devel list) Mihael, can you clarify a few more aspects of this? - the staging mode determines if the file is read by the client ("file") or service ("proxy"), correct? Both should work? - when a gsiftp-protocol URI is read, is it read and sent to the worker in a read loop, with good-sized buffers? Or is it read into local disk and then sent to the worker? Im hoping the former, so we can handle large files more reasonably. - the provider used to read the file is determined by the protocol field in the URI, and not by the data provider listed for the site the job is going to, right? The latter is used solely for accessing the workdirectory, and hence does not apply (ie is not needed or ignored) for the case of provider staging, right? Mike ----- Original Message ----- > From: "Michael Wilde" > To: "Mihael Hategan" > Cc: "Ketan Maheshwari" > Sent: Thursday, November 17, 2011 8:38:15 AM > Subject: Re: gridFTP urls for data in provider staging mode > Mihael, I dont understand. Did you mean to say that the *service* wont > accept a URL for which it cant open a stream? > > So how did this work for you? Did you not do a similar test to what > Ketan is doing? > > (Ie, map a scalar file-type variable to a gsiftp:// URI?) > > And I thought the code tried to read the input file via a provider > thats based on the protocol in the mapped file name's URI??? As > opposed to assuming that its a local file and opening a Java stream on > that name. > > Can you clarify what you tested and why you think Ketan's test is > failing? > > Thats why I asked you to send us the test you ran: so we could > replicate it to verify that we had all the needed code pieces. > > Thanks, > > - Mike > > > ----- Original Message ----- > > From: "Mihael Hategan" > > To: "Ketan Maheshwari" > > Cc: "Michael Wilde" > > Sent: Wednesday, November 16, 2011 8:33:39 PM > > Subject: Re: gridFTP urls for data in provider staging mode > > That's a completely different issue. > > > > It seems that URL won't accept a URL for which it cannot open a > > stream. > > I will change the parsing. > > > > On Wed, 2011-11-16 at 17:03 -0600, Ketan Maheshwari wrote: > > > Mihael, > > > > > > I think in this latest run I have got the correct > > > CoGResourceIOProvider. The log is: > > > > > > http://ci.uchicago.edu/~ketan/worker-tmp.log > > > > > > And, this line matches: > > > at > > > org.globus.cog.abstraction.impl.file.coaster.handlers.providers.CoGResourceIOProvider.buildService(CoGResourceIOProvider.java:55) > > > the file you sent. > > > > > > > > > The error message is also pertaining to the gsiftp protocol: > > > 2011/11/16 16:57:18.743 DEBUG 000000 1321484238602 getFileCBDataIn > > > FAILED 520 Error staging in file: > > > org.globus.cog.karajan.workflow.service.ProtocolException: > > > org.globus.cog.karajan.workflow.service.ProtocolException: > > > java.net.MalformedURLException: unknown protocol: gsiftp > > > > > > Regards, > > > Ketan > > > > > > > > > On Wed, Nov 16, 2011 at 3:12 PM, Mihael Hategan > > > > > > wrote: > > > On Wed, 2011-11-16 at 14:58 -0600, Michael Wilde wrote: > > > > > > > My theory as that for some reason the Java code in the > > > service is not > > > > recognizing the mapped URL as a gridftp URL, and is > > > > hence > > > treating the > > > > gridftp URI like a file. > > > > > > > > > The stack trace that Ketan sees occurs in a line that does > > > not > > > have > > > anything relevant in the version of CoGResourceIOProvider > > > that > > > I sent. > > > > > > If you look at the worker log, you see this: > > > > > > org.globus.cog.abstraction.impl.file.coaster.handlers.providers.CoGResourceIOProvider$Reader.(CoGResourceIOProvider.java:143) > > > > > > If you look at the CoGResourceIOProvider I sent: > > > > > > 142 public void resume() { > > > 143 } <---- > > > > > > The original, however, has the following: > > > 142 this.cb = cb; > > > 143 fc = new FileInputStream(f).getChannel(); <---- > > > > > > So it's pretty clear to me that whatever Ketan is running > > > is > > > using the > > > old file. > > > > > > That's a build problem of some sort. I suggest checking in > > > the > > > two files > > > into the 0.93.1 branch and then doing a clean build from > > > that > > > branch as > > > well as making sure that no helper scripts that modify the > > > standard > > > environment are used. > > > > > > > > > > > > > > The log should help us see if we running the same swift > > > > as > > > you, and > > > > where the patterns diverge. > > > > > > > > I really hope that we can resolve this expeditiously. > > > > > > > > - Mike > > > > > > > > > > > > ----- Original Message ----- > > > > > From: "Mihael Hategan" > > > > > To: "Ketan Maheshwari" > > > > > Cc: "Michael Wilde" > > > > > Sent: Wednesday, November 16, 2011 11:50:09 AM > > > > > Subject: Re: gridFTP urls for data in provider staging > > > mode > > > > > There was nothing special in them. > > > > > > > > > > I used proxy mode (though it shouldn't matter) and, of > > > course, > > > > > provider > > > > > staging. > > > > > > > > > > I generally helps to make it simple and use automatic > > > coasters and > > > > > start > > > > > with local:local. > > > > > > > > > > On Wed, 2011-11-16 at 10:55 -0600, Ketan Maheshwari > > > > > wrote: > > > > > > Mihael, > > > > > > > > > > > > > > > > > > Could you send your sites.xml, cf and source files > > > > > > with > > > which you > > > > > > tested this. > > > > > > > > > > > > Regards, > > > > > > Ketan > > > > > > > > > > > > On Wed, Nov 16, 2011 at 10:19 AM, Ketan Maheshwari > > > > > > wrote: > > > > > > I checked which coaster-service I am using > > > > > > and > > > think it is > > > > > > coming from the right location in 0.93: > > > > > > > > > > > > > > > > > > [communicado:swiftgrid]$ which > > > > > > coaster-service > > > > > > > > > ~/swift-install/0.93/cog/modules/swift/dist/swift-svn/bin/coaster-service > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Nov 15, 2011 at 4:15 PM, Mihael > > > > > > Hategan > > > > > > wrote: > > > > > > Also, this is not an issue with > > > worker.pl. It's > > > > > > really > > > > > > the java part. > > > > > > But it's the one used by the coaster > > > service, so > > > > > > make > > > > > > sure that the > > > > > > coaster service you are starting is > > > > > > from > > > a tree with > > > > > > the updated > > > > > > CoGResourceIOProvider.java. > > > > > > > > > > > > On Tue, 2011-11-15 at 15:20 -0600, > > > > > > Ketan > > > Maheshwari > > > > > > wrote: > > > > > > > > > > > > > > > > > > > I am using passive coasters: > > > > > > > > > > > > > > > > > > > > > coaster-service -port 1984 > > > > > > > -localport > > > 35753 -nosec > > > > > > -passive > > > > > > > > > > > > > > > > > > > > > but I will run once again putting > > > > > > > the > > > modified > > > > > > worker.pl in source > > > > > > > tree. > > > > > > > > > > > > > > On Tue, Nov 15, 2011 at 3:18 PM, > > > Michael Wilde > > > > > > > > > > > > > wrote: > > > > > > > If this is automatic > > > > > > > coasters, > > > you need to > > > > > > change worler.pl in > > > > > > > the source tree and re ant > > > dist: its > > > > > > included in the automatic > > > > > > > boot process and not > > > > > > > fetched > > > from bin/ > > > > > > > > > > > > > > - Mike > > > > > > > > > > > > > > ----- Original Message > > > > > > > ----- > > > > > > > > From: "Ketan Maheshwari" > > > > > > > > > > > > > > To: "Mihael Hategan" > > > > > > > > > > > > > > > > Cc: "Michael Wilde" > > > > > > > > > > > > > > > > > > Sent: Tuesday, November > > > > > > > > 15, > > > 2011 > > > > > > > > 12:53:19 > > > > > > PM > > > > > > > > Subject: Re: gridFTP > > > > > > > > urls > > > for data in > > > > > > provider staging mode > > > > > > > > > > > > > > > > > > > > > > Mihael, > > > > > > > > > > > > > > > > > > > > > > > > I tried to match the > > > > > > > > file > > > you sent that > > > > > > > > I > > > > > > copied into the > > > > > > > swift tree > > > > > > > > using sum and diff: > > > > > > > > > > > > > > > > > > > > > > > > [login:cog]$ sum > > > > > > ~/CoGResourceIOProvider.java > > > > > > > > 63915 8 > > > > > > > > [login:cog]$ sum > > > > > > > > > > > > > > > > > ./modules/provider-coaster/src/org/globus/cog/abstraction/impl/file/coaster/handlers/providers/CoGResourceIOProvider.java > > > > > > > > 63915 8 > > > > > > > > [login:cog]$ > > > > > > > > > diff /home/ketan/CoGResourceIOProvider.java > > > > > > > > > > > > > > > > > ./modules/provider-coaster/src/org/globus/cog/abstraction/impl/file/coaster/handlers/providers/CoGResourceIOProvider.java > > > > > > > > > > > > > > > > > > > > > > > > I recompiled swift. And > > > replaced the > > > > > > worker.pl in > > > > > > > dist/swift-svn/bin > > > > > > > > with the one you sent. > > > > > > > > > > > > > > > > > > > > > > > > I made sure that I am > > > running the > > > > > > > > correct > > > > > > swift: > > > > > > > > > > > > > > > > [communicado:swiftgrid]$ > > > which swift > > > > > > > > > > > > > > > > > > > > > > > > ~/swift-install/0.93/cog/modules/swift/dist/swift-svn/bin/swift > > > > > > > > > > > > > > > > > > > > > > > > I tried the example > > > > > > > > again > > > and it is > > > > > > > > giving > > > > > > the same failure > > > > > > > message. > > > > > > > > > > > > > > > > On Tue, Nov 15, 2011 at > > > > > > > > 2:19 > > > PM, Mihael > > > > > > Hategan < > > > > > > > hategan at mcs.anl.gov > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > The error in the worker > > > > > > > > logs > > > can only > > > > > > appear in the original > > > > > > > > CoGResourceIOProvider, > > > > > > > > not > > > the one I > > > > > > > > sent > > > > > > you. Make sure you > > > > > > > got that > > > > > > > > in > > > > > > > > the right place, that > > > > > > > > you > > > recompiled, > > > > > > > > and > > > > > > that you are using > > > > > > > the > > > > > > > > modified swift. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Tue, 2011-11-15 at > > > > > > > > 11:23 > > > -0600, Ketan > > > > > > Maheshwari wrote: > > > > > > > > > Mike, > > > > > > > > > > > > > > > > > > > > > > > > > > > The worker log is > > > > > > > > > here: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > http://ci.uchicago.edu/~ketan/worker-catsn-gridftp.log > > > > > > > > > > > > > > > > > > > > > > > > > > > The problem looks like > > > > > > > > > the > > > gridFTP > > > > > > > > > path > > > > > > is not being > > > > > > > properly > > > > > > > > > captured > > > > > > > > > from this line: > > > > > > > > > > > > > > > > > > > > > > > > > > > 2011/11/08 > > > > > > > > > 16:14:42.650 > > > DEBUG 000000 > > > > > > 1320790482491 > > > > > > > > > getFileCBDataInIndirect > > > error: > > > > > > > > > > > > > > > > > > org.globus.cog.karajan.workflow.service.ProtocolException: > > > > > > > > > > > > java.io.FileNotFoundException: > > > > > > > > > > > > > > > > > > > > > > > > > /scratch/local/ketan/catsn-gridftp-staging-method-file/./gsiftp:/ > > > > > > > > > > > > > > > > > > > > > > > > > gridftp.pads.ci.uchicago.edu/gpfs/pads/swift/ketan/data0000.txt > > > > > > (No > > > > > > > > > such file or > > > > > > > > > directory) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > Ketan > > > > > > > > > > > > > > > > > > On Tue, Nov 15, 2011 > > > > > > > > > at > > > 10:42 AM, > > > > > > Michael Wilde < > > > > > > > wilde at mcs.anl.gov > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > Ketan, the 520 error > > > > > > > > > means > > > that the > > > > > > worker.pl script died. > > > > > > > > > > > > > > > > > > Can you run again, and > > > capture all the > > > > > > logs: swift log, > > > > > > > > > coaster log (if > > > > > > > > > separate), > > > and worker > > > > > > log (with DEBUG set, > > > > > > > > > possibly important). > > > > > > > > > Maybe > > > you can > > > > > > > > > trakc > > > > > > back from the 520 > > > > > > > > > error. The worker log > > > should give some > > > > > > insight into why > > > > > > > the > > > > > > > > > 520 is occurring. I > > > > > > > > > cant > > > recall what > > > > > > > > > the > > > > > > # means but its a > > > > > > > > > common one. Probably > > > "failed to stage > > > > > > file", in which case > > > > > > > the > > > > > > > > > worker log may help > > > clarify why. > > > > > > > > > > > > > > > > > > - Mike > > > > > > > > > > > > > > > > > > ----- Original Message > > > ----- > > > > > > > > > > > > > > > > > > > From: "Ketan > > > > > > > > > > Maheshwari" > > > < > > > > > > ketancmaheshwari at gmail.com > > > > > > > > > > > To: "Mihael Hategan" > > > > > > > > > > < > > > > > > hategan at mcs.anl.gov > > > > > > > > > > > Cc: "Michael Wilde" > > > > > > > > > > < > > > > > > wilde at mcs.anl.gov > > > > > > > > > > > Sent: Tuesday, > > > > > > > > > > November > > > 15, 2011 > > > > > > 8:15:48 AM > > > > > > > > > > Subject: Re: gridFTP > > > urls for data > > > > > > > > > > in > > > > > > provider staging > > > > > > > mode > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi Mihael, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I rebuilt swift with > > > these files > > > > > > included but could not > > > > > > > get > > > > > > > > > the test > > > > > > > > > > run successfully. I > > > tried three > > > > > > > > > > tests, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1. catsn without > > > gridftp: worked > > > > > > > > > > 2. catsn with > > > > > > > > > > gridftp > > > staging method > > > > > > file: fails > > > > > > > > > > 3. catsn with > > > > > > > > > > gridftp > > > staging method > > > > > > proxy: fails > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Both failures seem > > > > > > > > > > to > > > > > > > > > > be > > > related to > > > > > > the staging from the > > > > > > > > > following > > > > > > > > > > stderr message: > > > > > > > > > > > > > > > > > > > > Caused by: null > > > > > > > > > > Caused by: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > org.globus.cog.abstraction.impl.common.execution.JobException: > > > > > > > > > Job > > > > > > > > > > failed with an exit > > > > > > > > > > code > > > of 520 > > > > > > > > > > Final status: time: > > > > > > > > > > Tue, > > > 15 Nov 2011 > > > > > > 10:08:37 -0600 > > > > > > > Failed:1 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The logs are as > > > > > > > > > > follows: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > http://ci.uchicago.edu/~ketan/catsn-gridftp-staging-method-file.log > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > http://ci.uchicago.edu/~ketan/catsn-gridftp-staging-method-proxy.log > > > > > > > > > > > > > > > > > > > http://ci.uchicago.edu/~ketan/catsn-nogridftp.log > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > Ketan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Nov 14, 2011 > > > > > > > > > > at > > > 1:47 PM, > > > > > > Mihael Hategan < > > > > > > > > > hategan at mcs.anl.gov > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, 2011-11-14 > > > > > > > > > > at > > > 08:39 -0600, > > > > > > Michael Wilde wrote: > > > > > > > > > > > Mihael, I see that > > > > > > > > > > > you > > > made the > > > > > > branch for this. Did > > > > > > > you > > > > > > > > > commit it? > > > > > > > > > > > If not, can you? > > > > > > > > > > > > > > > > > > > > Not yet. I'm a bit > > > > > > > > > > busy > > > this week > > > > > > seeing that I seem to > > > > > > > have > > > > > > > > > no idea > > > > > > > > > > how > > > > > > > > > > to do any of the > > > homework (which > > > > > > > > > > isn't > > > > > > really a new > > > > > > > thing), > > > > > > > > > but I'll > > > > > > > > > > try > > > > > > > > > > to commit the input > > > stuff as soon as > > > > > > > > > > I > > > > > > can. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Whats needed for > > > testing the > > > > > > > > > > > output > > > > > > side as well? > > > > > > > > > > > > > > > > > > > > Should be equally > > > > > > > > > > easy. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I'd like to push > > > forward with > > > > > > > > > > > this. > > > > > > > > > > > > > > > > > > > > Yes, I know, I'm > > > > > > > > > > sorry. > > > I'll get to > > > > > > > > > > it > > > > > > as soon as I can. > > > > > > > > > > > > > > > > > > > > Well, you know what, > > > > > > > > > > I'm > > > attaching > > > > > > > > > > the > > > > > > modified relevant > > > > > > > > > files. Just > > > > > > > > > > > > > > > > > > > rename worker.pl.pr > > > > > > > > > > to > > > worker.pl . > > > > > > Feel free to commit > > > > > > > them > > > > > > > > > to the new > > > > > > > > > > branch. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Ketan > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Michael Wilde > > > > > > > > > Computation Institute, > > > University of > > > > > > Chicago > > > > > > > > > Mathematics and > > > > > > > > > Computer > > > Science > > > > > > Division > > > > > > > > > Argonne National > > > Laboratory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Ketan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Ketan > > > > > > > > > > > > > > -- > > > > > > > Michael Wilde > > > > > > > Computation Institute, > > > University of > > > > > > > Chicago > > > > > > > Mathematics and Computer > > > Science Division > > > > > > > Argonne National > > > > > > > Laboratory > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Ketan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Ketan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Ketan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > Ketan > > > > > > > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From hategan at mcs.anl.gov Thu Nov 17 11:09:57 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 17 Nov 2011 09:09:57 -0800 Subject: [Swift-devel] gridFTP urls for data in provider staging mode In-Reply-To: <1250188813.38212.1321548557304.JavaMail.root@zimbra.anl.gov> References: <1250188813.38212.1321548557304.JavaMail.root@zimbra.anl.gov> Message-ID: <1321549797.6216.6.camel@blabla> On Thu, 2011-11-17 at 10:49 -0600, Michael Wilde wrote: > (moving this thread to the swift-devel list) > > Mihael, can you clarify a few more aspects of this? > > - the staging mode determines if the file is read by the client > ("file") or service ("proxy"), correct? Both should work? Yes. > > - when a gsiftp-protocol URI is read, is it read and sent to the > worker in a read loop, with good-sized buffers? Or is it read into > local disk and then sent to the worker? Im hoping the former, so we > can handle large files more reasonably. The former, of course. > > - the provider used to read the file is determined by the protocol > field in the URI, and not by the data provider listed for the site the > job is going to, right? Right. > The latter is used solely for accessing the workdirectory, and hence > does not apply (ie is not needed or ignored) for the case of provider > staging, right? That is correct. From tim.g.armstrong at gmail.com Thu Nov 17 13:58:51 2011 From: tim.g.armstrong at gmail.com (Tim Armstrong) Date: Thu, 17 Nov 2011 13:58:51 -0600 Subject: [Swift-devel] Semantics of multidimensional arrays in Swift Message-ID: I'm trying to understand the semantics of multidimensional arrays in Swift so that I can implement something similar for the ExM project. I initially assumed that 2d arrays were effectively arrays of references to arrays, so that the semantics would be as follows: int a[][]; int b[]; a[0] = b; // Pointer to b in first slot of array a a[0][1] = 2; // insert into b a[1][1] = 2; // invalid because no array inserted into a[1] But a[1][1] = 2 actually succeeds. As far as I can work out, what actually happens is: - A[i][j] = x; - A[i] is uninitialised, a new array C is created, and x assigned to C[j] - A[i] is initialised with array C, x is assigned to C[j] - A[i] = B; - A[i] is uninitialised, A[i] is then a reference to B - A[i] is initialised and points to array C, all members of B are copied to the corresponding index in C Is this the intended behaviour? Am I understanding this correctly? In the swift implementation I tested it on, this leads to some nondeterminism: int a[][]; int b[] = [1,2,3]; a[0][3] = 30; a[0] = b; a[1][0] = 123; trace(a[0][0]); trace(a[0][3]); trace(a[1][0]); Nondetermistic outcome 1 ==================== SwiftScript trace: 1 SwiftScript trace: 123 Execution failed: Invalid path ([3]) for a.[0][]/3 Nondetermistic outcome 2 ==================== SwiftScript trace: 1 SwiftScript trace: 30 SwiftScript trace: 123 - Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: From benc at hawaga.org.uk Fri Nov 18 03:52:53 2011 From: benc at hawaga.org.uk (Ben Clifford) Date: Fri, 18 Nov 2011 10:52:53 +0100 Subject: [Swift-devel] Semantics of multidimensional arrays in Swift In-Reply-To: References: Message-ID: The below is based on my gut feeling for how things *should* work, at least assuming the existence of array assignment syntax, not how the implementation does work. Remember this is a declarative language. So then: > I initially assumed that 2d arrays were effectively arrays of references to arrays, so that the semantics would be as follows: > > int a[][]; > int b[]; > > a[0] = b; // Pointer to b in first slot of array a > a[0][1] = 2; // insert into b First you say "the value of a[0] is fully described by b" and then secondly you say "the value of a[0] is partially described by the statement 'element [1] is 2'". Those are contradictory statements. Maybe it would be ok to say: a[0]=b b[1]=2 If you're using verbs like "insert" it suggests you haven't smoked enough swift-crack and are still thinking too imperatively... > a[1][1] = 2; // invalid because no array inserted into a[1] One interpretation of this working is that declaring an array int a[][] means that you have a structure that is indexed by pairs of co-ordinates, rather than a structure indexed by one co-ordinate containing a bunch of other structures indexed by the other co-ordinate. So if you have tuple syntax (x,y) then you are really writing: a[(1,1)] = 2 which is fine, just like a[(0,1)] = 2 is. But in that view, what does a[0] means? something like a[(0,?)] ? mostly I think it would mean "select the subset of values that have left co-ordinate 0": if you're using it for extracting a value, you get a 1d array out of it. If you're using it for setting values, then: a[0] = b would be something like: foreach i in b { a[0][i] = b[i]; } Whether this is implemented by copying a pointer to b or by copying individual values shouldn't affect the end result, as long as its implemented properly, and as long as you don't write invalid programs which are awkward to detect. In my opinion, your program below is invalid, because you are making two contradictory declarations of the value of a[0] - on that its entire value is determined by b, and another that one of its values is 30. If the above model is how things should work, then really you should get a runtime error when trying to do this: > a[0][3] = 30; > a[0] = b; and non-deterministic outcome 2 below is incorrect. but that this: > a[1][0] = 123; should work just fine - there is no concept of "declaring" a[1] to exist separate from using it. A slightly different way for this to work would be for a[0] = b to explicitly be a copy of b elements into a[0], which *would* allow you to assign other entries to other a[0][x] positions, as long as they did not conflict with elements that came from b. (the usual "can't assign an element twice" rule). But maybe that would be less efficient to implement. Sorry that the above sounds rather disconnected from implementation and abstract - but you are asking about abstract semantics ;) Feel free to demand I explain my views more... Ben > > But a[1][1] = 2 actually succeeds. As far as I can work out, what actually happens is: > ? A[i][j] = x; > ? A[i] is uninitialised, a new array C is created, and x assigned to C[j] > ? A[i] is initialised with array C, x is assigned to C[j] > ? A[i] = B; > ? A[i] is uninitialised, A[i] is then a reference to B > ? A[i] is initialised and points to array C, all members of B are copied to the corresponding index in C > Is this the intended behaviour? Am I understanding this correctly? > > In the swift implementation I tested it on, this leads to some nondeterminism: > > int a[][]; > int b[] = [1,2,3]; > > a[0][3] = 30; > a[0] = b; > a[1][0] = 123; > > trace(a[0][0]); > trace(a[0][3]); > trace(a[1][0]); > > > Nondetermistic outcome 1 > ==================== > SwiftScript trace: 1 > SwiftScript trace: 123 > Execution failed: > Invalid path ([3]) for a.[0][]/3 > > Nondetermistic outcome 2 > ==================== > SwiftScript trace: 1 > SwiftScript trace: 30 > SwiftScript trace: 123 > > - Tim > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From tim.g.armstrong at gmail.com Fri Nov 18 09:24:08 2011 From: tim.g.armstrong at gmail.com (Tim Armstrong) Date: Fri, 18 Nov 2011 09:24:08 -0600 Subject: [Swift-devel] Semantics of multidimensional arrays in Swift In-Reply-To: References: Message-ID: Hi Ben, the way you describe it makes a lot more sense to me (I've spent the last couple of months teaching Haskell to undergrads so have been in that mindframe a lot). I was trying to reverse engineer the semantics from the actual behaviour of the implementation. Am I correct then in thinking that mixing array-wise and cell-wise assignment for the same element like below is invalid (or at least not an intended use case for swift)? int a[][]; a[0] = b; a[0][1] = 2 In the 1d array case swift sometimes seems ok with mixing the two, but sometimes is unhappy. ---------- int a[]; a[2] = 1; a = f(); (int r[]) f () { r[0] = 1; r[1] = 2; } "Compile error in procedure invocation at line 5: variable a has multiple writers" ----- but the following compiles and runs fine: int a[]; a = f(); a[2] = 1; - Tim On Fri, Nov 18, 2011 at 3:52 AM, Ben Clifford wrote: > > The below is based on my gut feeling for how things *should* work, at > least assuming the existence of array assignment syntax, not how the > implementation does work. > > Remember this is a declarative language. So then: > > > I initially assumed that 2d arrays were effectively arrays of references > to arrays, so that the semantics would be as follows: > > > > int a[][]; > > int b[]; > > > > a[0] = b; // Pointer to b in first slot of array a > > a[0][1] = 2; // insert into b > > First you say "the value of a[0] is fully described by b" and then > secondly you say "the value of a[0] is partially described by the statement > 'element [1] is 2'". Those are contradictory statements. > > Maybe it would be ok to say: > > a[0]=b > b[1]=2 > > If you're using verbs like "insert" it suggests you haven't smoked enough > swift-crack and are still thinking too imperatively... > > > a[1][1] = 2; // invalid because no array inserted into a[1] > > One interpretation of this working is that declaring an array int a[][] > means that you have a structure that is indexed by pairs of co-ordinates, > rather than a structure indexed by one co-ordinate containing a bunch of > other structures indexed by the other co-ordinate. > > So if you have tuple syntax (x,y) then you are really writing: > > a[(1,1)] = 2 > > which is fine, just like a[(0,1)] = 2 is. > > But in that view, what does a[0] means? something like a[(0,?)] ? mostly > I think it would mean "select the subset of values that have left > co-ordinate 0": if you're using it for extracting a value, you get a 1d > array out of it. If you're using it for setting values, then: > > a[0] = b > > would be something like: > > foreach i in b { a[0][i] = b[i]; } > > > Whether this is implemented by copying a pointer to b or by copying > individual values shouldn't affect the end result, as long as its > implemented properly, and as long as you don't write invalid programs which > are awkward to detect. > > In my opinion, your program below is invalid, because you are making two > contradictory declarations of the value of a[0] - on that its entire value > is determined by b, and another that one of its values is 30. > > If the above model is how things should work, then really you should get a > runtime error when trying to do this: > > > a[0][3] = 30; > > a[0] = b; > > and non-deterministic outcome 2 below is incorrect. > > but that this: > > > > a[1][0] = 123; > > should work just fine - there is no concept of "declaring" a[1] to exist > separate from using it. > > A slightly different way for this to work would be for a[0] = b to > explicitly be a copy of b elements into a[0], which *would* allow you to > assign other entries to other a[0][x] positions, as long as they did not > conflict with elements that came from b. (the usual "can't assign an > element twice" rule). But maybe that would be less efficient to implement. > > Sorry that the above sounds rather disconnected from implementation and > abstract - but you are asking about abstract semantics ;) Feel free to > demand I explain my views more... > > Ben > > > > > But a[1][1] = 2 actually succeeds. As far as I can work out, what > actually happens is: > > ? A[i][j] = x; > > ? A[i] is uninitialised, a new array C is created, and x > assigned to C[j] > > ? A[i] is initialised with array C, x is assigned to C[j] > > ? A[i] = B; > > ? A[i] is uninitialised, A[i] is then a reference to B > > ? A[i] is initialised and points to array C, all members > of B are copied to the corresponding index in C > > Is this the intended behaviour? Am I understanding this correctly? > > > > In the swift implementation I tested it on, this leads to some > nondeterminism: > > > > int a[][]; > > int b[] = [1,2,3]; > > > > a[0][3] = 30; > > a[0] = b; > > a[1][0] = 123; > > > > trace(a[0][0]); > > trace(a[0][3]); > > trace(a[1][0]); > > > > > > Nondetermistic outcome 1 > > ==================== > > SwiftScript trace: 1 > > SwiftScript trace: 123 > > Execution failed: > > Invalid path ([3]) for a.[0][]/3 > > > > Nondetermistic outcome 2 > > ==================== > > SwiftScript trace: 1 > > SwiftScript trace: 30 > > SwiftScript trace: 123 > > > > - Tim > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From benc at hawaga.org.uk Fri Nov 18 09:35:13 2011 From: benc at hawaga.org.uk (Ben Clifford) Date: Fri, 18 Nov 2011 15:35:13 +0000 (GMT) Subject: [Swift-devel] Semantics of multidimensional arrays in Swift In-Reply-To: References: Message-ID: > actual behaviour of the implementation. Am I correct then in thinking > that mixing array-wise and cell-wise assignment for the same element > like below is invalid (or at least not an intended use case for swift)? The in-my-head version of swift says that is invalid. The implementation appears a little more fickle. Not sure what other people have to say. I found it very hard to develop static checking of array assignment, because while I believe a[0]=b;a[0][1]=1; to be invalid, I'm fairly happy that a[0]=b;a[1][1]=1 is valid, because you are not overlapping definitions of elements. I'm fairly happy to believe that the multiple writer checking doesn't match up with my beliefs above... this array assignment syntax is pretty horrible. > int a[][]; > a[0] = b; > a[0][1] = 2 > > In the 1d array case swift sometimes seems ok with mixing the two, but > sometimes is unhappy. > ---------- > int a[]; > a[2] = 1; > a = f(); > > (int r[]) f () { > r[0] = 1; > r[1] = 2; > } > > "Compile error in procedure invocation at line 5: variable a has multiple > writers" > ----- > but the following compiles and runs fine: > > int a[]; > a = f(); > a[2] = 1; > > - Tim > > On Fri, Nov 18, 2011 at 3:52 AM, Ben Clifford wrote: > > > > > The below is based on my gut feeling for how things *should* work, at > > least assuming the existence of array assignment syntax, not how the > > implementation does work. > > > > Remember this is a declarative language. So then: > > > > > I initially assumed that 2d arrays were effectively arrays of references > > to arrays, so that the semantics would be as follows: > > > > > > int a[][]; > > > int b[]; > > > > > > a[0] = b; // Pointer to b in first slot of array a > > > a[0][1] = 2; // insert into b > > > > First you say "the value of a[0] is fully described by b" and then > > secondly you say "the value of a[0] is partially described by the statement > > 'element [1] is 2'". Those are contradictory statements. > > > > Maybe it would be ok to say: > > > > a[0]=b > > b[1]=2 > > > > If you're using verbs like "insert" it suggests you haven't smoked enough > > swift-crack and are still thinking too imperatively... > > > > > a[1][1] = 2; // invalid because no array inserted into a[1] > > > > One interpretation of this working is that declaring an array int a[][] > > means that you have a structure that is indexed by pairs of co-ordinates, > > rather than a structure indexed by one co-ordinate containing a bunch of > > other structures indexed by the other co-ordinate. > > > > So if you have tuple syntax (x,y) then you are really writing: > > > > a[(1,1)] = 2 > > > > which is fine, just like a[(0,1)] = 2 is. > > > > But in that view, what does a[0] means? something like a[(0,?)] ? mostly > > I think it would mean "select the subset of values that have left > > co-ordinate 0": if you're using it for extracting a value, you get a 1d > > array out of it. If you're using it for setting values, then: > > > > a[0] = b > > > > would be something like: > > > > foreach i in b { a[0][i] = b[i]; } > > > > > > Whether this is implemented by copying a pointer to b or by copying > > individual values shouldn't affect the end result, as long as its > > implemented properly, and as long as you don't write invalid programs which > > are awkward to detect. > > > > In my opinion, your program below is invalid, because you are making two > > contradictory declarations of the value of a[0] - on that its entire value > > is determined by b, and another that one of its values is 30. > > > > If the above model is how things should work, then really you should get a > > runtime error when trying to do this: > > > > > a[0][3] = 30; > > > a[0] = b; > > > > and non-deterministic outcome 2 below is incorrect. > > > > but that this: > > > > > > > a[1][0] = 123; > > > > should work just fine - there is no concept of "declaring" a[1] to exist > > separate from using it. > > > > A slightly different way for this to work would be for a[0] = b to > > explicitly be a copy of b elements into a[0], which *would* allow you to > > assign other entries to other a[0][x] positions, as long as they did not > > conflict with elements that came from b. (the usual "can't assign an > > element twice" rule). But maybe that would be less efficient to implement. > > > > Sorry that the above sounds rather disconnected from implementation and > > abstract - but you are asking about abstract semantics ;) Feel free to > > demand I explain my views more... > > > > Ben > > > > > > > > But a[1][1] = 2 actually succeeds. As far as I can work out, what > > actually happens is: > > > ? A[i][j] = x; > > > ? A[i] is uninitialised, a new array C is created, and x > > assigned to C[j] > > > ? A[i] is initialised with array C, x is assigned to C[j] > > > ? A[i] = B; > > > ? A[i] is uninitialised, A[i] is then a reference to B > > > ? A[i] is initialised and points to array C, all members > > of B are copied to the corresponding index in C > > > Is this the intended behaviour? Am I understanding this correctly? > > > > > > In the swift implementation I tested it on, this leads to some > > nondeterminism: > > > > > > int a[][]; > > > int b[] = [1,2,3]; > > > > > > a[0][3] = 30; > > > a[0] = b; > > > a[1][0] = 123; > > > > > > trace(a[0][0]); > > > trace(a[0][3]); > > > trace(a[1][0]); > > > > > > > > > Nondetermistic outcome 1 > > > ==================== > > > SwiftScript trace: 1 > > > SwiftScript trace: 123 > > > Execution failed: > > > Invalid path ([3]) for a.[0][]/3 > > > > > > Nondetermistic outcome 2 > > > ==================== > > > SwiftScript trace: 1 > > > SwiftScript trace: 30 > > > SwiftScript trace: 123 > > > > > > - Tim > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > From wilde at mcs.anl.gov Fri Nov 18 10:17:35 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 18 Nov 2011 10:17:35 -0600 (CST) Subject: [Swift-devel] Semantics of multidimensional arrays in Swift In-Reply-To: Message-ID: <1659929565.41313.1321633055741.JavaMail.root@zimbra.anl.gov> I agree with Ben's general sense here, but feel we need to define this more clearly. And there are a few subtleties that I think are not yet captured in those examples. Array behavior can be best understood if you follow these rules: - arrays are really hash tables (hence they can be sparse) - multi dimension arrays are simply nested hash tables - every element of every array follows single-assignment semantics - the elements of all but the leaf elements of multi-diemnsion arrays are just hash tables I think that the semantics needs to define nesting in terms of references rather than copies, because if you set an element of an array, that value should be visible through any variable (or collection slot) that refers to the array object. At least hats the way Swift behaves in most cases Ive seen and scripts Ive written, and that seems useful and (largely) consistent. At the moment I think some of our test examples are clouded by bugs, including non-deterministic behavior, lack of appropriate double-assignment runtime error messages, Java exceptions, and ill-behaved trace() output (which is perhaps due to one of the above errors). I filed a ticket and can pay more attention to this when I return from SC. - Mike ----- Original Message ----- > From: "Ben Clifford" > To: "Tim Armstrong" > Cc: "swift-devel" > Sent: Friday, November 18, 2011 1:52:53 AM > Subject: Re: [Swift-devel] Semantics of multidimensional arrays in Swift > The below is based on my gut feeling for how things *should* work, at > least assuming the existence of array assignment syntax, not how the > implementation does work. > > Remember this is a declarative language. So then: > > > I initially assumed that 2d arrays were effectively arrays of > > references to arrays, so that the semantics would be as follows: > > > > int a[][]; > > int b[]; > > > > a[0] = b; // Pointer to b in first slot of array a > > a[0][1] = 2; // insert into b > > First you say "the value of a[0] is fully described by b" and then > secondly you say "the value of a[0] is partially described by the > statement 'element [1] is 2'". Those are contradictory statements. > > Maybe it would be ok to say: > > a[0]=b > b[1]=2 > > If you're using verbs like "insert" it suggests you haven't smoked > enough swift-crack and are still thinking too imperatively... > > > a[1][1] = 2; // invalid because no array inserted into a[1] > > One interpretation of this working is that declaring an array int > a[][] means that you have a structure that is indexed by pairs of > co-ordinates, rather than a structure indexed by one co-ordinate > containing a bunch of other structures indexed by the other > co-ordinate. > > So if you have tuple syntax (x,y) then you are really writing: > > a[(1,1)] = 2 > > which is fine, just like a[(0,1)] = 2 is. > > But in that view, what does a[0] means? something like a[(0,?)] ? > mostly I think it would mean "select the subset of values that have > left co-ordinate 0": if you're using it for extracting a value, you > get a 1d array out of it. If you're using it for setting values, then: > > a[0] = b > > would be something like: > > foreach i in b { a[0][i] = b[i]; } > > > Whether this is implemented by copying a pointer to b or by copying > individual values shouldn't affect the end result, as long as its > implemented properly, and as long as you don't write invalid programs > which are awkward to detect. > > In my opinion, your program below is invalid, because you are making > two contradictory declarations of the value of a[0] - on that its > entire value is determined by b, and another that one of its values is > 30. > > If the above model is how things should work, then really you should > get a runtime error when trying to do this: > > > a[0][3] = 30; > > a[0] = b; > > and non-deterministic outcome 2 below is incorrect. > > but that this: > > > > a[1][0] = 123; > > should work just fine - there is no concept of "declaring" a[1] to > exist separate from using it. > > A slightly different way for this to work would be for a[0] = b to > explicitly be a copy of b elements into a[0], which *would* allow you > to assign other entries to other a[0][x] positions, as long as they > did not conflict with elements that came from b. (the usual "can't > assign an element twice" rule). But maybe that would be less efficient > to implement. > > Sorry that the above sounds rather disconnected from implementation > and abstract - but you are asking about abstract semantics ;) Feel > free to demand I explain my views more... > > Ben > > > > > But a[1][1] = 2 actually succeeds. As far as I can work out, what > > actually happens is: > > ? A[i][j] = x; > > ? A[i] is uninitialised, a new array C is created, and x assigned > > to C[j] > > ? A[i] is initialised with array C, x is assigned to C[j] > > ? A[i] = B; > > ? A[i] is uninitialised, A[i] is then a reference to B > > ? A[i] is initialised and points to array C, all members of B are > > copied to the corresponding index in C > > Is this the intended behaviour? Am I understanding this correctly? > > > > In the swift implementation I tested it on, this leads to some > > nondeterminism: > > > > int a[][]; > > int b[] = [1,2,3]; > > > > a[0][3] = 30; > > a[0] = b; > > a[1][0] = 123; > > > > trace(a[0][0]); > > trace(a[0][3]); > > trace(a[1][0]); > > > > > > Nondetermistic outcome 1 > > ==================== > > SwiftScript trace: 1 > > SwiftScript trace: 123 > > Execution failed: > > Invalid path ([3]) for a.[0][]/3 > > > > Nondetermistic outcome 2 > > ==================== > > SwiftScript trace: 1 > > SwiftScript trace: 30 > > SwiftScript trace: 123 > > > > - Tim > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From wozniak at mcs.anl.gov Fri Nov 18 12:41:24 2011 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Fri, 18 Nov 2011 10:41:24 -0800 (Pacific Standard Time) Subject: [Swift-devel] Semantics of multidimensional arrays in Swift In-Reply-To: References: Message-ID: On Fri, 18 Nov 2011, Tim Armstrong wrote: > I was trying to reverse engineer the semantics from the > actual behaviour of the implementation. You may want to look at how DSHandle and its subclasses use Path to find their members. This is associated with mapping- given a variable and a Path, you can obtain the individual item. Justin -- Justin M Wozniak From iraicu at cs.iit.edu Sun Nov 20 08:52:58 2011 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Sun, 20 Nov 2011 08:52:58 -0600 Subject: [Swift-devel] CFP: CCGrid 2012 deadline extension Message-ID: <4EC9144A.60701@cs.iit.edu> **************** DEADLINE EXTENSION (CCGrid 2012) Must submit paper by November 25, but can update submission until December 2, 2011. **************** 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2012) Ottawa, Canada May 13-16, 2012 http://www.cloudbus.org/ccgrid2012 CALL FOR PAPERS Rapid advances in processing, communication and systems/middleware technologies are leading to new paradigms and platforms for computing, ranging from computing Clusters to widely distributed Grid and emerging Clouds. CCGrid is a series of very successful conferences, sponsored by the IEEE Computer Society Technical Committee on Scalable Computing (TCSC) and ACM, with the overarching goal of bringing together international researchers, developers, and users and to provide an international forum to present leading research activities and results on a broad range of topics related to these platforms and paradigms and their applications. The conference features keynotes, technical presentations, posters and research demos, workshops, tutorials, as well as the SCALE challenges featuring live demonstrations. In 2012, CCGrid will come to Canada for the first time and will be held in Ottawa, the capital city. CCGrid 2012 will have a focus on important and immediate issues that are significantly influencing all aspects of cluster, cloud and grid computing. Topics of interest include, but are not limited to: * Applications and Experiences: Applications to real and complex problems in science, engineering, business and society; User studies; Experiences with large-scale deployments systems or applications. * Architecture: System architectures, Design and deployment. * Autonomic Computing and Cyberinfrastructure: Self managed behavior, models and technologies; Autonomic paradigms and approaches (control-based, bio-inspired, emergent, etc.); Bio-inspired approaches to management; SLA definition and enforcement. * Performance Modeling and Evaluation: Performance models; Monitoring and evaluation tools, Analysis of system/application performance; Benchmarks and testbeds. * Programming Models, Systems, and Fault-Tolerant Computing: Programming models for cluster, clouds and grid computing; fault tolerant infrastructure and algorithms; systems software to enable efficient computing. * Multicore and Accelerator-based Computing: Software and application techniques to utilize multicore architectures and accelerators/heterogeneous computing systems. * Scheduling and Resource Management: Techniques to schedule jobs and resources on clusters, clouds and grid computing platforms. * Cloud Computing: Cloud architectures; Software tools and techniques for clouds. PAPER SUBMISSION Authors are invited to submit papers electronically. Submitted manuscripts should be structured as technical papers and may not exceed 8 letter size (8.5 x 11) pages including figures, tables and references using the IEEE format for conference proceedings: http://www.computer.org/portal/web/cscps/formatting Submissions not conforming to these guidelines may be returned without review. Authors should submit the manuscript in PDF format and make sure that the file will print on a printer that uses letter size (8.5 x 11) paper. The official language of the meeting is English. All manuscripts will be reviewed and will be judged on correctness, originality, technical strength, significance, quality of presentation, and interest and relevance to the conference attendees. Submitted papers must represent original unpublished research that is not currently under review for any other conference or journal. Papers not following these guidelines will be rejected without review and further action may be taken, including (but not limited to) notifications sent to the heads of the institutions of the authors and sponsors of the conference. Submissions received after the due date, exceeding the page limit, or not appropriately structured may not be considered. Authors may contact the conference chairs for more information. The proceedings will be published through the IEEE Computer Society Press, USA and will be made available online through the IEEE Digital Library. Submission Link: https://www.easychair.org/account/signin.cgi?conf=ccgrid2012 JOURNAL SPECIAL ISSUE Highly rated Top 6 papers from the CCGrid 2012 conference will be invited to extend for publication in a special issue of the "Future Generation Computer Systems (FGCS)" Journal published by Elsevier Press. CHAIRS General Chair * Shikharesh Majumdar, Carleton University, Canada Honorary Chair * Geoffrey Fox, Indiana University, USA Program Committee Co-Chairs * Rajkumar Buyya, University of Melbourne, Australia * Pavan Balaji, Argonne National Laboratory, USA Program Committee Vice-chairs * Daniel S. Katz (Applications and Experiences) * Dhabaleswar K. Panda (Architecture) * Manish Parashar (Middleware, Autonomic Computing, and Cyberinfrastructure) * Ahmad Afsahi (Performance Modeling and Analysis) * Xian-He Sun (Performance Measurement and Evaluation) * William Gropp (Programming Models, Systems, and Fault-Tolerant computing) * David Bader (Multicore and Accelerator-based Computing) * Thomas Fahringer (Scheduling and Resource Management) * Ignacio Martin Llorente and Madhusudhan Govindaraju (Cloud Computing) Cyber Co-Chairs * Anton Beloglazov, The University of Melbourne, Australia * Suraj Pandey, CSIRO, Australia * Trevor Gelowsky, Carleton University, Canada Workshops Co-Chairs * Marin Litiou, York University, Canada * Mukaddim Pathan, Telstra Corporation Limited, Australia Publicity Chairs * Helen Karatza, Aristotle University of Thessaloniki, Greece * Ioan Raicu, Illinois Institute of Technology& Argonne National Labs, USA * Bruno Schulze, National Laboratory for Scientific Computing, Brazil * G Subrahmanya VRK Rao: Cognizant technology Solutions, India Tutorials Co-Chairs * Sushil K. Prasad, Georgia State University, USA * Rob Simmonds, Westgrid, Canada Doctoral Symposium Co-Chairs * Carlos Varela, Rensselaer Polytechnic Institute, USA * Yogesh Simmhan, University of Southern California Poster and Research Demo Co-Chairs * Suraj Pandey, CSIRO, Australia SCALE Challenge Coordinator * Shantenu Jha, Rutgers and Loisiana State University Steering Committee * Henri Bal, Vrije University, The Netherlands * Pavan Balaji, Argonne National Laboratory, USA * Rajkumar Buyya, University of Melbourne, Australia (Chair) * Franck Capello, University of Paris-Sud, France * Jack Dongarra, University of Tennessee& ORNL, USA * Dick Epema, Technical University of Delft, The Netherlands * Thomas Fahringer, University of Innsbruck, Austria * Ian Foster, University of Chicago, USA * Wolfgang Gentzsch, DEISA, Germany * Hai Jin, Huazhong University of Science& Technology, China * Craig Lee, The Aerospace Corporation, USA (Co-Chair) * Laurent Lefevre, INRIA, France * Geng Lin, Dell Inc., USA * Manish Parashar, Rutgers: The State University of New Jersey, USA * Shikharesh Majumdar, Carleton University, Canada * Satoshi Matsuoaka, Tokyo Institute of Technology, Japan * Omer Rana, Cardiff University, UK * Paul Roe, Queensland University of Technology, Australia * Bruno Schulze, LNCC, Brazil * Nalini Venkatasubramanian, University of California, USA * Carlos Varela, Rensselaer Polytechnic Institute, USA IMPORTANT DATES Papers Due: 25 November 2011 Papers Final Revisions: 02 December 2011 Notification of Acceptance: 30 January 2012 Camera Ready Papers Due: 27 February 2012 Sponsors: IEEE Computer Society (TCSE)& ACM SIGARCH (approval pending) -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================================= ================================================================= From ketancmaheshwari at gmail.com Sun Nov 20 11:02:02 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Sun, 20 Nov 2011 11:02:02 -0600 Subject: [Swift-devel] Error while parsing XML Message-ID: I am seeing this error message at runtime on the SCEC workflow: Progress: time: Sun, 20 Nov 2011 10:20:51 -0600 Active:2 Finished successfully:4 [Fatal Error] :-1:-1: Premature end of file. org.xml.sax.SAXParseException: Premature end of file. at org.apache.xerces.parsers.DOMParser.parse(Unknown Source) at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:124) at org.globus.cog.abstraction.impl.scheduler.sge.QueuePoller.processStdout(QueuePoller.java:191) at org.globus.cog.abstraction.impl.scheduler.common.AbstractQueuePoller.pollQueue(AbstractQueuePoller.java:170) at org.globus.cog.abstraction.impl.scheduler.common.AbstractQueuePoller.run(AbstractQueuePoller.java:82) at java.lang.Thread.run(Thread.java:619) Progress: time: Sun, 20 Nov 2011 10:21:02 -0600 Active:1 Finished successfully:4 Failed but can retry:1 Exception in simsgt: Arguments: [TEST, processed/gridout_TEST, processed/TEST.modelbox, processed/TEST.fdloc, processed/TEST.cordfile, x] Host: SGE Directory: presgt-20111120-0852-n6cze3l8/jobs/d/simsgt-dlogn1jk stderr.txt: stdout.txt: ---- Caused by: java.io.IOException: Error while parsing XML Exception in simsgt: Arguments: [TEST, processed/gridout_TEST, processed/TEST.modelbox, processed/TEST.fdloc, processed/TEST.cordfile, y] Host: SGE Directory: presgt-20111120-0852-n6cze3l8/jobs/e/simsgt-elogn1jk stderr.txt: stdout.txt: ---- Caused by: java.io.IOException: Error while parsing XML Final status: time: Sun, 20 Nov 2011 10:21:02 -0600 Failed:2 Finished successfully:4 The following errors have occurred: 1. java.io.IOException: Error while parsing XML (2 times) The log is: http://www.mcs.anl.gov/~ketan/presgt-20111120-0852-n6cze3l8.log It shows the same error message at the end of the log. Does anyone else see this? -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidk at ci.uchicago.edu Sun Nov 20 16:05:12 2011 From: davidk at ci.uchicago.edu (David Kelly) Date: Sun, 20 Nov 2011 16:05:12 -0600 (CST) Subject: [Swift-devel] Error while parsing XML In-Reply-To: Message-ID: <186446546.25524.1321826712551.JavaMail.root@zimbra-mb2.anl.gov> I haven't run into this one yet, but I will take a look at this and the other SGE issues tomorrow. Does this happen every time? David ----- Original Message ----- > From: "Ketan Maheshwari" > To: "Swift Devel" > Sent: Sunday, November 20, 2011 11:02:02 AM > Subject: [Swift-devel] Error while parsing XML > I am seeing this error message at runtime on the SCEC workflow: > > > > Progress: time: Sun, 20 Nov 2011 10:20:51 -0600 Active:2 Finished > successfully:4 > [Fatal Error] :-1:-1: Premature end of file. > org.xml.sax.SAXParseException: Premature end of file. > at org.apache.xerces.parsers.DOMParser.parse(Unknown Source) > at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source) > at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:124) > at > org.globus.cog.abstraction.impl.scheduler.sge.QueuePoller.processStdout(QueuePoller.java:191) > at > org.globus.cog.abstraction.impl.scheduler.common.AbstractQueuePoller.pollQueue(AbstractQueuePoller.java:170) > at > org.globus.cog.abstraction.impl.scheduler.common.AbstractQueuePoller.run(AbstractQueuePoller.java:82) > at java.lang.Thread.run(Thread.java:619) > Progress: time: Sun, 20 Nov 2011 10:21:02 -0600 Active:1 Finished > successfully:4 Failed but can retry:1 > Exception in simsgt: > Arguments: [TEST, processed/gridout_TEST, processed/TEST.modelbox, > processed/TEST.fdloc, processed/TEST.cordfile, x] > Host: SGE > Directory: presgt-20111120-0852-n6cze3l8/jobs/d/simsgt-dlogn1jk > stderr.txt: > stdout.txt: > > > ---- > > > Caused by: java.io.IOException: Error while parsing XML > > > > > Exception in simsgt: > Arguments: [TEST, processed/gridout_TEST, processed/TEST.modelbox, > processed/TEST.fdloc, processed/TEST.cordfile, y] > Host: SGE > Directory: presgt-20111120-0852-n6cze3l8/jobs/e/simsgt-elogn1jk > stderr.txt: > stdout.txt: > > > ---- > > > Caused by: java.io.IOException: Error while parsing XML > > > > > Final status: time: Sun, 20 Nov 2011 10:21:02 -0600 Failed:2 Finished > successfully:4 > The following errors have occurred: > 1. java.io.IOException: Error while parsing XML (2 times) > > > > > The log is: > http://www.mcs.anl.gov/~ketan/presgt-20111120-0852-n6cze3l8.log > > > It shows the same error message at the end of the log. Does anyone > else see this? > > > > -- > Ketan > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From ketancmaheshwari at gmail.com Mon Nov 21 08:35:41 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Mon, 21 Nov 2011 08:35:41 -0600 Subject: [Swift-devel] Error while parsing XML In-Reply-To: <186446546.25524.1321826712551.JavaMail.root@zimbra-mb2.anl.gov> References: <186446546.25524.1321826712551.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: David, Yes, I see this happening everytime more than one task is active. I looked up the web about this and this page seems to be relavant: http://www.danielschneller.com/2008/01/saxparseexception-1-1-premature-end-of.html It seems that while reusing the xml stream for probing the job status, the error occurs. The above article suggests to make a copy of each task-status stream and read from it. If you are around today, let's look into it together. Regards, Ketan On Sun, Nov 20, 2011 at 4:05 PM, David Kelly wrote: > > I haven't run into this one yet, but I will take a look at this and the > other SGE issues tomorrow. Does this happen every time? > > David > > ----- Original Message ----- > > From: "Ketan Maheshwari" > > To: "Swift Devel" > > Sent: Sunday, November 20, 2011 11:02:02 AM > > Subject: [Swift-devel] Error while parsing XML > > I am seeing this error message at runtime on the SCEC workflow: > > > > > > > > Progress: time: Sun, 20 Nov 2011 10:20:51 -0600 Active:2 Finished > > successfully:4 > > [Fatal Error] :-1:-1: Premature end of file. > > org.xml.sax.SAXParseException: Premature end of file. > > at org.apache.xerces.parsers.DOMParser.parse(Unknown Source) > > at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source) > > at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:124) > > at > > > org.globus.cog.abstraction.impl.scheduler.sge.QueuePoller.processStdout(QueuePoller.java:191) > > at > > > org.globus.cog.abstraction.impl.scheduler.common.AbstractQueuePoller.pollQueue(AbstractQueuePoller.java:170) > > at > > > org.globus.cog.abstraction.impl.scheduler.common.AbstractQueuePoller.run(AbstractQueuePoller.java:82) > > at java.lang.Thread.run(Thread.java:619) > > Progress: time: Sun, 20 Nov 2011 10:21:02 -0600 Active:1 Finished > > successfully:4 Failed but can retry:1 > > Exception in simsgt: > > Arguments: [TEST, processed/gridout_TEST, processed/TEST.modelbox, > > processed/TEST.fdloc, processed/TEST.cordfile, x] > > Host: SGE > > Directory: presgt-20111120-0852-n6cze3l8/jobs/d/simsgt-dlogn1jk > > stderr.txt: > > stdout.txt: > > > > > > ---- > > > > > > Caused by: java.io.IOException: Error while parsing XML > > > > > > > > > > Exception in simsgt: > > Arguments: [TEST, processed/gridout_TEST, processed/TEST.modelbox, > > processed/TEST.fdloc, processed/TEST.cordfile, y] > > Host: SGE > > Directory: presgt-20111120-0852-n6cze3l8/jobs/e/simsgt-elogn1jk > > stderr.txt: > > stdout.txt: > > > > > > ---- > > > > > > Caused by: java.io.IOException: Error while parsing XML > > > > > > > > > > Final status: time: Sun, 20 Nov 2011 10:21:02 -0600 Failed:2 Finished > > successfully:4 > > The following errors have occurred: > > 1. java.io.IOException: Error while parsing XML (2 times) > > > > > > > > > > The log is: > > http://www.mcs.anl.gov/~ketan/presgt-20111120-0852-n6cze3l8.log > > > > > > It shows the same error message at the end of the log. Does anyone > > else see this? > > > > > > > > -- > > Ketan > > > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From skenny at uchicago.edu Mon Nov 21 18:01:22 2011 From: skenny at uchicago.edu (Sarah Kenny) Date: Mon, 21 Nov 2011 16:01:22 -0800 Subject: [Swift-devel] [Swift-user] gram on ranger In-Reply-To: <1349889700.12818.1321084389586.JavaMail.root@zimbra-mb2.anl.gov> References: <1349889700.12818.1321084389586.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: hi david, she reran and apparently got the same error. the log file is in /home/skenny/swift_logs/corr_multisubj-20111116-1131-dqy537b3.log ~sk On Fri, Nov 11, 2011 at 11:53 PM, David Kelly wrote: > Sarah, > > I just submitted a fix that might help. There was an issue with the > provider not always correctly detecting when the job was completed. The fix > is in the 0.93 source. Can you give it a try and let me know if you still > see any issues? Thanks. > > David > > > ----- Original Message ----- > > From: "Sarah Kenny" > > To: "Justin M Wozniak" > > Cc: "David Kelly" , "Swift Devel" < > swift-devel at ci.uchicago.edu>, "Anjali Raja" > > > > Sent: Tuesday, November 8, 2011 4:36:42 PM > > Subject: Re: [Swift-devel] [Swift-user] gram on ranger > > thought i'd revisit this since anjali re-ran this workflow with fewer > > jobs (~85K) and perhaps the info would be useful. it showed a similar > > pattern in that it finished all jobs but one (that is, we were missing > > a single output file) and hung indefinitely on the last 'finished > > successfully...' > > > > so this discussion seems to have turned mostly to how coasters > > requests cores. however, i have to say that *generally* in the past > > when swift/coasters has requested too many cores for the given queue > > gram complains and you see it in the gram log, which is not the case > > here. > > > > that said, if you want em: the swift log is in /home/skenny/swift_logs > > on ci and the coaster log was too big for my home on ci (and has since > > been appended to so make sure to match the dates with the swift log), > > but if someone has access to ranger it's in /var/tmp/skenny_swift on > > login3 > > > > we're continuing to use the same swift version and sites file since > > it's at least helping us push thru much of the work (doing manual > > resumes/restarts). > > > > ~sk > > > > > > On Fri, Oct 28, 2011 at 11:02 AM, Justin M Wozniak < > > wozniak at mcs.anl.gov > wrote: > > > > > > > > I think count is the number of processes. PBSExecutor uses it, that > > may > > be a good place to look. In the Coasters context, I think it is the > > number of invocations of worker.pl . > > > > > > > > > > On Fri, 28 Oct 2011, David Kelly wrote: > > > > > Just to clarify - when coasters is being used, count represents the > > > number of coaster blocks? Then to get the number of cores to > > > request, I > > > should use count*workersPerNode? > > > > > > What about in the case where coasters is not used? > > > > > > ----- Original Message ----- > > >> From: "Mihael Hategan" < hategan at mcs.anl.gov > > > >> To: "David Kelly" < davidk at ci.uchicago.edu > > > >> Cc: "Anjali Raja" < anjraja at gmail.com >, "Swift Devel" < > > >> swift-devel at ci.uchicago.edu >, "Swift User" > > >> < swift-user at ci.uchicago.edu >, "Ketan Maheshwari" < > > >> ketancmaheshwari at gmail.com > > > >> Sent: Thursday, October 20, 2011 9:08:46 PM > > >> Subject: Re: [Swift-devel] [Swift-user] gram on ranger > > >> On Thu, 2011-10-20 at 21:03 -0500, David Kelly wrote: > > >>> Yep, this is using coasters > > >>> > > >> > > >> Then no. Count is whatever the block allocation algorithm decides > > >> it > > >> should be. > > >> > > >>>>> > > >>>>> Should count=32 in the second case? Am I misunderstanding what > > >>>>> 'count' is? Is there any way to get the exact number of > > >>>>> applications? > > >>>> > > >>>> Coasters? > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > -- > > Justin M Wozniak > > > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > -- > > Sarah Kenny > > Programmer ~ Brain Circuits Laboratory ~ Rm 2224 Bio Sci III > > University of California Irvine, Dept. of Neurology ~ 773-818-8300 > -- Sarah Kenny Programmer ~ Brain Circuits Laboratory ~ Rm 2224 Bio Sci III University of California Irvine, Dept. of Neurology ~ 773-818-8300 -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Tue Nov 22 12:05:55 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 22 Nov 2011 12:05:55 -0600 (CST) Subject: [Swift-devel] arrays of files In-Reply-To: <4FE1A950-DA9F-43B2-80F1-ADAC08BFF476@mcs.anl.gov> Message-ID: <427723534.50745.1321985155467.JavaMail.root@zimbra.anl.gov> Yup, thanks, Mark - its a bug. if you change the body of bar to read: ls "-lt" @filename(files[0]) @filename(files[1]) stdout=@result; instead of: ls "-lt" @filenames(files) stdout=@result; then it works. I'll file a bugzilla ticket on this. David, can you add to the test suite? Mihael, can you fix it? Thanks, - Mike ----- Original Message ----- > From: "Mark Hereld" > To: "Michael Wilde" > Cc: "Thomas D. Uram" > Sent: Tuesday, November 22, 2011 11:48:17 AM > Subject: arrays of files > Mike, > > > this swift seems to compile without complaint but breaks down > with a null pointer at the last line (invocation of "bar"). i think > i'm > confused about arrays of file objects. ideas? > > > > > > > type file; app ( file result ) foo ( string args[] ) { ls "-lt" args > stdout=@result; } app ( file result ) bar ( file files[] ) { ls "-lt" > @filenames(files) stdout=@result; } file out <"dirlisting.txt">; out = > foo([".",".."]); file out2 <"nother.txt">; out2 = foo(["/"]); file > out3 <"proof.txt">; out3 = bar([out,out2]); > > > > > > > > > > > > > ------------------------------------------------------- > Mark Hereld < hereld at mcs.anl.gov > > Senior Fellow - Computation Institute > Experimental Systems Engineer - Mathematics and Computer Science > Visualization and Analysis Lead - Argonne Leadership Computing > Facility > Argonne National Laboratory > The University of Chicago > > > Cell: 630.327.2088 > Voice: 630.252.4170 -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From wilde at mcs.anl.gov Tue Nov 22 14:13:26 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 22 Nov 2011 14:13:26 -0600 (CST) Subject: [Swift-devel] arrays of files In-Reply-To: Message-ID: <1793826491.51609.1321992806080.JavaMail.root@zimbra.anl.gov> Replace your prior implementation of bar() with this, and you then have a reall array solution (workaround) until the @filenames() bug is fixed: app ( file result ) barapp ( file files[], string s[]) { ls "-lt" s stdout=@result; } (file result) bar (file files[] ) { string s[]; foreach f, i in files { s[i] = @filename(f); } result = barapp( files, s ); } - Mike ----- Original Message ----- > From: "Mark Hereld" > To: "Michael Wilde" > Cc: "swift-devel" , "Thomas D. Uram" > Sent: Tuesday, November 22, 2011 1:07:32 PM > Subject: Re: arrays of files > thanks mike! only interested in the array solution. collecting > generic patterns that can be easily reused for wide number of > situations. gpsi. -- mark > > > > On Nov 22, 2011, at 12:05 PM, Michael Wilde wrote: > > > > Yup, thanks, Mark - its a bug. > > if you change the body of bar to read: > > ls "-lt" @filename(files[0]) @filename(files[1]) stdout=@result; > > instead of: > > ls "-lt" @filenames(files) stdout=@result; > > then it works. > > I'll file a bugzilla ticket on this. David, can you add to the test > suite? Mihael, can you fix it? > > Thanks, > > - Mike > > > ----- Original Message ----- > > > From: "Mark Hereld" < hereld at mcs.anl.gov > > > > To: "Michael Wilde" < wilde at mcs.anl.gov > > > > Cc: "Thomas D. Uram" < turam at mcs.anl.gov > > > > Sent: Tuesday, November 22, 2011 11:48:17 AM > > > Subject: arrays of files > > > Mike, > > > > > > > > > this swift seems to compile without complaint but breaks down > > > with a null pointer at the last line (invocation of "bar"). i think > > > i'm > > > confused about arrays of file objects. ideas? > > > > > > > > > > > > > > > > > > > > > type file; app ( file result ) foo ( string args[] ) { ls "-lt" args > > > stdout=@result; } app ( file result ) bar ( file files[] ) { ls "-lt" > > > @filenames(files) stdout=@result; } file out <"dirlisting.txt">; out = > > > foo([".",".."]); file out2 <"nother.txt">; out2 = foo(["/"]); file > > > out3 <"proof.txt">; out3 = bar([out,out2]); > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------- > > > Mark Hereld < hereld at mcs.anl.gov > > > > Senior Fellow - Computation Institute > > > Experimental Systems Engineer - Mathematics and Computer Science > > > Visualization and Analysis Lead - Argonne Leadership Computing > > > Facility > > > Argonne National Laboratory > > > The University of Chicago > > > > > > > > > Cell: 630.327.2088 > > > Voice: 630.252.4170 > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > > > > > > > > > > ------------------------------------------------------- > Mark Hereld < hereld at mcs.anl.gov > > Senior Fellow - Computation Institute > Experimental Systems Engineer - Mathematics and Computer Science > Visualization and Analysis Lead - Argonne Leadership Computing > Facility > Argonne National Laboratory > The University of Chicago > > > Cell: 630.327.2088 > Voice: 630.252.4170 -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From hereld at mcs.anl.gov Tue Nov 22 14:15:58 2011 From: hereld at mcs.anl.gov (Mark Hereld) Date: Tue, 22 Nov 2011 14:15:58 -0600 Subject: [Swift-devel] arrays of files In-Reply-To: <1793826491.51609.1321992806080.JavaMail.root@zimbra.anl.gov> References: <1793826491.51609.1321992806080.JavaMail.root@zimbra.anl.gov> Message-ID: oh! that's good! thanks. -- mark On Nov 22, 2011, at 2:13 PM, Michael Wilde wrote: > Replace your prior implementation of bar() with this, and you then have a reall array solution (workaround) until the @filenames() bug is fixed: > > app ( file result ) barapp ( file files[], string s[]) > { > ls "-lt" s stdout=@result; > } > > (file result) bar (file files[] ) > { > string s[]; > foreach f, i in files { > s[i] = @filename(f); > } > result = barapp( files, s ); > } > > - Mike > > ----- Original Message ----- >> From: "Mark Hereld" >> To: "Michael Wilde" >> Cc: "swift-devel" , "Thomas D. Uram" >> Sent: Tuesday, November 22, 2011 1:07:32 PM >> Subject: Re: arrays of files >> thanks mike! only interested in the array solution. collecting >> generic patterns that can be easily reused for wide number of >> situations. gpsi. -- mark >> >> >> >> On Nov 22, 2011, at 12:05 PM, Michael Wilde wrote: >> >> >> >> Yup, thanks, Mark - its a bug. >> >> if you change the body of bar to read: >> >> ls "-lt" @filename(files[0]) @filename(files[1]) stdout=@result; >> >> instead of: >> >> ls "-lt" @filenames(files) stdout=@result; >> >> then it works. >> >> I'll file a bugzilla ticket on this. David, can you add to the test >> suite? Mihael, can you fix it? >> >> Thanks, >> >> - Mike >> >> >> ----- Original Message ----- >> >> >> From: "Mark Hereld" < hereld at mcs.anl.gov > >> >> >> To: "Michael Wilde" < wilde at mcs.anl.gov > >> >> >> Cc: "Thomas D. Uram" < turam at mcs.anl.gov > >> >> >> Sent: Tuesday, November 22, 2011 11:48:17 AM >> >> >> Subject: arrays of files >> >> >> Mike, >> >> >> >> >> >> >> >> >> this swift seems to compile without complaint but breaks down >> >> >> with a null pointer at the last line (invocation of "bar"). i think >> >> >> i'm >> >> >> confused about arrays of file objects. ideas? >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> type file; app ( file result ) foo ( string args[] ) { ls "-lt" args >> >> >> stdout=@result; } app ( file result ) bar ( file files[] ) { ls "-lt" >> >> >> @filenames(files) stdout=@result; } file out <"dirlisting.txt">; out = >> >> >> foo([".",".."]); file out2 <"nother.txt">; out2 = foo(["/"]); file >> >> >> out3 <"proof.txt">; out3 = bar([out,out2]); >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> ------------------------------------------------------- >> >> >> Mark Hereld < hereld at mcs.anl.gov > >> >> >> Senior Fellow - Computation Institute >> >> >> Experimental Systems Engineer - Mathematics and Computer Science >> >> >> Visualization and Analysis Lead - Argonne Leadership Computing >> >> >> Facility >> >> >> Argonne National Laboratory >> >> >> The University of Chicago >> >> >> >> >> >> >> >> >> Cell: 630.327.2088 >> >> >> Voice: 630.252.4170 >> >> -- >> Michael Wilde >> Computation Institute, University of Chicago >> Mathematics and Computer Science Division >> Argonne National Laboratory >> >> >> >> >> >> >> >> >> >> >> ------------------------------------------------------- >> Mark Hereld < hereld at mcs.anl.gov > >> Senior Fellow - Computation Institute >> Experimental Systems Engineer - Mathematics and Computer Science >> Visualization and Analysis Lead - Argonne Leadership Computing >> Facility >> Argonne National Laboratory >> The University of Chicago >> >> >> Cell: 630.327.2088 >> Voice: 630.252.4170 > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > ------------------------------------------------------- Mark Hereld Senior Fellow - Computation Institute Experimental Systems Engineer - Mathematics and Computer Science Visualization and Analysis Lead - Argonne Leadership Computing Facility Argonne National Laboratory The University of Chicago Cell: 630.327.2088 Voice: 630.252.4170 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Tue Nov 22 19:27:17 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Tue, 22 Nov 2011 19:27:17 -0600 Subject: [Swift-devel] Error while parsing XML In-Reply-To: References: <186446546.25524.1321826712551.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: David, I tested your fix and am getting another kind of error message which seems related to the previous issue: Progress: time: Tue, 22 Nov 2011 18:35:27 -0600 Submitted:1 Active:1 Finished successfully:4 Progress: time: Tue, 22 Nov 2011 18:35:57 -0600 Submitted:1 Active:1 Finished successfully:4 Progress: time: Tue, 22 Nov 2011 18:36:01 -0600 Active:2 Finished successfully:4 Failed to transfer wrapper log for job simsgt-ef6si5jk Exception in simsgt: Arguments: [TEST, processed/gridout_TEST, processed/TEST.modelbox, processed/TEST.fdloc, processed/TEST.cordfile, y] Host: SGE Directory: presgt-20111122-1716-alcaldyb/jobs/e/simsgt-ef6si5jk stderr.txt: stdout.txt: ---- Caused by: java.util.NoSuchElementException Exception in simsgt: Arguments: [TEST, processed/gridout_TEST, processed/TEST.modelbox, processed/TEST.fdloc, processed/TEST.cordfile, x] Host: SGE Directory: presgt-20111122-1716-alcaldyb/jobs/f/simsgt-ff6si5jk stderr.txt: stdout.txt: ---- Caused by: java.util.NoSuchElementException Final status: time: Tue, 22 Nov 2011 18:36:01 -0600 Initializing:2 Failed:2 Finished successfully:4 The following errors have occurred: 1. Application mergesgt not executed due to errors in dependencies (2 times) 2. java.util.NoSuchElementException (2 times) One thing to note here is that this error occured as soon as both of the Swift tasks became active. The complete log is here: http://www.mcs.anl.gov/~ketan/presgt-20111122-1303-btq6sng1.log Regards, Ketan On Mon, Nov 21, 2011 at 8:35 AM, Ketan Maheshwari < ketancmaheshwari at gmail.com> wrote: > David, > > Yes, I see this happening everytime more than one task is active. I looked > up the web about this and this page seems to be relavant: > > > http://www.danielschneller.com/2008/01/saxparseexception-1-1-premature-end-of.html > > It seems that while reusing the xml stream for probing the job status, the > error occurs. The above article suggests to make a copy of each task-status > stream and read from it. > > If you are around today, let's look into it together. > > Regards, > Ketan > > On Sun, Nov 20, 2011 at 4:05 PM, David Kelly wrote: > >> >> I haven't run into this one yet, but I will take a look at this and the >> other SGE issues tomorrow. Does this happen every time? >> >> David >> >> ----- Original Message ----- >> > From: "Ketan Maheshwari" >> > To: "Swift Devel" >> > Sent: Sunday, November 20, 2011 11:02:02 AM >> > Subject: [Swift-devel] Error while parsing XML >> > I am seeing this error message at runtime on the SCEC workflow: >> > >> > >> > >> > Progress: time: Sun, 20 Nov 2011 10:20:51 -0600 Active:2 Finished >> > successfully:4 >> > [Fatal Error] :-1:-1: Premature end of file. >> > org.xml.sax.SAXParseException: Premature end of file. >> > at org.apache.xerces.parsers.DOMParser.parse(Unknown Source) >> > at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source) >> > at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:124) >> > at >> > >> org.globus.cog.abstraction.impl.scheduler.sge.QueuePoller.processStdout(QueuePoller.java:191) >> > at >> > >> org.globus.cog.abstraction.impl.scheduler.common.AbstractQueuePoller.pollQueue(AbstractQueuePoller.java:170) >> > at >> > >> org.globus.cog.abstraction.impl.scheduler.common.AbstractQueuePoller.run(AbstractQueuePoller.java:82) >> > at java.lang.Thread.run(Thread.java:619) >> > Progress: time: Sun, 20 Nov 2011 10:21:02 -0600 Active:1 Finished >> > successfully:4 Failed but can retry:1 >> > Exception in simsgt: >> > Arguments: [TEST, processed/gridout_TEST, processed/TEST.modelbox, >> > processed/TEST.fdloc, processed/TEST.cordfile, x] >> > Host: SGE >> > Directory: presgt-20111120-0852-n6cze3l8/jobs/d/simsgt-dlogn1jk >> > stderr.txt: >> > stdout.txt: >> > >> > >> > ---- >> > >> > >> > Caused by: java.io.IOException: Error while parsing XML >> > >> > >> > >> > >> > Exception in simsgt: >> > Arguments: [TEST, processed/gridout_TEST, processed/TEST.modelbox, >> > processed/TEST.fdloc, processed/TEST.cordfile, y] >> > Host: SGE >> > Directory: presgt-20111120-0852-n6cze3l8/jobs/e/simsgt-elogn1jk >> > stderr.txt: >> > stdout.txt: >> > >> > >> > ---- >> > >> > >> > Caused by: java.io.IOException: Error while parsing XML >> > >> > >> > >> > >> > Final status: time: Sun, 20 Nov 2011 10:21:02 -0600 Failed:2 Finished >> > successfully:4 >> > The following errors have occurred: >> > 1. java.io.IOException: Error while parsing XML (2 times) >> > >> > >> > >> > >> > The log is: >> > http://www.mcs.anl.gov/~ketan/presgt-20111120-0852-n6cze3l8.log >> > >> > >> > It shows the same error message at the end of the log. Does anyone >> > else see this? >> > >> > >> > >> > -- >> > Ketan >> > >> > >> > >> > _______________________________________________ >> > Swift-devel mailing list >> > Swift-devel at ci.uchicago.edu >> > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> > > > > -- > Ketan > > > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From lgadelha at lncc.br Wed Nov 23 12:27:03 2011 From: lgadelha at lncc.br (Luiz Gadelha) Date: Wed, 23 Nov 2011 16:27:03 -0200 Subject: [Swift-devel] SGE's qstat output parsing failure Message-ID: Hi, There is a small bug in the parsing of SGE's qstat output lines. For the following qstat output: job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------------------------------------- 543309 0.50500 null lgadelha r 11/22/2011 14:40:38 linux.q at cmm11bl5.hpc.lncc.br 1 543310 0.50500 null lgadelha r 11/22/2011 14:40:38 linux.q at cmm11bl5.hpc.lncc.br 1 ... QueuePoller incorrectly assumes that the job identifier is in the same position as "job-ID" in the header, which is not the case above, where it finds a space character and returns an empty string as the job identifier, Swift fails with a "Failed to parse qstat line" message. The attached log file is from a simple test script run that presents this problem. The attached patch fixed the problem in my environment. Regards, Luiz -- Luiz Gadelha http://www.lncc.br/~lgadelha -------------- next part -------------- A non-text attachment was scrubbed... Name: openssl-20111122-1725-07cmqx2a.log.gz Type: application/x-gzip Size: 63414 bytes Desc: not available URL: -------------- next part -------------- -------------- next part -------------- A non-text attachment was scrubbed... Name: sge_fix.diff Type: application/octet-stream Size: 3848 bytes Desc: not available URL: -------------- next part -------------- From lgadelha at lncc.br Wed Nov 23 12:46:25 2011 From: lgadelha at lncc.br (Luiz Gadelha) Date: Wed, 23 Nov 2011 16:46:25 -0200 Subject: [Swift-devel] SGE's qstat output parsing failure In-Reply-To: References: Message-ID: <63526224-3071-495B-8944-01ECC8DCC038@lncc.br> I attached the qstat output as a text file, it's easier too see its format there than in the previous message body. On Nov 23, 2011, at 4:27 PM, Luiz Gadelha wrote: > Hi, > > There is a small bug in the parsing of SGE's qstat output lines. For the following qstat output: > > job-ID prior name user state submit/start at queue slots ja-task-ID > ----------------------------------------------------------------------------------------------------------------- > 543309 0.50500 null lgadelha r 11/22/2011 14:40:38 linux.q at cmm11bl5.hpc.lncc.br 1 > 543310 0.50500 null lgadelha r 11/22/2011 14:40:38 linux.q at cmm11bl5.hpc.lncc.br 1 > ... > > QueuePoller incorrectly assumes that the job identifier is in the same position as "job-ID" in the header, which is not the case above, where it finds a space character and returns an empty string as the job identifier, Swift fails with a "Failed to parse qstat line" message. The attached log file is from a simple test script run that presents this problem. > > The attached patch fixed the problem in my environment. > > Regards, > > Luiz > > -- > Luiz Gadelha > http://www.lncc.br/~lgadelha > > > > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Luiz Gadelha http://www.lncc.br/~lgadelha -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: qstat_output.txt URL: -------------- next part -------------- From davidk at ci.uchicago.edu Wed Nov 23 12:59:49 2011 From: davidk at ci.uchicago.edu (David Kelly) Date: Wed, 23 Nov 2011 12:59:49 -0600 (CST) Subject: [Swift-devel] SGE's qstat output parsing failure In-Reply-To: Message-ID: <1776167193.30586.1322074789188.JavaMail.root@zimbra-mb2.anl.gov> Thanks Luiz. I've seen some similar issues with 0.92 when parsing qstat output. qstat output tends to change between versions which can cause parsing problems like this. For 0.93, we will be using the XML output to help standardize things a bit. Regards, David ----- Original Message ----- > From: "Luiz Gadelha" > To: "Swift Devel" > Sent: Wednesday, November 23, 2011 12:27:03 PM > Subject: [Swift-devel] SGE's qstat output parsing failure > Hi, > > There is a small bug in the parsing of SGE's qstat output lines. For > the following qstat output: > > job-ID prior name user state submit/start at queue slots ja-task-ID > ----------------------------------------------------------------------------------------------------------------- > 543309 0.50500 null lgadelha r 11/22/2011 14:40:38 > linux.q at cmm11bl5.hpc.lncc.br 1 > 543310 0.50500 null lgadelha r 11/22/2011 14:40:38 > linux.q at cmm11bl5.hpc.lncc.br 1 > ... > > QueuePoller incorrectly assumes that the job identifier is in the same > position as "job-ID" in the header, which is not the case above, where > it finds a space character and returns an empty string as the job > identifier, Swift fails with a "Failed to parse qstat line" message. > The attached log file is from a simple test script run that presents > this problem. > > The attached patch fixed the problem in my environment. > > Regards, > > Luiz > > -- > Luiz Gadelha > http://www.lncc.br/~lgadelha > > > > > > > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From wilde at mcs.anl.gov Wed Nov 23 14:43:24 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 23 Nov 2011 14:43:24 -0600 (CST) Subject: [Swift-devel] SGE's qstat output parsing failure In-Reply-To: <1776167193.30586.1322074789188.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: <1459175445.55020.1322081004980.JavaMail.root@zimbra.anl.gov> Luiz, O wonder: can you switch to using the latest release candidate (RC) of 0.93? Or better yet, build 0.93 form the latest rev in the 0.93 SVN branch, to pick up David's latest fixes to the SGE provider? David is currently working on fixing problems that are affecting us and other users on Ranger (an SGE machine at TACC) and it would be good if we can verify that these latest fixes to the SGE provider work for your installation as well. Thanks, - Mike ----- Original Message ----- > From: "David Kelly" > To: "Luiz Gadelha" > Cc: "Swift Devel" > Sent: Wednesday, November 23, 2011 12:59:49 PM > Subject: Re: [Swift-devel] SGE's qstat output parsing failure > Thanks Luiz. I've seen some similar issues with 0.92 when parsing > qstat output. qstat output tends to change between versions which can > cause parsing problems like this. For 0.93, we will be using the XML > output to help standardize things a bit. > > Regards, > David > > ----- Original Message ----- > > From: "Luiz Gadelha" > > To: "Swift Devel" > > Sent: Wednesday, November 23, 2011 12:27:03 PM > > Subject: [Swift-devel] SGE's qstat output parsing failure > > Hi, > > > > There is a small bug in the parsing of SGE's qstat output lines. For > > the following qstat output: > > > > job-ID prior name user state submit/start at queue slots ja-task-ID > > ----------------------------------------------------------------------------------------------------------------- > > 543309 0.50500 null lgadelha r 11/22/2011 14:40:38 > > linux.q at cmm11bl5.hpc.lncc.br 1 > > 543310 0.50500 null lgadelha r 11/22/2011 14:40:38 > > linux.q at cmm11bl5.hpc.lncc.br 1 > > ... > > > > QueuePoller incorrectly assumes that the job identifier is in the > > same > > position as "job-ID" in the header, which is not the case above, > > where > > it finds a space character and returns an empty string as the job > > identifier, Swift fails with a "Failed to parse qstat line" message. > > The attached log file is from a simple test script run that presents > > this problem. > > > > The attached patch fixed the problem in my environment. > > > > Regards, > > > > Luiz > > > > -- > > Luiz Gadelha > > http://www.lncc.br/~lgadelha > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From lgadelha at lncc.br Thu Nov 24 04:53:10 2011 From: lgadelha at lncc.br (Luiz Gadelha) Date: Thu, 24 Nov 2011 08:53:10 -0200 Subject: [Swift-devel] SGE's qstat output parsing failure In-Reply-To: <1459175445.55020.1322081004980.JavaMail.root@zimbra.anl.gov> References: <1459175445.55020.1322081004980.JavaMail.root@zimbra.anl.gov> Message-ID: It worked fine with the 0.93RC5 binary distribution. I compiled Swift from SVN but I still get the parsing failure error, probably I'm not getting the right cog sources. I got cog and swift from: https://cogkit.svn.sourceforge.net/svnroot/cogkit/tags/swift-0.93 https://svn.ci.uchicago.edu/svn/vdl2/branches/release-0.93 I tried https://svn.ci.uchicago.edu/svn/vdl2/branches/release-0.93.1 also but it gave me the same error. Regards, Luiz On Nov 23, 2011, at 6:43 PM, Michael Wilde wrote: > Luiz, O wonder: can you switch to using the latest release candidate (RC) of 0.93? Or better yet, build 0.93 form the latest rev in the 0.93 SVN branch, to pick up David's latest fixes to the SGE provider? > > David is currently working on fixing problems that are affecting us and other users on Ranger (an SGE machine at TACC) and it would be good if we can verify that these latest fixes to the SGE provider work for your installation as well. > > Thanks, > > - Mike > > > ----- Original Message ----- >> From: "David Kelly" >> To: "Luiz Gadelha" >> Cc: "Swift Devel" >> Sent: Wednesday, November 23, 2011 12:59:49 PM >> Subject: Re: [Swift-devel] SGE's qstat output parsing failure >> Thanks Luiz. I've seen some similar issues with 0.92 when parsing >> qstat output. qstat output tends to change between versions which can >> cause parsing problems like this. For 0.93, we will be using the XML >> output to help standardize things a bit. >> >> Regards, >> David >> >> ----- Original Message ----- >>> From: "Luiz Gadelha" >>> To: "Swift Devel" >>> Sent: Wednesday, November 23, 2011 12:27:03 PM >>> Subject: [Swift-devel] SGE's qstat output parsing failure >>> Hi, >>> >>> There is a small bug in the parsing of SGE's qstat output lines. For >>> the following qstat output: >>> >>> job-ID prior name user state submit/start at queue slots ja-task-ID >>> ----------------------------------------------------------------------------------------------------------------- >>> 543309 0.50500 null lgadelha r 11/22/2011 14:40:38 >>> linux.q at cmm11bl5.hpc.lncc.br 1 >>> 543310 0.50500 null lgadelha r 11/22/2011 14:40:38 >>> linux.q at cmm11bl5.hpc.lncc.br 1 >>> ... >>> >>> QueuePoller incorrectly assumes that the job identifier is in the >>> same >>> position as "job-ID" in the header, which is not the case above, >>> where >>> it finds a space character and returns an empty string as the job >>> identifier, Swift fails with a "Failed to parse qstat line" message. >>> The attached log file is from a simple test script run that presents >>> this problem. >>> >>> The attached patch fixed the problem in my environment. >>> >>> Regards, >>> >>> Luiz >>> >>> -- >>> Luiz Gadelha >>> http://www.lncc.br/~lgadelha >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> _______________________________________________ >>> Swift-devel mailing list >>> Swift-devel at ci.uchicago.edu >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > -- Luiz Gadelha http://www.lncc.br/~lgadelha From hategan at mcs.anl.gov Thu Nov 24 15:05:44 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 24 Nov 2011 13:05:44 -0800 Subject: [Swift-devel] arrays of files In-Reply-To: <1793826491.51609.1321992806080.JavaMail.root@zimbra.anl.gov> References: <1793826491.51609.1321992806080.JavaMail.root@zimbra.anl.gov> Message-ID: <1322168744.6477.3.camel@blabla> This is something that should be detected at compile time. @filename and friends accept any type of argument, but it's clear that only types that have at least one non-primitive component are valid. I think for 0.93 run-time detection is fine, but the long term solution is a static check. Mihael On Tue, 2011-11-22 at 14:13 -0600, Michael Wilde wrote: > Replace your prior implementation of bar() with this, and you then have a reall array solution (workaround) until the @filenames() bug is fixed: > > app ( file result ) barapp ( file files[], string s[]) > { > ls "-lt" s stdout=@result; > } > > (file result) bar (file files[] ) > { > string s[]; > foreach f, i in files { > s[i] = @filename(f); > } > result = barapp( files, s ); > } > > - Mike > > ----- Original Message ----- > > From: "Mark Hereld" > > To: "Michael Wilde" > > Cc: "swift-devel" , "Thomas D. Uram" > > Sent: Tuesday, November 22, 2011 1:07:32 PM > > Subject: Re: arrays of files > > thanks mike! only interested in the array solution. collecting > > generic patterns that can be easily reused for wide number of > > situations. gpsi. -- mark > > > > > > > > On Nov 22, 2011, at 12:05 PM, Michael Wilde wrote: > > > > > > > > Yup, thanks, Mark - its a bug. > > > > if you change the body of bar to read: > > > > ls "-lt" @filename(files[0]) @filename(files[1]) stdout=@result; > > > > instead of: > > > > ls "-lt" @filenames(files) stdout=@result; > > > > then it works. > > > > I'll file a bugzilla ticket on this. David, can you add to the test > > suite? Mihael, can you fix it? > > > > Thanks, > > > > - Mike > > > > > > ----- Original Message ----- > > > > > > From: "Mark Hereld" < hereld at mcs.anl.gov > > > > > > > To: "Michael Wilde" < wilde at mcs.anl.gov > > > > > > > Cc: "Thomas D. Uram" < turam at mcs.anl.gov > > > > > > > Sent: Tuesday, November 22, 2011 11:48:17 AM > > > > > > Subject: arrays of files > > > > > > Mike, > > > > > > > > > > > > > > > > > > this swift seems to compile without complaint but breaks down > > > > > > with a null pointer at the last line (invocation of "bar"). i think > > > > > > i'm > > > > > > confused about arrays of file objects. ideas? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > type file; app ( file result ) foo ( string args[] ) { ls "-lt" args > > > > > > stdout=@result; } app ( file result ) bar ( file files[] ) { ls "-lt" > > > > > > @filenames(files) stdout=@result; } file out <"dirlisting.txt">; out = > > > > > > foo([".",".."]); file out2 <"nother.txt">; out2 = foo(["/"]); file > > > > > > out3 <"proof.txt">; out3 = bar([out,out2]); > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------- > > > > > > Mark Hereld < hereld at mcs.anl.gov > > > > > > > Senior Fellow - Computation Institute > > > > > > Experimental Systems Engineer - Mathematics and Computer Science > > > > > > Visualization and Analysis Lead - Argonne Leadership Computing > > > > > > Facility > > > > > > Argonne National Laboratory > > > > > > The University of Chicago > > > > > > > > > > > > > > > > > > Cell: 630.327.2088 > > > > > > Voice: 630.252.4170 > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > > > > > > > > > > > > > > > > > > > ------------------------------------------------------- > > Mark Hereld < hereld at mcs.anl.gov > > > Senior Fellow - Computation Institute > > Experimental Systems Engineer - Mathematics and Computer Science > > Visualization and Analysis Lead - Argonne Leadership Computing > > Facility > > Argonne National Laboratory > > The University of Chicago > > > > > > Cell: 630.327.2088 > > Voice: 630.252.4170 > From iraicu at cs.iit.edu Fri Nov 25 16:20:22 2011 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Fri, 25 Nov 2011 16:20:22 -0600 Subject: [Swift-devel] Fwd: CFP: SWEET'12 Scalable Workflow Enactment Engines and Technologies In-Reply-To: <64AFFC51-9640-4D80-B548-A87A0E4EA7BF@TUDELFT.NL> References: <64AFFC51-9640-4D80-B548-A87A0E4EA7BF@TUDELFT.NL> Message-ID: <4ED014A6.6080008@cs.iit.edu> Hi all, This seems interesting and relevant. Cheers, Ioan -------- Original Message -------- Subject: CFP: SWEET'12 Scalable Workflow Enactment Engines and Technologies Date: Fri, 25 Nov 2011 22:58:42 +0100 From: Jan Hidders To: iraicu at cs.iit.edu Dear Ioan Raicu, We would like to invite you to submit a paper to SWEET'12, the 1st International Workshop on Scalable Workflow Enactment Engines and Technologies. For more information read the CFP below. We also would like to ask you to forward this information to those potentially interested in making a submission. On behalf of the organizers of SWEET'12, Jan Hidders Jacek Sroka Paolo Missier --------------- * Call for Papers * SWEET'12 1st International Workshop on Scalable Workflow Enactment Engines and Technologies http://sites.google.com/site/sweetworkshop2012 inquiries: sweet2012 at easychair.org Held in conjunction with SIGMOD 2012 Scottsdale, Arizona, USA, May 20, 2012 http://www.sigmod.org/2012/ ---------------- IMPORTANT DATES: ---------------- Papers submission deadline: February, 19th, 2012 Authors notification: April 8th Deadline for camera-ready copy: May 13th Workshop: May 20 ----- FOCUS ----- The goal of the workshop is to bring together researchers and practitioners to explore the potential of cloud-based computing in facilitating the convergence between workflows and large-scale data processing. Concretely, the workshop is expected to provide insight into: - performance issues: efficient data processing using cloud-based workflows, - modelling issues: best practices in data-intensive workflow modelling and enactment, - support technology issues: how the potential synergy between large- scale data processing and workflow technology can be exploited in a principled way. The workshop aims to address issues of (i) Architecture, (ii) Models and Languages, (iii) Applications of cloud-based workflows. Specific topics include (but, as usual, are not limited to): Architectures: + cloud-based, scalable workflow enactment architectures, + efficient data storage for data-intensive workflows, + optimizing execution of data-intensive workflows, + workflow scheduling in cloud computing. Models, Languages: + languages for data-intensive workflows, data processing pipelines and data-mashups, + verification and validation of data-intensive workflows, + programming models for cloud computing, + access control and authorisation models, privacy, security, risk and trust issues, + workflow patterns for data-intensive workflows. Applications of cloud-based workflow: + bioinformatics, + data mashups, + semantic web data management, + big data analytics. ---------------- SUBMISSION GUIDELINES ---------------- We invite full research or experience papers (up to 12 pages), or short papers (up to 6 pages) describing research in progress, formatted using the ACM proceedings style (http://www.acm.org/sigs/publications/proceedings-templates ) ---------------- PUBLICATION ---------------- The workshop proceedings will be part published by CEUR and will be included in the ACM DL. In addition, we have an agreement with the Fundamenta Informaticae journal to fast-track a few selected paper for further publication. --------------------------- KEYNOTE --------------------------- Dr. Pawel Garbacki from Google Inc.: "Data Processing at Scale" --------------------------- CHAIRS --------------------------- Jan Hidders, TU Delft, The Netherlands Jacek Sroka University of Warsaw, Poland Paolo Missier, Newcastle University, UK --------------------------- Program Committee --------------------------- Sarah Cohen-Boulakia, LRI, Universite Paris-Sud, France Juliana Freire, NYU Poly, USA Khalid Belhajjame, University of Manchester, UK Vasa Curcin, Imperial college, London, UK Paul Groth, VU University Amsterdam, NL Paul Watson, Newcastle University, UK Hugo Hiden, Newcastle University, UK Matthew Jones, University of California Santa Barbara, USA Bertram Ludaescher, UC Davis, USA Marta Mattoso, COPPE- Federal Univ. Rio de Janeiro, Brasil Norman Paton, University of Manchester, UK Jelena Pjesivac-Grbovic, Google, USA Benjamin Reed, Yahoo! Research Yogesh Simmhan, University of Southern California, USA Krzysztof Stencel, University of Warsaw, Poland Wei Tan, J.T. Watson IBM Research, USA Giovanni Tummarello, DERI, National University of Ireland Galway, Ireland Jerzy Tyszkiewicz, Institute of Informatics, Warsaw University, PL Jan Van Den Bussche, Hasselt University& Transnational University of Limburg, Belgium Aad Van Moorsel, Newcastle University, UK, USA Simon Woodman, Newcastle University, UK Suraj Pandey, University of Melbourne, Australia -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Fri Nov 25 19:52:26 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 25 Nov 2011 19:52:26 -0600 (CST) Subject: [Swift-devel] [Swift-commit] r5316 - in branches/release-0.93/src/org/griphyn/vdl: karajan/lib mapping In-Reply-To: <20111124211852.42DAC9CCA2@svn.ci.uchicago.edu> Message-ID: <460741880.56941.1322272346057.JavaMail.root@zimbra.anl.gov> Mihael, this does not fix the bug that I reported in bug 644, but it does seem to clarify the cause: the test case in bug 644 now gives the message: ?:file[2] - Closed is not a mapped type @filenames() should only be looking at array elements [0] and [1] - seems to me that its going beyond the assigned elements of the array to complain about [2] not being mapped. If I set element [2] it complains about element [3] in the error message above. Did you try the test case? That case should not generate an error - I believe that its a valid program. if you replace @filenames() with two calls to @filename() it works. Ie all elements of the array passed to @filenames() are properly mapped. - Mike ----- Original Message ----- > From: hategan at ci.uchicago.edu > To: swift-commit at ci.uchicago.edu > Sent: Thursday, November 24, 2011 3:18:52 PM > Subject: [Swift-commit] r5316 - in branches/release-0.93/src/org/griphyn/vdl: karajan/lib mapping > Author: hategan > Date: 2011-11-24 15:18:51 -0600 (Thu, 24 Nov 2011) > New Revision: 5316 > > Modified: > branches/release-0.93/src/org/griphyn/vdl/karajan/lib/VDLFunction.java > branches/release-0.93/src/org/griphyn/vdl/mapping/AbstractDataNode.java > branches/release-0.93/src/org/griphyn/vdl/mapping/RootArrayDataNode.java > branches/release-0.93/src/org/griphyn/vdl/mapping/RootDataNode.java > Log: > throw better exception than NPE when @filename is called with a > non-mapped argument > > Modified: > branches/release-0.93/src/org/griphyn/vdl/karajan/lib/VDLFunction.java > =================================================================== > --- > branches/release-0.93/src/org/griphyn/vdl/karajan/lib/VDLFunction.java > 2011-11-23 19:33:26 UTC (rev 5315) > +++ > branches/release-0.93/src/org/griphyn/vdl/karajan/lib/VDLFunction.java > 2011-11-24 21:18:51 UTC (rev 5316) > @@ -197,8 +197,17 @@ > > private static String[] leavesFileNames(DSHandle var) throws > ExecutionException, HandleOpenException { > Mapper mapper; > + > synchronized (var.getRoot()) { > - mapper = var.getMapper(); > + if (var instanceof AbstractDataNode) { > + mapper = ((AbstractDataNode) var).getActualMapper(); > + if (mapper == null) { > + throw new ExecutionException(var + " is not a mapped type"); > + } > + } > + else { > + mapper = var.getMapper(); > + } > } > List l = new ArrayList(); > try { > > Modified: > branches/release-0.93/src/org/griphyn/vdl/mapping/AbstractDataNode.java > =================================================================== > --- > branches/release-0.93/src/org/griphyn/vdl/mapping/AbstractDataNode.java > 2011-11-23 19:33:26 UTC (rev 5315) > +++ > branches/release-0.93/src/org/griphyn/vdl/mapping/AbstractDataNode.java > 2011-11-24 21:18:51 UTC (rev 5316) > @@ -513,7 +513,7 @@ > } > } > > - protected Mapper getActualMapper() { > + public Mapper getActualMapper() { > return null; > } > > > Modified: > branches/release-0.93/src/org/griphyn/vdl/mapping/RootArrayDataNode.java > =================================================================== > --- > branches/release-0.93/src/org/griphyn/vdl/mapping/RootArrayDataNode.java > 2011-11-23 19:33:26 UTC (rev 5315) > +++ > branches/release-0.93/src/org/griphyn/vdl/mapping/RootArrayDataNode.java > 2011-11-24 21:18:51 UTC (rev 5316) > @@ -103,7 +103,7 @@ > throw new > FutureNotYetAvailable(waitingMapperParam.getFutureWrapper()); > } > > - protected Mapper getActualMapper() { > + public Mapper getActualMapper() { > return mapper; > } > > > Modified: > branches/release-0.93/src/org/griphyn/vdl/mapping/RootDataNode.java > =================================================================== > --- > branches/release-0.93/src/org/griphyn/vdl/mapping/RootDataNode.java > 2011-11-23 19:33:26 UTC (rev 5315) > +++ > branches/release-0.93/src/org/griphyn/vdl/mapping/RootDataNode.java > 2011-11-24 21:18:51 UTC (rev 5316) > @@ -213,7 +213,7 @@ > throw new > FutureNotYetAvailable(waitingMapperParam.getFutureWrapper()); > } > > - protected Mapper getActualMapper() { > + public Mapper getActualMapper() { > return mapper; > } > > > _______________________________________________ > Swift-commit mailing list > Swift-commit at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-commit -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From wilde at mcs.anl.gov Fri Nov 25 19:59:19 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 25 Nov 2011 19:59:19 -0600 (CST) Subject: [Swift-devel] [Swift-commit] r5316 - in branches/release-0.93/src/org/griphyn/vdl: karajan/lib mapping In-Reply-To: <460741880.56941.1322272346057.JavaMail.root@zimbra.anl.gov> Message-ID: <961553259.56945.1322272759196.JavaMail.root@zimbra.anl.gov> Actually, looking at the error message more closely I am not sure my comment below is correct. The message may be complaining about the array object as a whole, and the [2] may be stating the number of elements (the array does indeed have 2 elements) as opposed to a member of the array that it believes is not mapped. I need to experiment further. Do you see an error in the test case? - Mike ----- Original Message ----- > From: "Michael Wilde" > To: hategan at ci.uchicago.edu > Cc: "Swift Devel" > Sent: Friday, November 25, 2011 7:52:26 PM > Subject: Re: [Swift-devel] [Swift-commit] r5316 - in branches/release-0.93/src/org/griphyn/vdl: karajan/lib mapping > Mihael, this does not fix the bug that I reported in bug 644, but it > does seem to clarify the cause: the test case in bug 644 now gives the > message: > > ?:file[2] - Closed is not a mapped type > > @filenames() should only be looking at array elements [0] and [1] - > seems to me that its going beyond the assigned elements of the array > to complain about [2] not being mapped. If I set element [2] it > complains about element [3] in the error message above. > > Did you try the test case? That case should not generate an error - I > believe that its a valid program. if you replace @filenames() with two > calls to @filename() it works. Ie all elements of the array passed to > @filenames() are properly mapped. > > - Mike > > > > > ----- Original Message ----- > > From: hategan at ci.uchicago.edu > > To: swift-commit at ci.uchicago.edu > > Sent: Thursday, November 24, 2011 3:18:52 PM > > Subject: [Swift-commit] r5316 - in > > branches/release-0.93/src/org/griphyn/vdl: karajan/lib mapping > > Author: hategan > > Date: 2011-11-24 15:18:51 -0600 (Thu, 24 Nov 2011) > > New Revision: 5316 > > > > Modified: > > branches/release-0.93/src/org/griphyn/vdl/karajan/lib/VDLFunction.java > > branches/release-0.93/src/org/griphyn/vdl/mapping/AbstractDataNode.java > > branches/release-0.93/src/org/griphyn/vdl/mapping/RootArrayDataNode.java > > branches/release-0.93/src/org/griphyn/vdl/mapping/RootDataNode.java > > Log: > > throw better exception than NPE when @filename is called with a > > non-mapped argument > > > > Modified: > > branches/release-0.93/src/org/griphyn/vdl/karajan/lib/VDLFunction.java > > =================================================================== > > --- > > branches/release-0.93/src/org/griphyn/vdl/karajan/lib/VDLFunction.java > > 2011-11-23 19:33:26 UTC (rev 5315) > > +++ > > branches/release-0.93/src/org/griphyn/vdl/karajan/lib/VDLFunction.java > > 2011-11-24 21:18:51 UTC (rev 5316) > > @@ -197,8 +197,17 @@ > > > > private static String[] leavesFileNames(DSHandle var) throws > > ExecutionException, HandleOpenException { > > Mapper mapper; > > + > > synchronized (var.getRoot()) { > > - mapper = var.getMapper(); > > + if (var instanceof AbstractDataNode) { > > + mapper = ((AbstractDataNode) var).getActualMapper(); > > + if (mapper == null) { > > + throw new ExecutionException(var + " is not a mapped type"); > > + } > > + } > > + else { > > + mapper = var.getMapper(); > > + } > > } > > List l = new ArrayList(); > > try { > > > > Modified: > > branches/release-0.93/src/org/griphyn/vdl/mapping/AbstractDataNode.java > > =================================================================== > > --- > > branches/release-0.93/src/org/griphyn/vdl/mapping/AbstractDataNode.java > > 2011-11-23 19:33:26 UTC (rev 5315) > > +++ > > branches/release-0.93/src/org/griphyn/vdl/mapping/AbstractDataNode.java > > 2011-11-24 21:18:51 UTC (rev 5316) > > @@ -513,7 +513,7 @@ > > } > > } > > > > - protected Mapper getActualMapper() { > > + public Mapper getActualMapper() { > > return null; > > } > > > > > > Modified: > > branches/release-0.93/src/org/griphyn/vdl/mapping/RootArrayDataNode.java > > =================================================================== > > --- > > branches/release-0.93/src/org/griphyn/vdl/mapping/RootArrayDataNode.java > > 2011-11-23 19:33:26 UTC (rev 5315) > > +++ > > branches/release-0.93/src/org/griphyn/vdl/mapping/RootArrayDataNode.java > > 2011-11-24 21:18:51 UTC (rev 5316) > > @@ -103,7 +103,7 @@ > > throw new > > FutureNotYetAvailable(waitingMapperParam.getFutureWrapper()); > > } > > > > - protected Mapper getActualMapper() { > > + public Mapper getActualMapper() { > > return mapper; > > } > > > > > > Modified: > > branches/release-0.93/src/org/griphyn/vdl/mapping/RootDataNode.java > > =================================================================== > > --- > > branches/release-0.93/src/org/griphyn/vdl/mapping/RootDataNode.java > > 2011-11-23 19:33:26 UTC (rev 5315) > > +++ > > branches/release-0.93/src/org/griphyn/vdl/mapping/RootDataNode.java > > 2011-11-24 21:18:51 UTC (rev 5316) > > @@ -213,7 +213,7 @@ > > throw new > > FutureNotYetAvailable(waitingMapperParam.getFutureWrapper()); > > } > > > > - protected Mapper getActualMapper() { > > + public Mapper getActualMapper() { > > return mapper; > > } > > > > > > _______________________________________________ > > Swift-commit mailing list > > Swift-commit at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-commit > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From wilde at mcs.anl.gov Fri Nov 25 20:43:00 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 25 Nov 2011 20:43:00 -0600 (CST) Subject: [Swift-devel] [Swift-commit] r5316 - in branches/release-0.93/src/org/griphyn/vdl: karajan/lib mapping In-Reply-To: <961553259.56945.1322272759196.JavaMail.root@zimbra.anl.gov> Message-ID: <713839010.56975.1322275380089.JavaMail.root@zimbra.anl.gov> What seems to be happening here is that when a mapped file variable is assigned to a variable, the mapping is not assigned with it. I need to check, but I think that in older revisions (eg 0.92.1 ?) the mapping *was* assigned along with the value (i.e. the state) of the file variable. Whats happening in this case is that when the array [out,out2] is created, the file-typed members of the constructed array have only default mappings (ie _concurrent/etc) - they have lost their original mappings assigned by the single_file_mapper <"filename">. I *think* this is incorrect behavior, but perhaps that needs discussion. What do you think? - Mike ----- Original Message ----- > From: "Michael Wilde" > To: hategan at ci.uchicago.edu > Cc: "Swift Devel" > Sent: Friday, November 25, 2011 7:59:19 PM > Subject: Re: [Swift-devel] [Swift-commit] r5316 - in branches/release-0.93/src/org/griphyn/vdl: karajan/lib mapping > Actually, looking at the error message more closely I am not sure my > comment below is correct. The message may be complaining about the > array object as a whole, and the [2] may be stating the number of > elements (the array does indeed have 2 elements) as opposed to a > member of the array that it believes is not mapped. > > I need to experiment further. Do you see an error in the test case? > > - Mike > > > ----- Original Message ----- > > From: "Michael Wilde" > > To: hategan at ci.uchicago.edu > > Cc: "Swift Devel" > > Sent: Friday, November 25, 2011 7:52:26 PM > > Subject: Re: [Swift-devel] [Swift-commit] r5316 - in > > branches/release-0.93/src/org/griphyn/vdl: karajan/lib mapping > > Mihael, this does not fix the bug that I reported in bug 644, but it > > does seem to clarify the cause: the test case in bug 644 now gives > > the > > message: > > > > ?:file[2] - Closed is not a mapped type > > > > @filenames() should only be looking at array elements [0] and [1] - > > seems to me that its going beyond the assigned elements of the array > > to complain about [2] not being mapped. If I set element [2] it > > complains about element [3] in the error message above. > > > > Did you try the test case? That case should not generate an error - > > I > > believe that its a valid program. if you replace @filenames() with > > two > > calls to @filename() it works. Ie all elements of the array passed > > to > > @filenames() are properly mapped. > > > > - Mike > > > > > > > > > > ----- Original Message ----- > > > From: hategan at ci.uchicago.edu > > > To: swift-commit at ci.uchicago.edu > > > Sent: Thursday, November 24, 2011 3:18:52 PM > > > Subject: [Swift-commit] r5316 - in > > > branches/release-0.93/src/org/griphyn/vdl: karajan/lib mapping > > > Author: hategan > > > Date: 2011-11-24 15:18:51 -0600 (Thu, 24 Nov 2011) > > > New Revision: 5316 > > > > > > Modified: > > > branches/release-0.93/src/org/griphyn/vdl/karajan/lib/VDLFunction.java > > > branches/release-0.93/src/org/griphyn/vdl/mapping/AbstractDataNode.java > > > branches/release-0.93/src/org/griphyn/vdl/mapping/RootArrayDataNode.java > > > branches/release-0.93/src/org/griphyn/vdl/mapping/RootDataNode.java > > > Log: > > > throw better exception than NPE when @filename is called with a > > > non-mapped argument > > > > > > Modified: > > > branches/release-0.93/src/org/griphyn/vdl/karajan/lib/VDLFunction.java > > > =================================================================== > > > --- > > > branches/release-0.93/src/org/griphyn/vdl/karajan/lib/VDLFunction.java > > > 2011-11-23 19:33:26 UTC (rev 5315) > > > +++ > > > branches/release-0.93/src/org/griphyn/vdl/karajan/lib/VDLFunction.java > > > 2011-11-24 21:18:51 UTC (rev 5316) > > > @@ -197,8 +197,17 @@ > > > > > > private static String[] leavesFileNames(DSHandle var) throws > > > ExecutionException, HandleOpenException { > > > Mapper mapper; > > > + > > > synchronized (var.getRoot()) { > > > - mapper = var.getMapper(); > > > + if (var instanceof AbstractDataNode) { > > > + mapper = ((AbstractDataNode) var).getActualMapper(); > > > + if (mapper == null) { > > > + throw new ExecutionException(var + " is not a mapped type"); > > > + } > > > + } > > > + else { > > > + mapper = var.getMapper(); > > > + } > > > } > > > List l = new ArrayList(); > > > try { > > > > > > Modified: > > > branches/release-0.93/src/org/griphyn/vdl/mapping/AbstractDataNode.java > > > =================================================================== > > > --- > > > branches/release-0.93/src/org/griphyn/vdl/mapping/AbstractDataNode.java > > > 2011-11-23 19:33:26 UTC (rev 5315) > > > +++ > > > branches/release-0.93/src/org/griphyn/vdl/mapping/AbstractDataNode.java > > > 2011-11-24 21:18:51 UTC (rev 5316) > > > @@ -513,7 +513,7 @@ > > > } > > > } > > > > > > - protected Mapper getActualMapper() { > > > + public Mapper getActualMapper() { > > > return null; > > > } > > > > > > > > > Modified: > > > branches/release-0.93/src/org/griphyn/vdl/mapping/RootArrayDataNode.java > > > =================================================================== > > > --- > > > branches/release-0.93/src/org/griphyn/vdl/mapping/RootArrayDataNode.java > > > 2011-11-23 19:33:26 UTC (rev 5315) > > > +++ > > > branches/release-0.93/src/org/griphyn/vdl/mapping/RootArrayDataNode.java > > > 2011-11-24 21:18:51 UTC (rev 5316) > > > @@ -103,7 +103,7 @@ > > > throw new > > > FutureNotYetAvailable(waitingMapperParam.getFutureWrapper()); > > > } > > > > > > - protected Mapper getActualMapper() { > > > + public Mapper getActualMapper() { > > > return mapper; > > > } > > > > > > > > > Modified: > > > branches/release-0.93/src/org/griphyn/vdl/mapping/RootDataNode.java > > > =================================================================== > > > --- > > > branches/release-0.93/src/org/griphyn/vdl/mapping/RootDataNode.java > > > 2011-11-23 19:33:26 UTC (rev 5315) > > > +++ > > > branches/release-0.93/src/org/griphyn/vdl/mapping/RootDataNode.java > > > 2011-11-24 21:18:51 UTC (rev 5316) > > > @@ -213,7 +213,7 @@ > > > throw new > > > FutureNotYetAvailable(waitingMapperParam.getFutureWrapper()); > > > } > > > > > > - protected Mapper getActualMapper() { > > > + public Mapper getActualMapper() { > > > return mapper; > > > } > > > > > > > > > _______________________________________________ > > > Swift-commit mailing list > > > Swift-commit at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-commit > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From wilde at mcs.anl.gov Fri Nov 25 23:00:58 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 25 Nov 2011 23:00:58 -0600 (CST) Subject: [Swift-devel] [Swift-commit] r5316 - in branches/release-0.93/src/org/griphyn/vdl: karajan/lib mapping In-Reply-To: <713839010.56975.1322275380089.JavaMail.root@zimbra.anl.gov> Message-ID: <1305441236.57001.1322283658166.JavaMail.root@zimbra.anl.gov> The test example seems to work in trunk, and in 0.91. Fails in 0.92.1 and in current rev of 0.93. The behavior of assigning mapped variables is mysterious to me; I need to do more testing. In many cases the mapping is lost on assignment, but in this test program, the mapping is somehow retained (or visible?) when the array to which mapped variables has been assigned is passed as an argument to an app() function. - Mike ----- Original Message ----- > From: "Michael Wilde" > To: hategan at ci.uchicago.edu > Cc: "Swift Devel" > Sent: Friday, November 25, 2011 8:43:00 PM > Subject: Re: [Swift-devel] [Swift-commit] r5316 - in branches/release-0.93/src/org/griphyn/vdl: karajan/lib mapping > What seems to be happening here is that when a mapped file variable is > assigned to a variable, the mapping is not assigned with it. > > I need to check, but I think that in older revisions (eg 0.92.1 ?) the > mapping *was* assigned along with the value (i.e. the state) of the > file variable. > > Whats happening in this case is that when the array [out,out2] is > created, the file-typed members of the constructed array have only > default mappings (ie _concurrent/etc) - they have lost their original > mappings assigned by the single_file_mapper <"filename">. > > I *think* this is incorrect behavior, but perhaps that needs > discussion. What do you think? > > - Mike > > ----- Original Message ----- > > From: "Michael Wilde" > > To: hategan at ci.uchicago.edu > > Cc: "Swift Devel" > > Sent: Friday, November 25, 2011 7:59:19 PM > > Subject: Re: [Swift-devel] [Swift-commit] r5316 - in > > branches/release-0.93/src/org/griphyn/vdl: karajan/lib mapping > > Actually, looking at the error message more closely I am not sure my > > comment below is correct. The message may be complaining about the > > array object as a whole, and the [2] may be stating the number of > > elements (the array does indeed have 2 elements) as opposed to a > > member of the array that it believes is not mapped. > > > > I need to experiment further. Do you see an error in the test case? > > > > - Mike > > > > > > ----- Original Message ----- > > > From: "Michael Wilde" > > > To: hategan at ci.uchicago.edu > > > Cc: "Swift Devel" > > > Sent: Friday, November 25, 2011 7:52:26 PM > > > Subject: Re: [Swift-devel] [Swift-commit] r5316 - in > > > branches/release-0.93/src/org/griphyn/vdl: karajan/lib mapping > > > Mihael, this does not fix the bug that I reported in bug 644, but > > > it > > > does seem to clarify the cause: the test case in bug 644 now gives > > > the > > > message: > > > > > > ?:file[2] - Closed is not a mapped type > > > > > > @filenames() should only be looking at array elements [0] and [1] > > > - > > > seems to me that its going beyond the assigned elements of the > > > array > > > to complain about [2] not being mapped. If I set element [2] it > > > complains about element [3] in the error message above. > > > > > > Did you try the test case? That case should not generate an error > > > - > > > I > > > believe that its a valid program. if you replace @filenames() with > > > two > > > calls to @filename() it works. Ie all elements of the array passed > > > to > > > @filenames() are properly mapped. > > > > > > - Mike > > > > > > > > > > > > > > > ----- Original Message ----- > > > > From: hategan at ci.uchicago.edu > > > > To: swift-commit at ci.uchicago.edu > > > > Sent: Thursday, November 24, 2011 3:18:52 PM > > > > Subject: [Swift-commit] r5316 - in > > > > branches/release-0.93/src/org/griphyn/vdl: karajan/lib mapping > > > > Author: hategan > > > > Date: 2011-11-24 15:18:51 -0600 (Thu, 24 Nov 2011) > > > > New Revision: 5316 > > > > > > > > Modified: > > > > branches/release-0.93/src/org/griphyn/vdl/karajan/lib/VDLFunction.java > > > > branches/release-0.93/src/org/griphyn/vdl/mapping/AbstractDataNode.java > > > > branches/release-0.93/src/org/griphyn/vdl/mapping/RootArrayDataNode.java > > > > branches/release-0.93/src/org/griphyn/vdl/mapping/RootDataNode.java > > > > Log: > > > > throw better exception than NPE when @filename is called with a > > > > non-mapped argument > > > > > > > > Modified: > > > > branches/release-0.93/src/org/griphyn/vdl/karajan/lib/VDLFunction.java > > > > =================================================================== > > > > --- > > > > branches/release-0.93/src/org/griphyn/vdl/karajan/lib/VDLFunction.java > > > > 2011-11-23 19:33:26 UTC (rev 5315) > > > > +++ > > > > branches/release-0.93/src/org/griphyn/vdl/karajan/lib/VDLFunction.java > > > > 2011-11-24 21:18:51 UTC (rev 5316) > > > > @@ -197,8 +197,17 @@ > > > > > > > > private static String[] leavesFileNames(DSHandle var) throws > > > > ExecutionException, HandleOpenException { > > > > Mapper mapper; > > > > + > > > > synchronized (var.getRoot()) { > > > > - mapper = var.getMapper(); > > > > + if (var instanceof AbstractDataNode) { > > > > + mapper = ((AbstractDataNode) var).getActualMapper(); > > > > + if (mapper == null) { > > > > + throw new ExecutionException(var + " is not a mapped type"); > > > > + } > > > > + } > > > > + else { > > > > + mapper = var.getMapper(); > > > > + } > > > > } > > > > List l = new ArrayList(); > > > > try { > > > > > > > > Modified: > > > > branches/release-0.93/src/org/griphyn/vdl/mapping/AbstractDataNode.java > > > > =================================================================== > > > > --- > > > > branches/release-0.93/src/org/griphyn/vdl/mapping/AbstractDataNode.java > > > > 2011-11-23 19:33:26 UTC (rev 5315) > > > > +++ > > > > branches/release-0.93/src/org/griphyn/vdl/mapping/AbstractDataNode.java > > > > 2011-11-24 21:18:51 UTC (rev 5316) > > > > @@ -513,7 +513,7 @@ > > > > } > > > > } > > > > > > > > - protected Mapper getActualMapper() { > > > > + public Mapper getActualMapper() { > > > > return null; > > > > } > > > > > > > > > > > > Modified: > > > > branches/release-0.93/src/org/griphyn/vdl/mapping/RootArrayDataNode.java > > > > =================================================================== > > > > --- > > > > branches/release-0.93/src/org/griphyn/vdl/mapping/RootArrayDataNode.java > > > > 2011-11-23 19:33:26 UTC (rev 5315) > > > > +++ > > > > branches/release-0.93/src/org/griphyn/vdl/mapping/RootArrayDataNode.java > > > > 2011-11-24 21:18:51 UTC (rev 5316) > > > > @@ -103,7 +103,7 @@ > > > > throw new > > > > FutureNotYetAvailable(waitingMapperParam.getFutureWrapper()); > > > > } > > > > > > > > - protected Mapper getActualMapper() { > > > > + public Mapper getActualMapper() { > > > > return mapper; > > > > } > > > > > > > > > > > > Modified: > > > > branches/release-0.93/src/org/griphyn/vdl/mapping/RootDataNode.java > > > > =================================================================== > > > > --- > > > > branches/release-0.93/src/org/griphyn/vdl/mapping/RootDataNode.java > > > > 2011-11-23 19:33:26 UTC (rev 5315) > > > > +++ > > > > branches/release-0.93/src/org/griphyn/vdl/mapping/RootDataNode.java > > > > 2011-11-24 21:18:51 UTC (rev 5316) > > > > @@ -213,7 +213,7 @@ > > > > throw new > > > > FutureNotYetAvailable(waitingMapperParam.getFutureWrapper()); > > > > } > > > > > > > > - protected Mapper getActualMapper() { > > > > + public Mapper getActualMapper() { > > > > return mapper; > > > > } > > > > > > > > > > > > _______________________________________________ > > > > Swift-commit mailing list > > > > Swift-commit at ci.uchicago.edu > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-commit > > > > > > -- > > > Michael Wilde > > > Computation Institute, University of Chicago > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From hategan at mcs.anl.gov Sat Nov 26 01:30:34 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 25 Nov 2011 23:30:34 -0800 Subject: [Swift-devel] [Swift-commit] r5316 - in branches/release-0.93/src/org/griphyn/vdl: karajan/lib mapping In-Reply-To: <713839010.56975.1322275380089.JavaMail.root@zimbra.anl.gov> References: <713839010.56975.1322275380089.JavaMail.root@zimbra.anl.gov> Message-ID: <1322292634.19498.7.camel@blabla> On Fri, 2011-11-25 at 20:43 -0600, Michael Wilde wrote: > What seems to be happening here is that when a mapped file variable is > assigned to a variable, the mapping is not assigned with it. > > I need to check, but I think that in older revisions (eg 0.92.1 ?) the > mapping *was* assigned along with the value (i.e. the state) of the > file variable. > > Whats happening in this case is that when the array [out,out2] is > created, the file-typed members of the constructed array have only > default mappings (ie _concurrent/etc) - they have lost their original > mappings assigned by the single_file_mapper <"filename">. I think that the problem is in the way the array constructor works. It is compiled to vdl:createArray() and that function doesn't properly deal with mapping since it was meant to be used with primitive arrays. The theoretically equivalent code: file f[]; f[0] = out; f[1] = out2; should work properly. In terms of the mapping it will follow the standard swift mapped data assignment, which is as follows: 1. if lhs is remappable (e.g. concurrent mapper), set the lhs mapping to that of the rsh 2. if rhs is not remappable but lhs is, set the rhs mapping to that of lhs 2. if rhs nor lhs are remappable, do a copy. What r5316 fixes is code of the form @filename([1, 2, 3]) (i.e. argument is something that has no mapping), which is what I thought the problem was. From hategan at mcs.anl.gov Sat Nov 26 15:33:52 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sat, 26 Nov 2011 13:33:52 -0800 Subject: [Swift-devel] [Swift-commit] r5316 - in branches/release-0.93/src/org/griphyn/vdl: karajan/lib mapping In-Reply-To: <1322292634.19498.7.camel@blabla> References: <713839010.56975.1322275380089.JavaMail.root@zimbra.anl.gov> <1322292634.19498.7.camel@blabla> Message-ID: <1322343232.25245.8.camel@blabla> I guess this has uncovered a fundamental flaw with createArray() (or the way data works in swift). createArray() allows the composition of what otherwise would be data that has no ancestry. Swift has a distinction between data that is part of a structure and data that isn't. The latter is represented by either a RootDataNode or a RootArrayDataNode. These two are special in that they allow mappers to be attached to them and that they assume they are not part of another data structure. Any time a Root*Node is assigned to a structure/array field, copies are made. That works properly (the example below). I think there are two possible solutions: 1. compile [e1, e2, ...] to a := newArray(), a[0] = e1, ... 2. relax the Root*Node restrictions to allow them to be part of another structure and have a composite mapper that delegates to the e1, e2, ... mappers. Thoughts? On Fri, 2011-11-25 at 23:30 -0800, Mihael Hategan wrote: > The theoretically equivalent code: > file f[]; > f[0] = out; f[1] = out2; > > should work properly. > > In terms of the mapping it will follow the standard swift mapped data > assignment, which is as follows: > 1. if lhs is remappable (e.g. concurrent mapper), set the lhs mapping to > that of the rsh > 2. if rhs is not remappable but lhs is, set the rhs mapping to that of > lhs > 2. if rhs nor lhs are remappable, do a copy. > > What r5316 fixes is code of the form @filename([1, 2, 3]) (i.e. argument > is something that has no mapping), which is what I thought the problem > was. > > From hereld at mcs.anl.gov Tue Nov 22 13:07:32 2011 From: hereld at mcs.anl.gov (Mark Hereld) Date: Tue, 22 Nov 2011 13:07:32 -0600 Subject: [Swift-devel] arrays of files In-Reply-To: <427723534.50745.1321985155467.JavaMail.root@zimbra.anl.gov> References: <427723534.50745.1321985155467.JavaMail.root@zimbra.anl.gov> Message-ID: thanks mike! only interested in the array solution. collecting generic patterns that can be easily reused for wide number of situations. gpsi. -- mark On Nov 22, 2011, at 12:05 PM, Michael Wilde wrote: > Yup, thanks, Mark - its a bug. > > if you change the body of bar to read: > > ls "-lt" @filename(files[0]) @filename(files[1]) stdout=@result; > > instead of: > > ls "-lt" @filenames(files) stdout=@result; > > then it works. > > I'll file a bugzilla ticket on this. David, can you add to the test suite? Mihael, can you fix it? > > Thanks, > > - Mike > > > ----- Original Message ----- >> From: "Mark Hereld" >> To: "Michael Wilde" >> Cc: "Thomas D. Uram" >> Sent: Tuesday, November 22, 2011 11:48:17 AM >> Subject: arrays of files >> Mike, >> >> >> this swift seems to compile without complaint but breaks down >> with a null pointer at the last line (invocation of "bar"). i think >> i'm >> confused about arrays of file objects. ideas? >> >> >> >> >> >> >> type file; app ( file result ) foo ( string args[] ) { ls "-lt" args >> stdout=@result; } app ( file result ) bar ( file files[] ) { ls "-lt" >> @filenames(files) stdout=@result; } file out <"dirlisting.txt">; out = >> foo([".",".."]); file out2 <"nother.txt">; out2 = foo(["/"]); file >> out3 <"proof.txt">; out3 = bar([out,out2]); >> >> >> >> >> >> >> >> >> >> >> >> >> ------------------------------------------------------- >> Mark Hereld < hereld at mcs.anl.gov > >> Senior Fellow - Computation Institute >> Experimental Systems Engineer - Mathematics and Computer Science >> Visualization and Analysis Lead - Argonne Leadership Computing >> Facility >> Argonne National Laboratory >> The University of Chicago >> >> >> Cell: 630.327.2088 >> Voice: 630.252.4170 > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > ------------------------------------------------------- Mark Hereld Senior Fellow - Computation Institute Experimental Systems Engineer - Mathematics and Computer Science Visualization and Analysis Lead - Argonne Leadership Computing Facility Argonne National Laboratory The University of Chicago Cell: 630.327.2088 Voice: 630.252.4170 -------------- next part -------------- An HTML attachment was scrubbed... URL: