From wilde at mcs.anl.gov Mon Apr 2 12:04:39 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 2 Apr 2012 12:04:39 -0500 (CDT) Subject: [Swift-devel] Application run scripts Message-ID: <555674621.122476.1333386279217.JavaMail.root@zimbra.anl.gov> Jon, David, I am cloning the user run script we did for SciCol for the first CMTS application ("energy landscape"). Im a bit concerned that if we dont do something soon to create a nice separation between application-independent "Swift run logic" and the application-specific logic, we'll forever be re-writing the common code and making the user code more complex, hard to document, etc. This is just a heads-up that I'm looking for ideas on how to do this in a nice way that we can immediately test on SciCol, DSSAT, CMTS, Broadband, etc. and ideally leverage in GO-Swift. Mike From jon.monette at gmail.com Mon Apr 2 12:21:20 2012 From: jon.monette at gmail.com (Jonathan Monette) Date: Mon, 2 Apr 2012 12:21:20 -0500 Subject: [Swift-devel] Application run scripts In-Reply-To: <555674621.122476.1333386279217.JavaMail.root@zimbra.anl.gov> References: <555674621.122476.1333386279217.JavaMail.root@zimbra.anl.gov> Message-ID: <2B86DB86-1E8E-48E1-B3C9-151864DE6493@gmail.com> I have already been working on a run script using the logic we have from SciColSim. It is not checked in for testing as it still only works for SciColSim. I have been looking for other run scripts I have laying around to see what is still missing and how to work with @arg parameters. Right now it just sources a parameter file but I want to also have the ability to pass in the parameters on the command line as it seems kind of clunky to make a parameter file if you only need 1 or 2 parameters set. So when you finish the CMTS script please provide a pointer. I can then compare the one I have and the one you have and do a check in for testing for a wide variety of apps. We can get this into trunk and stabilize the script for a 0.94 release. On Apr 2, 2012, at 12:04 PM, Michael Wilde wrote: > Jon, David, > > I am cloning the user run script we did for SciCol for the first CMTS application ("energy landscape"). > > Im a bit concerned that if we dont do something soon to create a nice separation between application-independent "Swift run logic" and the application-specific logic, we'll forever be re-writing the common code and making the user code more complex, hard to document, etc. > > This is just a heads-up that I'm looking for ideas on how to do this in a nice way that we can immediately test on SciCol, DSSAT, CMTS, Broadband, etc. and ideally leverage in GO-Swift. > > Mike > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From wozniak at mcs.anl.gov Mon Apr 2 13:38:59 2012 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Mon, 2 Apr 2012 13:38:59 -0500 (CDT) Subject: [Swift-devel] Application run scripts In-Reply-To: <2B86DB86-1E8E-48E1-B3C9-151864DE6493@gmail.com> References: <555674621.122476.1333386279217.JavaMail.root@zimbra.anl.gov> <2B86DB86-1E8E-48E1-B3C9-151864DE6493@gmail.com> Message-ID: Is this about swiftopt.sh ? On Mon, 2 Apr 2012, Jonathan Monette wrote: > I have already been working on a run script using the logic we have from > SciColSim. It is not checked in for testing as it still only works for > SciColSim. I have been looking for other run scripts I have laying > around to see what is still missing and how to work with @arg > parameters. Right now it just sources a parameter file but I want to > also have the ability to pass in the parameters on the command line as > it seems kind of clunky to make a parameter file if you only need 1 or 2 > parameters set. > > So when you finish the CMTS script please provide a pointer. I can then > compare the one I have and the one you have and do a check in for > testing for a wide variety of apps. We can get this into trunk and > stabilize the script for a 0.94 release. > > On Apr 2, 2012, at 12:04 PM, Michael Wilde wrote: > >> Jon, David, >> >> I am cloning the user run script we did for SciCol for the first CMTS >> application ("energy landscape"). >> >> Im a bit concerned that if we dont do something soon to create a nice >> separation between application-independent "Swift run logic" and the >> application-specific logic, we'll forever be re-writing the common code >> and making the user code more complex, hard to document, etc. >> >> This is just a heads-up that I'm looking for ideas on how to do this in >> a nice way that we can immediately test on SciCol, DSSAT, CMTS, >> Broadband, etc. and ideally leverage in GO-Swift. >> >> Mike >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -- Justin M Wozniak From jon.monette at gmail.com Mon Apr 2 15:54:53 2012 From: jon.monette at gmail.com (Jonathan Monette) Date: Mon, 2 Apr 2012 15:54:53 -0500 Subject: [Swift-devel] Application run scripts In-Reply-To: References: <555674621.122476.1333386279217.JavaMail.root@zimbra.anl.gov> <2B86DB86-1E8E-48E1-B3C9-151864DE6493@gmail.com> Message-ID: <3A96AC12-B9FE-45BA-B2A1-B53512B8E535@gmail.com> Yes. Generalizing swiftopt.sh to a general runswift script that any application can use. On Apr 2, 2012, at 1:38 PM, Justin M Wozniak wrote: > > Is this about swiftopt.sh ? > > On Mon, 2 Apr 2012, Jonathan Monette wrote: > >> I have already been working on a run script using the logic we have from >> SciColSim. It is not checked in for testing as it still only works for >> SciColSim. I have been looking for other run scripts I have laying >> around to see what is still missing and how to work with @arg >> parameters. Right now it just sources a parameter file but I want to >> also have the ability to pass in the parameters on the command line as >> it seems kind of clunky to make a parameter file if you only need 1 or 2 >> parameters set. >> >> So when you finish the CMTS script please provide a pointer. I can then >> compare the one I have and the one you have and do a check in for >> testing for a wide variety of apps. We can get this into trunk and >> stabilize the script for a 0.94 release. >> >> On Apr 2, 2012, at 12:04 PM, Michael Wilde wrote: >> >>> Jon, David, >>> >>> I am cloning the user run script we did for SciCol for the first CMTS >>> application ("energy landscape"). >>> >>> Im a bit concerned that if we dont do something soon to create a nice >>> separation between application-independent "Swift run logic" and the >>> application-specific logic, we'll forever be re-writing the common code >>> and making the user code more complex, hard to document, etc. >>> >>> This is just a heads-up that I'm looking for ideas on how to do this in >>> a nice way that we can immediately test on SciCol, DSSAT, CMTS, >>> Broadband, etc. and ideally leverage in GO-Swift. >>> >>> Mike >>> _______________________________________________ >>> Swift-devel mailing list >>> Swift-devel at ci.uchicago.edu >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> > > -- > Justin M Wozniak > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From jonmon at mcs.anl.gov Mon Apr 2 17:11:00 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Mon, 2 Apr 2012 17:11:00 -0500 Subject: [Swift-devel] hanging in trunk Message-ID: <5E45BA7F-A168-4F03-B156-DECD5E4A8063@mcs.anl.gov> Hello, I tried running the SciColSim application using trunk to test the fix Mihael added for the sockets and pipes issue and the application is hanging. I do not witness this hanging situation when using 0.93. The compressed tar ball of the run that hung is at http://www.ci.uchicago.edu/~jonmon/logs/SciColSim-hang-trunk.tar.gz There is a jstack output file in that tar ball. I am currently testing to see if maybe the fix Mihael added is causing this but I do not believe for this to be the case. What other helpful debugging information would anyone need and I can gather it? From hategan at mcs.anl.gov Mon Apr 2 20:57:33 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 02 Apr 2012 18:57:33 -0700 Subject: [Swift-devel] hanging in trunk In-Reply-To: <5E45BA7F-A168-4F03-B156-DECD5E4A8063@mcs.anl.gov> References: <5E45BA7F-A168-4F03-B156-DECD5E4A8063@mcs.anl.gov> Message-ID: <1333418253.6521.1.camel@blabla> On Mon, 2012-04-02 at 17:11 -0500, Jonathan Monette wrote: > Hello, > I tried running the SciColSim application using trunk to test the > fix Mihael added for the sockets and pipes issue and the application > is hanging. I do not witness this hanging situation when using 0.93. > The compressed tar ball of the run that hung is at > http://www.ci.uchicago.edu/~jonmon/logs/SciColSim-hang-trunk.tar.gz > > There is a jstack output file in that tar ball. I am currently > testing to see if maybe the fix Mihael added is causing this but I do > not believe for this to be the case. It's not. > What other helpful debugging information would anyone need and I can > gather it? None. This is an actual deadlock and the jstack output is as good as it gets. Mihael From jonmon at mcs.anl.gov Mon Apr 2 21:08:11 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Mon, 2 Apr 2012 21:08:11 -0500 Subject: [Swift-devel] hanging in trunk In-Reply-To: <1333418253.6521.1.camel@blabla> References: <5E45BA7F-A168-4F03-B156-DECD5E4A8063@mcs.anl.gov> <1333418253.6521.1.camel@blabla> Message-ID: <1D075ECC-6F56-4B73-81F8-1C129825E705@mcs.anl.gov> David believes he may have a fix. I am running my tests on Raven to see if it hangs. He is running the test suite to make sure nothing else broke. So David may have a fix already. On Apr 2, 2012, at 20:57, Mihael Hategan wrote: > On Mon, 2012-04-02 at 17:11 -0500, Jonathan Monette wrote: >> Hello, >> I tried running the SciColSim application using trunk to test the >> fix Mihael added for the sockets and pipes issue and the application >> is hanging. I do not witness this hanging situation when using 0.93. >> The compressed tar ball of the run that hung is at >> http://www.ci.uchicago.edu/~jonmon/logs/SciColSim-hang-trunk.tar.gz >> >> There is a jstack output file in that tar ball. I am currently >> testing to see if maybe the fix Mihael added is causing this but I do >> not believe for this to be the case. > > It's not. > >> What other helpful debugging information would anyone need and I can >> gather it? > > None. This is an actual deadlock and the jstack output is as good as it > gets. > > Mihael > > From davidk at ci.uchicago.edu Mon Apr 2 21:16:55 2012 From: davidk at ci.uchicago.edu (David Kelly) Date: Mon, 2 Apr 2012 21:16:55 -0500 (CDT) Subject: [Swift-devel] hanging in trunk In-Reply-To: <1D075ECC-6F56-4B73-81F8-1C129825E705@mcs.anl.gov> Message-ID: <812016097.94412.1333419415377.JavaMail.root@zimbra-mb2.anl.gov> I remember seeing a very similar deadlock with swat. I went through my emails and found that it was fixed in 0.93 r5143. As a test I applied the same patch to swift/src/org/griphyn/vdl/mapping/ArrayDataNode.java in trunk to see if that may fix the issue. David ----- Original Message ----- > From: "Jonathan Monette" > To: "Mihael Hategan" > Cc: "swift-devel at ci.uchicago.edu Devel" > Sent: Monday, April 2, 2012 9:08:11 PM > Subject: Re: [Swift-devel] hanging in trunk > David believes he may have a fix. I am running my tests on Raven to > see if it hangs. He is running the test suite to make sure nothing > else broke. So David may have a fix already. > > On Apr 2, 2012, at 20:57, Mihael Hategan wrote: > > > On Mon, 2012-04-02 at 17:11 -0500, Jonathan Monette wrote: > >> Hello, > >> I tried running the SciColSim application using trunk to test the > >> fix Mihael added for the sockets and pipes issue and the > >> application > >> is hanging. I do not witness this hanging situation when using > >> 0.93. > >> The compressed tar ball of the run that hung is at > >> http://www.ci.uchicago.edu/~jonmon/logs/SciColSim-hang-trunk.tar.gz > >> > >> There is a jstack output file in that tar ball. I am currently > >> testing to see if maybe the fix Mihael added is causing this but I > >> do > >> not believe for this to be the case. > > > > It's not. > > > >> What other helpful debugging information would anyone need and I > >> can > >> gather it? > > > > None. This is an actual deadlock and the jstack output is as good as > > it > > gets. > > > > Mihael > > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From wilde at mcs.anl.gov Mon Apr 2 21:27:10 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 2 Apr 2012 21:27:10 -0500 (CDT) Subject: [Swift-devel] hanging in trunk In-Reply-To: <812016097.94412.1333419415377.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: <692167390.123379.1333420030067.JavaMail.root@zimbra.anl.gov> David, Mihael, Wasnt that already (supposed to be) integrated into trunk? - Mike ----- Original Message ----- > From: "David Kelly" > To: "Jonathan Monette" > Cc: "swift-devel at ci.uchicago.edu Devel" > Sent: Monday, April 2, 2012 9:16:55 PM > Subject: Re: [Swift-devel] hanging in trunk > I remember seeing a very similar deadlock with swat. I went through my > emails and found that it was fixed in 0.93 r5143. As a test I applied > the same patch to swift/src/org/griphyn/vdl/mapping/ArrayDataNode.java > in trunk to see if that may fix the issue. > > David > > ----- Original Message ----- > > From: "Jonathan Monette" > > To: "Mihael Hategan" > > Cc: "swift-devel at ci.uchicago.edu Devel" > > > > Sent: Monday, April 2, 2012 9:08:11 PM > > Subject: Re: [Swift-devel] hanging in trunk > > David believes he may have a fix. I am running my tests on Raven to > > see if it hangs. He is running the test suite to make sure nothing > > else broke. So David may have a fix already. > > > > On Apr 2, 2012, at 20:57, Mihael Hategan > > wrote: > > > > > On Mon, 2012-04-02 at 17:11 -0500, Jonathan Monette wrote: > > >> Hello, > > >> I tried running the SciColSim application using trunk to test > > >> the > > >> fix Mihael added for the sockets and pipes issue and the > > >> application > > >> is hanging. I do not witness this hanging situation when using > > >> 0.93. > > >> The compressed tar ball of the run that hung is at > > >> http://www.ci.uchicago.edu/~jonmon/logs/SciColSim-hang-trunk.tar.gz > > >> > > >> There is a jstack output file in that tar ball. I am currently > > >> testing to see if maybe the fix Mihael added is causing this but > > >> I > > >> do > > >> not believe for this to be the case. > > > > > > It's not. > > > > > >> What other helpful debugging information would anyone need and I > > >> can > > >> gather it? > > > > > > None. This is an actual deadlock and the jstack output is as good > > > as > > > it > > > gets. > > > > > > Mihael > > > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From davidk at ci.uchicago.edu Mon Apr 2 21:39:30 2012 From: davidk at ci.uchicago.edu (David Kelly) Date: Mon, 2 Apr 2012 21:39:30 -0500 (CDT) Subject: [Swift-devel] hanging in trunk In-Reply-To: <692167390.123379.1333420030067.JavaMail.root@zimbra.anl.gov> Message-ID: <406919819.94466.1333420770287.JavaMail.root@zimbra-mb2.anl.gov> It looks like that particular fix was only applied to 0.93.. perhaps it was missed when doing the merges. ----- Original Message ----- > From: "Michael Wilde" > To: "David Kelly" > Cc: "swift-devel at ci.uchicago.edu Devel" , "Jonathan Monette" > Sent: Monday, April 2, 2012 9:27:10 PM > Subject: Re: [Swift-devel] hanging in trunk > David, Mihael, > > Wasnt that already (supposed to be) integrated into trunk? > > - Mike > > ----- Original Message ----- > > From: "David Kelly" > > To: "Jonathan Monette" > > Cc: "swift-devel at ci.uchicago.edu Devel" > > > > Sent: Monday, April 2, 2012 9:16:55 PM > > Subject: Re: [Swift-devel] hanging in trunk > > I remember seeing a very similar deadlock with swat. I went through > > my > > emails and found that it was fixed in 0.93 r5143. As a test I > > applied > > the same patch to > > swift/src/org/griphyn/vdl/mapping/ArrayDataNode.java > > in trunk to see if that may fix the issue. > > > > David > > > > ----- Original Message ----- > > > From: "Jonathan Monette" > > > To: "Mihael Hategan" > > > Cc: "swift-devel at ci.uchicago.edu Devel" > > > > > > Sent: Monday, April 2, 2012 9:08:11 PM > > > Subject: Re: [Swift-devel] hanging in trunk > > > David believes he may have a fix. I am running my tests on Raven > > > to > > > see if it hangs. He is running the test suite to make sure nothing > > > else broke. So David may have a fix already. > > > > > > On Apr 2, 2012, at 20:57, Mihael Hategan > > > wrote: > > > > > > > On Mon, 2012-04-02 at 17:11 -0500, Jonathan Monette wrote: > > > >> Hello, > > > >> I tried running the SciColSim application using trunk to test > > > >> the > > > >> fix Mihael added for the sockets and pipes issue and the > > > >> application > > > >> is hanging. I do not witness this hanging situation when using > > > >> 0.93. > > > >> The compressed tar ball of the run that hung is at > > > >> http://www.ci.uchicago.edu/~jonmon/logs/SciColSim-hang-trunk.tar.gz > > > >> > > > >> There is a jstack output file in that tar ball. I am currently > > > >> testing to see if maybe the fix Mihael added is causing this > > > >> but > > > >> I > > > >> do > > > >> not believe for this to be the case. > > > > > > > > It's not. > > > > > > > >> What other helpful debugging information would anyone need and > > > >> I > > > >> can > > > >> gather it? > > > > > > > > None. This is an actual deadlock and the jstack output is as > > > > good > > > > as > > > > it > > > > gets. > > > > > > > > Mihael > > > > > > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory From jonmon at mcs.anl.gov Mon Apr 2 22:48:01 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Mon, 2 Apr 2012 22:48:01 -0500 Subject: [Swift-devel] hanging in trunk In-Reply-To: <406919819.94466.1333420770287.JavaMail.root@zimbra-mb2.anl.gov> References: <406919819.94466.1333420770287.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: <7CF8BAC8-E438-44C1-B4FF-7D3AEBB408C5@mcs.anl.gov> Yea. I ran small tests to see if this fixed it. Before it hung almost immediately. The small tests did not hang. I currently have a larger test waiting to be executed on raven. Seems the machine is I'm full use right now. On Apr 2, 2012, at 21:39, David Kelly wrote: > > It looks like that particular fix was only applied to 0.93.. perhaps it was missed when doing the merges. > > ----- Original Message ----- >> From: "Michael Wilde" >> To: "David Kelly" >> Cc: "swift-devel at ci.uchicago.edu Devel" , "Jonathan Monette" >> Sent: Monday, April 2, 2012 9:27:10 PM >> Subject: Re: [Swift-devel] hanging in trunk >> David, Mihael, >> >> Wasnt that already (supposed to be) integrated into trunk? >> >> - Mike >> >> ----- Original Message ----- >>> From: "David Kelly" >>> To: "Jonathan Monette" >>> Cc: "swift-devel at ci.uchicago.edu Devel" >>> >>> Sent: Monday, April 2, 2012 9:16:55 PM >>> Subject: Re: [Swift-devel] hanging in trunk >>> I remember seeing a very similar deadlock with swat. I went through >>> my >>> emails and found that it was fixed in 0.93 r5143. As a test I >>> applied >>> the same patch to >>> swift/src/org/griphyn/vdl/mapping/ArrayDataNode.java >>> in trunk to see if that may fix the issue. >>> >>> David >>> >>> ----- Original Message ----- >>>> From: "Jonathan Monette" >>>> To: "Mihael Hategan" >>>> Cc: "swift-devel at ci.uchicago.edu Devel" >>>> >>>> Sent: Monday, April 2, 2012 9:08:11 PM >>>> Subject: Re: [Swift-devel] hanging in trunk >>>> David believes he may have a fix. I am running my tests on Raven >>>> to >>>> see if it hangs. He is running the test suite to make sure nothing >>>> else broke. So David may have a fix already. >>>> >>>> On Apr 2, 2012, at 20:57, Mihael Hategan >>>> wrote: >>>> >>>>> On Mon, 2012-04-02 at 17:11 -0500, Jonathan Monette wrote: >>>>>> Hello, >>>>>> I tried running the SciColSim application using trunk to test >>>>>> the >>>>>> fix Mihael added for the sockets and pipes issue and the >>>>>> application >>>>>> is hanging. I do not witness this hanging situation when using >>>>>> 0.93. >>>>>> The compressed tar ball of the run that hung is at >>>>>> http://www.ci.uchicago.edu/~jonmon/logs/SciColSim-hang-trunk.tar.gz >>>>>> >>>>>> There is a jstack output file in that tar ball. I am currently >>>>>> testing to see if maybe the fix Mihael added is causing this >>>>>> but >>>>>> I >>>>>> do >>>>>> not believe for this to be the case. >>>>> >>>>> It's not. >>>>> >>>>>> What other helpful debugging information would anyone need and >>>>>> I >>>>>> can >>>>>> gather it? >>>>> >>>>> None. This is an actual deadlock and the jstack output is as >>>>> good >>>>> as >>>>> it >>>>> gets. >>>>> >>>>> Mihael >>>>> >>>>> >>>> _______________________________________________ >>>> Swift-devel mailing list >>>> Swift-devel at ci.uchicago.edu >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>> _______________________________________________ >>> Swift-devel mailing list >>> Swift-devel at ci.uchicago.edu >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> >> -- >> Michael Wilde >> Computation Institute, University of Chicago >> Mathematics and Computer Science Division >> Argonne National Laboratory From hategan at mcs.anl.gov Mon Apr 2 22:51:08 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 02 Apr 2012 20:51:08 -0700 Subject: [Swift-devel] hanging in trunk In-Reply-To: <692167390.123379.1333420030067.JavaMail.root@zimbra.anl.gov> References: <692167390.123379.1333420030067.JavaMail.root@zimbra.anl.gov> Message-ID: <1333425068.7212.1.camel@blabla> On Mon, 2012-04-02 at 21:27 -0500, Michael Wilde wrote: > David, Mihael, > > Wasnt that already (supposed to be) integrated into trunk? Yes. Which was what I was going to double-check. Sometimes things get messed up if the same code is modified in both branches before the merge. I suspect this is what happened here. I remember this being a family of fixes that made similar modifications in relevant code. I will double-check all places I know of where this should be and commit as necessary. Mihael From davidk at ci.uchicago.edu Mon Apr 2 23:37:08 2012 From: davidk at ci.uchicago.edu (David Kelly) Date: Mon, 2 Apr 2012 23:37:08 -0500 (CDT) Subject: [Swift-devel] hanging in trunk In-Reply-To: <1333425068.7212.1.camel@blabla> Message-ID: <1046342495.94615.1333427828027.JavaMail.root@zimbra-mb2.anl.gov> The test version I built fails some tests in test suite. I think this is because as you mentioned, there are some other modifications to these classes. ----- Original Message ----- > From: "Mihael Hategan" > To: "Michael Wilde" > Cc: "David Kelly" , "swift-devel at ci.uchicago.edu Devel" > Sent: Monday, April 2, 2012 10:51:08 PM > Subject: Re: [Swift-devel] hanging in trunk > On Mon, 2012-04-02 at 21:27 -0500, Michael Wilde wrote: > > David, Mihael, > > > > Wasnt that already (supposed to be) integrated into trunk? > > Yes. Which was what I was going to double-check. Sometimes things get > messed up if the same code is modified in both branches before the > merge. I suspect this is what happened here. > > I remember this being a family of fixes that made similar > modifications > in relevant code. I will double-check all places I know of where this > should be and commit as necessary. > > Mihael From hategan at mcs.anl.gov Tue Apr 3 02:07:24 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 03 Apr 2012 00:07:24 -0700 Subject: [Swift-devel] hanging in trunk In-Reply-To: <7CF8BAC8-E438-44C1-B4FF-7D3AEBB408C5@mcs.anl.gov> References: <406919819.94466.1333420770287.JavaMail.root@zimbra-mb2.anl.gov> <7CF8BAC8-E438-44C1-B4FF-7D3AEBB408C5@mcs.anl.gov> Message-ID: <1333436844.8704.2.camel@blabla> Hmm, so the deadlock in the jstack file and r5143 do not cross paths. So you're probably seeing things working because they sometimes do not because of the code change. On Mon, 2012-04-02 at 22:48 -0500, Jonathan Monette wrote: > Yea. I ran small tests to see if this fixed it. Before it hung almost immediately. The small tests did not hang. I currently have a larger test waiting to be executed on raven. Seems the machine is I'm full use right now. > > On Apr 2, 2012, at 21:39, David Kelly wrote: > > > > > It looks like that particular fix was only applied to 0.93.. perhaps it was missed when doing the merges. From hategan at mcs.anl.gov Tue Apr 3 02:15:09 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 03 Apr 2012 00:15:09 -0700 Subject: [Swift-devel] hanging in trunk In-Reply-To: <1333436844.8704.2.camel@blabla> References: <406919819.94466.1333420770287.JavaMail.root@zimbra-mb2.anl.gov> <7CF8BAC8-E438-44C1-B4FF-7D3AEBB408C5@mcs.anl.gov> <1333436844.8704.2.camel@blabla> Message-ID: <1333437309.8704.3.camel@blabla> r5141 was relevant though. I committed that to trunk. Please test. On Tue, 2012-04-03 at 00:07 -0700, Mihael Hategan wrote: > Hmm, so the deadlock in the jstack file and r5143 do not cross paths. So > you're probably seeing things working because they sometimes do not > because of the code change. > > On Mon, 2012-04-02 at 22:48 -0500, Jonathan Monette wrote: > > Yea. I ran small tests to see if this fixed it. Before it hung almost immediately. The small tests did not hang. I currently have a larger test waiting to be executed on raven. Seems the machine is I'm full use right now. > > > > On Apr 2, 2012, at 21:39, David Kelly wrote: > > > > > > > > It looks like that particular fix was only applied to 0.93.. perhaps it was missed when doing the merges. > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From wilde at mcs.anl.gov Tue Apr 3 09:22:25 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 3 Apr 2012 09:22:25 -0500 (CDT) Subject: [Swift-devel] hanging in trunk In-Reply-To: <1046342495.94615.1333427828027.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: <380423621.123789.1333462945315.JavaMail.root@zimbra.anl.gov> David, Mihael, Can you describe whats failing and what further debugging and/or 0.93-to-trunk integration that indicates is still needed? Can we also work from the SVN side from a list of what was not integrated, or what failed to auto-integrate? - Mike ----- Original Message ----- > From: "David Kelly" > To: "Mihael Hategan" > Cc: "swift-devel at ci.uchicago.edu Devel" , "Michael Wilde" > Sent: Monday, April 2, 2012 11:37:08 PM > Subject: Re: [Swift-devel] hanging in trunk > The test version I built fails some tests in test suite. I think this > is because as you mentioned, there are some other modifications to > these classes. > > ----- Original Message ----- > > From: "Mihael Hategan" > > To: "Michael Wilde" > > Cc: "David Kelly" , > > "swift-devel at ci.uchicago.edu Devel" > > Sent: Monday, April 2, 2012 10:51:08 PM > > Subject: Re: [Swift-devel] hanging in trunk > > On Mon, 2012-04-02 at 21:27 -0500, Michael Wilde wrote: > > > David, Mihael, > > > > > > Wasnt that already (supposed to be) integrated into trunk? > > > > Yes. Which was what I was going to double-check. Sometimes things > > get > > messed up if the same code is modified in both branches before the > > merge. I suspect this is what happened here. > > > > I remember this being a family of fixes that made similar > > modifications > > in relevant code. I will double-check all places I know of where > > this > > should be and commit as necessary. > > > > Mihael -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From jonmon at mcs.anl.gov Tue Apr 3 10:38:03 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Tue, 3 Apr 2012 10:38:03 -0500 Subject: [Swift-devel] hanging in trunk In-Reply-To: <1333436844.8704.2.camel@blabla> References: <406919819.94466.1333420770287.JavaMail.root@zimbra-mb2.anl.gov> <7CF8BAC8-E438-44C1-B4FF-7D3AEBB408C5@mcs.anl.gov> <1333436844.8704.2.camel@blabla> Message-ID: <9CC405A2-0D2D-4455-A6F0-50478216C4A4@mcs.anl.gov> You were correct. The workflow still hung it just took longer in the run I ran. So this is what I will do for my the SciColSim app. I will port the socket and pipe fix to the 0.93 branch and compile my own release for it while the issue is worked out in trunk. When trunk is believed to be fixed, let me know and I can do my testing. On Apr 3, 2012, at 2:07 AM, Mihael Hategan wrote: > Hmm, so the deadlock in the jstack file and r5143 do not cross paths. So > you're probably seeing things working because they sometimes do not > because of the code change. > > On Mon, 2012-04-02 at 22:48 -0500, Jonathan Monette wrote: >> Yea. I ran small tests to see if this fixed it. Before it hung almost immediately. The small tests did not hang. I currently have a larger test waiting to be executed on raven. Seems the machine is I'm full use right now. >> >> On Apr 2, 2012, at 21:39, David Kelly wrote: >> >>> >>> It looks like that particular fix was only applied to 0.93.. perhaps it was missed when doing the merges. > > From hategan at mcs.anl.gov Tue Apr 3 11:07:28 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 03 Apr 2012 09:07:28 -0700 Subject: [Swift-devel] hanging in trunk In-Reply-To: <9CC405A2-0D2D-4455-A6F0-50478216C4A4@mcs.anl.gov> References: <406919819.94466.1333420770287.JavaMail.root@zimbra-mb2.anl.gov> <7CF8BAC8-E438-44C1-B4FF-7D3AEBB408C5@mcs.anl.gov> <1333436844.8704.2.camel@blabla> <9CC405A2-0D2D-4455-A6F0-50478216C4A4@mcs.anl.gov> Message-ID: <1333469248.10864.0.camel@blabla> On Tue, 2012-04-03 at 10:38 -0500, Jonathan Monette wrote: > You were correct. The workflow still hung it just took longer in the run I ran. > > So this is what I will do for my the SciColSim app. I will port the socket and pipe fix to the 0.93 branch and compile my own release for it while the issue is worked out in trunk. When trunk is believed to be fixed, let me know and I can do my testing. Oh, I committed the fix to trunk last night. From jonmon at mcs.anl.gov Tue Apr 3 11:08:14 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Tue, 3 Apr 2012 11:08:14 -0500 Subject: [Swift-devel] hanging in trunk In-Reply-To: <1333469248.10864.0.camel@blabla> References: <406919819.94466.1333420770287.JavaMail.root@zimbra-mb2.anl.gov> <7CF8BAC8-E438-44C1-B4FF-7D3AEBB408C5@mcs.anl.gov> <1333436844.8704.2.camel@blabla> <9CC405A2-0D2D-4455-A6F0-50478216C4A4@mcs.anl.gov> <1333469248.10864.0.camel@blabla> Message-ID: Oh. Must have missed that. Let me update and try again. On Apr 3, 2012, at 11:07 AM, Mihael Hategan wrote: > On Tue, 2012-04-03 at 10:38 -0500, Jonathan Monette wrote: >> You were correct. The workflow still hung it just took longer in the run I ran. >> >> So this is what I will do for my the SciColSim app. I will port the socket and pipe fix to the 0.93 branch and compile my own release for it while the issue is worked out in trunk. When trunk is believed to be fixed, let me know and I can do my testing. > > Oh, I committed the fix to trunk last night. > From jonmon at mcs.anl.gov Tue Apr 3 11:30:21 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Tue, 3 Apr 2012 11:30:21 -0500 Subject: [Swift-devel] hanging in trunk In-Reply-To: <1333469248.10864.0.camel@blabla> References: <406919819.94466.1333420770287.JavaMail.root@zimbra-mb2.anl.gov> <7CF8BAC8-E438-44C1-B4FF-7D3AEBB408C5@mcs.anl.gov> <1333436844.8704.2.camel@blabla> <9CC405A2-0D2D-4455-A6F0-50478216C4A4@mcs.anl.gov> <1333469248.10864.0.camel@blabla> Message-ID: <474BD8A2-00B0-4166-8EDF-46F75FA86040@mcs.anl.gov> The workflow still hung in trunk. I tar balled the run directory and put it at ~jonmon/SciColSim-hang-trunk2.tar.gz on the ci network(my public_html directory seems to be down). I captured another jstack log. On Apr 3, 2012, at 11:07 AM, Mihael Hategan wrote: > On Tue, 2012-04-03 at 10:38 -0500, Jonathan Monette wrote: >> You were correct. The workflow still hung it just took longer in the run I ran. >> >> So this is what I will do for my the SciColSim app. I will port the socket and pipe fix to the 0.93 branch and compile my own release for it while the issue is worked out in trunk. When trunk is believed to be fixed, let me know and I can do my testing. > > Oh, I committed the fix to trunk last night. > From jonmon at mcs.anl.gov Tue Apr 3 11:33:37 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Tue, 3 Apr 2012 11:33:37 -0500 Subject: [Swift-devel] hanging in trunk In-Reply-To: <474BD8A2-00B0-4166-8EDF-46F75FA86040@mcs.anl.gov> References: <406919819.94466.1333420770287.JavaMail.root@zimbra-mb2.anl.gov> <7CF8BAC8-E438-44C1-B4FF-7D3AEBB408C5@mcs.anl.gov> <1333436844.8704.2.camel@blabla> <9CC405A2-0D2D-4455-A6F0-50478216C4A4@mcs.anl.gov> <1333469248.10864.0.camel@blabla> <474BD8A2-00B0-4166-8EDF-46F75FA86040@mcs.anl.gov> Message-ID: <98332BB4-9824-4F1F-B537-E8174ABA3502@mcs.anl.gov> I just realized that this was with OpenJDK. I am not sure if it matters but I will try again with a SunJava release. On Apr 3, 2012, at 11:30 AM, Jonathan Monette wrote: > The workflow still hung in trunk. I tar balled the run directory and put it at ~jonmon/SciColSim-hang-trunk2.tar.gz on the ci network(my public_html directory seems to be down). I captured another jstack log. > > On Apr 3, 2012, at 11:07 AM, Mihael Hategan wrote: > >> On Tue, 2012-04-03 at 10:38 -0500, Jonathan Monette wrote: >>> You were correct. The workflow still hung it just took longer in the run I ran. >>> >>> So this is what I will do for my the SciColSim app. I will port the socket and pipe fix to the 0.93 branch and compile my own release for it while the issue is worked out in trunk. When trunk is believed to be fixed, let me know and I can do my testing. >> >> Oh, I committed the fix to trunk last night. >> > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From jonmon at mcs.anl.gov Tue Apr 3 11:42:48 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Tue, 3 Apr 2012 11:42:48 -0500 Subject: [Swift-devel] hanging in trunk In-Reply-To: <98332BB4-9824-4F1F-B537-E8174ABA3502@mcs.anl.gov> References: <406919819.94466.1333420770287.JavaMail.root@zimbra-mb2.anl.gov> <7CF8BAC8-E438-44C1-B4FF-7D3AEBB408C5@mcs.anl.gov> <1333436844.8704.2.camel@blabla> <9CC405A2-0D2D-4455-A6F0-50478216C4A4@mcs.anl.gov> <1333469248.10864.0.camel@blabla> <474BD8A2-00B0-4166-8EDF-46F75FA86040@mcs.anl.gov> <98332BB4-9824-4F1F-B537-E8174ABA3502@mcs.anl.gov> Message-ID: <72FA2497-06C3-4C1D-8107-229E451106E2@mcs.anl.gov> Ok. Just reran it with Sun Java. Still hung. Tar ball located at ~jonmon/SciColSim-hang-trunkSun.tar.gz Again, jstack log provided. On Apr 3, 2012, at 11:33 AM, Jonathan Monette wrote: > I just realized that this was with OpenJDK. I am not sure if it matters but I will try again with a SunJava release. > > On Apr 3, 2012, at 11:30 AM, Jonathan Monette wrote: > >> The workflow still hung in trunk. I tar balled the run directory and put it at ~jonmon/SciColSim-hang-trunk2.tar.gz on the ci network(my public_html directory seems to be down). I captured another jstack log. >> >> On Apr 3, 2012, at 11:07 AM, Mihael Hategan wrote: >> >>> On Tue, 2012-04-03 at 10:38 -0500, Jonathan Monette wrote: >>>> You were correct. The workflow still hung it just took longer in the run I ran. >>>> >>>> So this is what I will do for my the SciColSim app. I will port the socket and pipe fix to the 0.93 branch and compile my own release for it while the issue is worked out in trunk. When trunk is believed to be fixed, let me know and I can do my testing. >>> >>> Oh, I committed the fix to trunk last night. >>> >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From hategan at mcs.anl.gov Tue Apr 3 11:59:20 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 03 Apr 2012 09:59:20 -0700 Subject: [Swift-devel] hanging in trunk In-Reply-To: <72FA2497-06C3-4C1D-8107-229E451106E2@mcs.anl.gov> References: <406919819.94466.1333420770287.JavaMail.root@zimbra-mb2.anl.gov> <7CF8BAC8-E438-44C1-B4FF-7D3AEBB408C5@mcs.anl.gov> <1333436844.8704.2.camel@blabla> <9CC405A2-0D2D-4455-A6F0-50478216C4A4@mcs.anl.gov> <1333469248.10864.0.camel@blabla> <474BD8A2-00B0-4166-8EDF-46F75FA86040@mcs.anl.gov> <98332BB4-9824-4F1F-B537-E8174ABA3502@mcs.anl.gov> <72FA2497-06C3-4C1D-8107-229E451106E2@mcs.anl.gov> Message-ID: <1333472360.11261.1.camel@blabla> On Tue, 2012-04-03 at 11:42 -0500, Jonathan Monette wrote: > Ok. Just reran it with Sun Java. Still hung. Tar ball located at ~jonmon/SciColSim-hang-trunkSun.tar.gz > > Again, jstack log provided. The jstack shows no deadlock so the hang is due to something else. From hategan at mcs.anl.gov Tue Apr 3 12:00:55 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 03 Apr 2012 10:00:55 -0700 Subject: [Swift-devel] hanging in trunk In-Reply-To: <380423621.123789.1333462945315.JavaMail.root@zimbra.anl.gov> References: <380423621.123789.1333462945315.JavaMail.root@zimbra.anl.gov> Message-ID: <1333472455.11261.3.camel@blabla> On Tue, 2012-04-03 at 09:22 -0500, Michael Wilde wrote: > Can we also work from the SVN side from a list of what was not integrated, or what failed to auto-integrate? That's a good point to which I don't know the answer. Essentially one should do a diff between 0.93 and trunk. But that's essentially part of a merge (things that SVN thinks are straightforward to integrate it will integrate, and other things will show up as conflicts that need to be manually addressed). From davidk at ci.uchicago.edu Tue Apr 3 13:02:24 2012 From: davidk at ci.uchicago.edu (David Kelly) Date: Tue, 3 Apr 2012 13:02:24 -0500 (CDT) Subject: [Swift-devel] Join function? In-Reply-To: <355549853.97008.1333475393275.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: <492858065.97114.1333476144147.JavaMail.root@zimbra-mb2.anl.gov> Hello, Is there a way to concatenate all elements of an array into a single string? I would also like to define a separator between elements. I was thinking of something similar to Perl's join function. If I have: string a[]; a[0] = "this"; a[1] = "is"; a[2] = "a"; a[3] = "test"; How can I get it into "this is a test" or "this:is:a:test"? @strcat returns a reference. I can tracef with %q and get "[this,is,a,test]", but it doesn't give me any control over the formatting as far as I know. I could call a shell script to do this, just wondering if there was another way. David From jonmon at mcs.anl.gov Tue Apr 3 13:21:32 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Tue, 3 Apr 2012 13:21:32 -0500 Subject: [Swift-devel] Join function? In-Reply-To: <492858065.97114.1333476144147.JavaMail.root@zimbra-mb2.anl.gov> References: <492858065.97114.1333476144147.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: So @strcat(a[0],":", a[1], ":", a[2], ":", a[3]); does not work? That is the only swift way I can think of to make it work. The only other approach is to use an app call to a perl/python script. On Apr 3, 2012, at 1:02 PM, David Kelly wrote: > Hello, > > Is there a way to concatenate all elements of an array into a single string? I would also like to define a separator between elements. I was thinking of something similar to Perl's join function. > > If I have: > > string a[]; > a[0] = "this"; > a[1] = "is"; > a[2] = "a"; > a[3] = "test"; > > How can I get it into "this is a test" or "this:is:a:test"? @strcat returns a reference. I can tracef with %q and get "[this,is,a,test]", but it doesn't give me any control over the formatting as far as I know. > > I could call a shell script to do this, just wondering if there was another way. > > David > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From davidk at ci.uchicago.edu Tue Apr 3 13:33:06 2012 From: davidk at ci.uchicago.edu (David Kelly) Date: Tue, 3 Apr 2012 13:33:06 -0500 (CDT) Subject: [Swift-devel] Join function? In-Reply-To: Message-ID: <2118928367.97253.1333477986033.JavaMail.root@zimbra-mb2.anl.gov> I think that would work, but it requires knowing the number of elements. ----- Original Message ----- > From: "Jonathan Monette" > To: "David Kelly" > Cc: "swift-devel at ci.uchicago.edu Devel" > Sent: Tuesday, April 3, 2012 1:21:32 PM > Subject: Re: [Swift-devel] Join function? > So @strcat(a[0],":", a[1], ":", a[2], ":", a[3]); does not work? That > is the only swift way I can think of to make it work. The only other > approach is to use an app call to a perl/python script. > > On Apr 3, 2012, at 1:02 PM, David Kelly wrote: > > > Hello, > > > > Is there a way to concatenate all elements of an array into a single > > string? I would also like to define a separator between elements. I > > was thinking of something similar to Perl's join function. > > > > If I have: > > > > string a[]; > > a[0] = "this"; > > a[1] = "is"; > > a[2] = "a"; > > a[3] = "test"; > > > > How can I get it into "this is a test" or "this:is:a:test"? @strcat > > returns a reference. I can tracef with %q and get > > "[this,is,a,test]", but it doesn't give me any control over the > > formatting as far as I know. > > > > I could call a shell script to do this, just wondering if there was > > another way. > > > > David > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From wilde at mcs.anl.gov Tue Apr 3 13:40:12 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 3 Apr 2012 13:40:12 -0500 (CDT) Subject: [Swift-devel] Join function? In-Reply-To: <2118928367.97253.1333477986033.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: <919246425.124461.1333478412192.JavaMail.root@zimbra.anl.gov> You might be able to write strjoin() as a recursive Swift function. It would also be a good exercise for you to add it as a new primitive, David - as we are rather scanty n string and math functions. - Mike ----- Original Message ----- > From: "David Kelly" > To: "Jonathan Monette" > Cc: "swift-devel at ci.uchicago.edu Devel" > Sent: Tuesday, April 3, 2012 1:33:06 PM > Subject: Re: [Swift-devel] Join function? > I think that would work, but it requires knowing the number of > elements. > > ----- Original Message ----- > > From: "Jonathan Monette" > > To: "David Kelly" > > Cc: "swift-devel at ci.uchicago.edu Devel" > > > > Sent: Tuesday, April 3, 2012 1:21:32 PM > > Subject: Re: [Swift-devel] Join function? > > So @strcat(a[0],":", a[1], ":", a[2], ":", a[3]); does not work? > > That > > is the only swift way I can think of to make it work. The only other > > approach is to use an app call to a perl/python script. > > > > On Apr 3, 2012, at 1:02 PM, David Kelly wrote: > > > > > Hello, > > > > > > Is there a way to concatenate all elements of an array into a > > > single > > > string? I would also like to define a separator between elements. > > > I > > > was thinking of something similar to Perl's join function. > > > > > > If I have: > > > > > > string a[]; > > > a[0] = "this"; > > > a[1] = "is"; > > > a[2] = "a"; > > > a[3] = "test"; > > > > > > How can I get it into "this is a test" or "this:is:a:test"? > > > @strcat > > > returns a reference. I can tracef with %q and get > > > "[this,is,a,test]", but it doesn't give me any control over the > > > formatting as far as I know. > > > > > > I could call a shell script to do this, just wondering if there > > > was > > > another way. > > > > > > David > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From jonmon at mcs.anl.gov Tue Apr 3 13:42:47 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Tue, 3 Apr 2012 13:42:47 -0500 Subject: [Swift-devel] Join function? In-Reply-To: <919246425.124461.1333478412192.JavaMail.root@zimbra.anl.gov> References: <919246425.124461.1333478412192.JavaMail.root@zimbra.anl.gov> Message-ID: <43FAD6DA-A444-452D-B077-FD8371BF3AF3@mcs.anl.gov> I was just looking into how hard it would be to add an @join function to Swift that does what David describes. I do not think it will be too difficult. So if David wants to try I can help if he would like. On Apr 3, 2012, at 1:40 PM, Michael Wilde wrote: > You might be able to write strjoin() as a recursive Swift function. It would also be a good exercise for you to add it as a new primitive, David - as we are rather scanty n string and math functions. > > - Mike > > ----- Original Message ----- >> From: "David Kelly" >> To: "Jonathan Monette" >> Cc: "swift-devel at ci.uchicago.edu Devel" >> Sent: Tuesday, April 3, 2012 1:33:06 PM >> Subject: Re: [Swift-devel] Join function? >> I think that would work, but it requires knowing the number of >> elements. >> >> ----- Original Message ----- >>> From: "Jonathan Monette" >>> To: "David Kelly" >>> Cc: "swift-devel at ci.uchicago.edu Devel" >>> >>> Sent: Tuesday, April 3, 2012 1:21:32 PM >>> Subject: Re: [Swift-devel] Join function? >>> So @strcat(a[0],":", a[1], ":", a[2], ":", a[3]); does not work? >>> That >>> is the only swift way I can think of to make it work. The only other >>> approach is to use an app call to a perl/python script. >>> >>> On Apr 3, 2012, at 1:02 PM, David Kelly wrote: >>> >>>> Hello, >>>> >>>> Is there a way to concatenate all elements of an array into a >>>> single >>>> string? I would also like to define a separator between elements. >>>> I >>>> was thinking of something similar to Perl's join function. >>>> >>>> If I have: >>>> >>>> string a[]; >>>> a[0] = "this"; >>>> a[1] = "is"; >>>> a[2] = "a"; >>>> a[3] = "test"; >>>> >>>> How can I get it into "this is a test" or "this:is:a:test"? >>>> @strcat >>>> returns a reference. I can tracef with %q and get >>>> "[this,is,a,test]", but it doesn't give me any control over the >>>> formatting as far as I know. >>>> >>>> I could call a shell script to do this, just wondering if there >>>> was >>>> another way. >>>> >>>> David >>>> _______________________________________________ >>>> Swift-devel mailing list >>>> Swift-devel at ci.uchicago.edu >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > From wilde at mcs.anl.gov Tue Apr 3 13:45:19 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 3 Apr 2012 13:45:19 -0500 (CDT) Subject: [Swift-devel] Join function? In-Reply-To: <43FAD6DA-A444-452D-B077-FD8371BF3AF3@mcs.anl.gov> Message-ID: <97529895.124485.1333478719061.JavaMail.root@zimbra.anl.gov> Best way to help is to create a developer note in the swiftdevel site describing what you know about how to do this, with an existing example. Thanks, - Mike ----- Original Message ----- > From: "Jonathan Monette" > To: "Michael Wilde" > Cc: "David Kelly" , "swift-devel at ci.uchicago.edu Devel" > Sent: Tuesday, April 3, 2012 1:42:47 PM > Subject: Re: [Swift-devel] Join function? > I was just looking into how hard it would be to add an @join function > to Swift that does what David describes. I do not think it will be too > difficult. So if David wants to try I can help if he would like. > > On Apr 3, 2012, at 1:40 PM, Michael Wilde wrote: > > > You might be able to write strjoin() as a recursive Swift function. > > It would also be a good exercise for you to add it as a new > > primitive, David - as we are rather scanty n string and math > > functions. > > > > - Mike > > > > ----- Original Message ----- > >> From: "David Kelly" > >> To: "Jonathan Monette" > >> Cc: "swift-devel at ci.uchicago.edu Devel" > >> > >> Sent: Tuesday, April 3, 2012 1:33:06 PM > >> Subject: Re: [Swift-devel] Join function? > >> I think that would work, but it requires knowing the number of > >> elements. > >> > >> ----- Original Message ----- > >>> From: "Jonathan Monette" > >>> To: "David Kelly" > >>> Cc: "swift-devel at ci.uchicago.edu Devel" > >>> > >>> Sent: Tuesday, April 3, 2012 1:21:32 PM > >>> Subject: Re: [Swift-devel] Join function? > >>> So @strcat(a[0],":", a[1], ":", a[2], ":", a[3]); does not work? > >>> That > >>> is the only swift way I can think of to make it work. The only > >>> other > >>> approach is to use an app call to a perl/python script. > >>> > >>> On Apr 3, 2012, at 1:02 PM, David Kelly wrote: > >>> > >>>> Hello, > >>>> > >>>> Is there a way to concatenate all elements of an array into a > >>>> single > >>>> string? I would also like to define a separator between elements. > >>>> I > >>>> was thinking of something similar to Perl's join function. > >>>> > >>>> If I have: > >>>> > >>>> string a[]; > >>>> a[0] = "this"; > >>>> a[1] = "is"; > >>>> a[2] = "a"; > >>>> a[3] = "test"; > >>>> > >>>> How can I get it into "this is a test" or "this:is:a:test"? > >>>> @strcat > >>>> returns a reference. I can tracef with %q and get > >>>> "[this,is,a,test]", but it doesn't give me any control over the > >>>> formatting as far as I know. > >>>> > >>>> I could call a shell script to do this, just wondering if there > >>>> was > >>>> another way. > >>>> > >>>> David > >>>> _______________________________________________ > >>>> Swift-devel mailing list > >>>> Swift-devel at ci.uchicago.edu > >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >> _______________________________________________ > >> Swift-devel mailing list > >> Swift-devel at ci.uchicago.edu > >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From jonmon at mcs.anl.gov Tue Apr 3 13:47:10 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Tue, 3 Apr 2012 13:47:10 -0500 Subject: [Swift-devel] Join function? In-Reply-To: <97529895.124485.1333478719061.JavaMail.root@zimbra.anl.gov> References: <97529895.124485.1333478719061.JavaMail.root@zimbra.anl.gov> Message-ID: <36D50854-25C3-4350-8AA5-AD11FD99DBD5@mcs.anl.gov> Ok. I will add what I know. On Apr 3, 2012, at 1:45 PM, Michael Wilde wrote: > Best way to help is to create a developer note in the swiftdevel site describing what you know about how to do this, with an existing example. > > Thanks, > > - Mike > > > ----- Original Message ----- >> From: "Jonathan Monette" >> To: "Michael Wilde" >> Cc: "David Kelly" , "swift-devel at ci.uchicago.edu Devel" >> Sent: Tuesday, April 3, 2012 1:42:47 PM >> Subject: Re: [Swift-devel] Join function? >> I was just looking into how hard it would be to add an @join function >> to Swift that does what David describes. I do not think it will be too >> difficult. So if David wants to try I can help if he would like. >> >> On Apr 3, 2012, at 1:40 PM, Michael Wilde wrote: >> >>> You might be able to write strjoin() as a recursive Swift function. >>> It would also be a good exercise for you to add it as a new >>> primitive, David - as we are rather scanty n string and math >>> functions. >>> >>> - Mike >>> >>> ----- Original Message ----- >>>> From: "David Kelly" >>>> To: "Jonathan Monette" >>>> Cc: "swift-devel at ci.uchicago.edu Devel" >>>> >>>> Sent: Tuesday, April 3, 2012 1:33:06 PM >>>> Subject: Re: [Swift-devel] Join function? >>>> I think that would work, but it requires knowing the number of >>>> elements. >>>> >>>> ----- Original Message ----- >>>>> From: "Jonathan Monette" >>>>> To: "David Kelly" >>>>> Cc: "swift-devel at ci.uchicago.edu Devel" >>>>> >>>>> Sent: Tuesday, April 3, 2012 1:21:32 PM >>>>> Subject: Re: [Swift-devel] Join function? >>>>> So @strcat(a[0],":", a[1], ":", a[2], ":", a[3]); does not work? >>>>> That >>>>> is the only swift way I can think of to make it work. The only >>>>> other >>>>> approach is to use an app call to a perl/python script. >>>>> >>>>> On Apr 3, 2012, at 1:02 PM, David Kelly wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> Is there a way to concatenate all elements of an array into a >>>>>> single >>>>>> string? I would also like to define a separator between elements. >>>>>> I >>>>>> was thinking of something similar to Perl's join function. >>>>>> >>>>>> If I have: >>>>>> >>>>>> string a[]; >>>>>> a[0] = "this"; >>>>>> a[1] = "is"; >>>>>> a[2] = "a"; >>>>>> a[3] = "test"; >>>>>> >>>>>> How can I get it into "this is a test" or "this:is:a:test"? >>>>>> @strcat >>>>>> returns a reference. I can tracef with %q and get >>>>>> "[this,is,a,test]", but it doesn't give me any control over the >>>>>> formatting as far as I know. >>>>>> >>>>>> I could call a shell script to do this, just wondering if there >>>>>> was >>>>>> another way. >>>>>> >>>>>> David >>>>>> _______________________________________________ >>>>>> Swift-devel mailing list >>>>>> Swift-devel at ci.uchicago.edu >>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>> _______________________________________________ >>>> Swift-devel mailing list >>>> Swift-devel at ci.uchicago.edu >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>> >>> -- >>> Michael Wilde >>> Computation Institute, University of Chicago >>> Mathematics and Computer Science Division >>> Argonne National Laboratory >>> > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > From wozniak at mcs.anl.gov Tue Apr 3 13:57:30 2012 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Tue, 3 Apr 2012 13:57:30 -0500 (CDT) Subject: [Swift-devel] Join function? In-Reply-To: <97529895.124485.1333478719061.JavaMail.root@zimbra.anl.gov> References: <97529895.124485.1333478719061.JavaMail.root@zimbra.anl.gov> Message-ID: The existing outline is at: https://sites.google.com/site/swiftdevel/internals/builtins Justin On Tue, 3 Apr 2012, Michael Wilde wrote: > Best way to help is to create a developer note in the swiftdevel site describing what you know about how to do this, with an existing example. > > Thanks, > > - Mike > > > ----- Original Message ----- >> From: "Jonathan Monette" >> To: "Michael Wilde" >> Cc: "David Kelly" , "swift-devel at ci.uchicago.edu Devel" >> Sent: Tuesday, April 3, 2012 1:42:47 PM >> Subject: Re: [Swift-devel] Join function? >> I was just looking into how hard it would be to add an @join function >> to Swift that does what David describes. I do not think it will be too >> difficult. So if David wants to try I can help if he would like. >> >> On Apr 3, 2012, at 1:40 PM, Michael Wilde wrote: >> >>> You might be able to write strjoin() as a recursive Swift function. >>> It would also be a good exercise for you to add it as a new >>> primitive, David - as we are rather scanty n string and math >>> functions. >>> >>> - Mike >>> >>> ----- Original Message ----- >>>> From: "David Kelly" >>>> To: "Jonathan Monette" >>>> Cc: "swift-devel at ci.uchicago.edu Devel" >>>> >>>> Sent: Tuesday, April 3, 2012 1:33:06 PM >>>> Subject: Re: [Swift-devel] Join function? >>>> I think that would work, but it requires knowing the number of >>>> elements. >>>> >>>> ----- Original Message ----- >>>>> From: "Jonathan Monette" >>>>> To: "David Kelly" >>>>> Cc: "swift-devel at ci.uchicago.edu Devel" >>>>> >>>>> Sent: Tuesday, April 3, 2012 1:21:32 PM >>>>> Subject: Re: [Swift-devel] Join function? >>>>> So @strcat(a[0],":", a[1], ":", a[2], ":", a[3]); does not work? >>>>> That >>>>> is the only swift way I can think of to make it work. The only >>>>> other >>>>> approach is to use an app call to a perl/python script. >>>>> >>>>> On Apr 3, 2012, at 1:02 PM, David Kelly wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> Is there a way to concatenate all elements of an array into a >>>>>> single >>>>>> string? I would also like to define a separator between elements. >>>>>> I >>>>>> was thinking of something similar to Perl's join function. >>>>>> >>>>>> If I have: >>>>>> >>>>>> string a[]; >>>>>> a[0] = "this"; >>>>>> a[1] = "is"; >>>>>> a[2] = "a"; >>>>>> a[3] = "test"; >>>>>> >>>>>> How can I get it into "this is a test" or "this:is:a:test"? >>>>>> @strcat >>>>>> returns a reference. I can tracef with %q and get >>>>>> "[this,is,a,test]", but it doesn't give me any control over the >>>>>> formatting as far as I know. >>>>>> >>>>>> I could call a shell script to do this, just wondering if there >>>>>> was >>>>>> another way. >>>>>> >>>>>> David >>>>>> _______________________________________________ >>>>>> Swift-devel mailing list >>>>>> Swift-devel at ci.uchicago.edu >>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>> _______________________________________________ >>>> Swift-devel mailing list >>>> Swift-devel at ci.uchicago.edu >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>> >>> -- >>> Michael Wilde >>> Computation Institute, University of Chicago >>> Mathematics and Computer Science Division >>> Argonne National Laboratory >>> > > -- Justin M Wozniak From glen842 at uchicago.edu Tue Apr 3 14:09:34 2012 From: glen842 at uchicago.edu (Glen Hocky) Date: Tue, 3 Apr 2012 15:09:34 -0400 Subject: [Swift-devel] Join function? In-Reply-To: <492858065.97114.1333476144147.JavaMail.root@zimbra-mb2.anl.gov> References: <355549853.97008.1333475393275.JavaMail.root@zimbra-mb2.anl.gov> <492858065.97114.1333476144147.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: This seemed like an interesting challenge. The following works assuming swift knows how many elements you have. However, there are many simpler things I would /think/ should work that do not. There seem to possibly be some bugs (i.e. things that don't seem intentional) about declaring a variable before an iterate statement and then writing to it on each iterate statement. Hence the use of lists below. I will also put a second version after which I would think should work but does not Working: $ swift test.swift > Swift svn swift-r3826 cog-r2988 > RunID: 20120403-1509-3xnsrx10 > Progress: > SwiftScript trace: this:is:a:test > Final status: (string j[]) join(string s[], string c, int n){ > j[0]=""; > iterate i { > if(i j[i+1]=@strcat( j[i], s[i]+c ); > } > else{ > j[i+1]=@strcat( j[i], s[i] ); > } > } until(i==n-1); > } > string a[]; > a[0] = "this"; > a[1] = "is"; > a[2] = "a"; > a[3] = "test"; > int nelements=4; > string j[]; > j = join( a, ":", nelements ); > trace(j[nelements]); Not working (error, jlist has multiple writers): > (string j) join(string s[], string c, int n){ > string jlist[]; > jlist[0] = ""; > iterate i { > if(i jlist[i+1]=@strcat( jlist[i], s[i]+c ); > } > else{ > jlist[i+1]=@strcat( jlist[i], s[i] ); > } > } until(i==n-1); > j = jlist[n]; > } > string a[]; > a[0] = "this"; > a[1] = "is"; > a[2] = "a"; > a[3] = "test"; > int nelements=4; > string j; > j = join( a, ":", nelements ); > trace(j); On Tue, Apr 3, 2012 at 2:02 PM, David Kelly wrote: > Hello, > > Is there a way to concatenate all elements of an array into a single > string? I would also like to define a separator between elements. I was > thinking of something similar to Perl's join function. > > If I have: > > string a[]; > a[0] = "this"; > a[1] = "is"; > a[2] = "a"; > a[3] = "test"; > > How can I get it into "this is a test" or "this:is:a:test"? @strcat > returns a reference. I can tracef with %q and get "[this,is,a,test]", but > it doesn't give me any control over the formatting as far as I know. > > I could call a shell script to do this, just wondering if there was > another way. > > David > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -------------- next part -------------- An HTML attachment was scrubbed... URL: From glen842 at uchicago.edu Tue Apr 3 14:17:04 2012 From: glen842 at uchicago.edu (Glen Hocky) Date: Tue, 3 Apr 2012 15:17:04 -0400 Subject: [Swift-devel] Join function? In-Reply-To: References: <355549853.97008.1333475393275.JavaMail.root@zimbra-mb2.anl.gov> <492858065.97114.1333476144147.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: p.s. With regards to my previous email. I seemed to be getting this error, which looks like it was/should have been address a long time ago http://lists.ci.uchicago.edu/pipermail/swift-devel/2009-June/005724.html On Tue, Apr 3, 2012 at 3:09 PM, Glen Hocky wrote: > This seemed like an interesting challenge. The following works assuming > swift knows how many elements you have. However, there are many simpler > things I would /think/ should work that do not. There seem to possibly be > some bugs (i.e. things that don't seem intentional) about declaring a > variable before an iterate statement and then writing to it on each iterate > statement. Hence the use of lists below. I will also put a second version > after which I would think should work but does not > > Working: > > $ swift test.swift >> Swift svn swift-r3826 cog-r2988 >> RunID: 20120403-1509-3xnsrx10 >> Progress: >> SwiftScript trace: this:is:a:test >> Final status: > > > > (string j[]) join(string s[], string c, int n){ >> j[0]=""; >> iterate i { >> if(i> j[i+1]=@strcat( j[i], s[i]+c ); >> } >> else{ >> j[i+1]=@strcat( j[i], s[i] ); >> } >> } until(i==n-1); >> >> } >> string a[]; >> a[0] = "this"; >> a[1] = "is"; >> a[2] = "a"; >> a[3] = "test"; >> int nelements=4; >> string j[]; >> j = join( a, ":", nelements ); >> trace(j[nelements]); > > > Not working (error, jlist has multiple writers): > >> (string j) join(string s[], string c, int n){ >> string jlist[]; >> jlist[0] = ""; >> iterate i { >> if(i> jlist[i+1]=@strcat( jlist[i], s[i]+c ); >> } >> else{ >> jlist[i+1]=@strcat( jlist[i], s[i] ); >> } >> } until(i==n-1); >> j = jlist[n]; >> >> } >> string a[]; >> a[0] = "this"; >> a[1] = "is"; >> a[2] = "a"; >> a[3] = "test"; >> int nelements=4; >> string j; >> j = join( a, ":", nelements ); >> trace(j); > > > > > On Tue, Apr 3, 2012 at 2:02 PM, David Kelly wrote: > >> Hello, >> >> Is there a way to concatenate all elements of an array into a single >> string? I would also like to define a separator between elements. I was >> thinking of something similar to Perl's join function. >> >> If I have: >> >> string a[]; >> a[0] = "this"; >> a[1] = "is"; >> a[2] = "a"; >> a[3] = "test"; >> >> How can I get it into "this is a test" or "this:is:a:test"? @strcat >> returns a reference. I can tracef with %q and get "[this,is,a,test]", but >> it doesn't give me any control over the formatting as far as I know. >> >> I could call a shell script to do this, just wondering if there was >> another way. >> >> David >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Tue Apr 3 14:44:33 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 03 Apr 2012 12:44:33 -0700 Subject: [Swift-devel] Join function? In-Reply-To: References: <355549853.97008.1333475393275.JavaMail.root@zimbra-mb2.anl.gov> <492858065.97114.1333476144147.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: <1333482273.12679.5.camel@blabla> Nice! I think though, fundamentally, there is no way to write this without either of: 1. length(array) 2. head/tail(array) (with the ability compare with [] - isEmpty(array)). But there might be ways in which the above could be accomplished. Mihael On Tue, 2012-04-03 at 15:09 -0400, Glen Hocky wrote: > This seemed like an interesting challenge. The following works > assuming swift knows how many elements you have. However, there are > many simpler things I would /think/ should work that do not. There > seem to possibly be some bugs (i.e. things that don't seem > intentional) about declaring a variable before an iterate statement > and then writing to it on each iterate statement. Hence the use of > lists below. I will also put a second version after which I would > think should work but does not > > > Working: > > > $ swift test.swift > Swift svn swift-r3826 cog-r2988 > RunID: 20120403-1509-3xnsrx10 > Progress: > SwiftScript trace: this:is:a:test > Final status: > > > > > (string j[]) join(string s[], string c, int n){ > j[0]=""; > iterate i { > if(i j[i+1]=@strcat( j[i], s[i]+c ); > } > else{ > j[i+1]=@strcat( j[i], s[i] ); > } > } until(i==n-1); > } > string a[]; > a[0] = "this"; > a[1] = "is"; > a[2] = "a"; > a[3] = "test"; > int nelements=4; > string j[]; > j = join( a, ":", nelements ); > trace(j[nelements]); > > > Not working (error, jlist has multiple writers): > (string j) join(string s[], string c, int n){ > string jlist[]; > jlist[0] = ""; > iterate i { > if(i jlist[i+1]=@strcat( jlist[i], s[i]+c ); > } > else{ > jlist[i+1]=@strcat( jlist[i], s[i] ); > } > } until(i==n-1); > j = jlist[n]; > } > string a[]; > a[0] = "this"; > a[1] = "is"; > a[2] = "a"; > a[3] = "test"; > int nelements=4; > string j; > j = join( a, ":", nelements ); > trace(j); > > > > > > On Tue, Apr 3, 2012 at 2:02 PM, David Kelly > wrote: > Hello, > > Is there a way to concatenate all elements of an array into a > single string? I would also like to define a separator between > elements. I was thinking of something similar to Perl's join > function. > > If I have: > > string a[]; > a[0] = "this"; > a[1] = "is"; > a[2] = "a"; > a[3] = "test"; > > How can I get it into "this is a test" or "this:is:a:test"? > @strcat returns a reference. I can tracef with %q and get > "[this,is,a,test]", but it doesn't give me any control over > the formatting as far as I know. > > I could call a shell script to do this, just wondering if > there was another way. > > David > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From jonmon at mcs.anl.gov Tue Apr 3 15:43:39 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Tue, 3 Apr 2012 15:43:39 -0500 Subject: [Swift-devel] Join function? In-Reply-To: <1333482273.12679.5.camel@blabla> References: <355549853.97008.1333475393275.JavaMail.root@zimbra-mb2.anl.gov> <492858065.97114.1333476144147.JavaMail.root@zimbra-mb2.anl.gov> <1333482273.12679.5.camel@blabla> Message-ID: <98AD1D87-AEF7-4013-B857-C09ABDA602B4@mcs.anl.gov> So Swift does have an @length built-in function that returns the length of an array. Mike said he also tried coming up with a Swift function to this same think and said that @length does not always behave as he thought it should. I am going to look into fixing @length so we do have a way finding the length of an array. On Apr 3, 2012, at 2:44 PM, Mihael Hategan wrote: > Nice! > > I think though, fundamentally, there is no way to write this without > either of: > 1. length(array) > 2. head/tail(array) (with the ability compare with [] - isEmpty(array)). > > But there might be ways in which the above could be accomplished. > > Mihael > > On Tue, 2012-04-03 at 15:09 -0400, Glen Hocky wrote: >> This seemed like an interesting challenge. The following works >> assuming swift knows how many elements you have. However, there are >> many simpler things I would /think/ should work that do not. There >> seem to possibly be some bugs (i.e. things that don't seem >> intentional) about declaring a variable before an iterate statement >> and then writing to it on each iterate statement. Hence the use of >> lists below. I will also put a second version after which I would >> think should work but does not >> >> >> Working: >> >> >> $ swift test.swift >> Swift svn swift-r3826 cog-r2988 >> RunID: 20120403-1509-3xnsrx10 >> Progress: >> SwiftScript trace: this:is:a:test >> Final status: >> >> >> >> >> (string j[]) join(string s[], string c, int n){ >> j[0]=""; >> iterate i { >> if(i> j[i+1]=@strcat( j[i], s[i]+c ); >> } >> else{ >> j[i+1]=@strcat( j[i], s[i] ); >> } >> } until(i==n-1); >> } >> string a[]; >> a[0] = "this"; >> a[1] = "is"; >> a[2] = "a"; >> a[3] = "test"; >> int nelements=4; >> string j[]; >> j = join( a, ":", nelements ); >> trace(j[nelements]); >> >> >> Not working (error, jlist has multiple writers): >> (string j) join(string s[], string c, int n){ >> string jlist[]; >> jlist[0] = ""; >> iterate i { >> if(i> jlist[i+1]=@strcat( jlist[i], s[i]+c ); >> } >> else{ >> jlist[i+1]=@strcat( jlist[i], s[i] ); >> } >> } until(i==n-1); >> j = jlist[n]; >> } >> string a[]; >> a[0] = "this"; >> a[1] = "is"; >> a[2] = "a"; >> a[3] = "test"; >> int nelements=4; >> string j; >> j = join( a, ":", nelements ); >> trace(j); >> >> >> >> >> >> On Tue, Apr 3, 2012 at 2:02 PM, David Kelly >> wrote: >> Hello, >> >> Is there a way to concatenate all elements of an array into a >> single string? I would also like to define a separator between >> elements. I was thinking of something similar to Perl's join >> function. >> >> If I have: >> >> string a[]; >> a[0] = "this"; >> a[1] = "is"; >> a[2] = "a"; >> a[3] = "test"; >> >> How can I get it into "this is a test" or "this:is:a:test"? >> @strcat returns a reference. I can tracef with %q and get >> "[this,is,a,test]", but it doesn't give me any control over >> the formatting as far as I know. >> >> I could call a shell script to do this, just wondering if >> there was another way. >> >> David >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From hategan at mcs.anl.gov Tue Apr 3 15:46:10 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 03 Apr 2012 13:46:10 -0700 Subject: [Swift-devel] Join function? In-Reply-To: <98AD1D87-AEF7-4013-B857-C09ABDA602B4@mcs.anl.gov> References: <355549853.97008.1333475393275.JavaMail.root@zimbra-mb2.anl.gov> <492858065.97114.1333476144147.JavaMail.root@zimbra-mb2.anl.gov> <1333482273.12679.5.camel@blabla> <98AD1D87-AEF7-4013-B857-C09ABDA602B4@mcs.anl.gov> Message-ID: <1333485970.14369.0.camel@blabla> On Tue, 2012-04-03 at 15:43 -0500, Jonathan Monette wrote: > So Swift does have an @length built-in function that returns the > length of an array. Mike said he also tried coming up with a Swift > function to this same think and said that @length does not always > behave as he thought it should. I am going to look into fixing > @length so we do have a way finding the length of an array. Can you be more specific as to what's wrong with @length? From jonmon at mcs.anl.gov Tue Apr 3 15:47:18 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Tue, 3 Apr 2012 15:47:18 -0500 Subject: [Swift-devel] Join function? In-Reply-To: <1333485970.14369.0.camel@blabla> References: <355549853.97008.1333475393275.JavaMail.root@zimbra-mb2.anl.gov> <492858065.97114.1333476144147.JavaMail.root@zimbra-mb2.anl.gov> <1333482273.12679.5.camel@blabla> <98AD1D87-AEF7-4013-B857-C09ABDA602B4@mcs.anl.gov> <1333485970.14369.0.camel@blabla> Message-ID: <0EDF5AE3-4AA7-460D-9778-EF7452C9AB8F@mcs.anl.gov> I do not know what Mike was experiencing. I was going to investigate myself and ask what his example was. Perhaps he can elaborate more on what he was witnessing. On Apr 3, 2012, at 3:46 PM, Mihael Hategan wrote: > On Tue, 2012-04-03 at 15:43 -0500, Jonathan Monette wrote: >> So Swift does have an @length built-in function that returns the >> length of an array. Mike said he also tried coming up with a Swift >> function to this same think and said that @length does not always >> behave as he thought it should. I am going to look into fixing >> @length so we do have a way finding the length of an array. > > Can you be more specific as to what's wrong with @length? > From wilde at mcs.anl.gov Tue Apr 3 16:13:44 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 3 Apr 2012 16:13:44 -0500 (CDT) Subject: [Swift-devel] Join function? In-Reply-To: <0EDF5AE3-4AA7-460D-9778-EF7452C9AB8F@mcs.anl.gov> Message-ID: <1355358185.124847.1333487624872.JavaMail.root@zimbra.anl.gov> for an array created like this: string a[] = ["a","b","c"]; @length(a) returned 3 when called from open code, but when applied to the argument s[] inside a function, to which a was passed, returned 0 inside that function. In trying to replicate this without divulging the source code to recursive strjoin() ;) which I was leaving as an exercise to the (email) reader, I see that there is further weirdness, likely due to confusion/race between @length() and array closing semantics. For example: com$ cat length.swift string a[] = ["a","b","c"]; string b[]; b[0] = "a"; b[1] = "b"; b[2] = "c"; (string o) strjoin(string s[], string sep) { # o = strjoinf(s, sep, @length(s)); # length returns 0 here! trace("len inside", at length(s)); } trace("len a outside", @length(a)); trace("len b outside", @length(b)); string js = strjoin(a,"---"); #### Gives this non-determinsitic output: com$ swift length.swift no sites file specified, setting to default: /home/wilde/swift/rev/trunk/etc/sites.xml Swift trunk swift-r5739 cog-r3368 (cog modified locally) RunID: 20120403-1559-ulurhdrf Progress: time: Tue, 03 Apr 2012 15:59:31 -0500 SwiftScript trace: len b outside, 2 SwiftScript trace: len inside, 3 SwiftScript trace: len a outside, 3 Final status: Tue, 03 Apr 2012 15:59:31 -0500 com$ swift length.swift no sites file specified, setting to default: /home/wilde/swift/rev/trunk/etc/sites.xml Swift trunk swift-r5739 cog-r3368 (cog modified locally) RunID: 20120403-1559-x3ovo0i6 Progress: time: Tue, 03 Apr 2012 15:59:39 -0500 SwiftScript trace: len b outside, 2 SwiftScript trace: len a outside, 0 SwiftScript trace: len inside, 0 Final status: Tue, 03 Apr 2012 15:59:39 -0500 com$ swift length.swift no sites file specified, setting to default: /home/wilde/swift/rev/trunk/etc/sites.xml Swift trunk swift-r5739 cog-r3368 (cog modified locally) RunID: 20120403-1559-zkglev13 Progress: time: Tue, 03 Apr 2012 15:59:42 -0500 SwiftScript trace: len b outside, 3 SwiftScript trace: len a outside, 0 SwiftScript trace: len inside, 0 Final status: Tue, 03 Apr 2012 15:59:42 -0500 com$ swift length.swift no sites file specified, setting to default: /home/wilde/swift/rev/trunk/etc/sites.xml Swift trunk swift-r5739 cog-r3368 (cog modified locally) RunID: 20120403-1559-5ttgglqc Progress: time: Tue, 03 Apr 2012 15:59:48 -0500 SwiftScript trace: len a outside, 0 SwiftScript trace: len b outside, 3 SwiftScript trace: len inside, 3 Final status: Tue, 03 Apr 2012 15:59:48 -0500 com$ ----- Original Message ----- > From: "Jonathan Monette" > To: "Mihael Hategan" > Cc: "Glen Hocky" , "swift-devel at ci.uchicago.edu Devel" > Sent: Tuesday, April 3, 2012 3:47:18 PM > Subject: Re: [Swift-devel] Join function? > I do not know what Mike was experiencing. I was going to investigate > myself and ask what his example was. Perhaps he can elaborate more on > what he was witnessing. > > On Apr 3, 2012, at 3:46 PM, Mihael Hategan wrote: > > > On Tue, 2012-04-03 at 15:43 -0500, Jonathan Monette wrote: > >> So Swift does have an @length built-in function that returns the > >> length of an array. Mike said he also tried coming up with a Swift > >> function to this same think and said that @length does not always > >> behave as he thought it should. I am going to look into fixing > >> @length so we do have a way finding the length of an array. > > > > Can you be more specific as to what's wrong with @length? > > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From hategan at mcs.anl.gov Tue Apr 3 16:59:11 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 03 Apr 2012 14:59:11 -0700 Subject: [Swift-devel] Join function? In-Reply-To: <1355358185.124847.1333487624872.JavaMail.root@zimbra.anl.gov> References: <1355358185.124847.1333487624872.JavaMail.root@zimbra.anl.gov> Message-ID: <1333490351.16398.0.camel@blabla> Maybe it didn't wait for the array to be closed? On Tue, 2012-04-03 at 16:13 -0500, Michael Wilde wrote: > for an array created like this: > > string a[] = ["a","b","c"]; > > @length(a) returned 3 when called from open code, but when applied to the argument s[] inside a function, to which a was passed, returned 0 inside that function. > > In trying to replicate this without divulging the source code to recursive strjoin() ;) which I was leaving as an exercise to the (email) reader, I see that there is further weirdness, likely due to confusion/race between @length() and array closing semantics. > > For example: > > com$ cat length.swift > string a[] = ["a","b","c"]; > > string b[]; > b[0] = "a"; > b[1] = "b"; > b[2] = "c"; > > (string o) strjoin(string s[], string sep) > { > # o = strjoinf(s, sep, @length(s)); # length returns 0 here! > trace("len inside", at length(s)); > > } > > trace("len a outside", @length(a)); > trace("len b outside", @length(b)); > > string js = strjoin(a,"---"); > > #### > > Gives this non-determinsitic output: > > com$ swift length.swift > > no sites file specified, setting to default: /home/wilde/swift/rev/trunk/etc/sites.xml > Swift trunk swift-r5739 cog-r3368 (cog modified locally) > > RunID: 20120403-1559-ulurhdrf > Progress: time: Tue, 03 Apr 2012 15:59:31 -0500 > SwiftScript trace: len b outside, 2 > SwiftScript trace: len inside, 3 > SwiftScript trace: len a outside, 3 > Final status: Tue, 03 Apr 2012 15:59:31 -0500 > > com$ swift length.swift > > no sites file specified, setting to default: /home/wilde/swift/rev/trunk/etc/sites.xml > Swift trunk swift-r5739 cog-r3368 (cog modified locally) > > RunID: 20120403-1559-x3ovo0i6 > Progress: time: Tue, 03 Apr 2012 15:59:39 -0500 > SwiftScript trace: len b outside, 2 > SwiftScript trace: len a outside, 0 > SwiftScript trace: len inside, 0 > Final status: Tue, 03 Apr 2012 15:59:39 -0500 > > com$ swift length.swift > > no sites file specified, setting to default: /home/wilde/swift/rev/trunk/etc/sites.xml > Swift trunk swift-r5739 cog-r3368 (cog modified locally) > > RunID: 20120403-1559-zkglev13 > Progress: time: Tue, 03 Apr 2012 15:59:42 -0500 > SwiftScript trace: len b outside, 3 > SwiftScript trace: len a outside, 0 > SwiftScript trace: len inside, 0 > Final status: Tue, 03 Apr 2012 15:59:42 -0500 > > com$ swift length.swift > > no sites file specified, setting to default: /home/wilde/swift/rev/trunk/etc/sites.xml > Swift trunk swift-r5739 cog-r3368 (cog modified locally) > > RunID: 20120403-1559-5ttgglqc > Progress: time: Tue, 03 Apr 2012 15:59:48 -0500 > SwiftScript trace: len a outside, 0 > SwiftScript trace: len b outside, 3 > SwiftScript trace: len inside, 3 > Final status: Tue, 03 Apr 2012 15:59:48 -0500 > com$ > > > ----- Original Message ----- > > From: "Jonathan Monette" > > To: "Mihael Hategan" > > Cc: "Glen Hocky" , "swift-devel at ci.uchicago.edu Devel" > > Sent: Tuesday, April 3, 2012 3:47:18 PM > > Subject: Re: [Swift-devel] Join function? > > I do not know what Mike was experiencing. I was going to investigate > > myself and ask what his example was. Perhaps he can elaborate more on > > what he was witnessing. > > > > On Apr 3, 2012, at 3:46 PM, Mihael Hategan wrote: > > > > > On Tue, 2012-04-03 at 15:43 -0500, Jonathan Monette wrote: > > >> So Swift does have an @length built-in function that returns the > > >> length of an array. Mike said he also tried coming up with a Swift > > >> function to this same think and said that @length does not always > > >> behave as he thought it should. I am going to look into fixing > > >> @length so we do have a way finding the length of an array. > > > > > > Can you be more specific as to what's wrong with @length? > > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > From jonmon at mcs.anl.gov Tue Apr 3 17:04:00 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Tue, 3 Apr 2012 17:04:00 -0500 Subject: [Swift-devel] Join function? In-Reply-To: <1333490351.16398.0.camel@blabla> References: <1355358185.124847.1333487624872.JavaMail.root@zimbra.anl.gov> <1333490351.16398.0.camel@blabla> Message-ID: That's exactly what is happening. I am looking at it right now. I am going to try a fix and do some tests. I'll report back when it is fixed. On Apr 3, 2012, at 4:59 PM, Mihael Hategan wrote: > Maybe it didn't wait for the array to be closed? > > On Tue, 2012-04-03 at 16:13 -0500, Michael Wilde wrote: >> for an array created like this: >> >> string a[] = ["a","b","c"]; >> >> @length(a) returned 3 when called from open code, but when applied to the argument s[] inside a function, to which a was passed, returned 0 inside that function. >> >> In trying to replicate this without divulging the source code to recursive strjoin() ;) which I was leaving as an exercise to the (email) reader, I see that there is further weirdness, likely due to confusion/race between @length() and array closing semantics. >> >> For example: >> >> com$ cat length.swift >> string a[] = ["a","b","c"]; >> >> string b[]; >> b[0] = "a"; >> b[1] = "b"; >> b[2] = "c"; >> >> (string o) strjoin(string s[], string sep) >> { >> # o = strjoinf(s, sep, @length(s)); # length returns 0 here! >> trace("len inside", at length(s)); >> >> } >> >> trace("len a outside", @length(a)); >> trace("len b outside", @length(b)); >> >> string js = strjoin(a,"---"); >> >> #### >> >> Gives this non-determinsitic output: >> >> com$ swift length.swift >> >> no sites file specified, setting to default: /home/wilde/swift/rev/trunk/etc/sites.xml >> Swift trunk swift-r5739 cog-r3368 (cog modified locally) >> >> RunID: 20120403-1559-ulurhdrf >> Progress: time: Tue, 03 Apr 2012 15:59:31 -0500 >> SwiftScript trace: len b outside, 2 >> SwiftScript trace: len inside, 3 >> SwiftScript trace: len a outside, 3 >> Final status: Tue, 03 Apr 2012 15:59:31 -0500 >> >> com$ swift length.swift >> >> no sites file specified, setting to default: /home/wilde/swift/rev/trunk/etc/sites.xml >> Swift trunk swift-r5739 cog-r3368 (cog modified locally) >> >> RunID: 20120403-1559-x3ovo0i6 >> Progress: time: Tue, 03 Apr 2012 15:59:39 -0500 >> SwiftScript trace: len b outside, 2 >> SwiftScript trace: len a outside, 0 >> SwiftScript trace: len inside, 0 >> Final status: Tue, 03 Apr 2012 15:59:39 -0500 >> >> com$ swift length.swift >> >> no sites file specified, setting to default: /home/wilde/swift/rev/trunk/etc/sites.xml >> Swift trunk swift-r5739 cog-r3368 (cog modified locally) >> >> RunID: 20120403-1559-zkglev13 >> Progress: time: Tue, 03 Apr 2012 15:59:42 -0500 >> SwiftScript trace: len b outside, 3 >> SwiftScript trace: len a outside, 0 >> SwiftScript trace: len inside, 0 >> Final status: Tue, 03 Apr 2012 15:59:42 -0500 >> >> com$ swift length.swift >> >> no sites file specified, setting to default: /home/wilde/swift/rev/trunk/etc/sites.xml >> Swift trunk swift-r5739 cog-r3368 (cog modified locally) >> >> RunID: 20120403-1559-5ttgglqc >> Progress: time: Tue, 03 Apr 2012 15:59:48 -0500 >> SwiftScript trace: len a outside, 0 >> SwiftScript trace: len b outside, 3 >> SwiftScript trace: len inside, 3 >> Final status: Tue, 03 Apr 2012 15:59:48 -0500 >> com$ >> >> >> ----- Original Message ----- >>> From: "Jonathan Monette" >>> To: "Mihael Hategan" >>> Cc: "Glen Hocky" , "swift-devel at ci.uchicago.edu Devel" >>> Sent: Tuesday, April 3, 2012 3:47:18 PM >>> Subject: Re: [Swift-devel] Join function? >>> I do not know what Mike was experiencing. I was going to investigate >>> myself and ask what his example was. Perhaps he can elaborate more on >>> what he was witnessing. >>> >>> On Apr 3, 2012, at 3:46 PM, Mihael Hategan wrote: >>> >>>> On Tue, 2012-04-03 at 15:43 -0500, Jonathan Monette wrote: >>>>> So Swift does have an @length built-in function that returns the >>>>> length of an array. Mike said he also tried coming up with a Swift >>>>> function to this same think and said that @length does not always >>>>> behave as he thought it should. I am going to look into fixing >>>>> @length so we do have a way finding the length of an array. >>>> >>>> Can you be more specific as to what's wrong with @length? >>>> >>> >>> _______________________________________________ >>> Swift-devel mailing list >>> Swift-devel at ci.uchicago.edu >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> > > From hategan at mcs.anl.gov Tue Apr 3 17:10:31 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 03 Apr 2012 15:10:31 -0700 Subject: [Swift-devel] Join function? In-Reply-To: References: <1355358185.124847.1333487624872.JavaMail.root@zimbra.anl.gov> <1333490351.16398.0.camel@blabla> Message-ID: <1333491031.16398.1.camel@blabla> Just add "handle.waitFor();" before getting the length. On Tue, 2012-04-03 at 17:04 -0500, Jonathan Monette wrote: > That's exactly what is happening. I am looking at it right now. I am going to try a fix and do some tests. I'll report back when it is fixed. > > On Apr 3, 2012, at 4:59 PM, Mihael Hategan wrote: > > > Maybe it didn't wait for the array to be closed? > > > > On Tue, 2012-04-03 at 16:13 -0500, Michael Wilde wrote: > >> for an array created like this: > >> > >> string a[] = ["a","b","c"]; > >> > >> @length(a) returned 3 when called from open code, but when applied to the argument s[] inside a function, to which a was passed, returned 0 inside that function. > >> > >> In trying to replicate this without divulging the source code to recursive strjoin() ;) which I was leaving as an exercise to the (email) reader, I see that there is further weirdness, likely due to confusion/race between @length() and array closing semantics. > >> > >> For example: > >> > >> com$ cat length.swift > >> string a[] = ["a","b","c"]; > >> > >> string b[]; > >> b[0] = "a"; > >> b[1] = "b"; > >> b[2] = "c"; > >> > >> (string o) strjoin(string s[], string sep) > >> { > >> # o = strjoinf(s, sep, @length(s)); # length returns 0 here! > >> trace("len inside", at length(s)); > >> > >> } > >> > >> trace("len a outside", @length(a)); > >> trace("len b outside", @length(b)); > >> > >> string js = strjoin(a,"---"); > >> > >> #### > >> > >> Gives this non-determinsitic output: > >> > >> com$ swift length.swift > >> > >> no sites file specified, setting to default: /home/wilde/swift/rev/trunk/etc/sites.xml > >> Swift trunk swift-r5739 cog-r3368 (cog modified locally) > >> > >> RunID: 20120403-1559-ulurhdrf > >> Progress: time: Tue, 03 Apr 2012 15:59:31 -0500 > >> SwiftScript trace: len b outside, 2 > >> SwiftScript trace: len inside, 3 > >> SwiftScript trace: len a outside, 3 > >> Final status: Tue, 03 Apr 2012 15:59:31 -0500 > >> > >> com$ swift length.swift > >> > >> no sites file specified, setting to default: /home/wilde/swift/rev/trunk/etc/sites.xml > >> Swift trunk swift-r5739 cog-r3368 (cog modified locally) > >> > >> RunID: 20120403-1559-x3ovo0i6 > >> Progress: time: Tue, 03 Apr 2012 15:59:39 -0500 > >> SwiftScript trace: len b outside, 2 > >> SwiftScript trace: len a outside, 0 > >> SwiftScript trace: len inside, 0 > >> Final status: Tue, 03 Apr 2012 15:59:39 -0500 > >> > >> com$ swift length.swift > >> > >> no sites file specified, setting to default: /home/wilde/swift/rev/trunk/etc/sites.xml > >> Swift trunk swift-r5739 cog-r3368 (cog modified locally) > >> > >> RunID: 20120403-1559-zkglev13 > >> Progress: time: Tue, 03 Apr 2012 15:59:42 -0500 > >> SwiftScript trace: len b outside, 3 > >> SwiftScript trace: len a outside, 0 > >> SwiftScript trace: len inside, 0 > >> Final status: Tue, 03 Apr 2012 15:59:42 -0500 > >> > >> com$ swift length.swift > >> > >> no sites file specified, setting to default: /home/wilde/swift/rev/trunk/etc/sites.xml > >> Swift trunk swift-r5739 cog-r3368 (cog modified locally) > >> > >> RunID: 20120403-1559-5ttgglqc > >> Progress: time: Tue, 03 Apr 2012 15:59:48 -0500 > >> SwiftScript trace: len a outside, 0 > >> SwiftScript trace: len b outside, 3 > >> SwiftScript trace: len inside, 3 > >> Final status: Tue, 03 Apr 2012 15:59:48 -0500 > >> com$ > >> > >> > >> ----- Original Message ----- > >>> From: "Jonathan Monette" > >>> To: "Mihael Hategan" > >>> Cc: "Glen Hocky" , "swift-devel at ci.uchicago.edu Devel" > >>> Sent: Tuesday, April 3, 2012 3:47:18 PM > >>> Subject: Re: [Swift-devel] Join function? > >>> I do not know what Mike was experiencing. I was going to investigate > >>> myself and ask what his example was. Perhaps he can elaborate more on > >>> what he was witnessing. > >>> > >>> On Apr 3, 2012, at 3:46 PM, Mihael Hategan wrote: > >>> > >>>> On Tue, 2012-04-03 at 15:43 -0500, Jonathan Monette wrote: > >>>>> So Swift does have an @length built-in function that returns the > >>>>> length of an array. Mike said he also tried coming up with a Swift > >>>>> function to this same think and said that @length does not always > >>>>> behave as he thought it should. I am going to look into fixing > >>>>> @length so we do have a way finding the length of an array. > >>>> > >>>> Can you be more specific as to what's wrong with @length? > >>>> > >>> > >>> _______________________________________________ > >>> Swift-devel mailing list > >>> Swift-devel at ci.uchicago.edu > >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >> > > > > > From jonmon at mcs.anl.gov Tue Apr 3 17:13:35 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Tue, 3 Apr 2012 17:13:35 -0500 Subject: [Swift-devel] Join function? In-Reply-To: <1333491031.16398.1.camel@blabla> References: <1355358185.124847.1333487624872.JavaMail.root@zimbra.anl.gov> <1333490351.16398.0.camel@blabla> <1333491031.16398.1.camel@blabla> Message-ID: <3D9B00EC-9B72-4B95-BED8-5A8541C052DA@mcs.anl.gov> When length was first added it created handle and would throw a Future exception if the array that was passed as an argument was not closed. Now the code gets a map and returns the map size. I am looking through the svn log to find out why that change was made. Do you know why? On Apr 3, 2012, at 5:10 PM, Mihael Hategan wrote: > Just add "handle.waitFor();" before getting the length. > > On Tue, 2012-04-03 at 17:04 -0500, Jonathan Monette wrote: >> That's exactly what is happening. I am looking at it right now. I am going to try a fix and do some tests. I'll report back when it is fixed. >> >> On Apr 3, 2012, at 4:59 PM, Mihael Hategan wrote: >> >>> Maybe it didn't wait for the array to be closed? >>> >>> On Tue, 2012-04-03 at 16:13 -0500, Michael Wilde wrote: >>>> for an array created like this: >>>> >>>> string a[] = ["a","b","c"]; >>>> >>>> @length(a) returned 3 when called from open code, but when applied to the argument s[] inside a function, to which a was passed, returned 0 inside that function. >>>> >>>> In trying to replicate this without divulging the source code to recursive strjoin() ;) which I was leaving as an exercise to the (email) reader, I see that there is further weirdness, likely due to confusion/race between @length() and array closing semantics. >>>> >>>> For example: >>>> >>>> com$ cat length.swift >>>> string a[] = ["a","b","c"]; >>>> >>>> string b[]; >>>> b[0] = "a"; >>>> b[1] = "b"; >>>> b[2] = "c"; >>>> >>>> (string o) strjoin(string s[], string sep) >>>> { >>>> # o = strjoinf(s, sep, @length(s)); # length returns 0 here! >>>> trace("len inside", at length(s)); >>>> >>>> } >>>> >>>> trace("len a outside", @length(a)); >>>> trace("len b outside", @length(b)); >>>> >>>> string js = strjoin(a,"---"); >>>> >>>> #### >>>> >>>> Gives this non-determinsitic output: >>>> >>>> com$ swift length.swift >>>> >>>> no sites file specified, setting to default: /home/wilde/swift/rev/trunk/etc/sites.xml >>>> Swift trunk swift-r5739 cog-r3368 (cog modified locally) >>>> >>>> RunID: 20120403-1559-ulurhdrf >>>> Progress: time: Tue, 03 Apr 2012 15:59:31 -0500 >>>> SwiftScript trace: len b outside, 2 >>>> SwiftScript trace: len inside, 3 >>>> SwiftScript trace: len a outside, 3 >>>> Final status: Tue, 03 Apr 2012 15:59:31 -0500 >>>> >>>> com$ swift length.swift >>>> >>>> no sites file specified, setting to default: /home/wilde/swift/rev/trunk/etc/sites.xml >>>> Swift trunk swift-r5739 cog-r3368 (cog modified locally) >>>> >>>> RunID: 20120403-1559-x3ovo0i6 >>>> Progress: time: Tue, 03 Apr 2012 15:59:39 -0500 >>>> SwiftScript trace: len b outside, 2 >>>> SwiftScript trace: len a outside, 0 >>>> SwiftScript trace: len inside, 0 >>>> Final status: Tue, 03 Apr 2012 15:59:39 -0500 >>>> >>>> com$ swift length.swift >>>> >>>> no sites file specified, setting to default: /home/wilde/swift/rev/trunk/etc/sites.xml >>>> Swift trunk swift-r5739 cog-r3368 (cog modified locally) >>>> >>>> RunID: 20120403-1559-zkglev13 >>>> Progress: time: Tue, 03 Apr 2012 15:59:42 -0500 >>>> SwiftScript trace: len b outside, 3 >>>> SwiftScript trace: len a outside, 0 >>>> SwiftScript trace: len inside, 0 >>>> Final status: Tue, 03 Apr 2012 15:59:42 -0500 >>>> >>>> com$ swift length.swift >>>> >>>> no sites file specified, setting to default: /home/wilde/swift/rev/trunk/etc/sites.xml >>>> Swift trunk swift-r5739 cog-r3368 (cog modified locally) >>>> >>>> RunID: 20120403-1559-5ttgglqc >>>> Progress: time: Tue, 03 Apr 2012 15:59:48 -0500 >>>> SwiftScript trace: len a outside, 0 >>>> SwiftScript trace: len b outside, 3 >>>> SwiftScript trace: len inside, 3 >>>> Final status: Tue, 03 Apr 2012 15:59:48 -0500 >>>> com$ >>>> >>>> >>>> ----- Original Message ----- >>>>> From: "Jonathan Monette" >>>>> To: "Mihael Hategan" >>>>> Cc: "Glen Hocky" , "swift-devel at ci.uchicago.edu Devel" >>>>> Sent: Tuesday, April 3, 2012 3:47:18 PM >>>>> Subject: Re: [Swift-devel] Join function? >>>>> I do not know what Mike was experiencing. I was going to investigate >>>>> myself and ask what his example was. Perhaps he can elaborate more on >>>>> what he was witnessing. >>>>> >>>>> On Apr 3, 2012, at 3:46 PM, Mihael Hategan wrote: >>>>> >>>>>> On Tue, 2012-04-03 at 15:43 -0500, Jonathan Monette wrote: >>>>>>> So Swift does have an @length built-in function that returns the >>>>>>> length of an array. Mike said he also tried coming up with a Swift >>>>>>> function to this same think and said that @length does not always >>>>>>> behave as he thought it should. I am going to look into fixing >>>>>>> @length so we do have a way finding the length of an array. >>>>>> >>>>>> Can you be more specific as to what's wrong with @length? >>>>>> >>>>> >>>>> _______________________________________________ >>>>> Swift-devel mailing list >>>>> Swift-devel at ci.uchicago.edu >>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>> >>> >>> >> > > From jonmon at mcs.anl.gov Tue Apr 3 17:17:40 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Tue, 3 Apr 2012 17:17:40 -0500 Subject: [Swift-devel] Join function? In-Reply-To: <3D9B00EC-9B72-4B95-BED8-5A8541C052DA@mcs.anl.gov> References: <1355358185.124847.1333487624872.JavaMail.root@zimbra.anl.gov> <1333490351.16398.0.camel@blabla> <1333491031.16398.1.camel@blabla> <3D9B00EC-9B72-4B95-BED8-5A8541C052DA@mcs.anl.gov> Message-ID: <40A334BC-3782-4115-A173-7C81CB71D782@mcs.anl.gov> So swift r4755 has code that will wait for the array to be closed using waitFor(). However, swift r4756 has the new code which uses the size of a map. The svn log does not say why. I am reverting the code for the length function to r4755. On Apr 3, 2012, at 5:13 PM, Jonathan Monette wrote: > When length was first added it created handle and would throw a Future exception if the array that was passed as an argument was not closed. Now the code gets a map and returns the map size. I am looking through the svn log to find out why that change was made. > > Do you know why? > > On Apr 3, 2012, at 5:10 PM, Mihael Hategan wrote: > >> Just add "handle.waitFor();" before getting the length. >> >> On Tue, 2012-04-03 at 17:04 -0500, Jonathan Monette wrote: >>> That's exactly what is happening. I am looking at it right now. I am going to try a fix and do some tests. I'll report back when it is fixed. >>> >>> On Apr 3, 2012, at 4:59 PM, Mihael Hategan wrote: >>> >>>> Maybe it didn't wait for the array to be closed? >>>> >>>> On Tue, 2012-04-03 at 16:13 -0500, Michael Wilde wrote: >>>>> for an array created like this: >>>>> >>>>> string a[] = ["a","b","c"]; >>>>> >>>>> @length(a) returned 3 when called from open code, but when applied to the argument s[] inside a function, to which a was passed, returned 0 inside that function. >>>>> >>>>> In trying to replicate this without divulging the source code to recursive strjoin() ;) which I was leaving as an exercise to the (email) reader, I see that there is further weirdness, likely due to confusion/race between @length() and array closing semantics. >>>>> >>>>> For example: >>>>> >>>>> com$ cat length.swift >>>>> string a[] = ["a","b","c"]; >>>>> >>>>> string b[]; >>>>> b[0] = "a"; >>>>> b[1] = "b"; >>>>> b[2] = "c"; >>>>> >>>>> (string o) strjoin(string s[], string sep) >>>>> { >>>>> # o = strjoinf(s, sep, @length(s)); # length returns 0 here! >>>>> trace("len inside", at length(s)); >>>>> >>>>> } >>>>> >>>>> trace("len a outside", @length(a)); >>>>> trace("len b outside", @length(b)); >>>>> >>>>> string js = strjoin(a,"---"); >>>>> >>>>> #### >>>>> >>>>> Gives this non-determinsitic output: >>>>> >>>>> com$ swift length.swift >>>>> >>>>> no sites file specified, setting to default: /home/wilde/swift/rev/trunk/etc/sites.xml >>>>> Swift trunk swift-r5739 cog-r3368 (cog modified locally) >>>>> >>>>> RunID: 20120403-1559-ulurhdrf >>>>> Progress: time: Tue, 03 Apr 2012 15:59:31 -0500 >>>>> SwiftScript trace: len b outside, 2 >>>>> SwiftScript trace: len inside, 3 >>>>> SwiftScript trace: len a outside, 3 >>>>> Final status: Tue, 03 Apr 2012 15:59:31 -0500 >>>>> >>>>> com$ swift length.swift >>>>> >>>>> no sites file specified, setting to default: /home/wilde/swift/rev/trunk/etc/sites.xml >>>>> Swift trunk swift-r5739 cog-r3368 (cog modified locally) >>>>> >>>>> RunID: 20120403-1559-x3ovo0i6 >>>>> Progress: time: Tue, 03 Apr 2012 15:59:39 -0500 >>>>> SwiftScript trace: len b outside, 2 >>>>> SwiftScript trace: len a outside, 0 >>>>> SwiftScript trace: len inside, 0 >>>>> Final status: Tue, 03 Apr 2012 15:59:39 -0500 >>>>> >>>>> com$ swift length.swift >>>>> >>>>> no sites file specified, setting to default: /home/wilde/swift/rev/trunk/etc/sites.xml >>>>> Swift trunk swift-r5739 cog-r3368 (cog modified locally) >>>>> >>>>> RunID: 20120403-1559-zkglev13 >>>>> Progress: time: Tue, 03 Apr 2012 15:59:42 -0500 >>>>> SwiftScript trace: len b outside, 3 >>>>> SwiftScript trace: len a outside, 0 >>>>> SwiftScript trace: len inside, 0 >>>>> Final status: Tue, 03 Apr 2012 15:59:42 -0500 >>>>> >>>>> com$ swift length.swift >>>>> >>>>> no sites file specified, setting to default: /home/wilde/swift/rev/trunk/etc/sites.xml >>>>> Swift trunk swift-r5739 cog-r3368 (cog modified locally) >>>>> >>>>> RunID: 20120403-1559-5ttgglqc >>>>> Progress: time: Tue, 03 Apr 2012 15:59:48 -0500 >>>>> SwiftScript trace: len a outside, 0 >>>>> SwiftScript trace: len b outside, 3 >>>>> SwiftScript trace: len inside, 3 >>>>> Final status: Tue, 03 Apr 2012 15:59:48 -0500 >>>>> com$ >>>>> >>>>> >>>>> ----- Original Message ----- >>>>>> From: "Jonathan Monette" >>>>>> To: "Mihael Hategan" >>>>>> Cc: "Glen Hocky" , "swift-devel at ci.uchicago.edu Devel" >>>>>> Sent: Tuesday, April 3, 2012 3:47:18 PM >>>>>> Subject: Re: [Swift-devel] Join function? >>>>>> I do not know what Mike was experiencing. I was going to investigate >>>>>> myself and ask what his example was. Perhaps he can elaborate more on >>>>>> what he was witnessing. >>>>>> >>>>>> On Apr 3, 2012, at 3:46 PM, Mihael Hategan wrote: >>>>>> >>>>>>> On Tue, 2012-04-03 at 15:43 -0500, Jonathan Monette wrote: >>>>>>>> So Swift does have an @length built-in function that returns the >>>>>>>> length of an array. Mike said he also tried coming up with a Swift >>>>>>>> function to this same think and said that @length does not always >>>>>>>> behave as he thought it should. I am going to look into fixing >>>>>>>> @length so we do have a way finding the length of an array. >>>>>>> >>>>>>> Can you be more specific as to what's wrong with @length? >>>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Swift-devel mailing list >>>>>> Swift-devel at ci.uchicago.edu >>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>>> >>>> >>>> >>> >> >> > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From hategan at mcs.anl.gov Tue Apr 3 17:19:50 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 03 Apr 2012 15:19:50 -0700 Subject: [Swift-devel] Join function? In-Reply-To: <3D9B00EC-9B72-4B95-BED8-5A8541C052DA@mcs.anl.gov> References: <1355358185.124847.1333487624872.JavaMail.root@zimbra.anl.gov> <1333490351.16398.0.camel@blabla> <1333491031.16398.1.camel@blabla> <3D9B00EC-9B72-4B95-BED8-5A8541C052DA@mcs.anl.gov> Message-ID: <1333491590.16834.0.camel@blabla> On Tue, 2012-04-03 at 17:13 -0500, Jonathan Monette wrote: > When length was first added it created handle and would throw a Future > exception if the array that was passed as an argument was not closed. > Now the code gets a map and returns the map size. I am looking > through the svn log to find out why that change was made. > > Do you know why? Not sure. I know I cleaned up Misc at some point, so maybe that got introduced at that time. Anyway, r5742 has the fix. From hategan at mcs.anl.gov Tue Apr 3 17:24:14 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 03 Apr 2012 15:24:14 -0700 Subject: [Swift-devel] Join function? In-Reply-To: <40A334BC-3782-4115-A173-7C81CB71D782@mcs.anl.gov> References: <1355358185.124847.1333487624872.JavaMail.root@zimbra.anl.gov> <1333490351.16398.0.camel@blabla> <1333491031.16398.1.camel@blabla> <3D9B00EC-9B72-4B95-BED8-5A8541C052DA@mcs.anl.gov> <40A334BC-3782-4115-A173-7C81CB71D782@mcs.anl.gov> Message-ID: <1333491854.16834.2.camel@blabla> On Tue, 2012-04-03 at 17:17 -0500, Jonathan Monette wrote: > So swift r4755 has code that will wait for the array to be closed > using waitFor(). However, swift r4756 has the new code which uses the > size of a map. The svn log does not say why. I am reverting the code > for the length function to r4755. It does look like a commit of mine that wasn't supposed to go in. Though back to the join() issue, it should probably be a built-in function. From hategan at mcs.anl.gov Tue Apr 3 17:26:43 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 03 Apr 2012 15:26:43 -0700 Subject: [Swift-devel] Join function? In-Reply-To: <40A334BC-3782-4115-A173-7C81CB71D782@mcs.anl.gov> References: <1355358185.124847.1333487624872.JavaMail.root@zimbra.anl.gov> <1333490351.16398.0.camel@blabla> <1333491031.16398.1.camel@blabla> <3D9B00EC-9B72-4B95-BED8-5A8541C052DA@mcs.anl.gov> <40A334BC-3782-4115-A173-7C81CB71D782@mcs.anl.gov> Message-ID: <1333492003.16834.4.camel@blabla> On Tue, 2012-04-03 at 17:17 -0500, Jonathan Monette wrote: > However, swift r4756 has the new code which uses the size of a map. The svn log does not say why. I think the reason for it was that previously @length(some_non_array) would give a weird error, whereas the current code complains more nicely that length must be applied to an array. From jonmon at mcs.anl.gov Tue Apr 3 17:28:12 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Tue, 3 Apr 2012 17:28:12 -0500 Subject: [Swift-devel] Join function? In-Reply-To: <1333492003.16834.4.camel@blabla> References: <1355358185.124847.1333487624872.JavaMail.root@zimbra.anl.gov> <1333490351.16398.0.camel@blabla> <1333491031.16398.1.camel@blabla> <3D9B00EC-9B72-4B95-BED8-5A8541C052DA@mcs.anl.gov> <40A334BC-3782-4115-A173-7C81CB71D782@mcs.anl.gov> <1333492003.16834.4.camel@blabla> Message-ID: Right, I got that from the new code. The svn log just did not shed light as to why the change was made. On Apr 3, 2012, at 5:26 PM, Mihael Hategan wrote: > On Tue, 2012-04-03 at 17:17 -0500, Jonathan Monette wrote: >> However, swift r4756 has the new code which uses the size of a map. The svn log does not say why. > > I think the reason for it was that previously @length(some_non_array) > would give a weird error, whereas the current code complains more nicely > that length must be applied to an array. > From benc at hawaga.org.uk Tue Apr 3 18:44:13 2012 From: benc at hawaga.org.uk (Ben Clifford) Date: Wed, 4 Apr 2012 09:44:13 +1000 Subject: [Swift-devel] Join function? In-Reply-To: <1333492003.16834.4.camel@blabla> References: <1355358185.124847.1333487624872.JavaMail.root@zimbra.anl.gov> <1333490351.16398.0.camel@blabla> <1333491031.16398.1.camel@blabla> <3D9B00EC-9B72-4B95-BED8-5A8541C052DA@mcs.anl.gov> <40A334BC-3782-4115-A173-7C81CB71D782@mcs.anl.gov> <1333492003.16834.4.camel@blabla> Message-ID: On Apr 4, 2012, at 8:26 AM, Mihael Hategan wrote: > On Tue, 2012-04-03 at 17:17 -0500, Jonathan Monette wrote: >> However, swift r4756 has the new code which uses the size of a map. The svn log does not say why. > > I think the reason for it was that previously @length(some_non_array) > would give a weird error, whereas the current code complains more nicely > that length must be applied to an array. That should be checkable at compile time, though? (at least in theory - there was something fuzzy about @functions and types maybe to do with variable argument lists) -- From hategan at mcs.anl.gov Mon Apr 9 10:06:56 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 09 Apr 2012 08:06:56 -0700 Subject: [Swift-devel] coaster io with NIO. Message-ID: <1333984016.31238.5.camel@blabla> So I wrote this NIO version of the coaster send/multiplexer loops. It uses selectors instead of whatever was there before. The multiplexer (i.e. reading) wasn't and issue because there is a way to test whether a read will block or not, but writing data to a collection of sockets without NIO suffers from one socket blocking everything because there is no way to test that a write on a socket will not block. The initial implementation was not NIO based because there is no easy way to convince the jglobus GSI libraries to work with NIO, so my NIO thing only applies to plain TCP sockets. I should probably deal with that, although it's not an issue when you have a single coaster service. I'll commit the changes today so that you guys can test it. Works fine on localhost so far. Mihael From ketancmaheshwari at gmail.com Mon Apr 9 10:40:03 2012 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Mon, 9 Apr 2012 11:40:03 -0400 Subject: [Swift-devel] we has one of our own Message-ID: http://www.cs.cornell.edu/jif/swift -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Mon Apr 9 10:48:39 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 9 Apr 2012 10:48:39 -0500 (CDT) Subject: [Swift-devel] coaster io with NIO. In-Reply-To: <1333984016.31238.5.camel@blabla> Message-ID: <2061521035.132139.1333986519728.JavaMail.root@zimbra.anl.gov> Mihael, this sounds very good. I replied with a comment on Bug 690 regarding fully debugging the root cause of the timeout problem. But regarding the limitations between the new NIO code and GSI: the current secure socket solution for coasters has long been a usability obstacle. This sounds like a good time to design a highly usable solution that will last us into the future. What alternatives can you suggest for this? - Mike ----- Original Message ----- > From: "Mihael Hategan" > To: "Swift Devel" > Sent: Monday, April 9, 2012 10:06:56 AM > Subject: [Swift-devel] coaster io with NIO. > So I wrote this NIO version of the coaster send/multiplexer loops. It > uses selectors instead of whatever was there before. > > The multiplexer (i.e. reading) wasn't and issue because there is a way > to test whether a read will block or not, but writing data to a > collection of sockets without NIO suffers from one socket blocking > everything because there is no way to test that a write on a socket > will > not block. > > The initial implementation was not NIO based because there is no easy > way to convince the jglobus GSI libraries to work with NIO, so my NIO > thing only applies to plain TCP sockets. I should probably deal with > that, although it's not an issue when you have a single coaster > service. > > I'll commit the changes today so that you guys can test it. Works fine > on localhost so far. > > Mihael > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From hategan at mcs.anl.gov Mon Apr 9 12:42:19 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 09 Apr 2012 10:42:19 -0700 Subject: [Swift-devel] coaster io with NIO. In-Reply-To: <2061521035.132139.1333986519728.JavaMail.root@zimbra.anl.gov> References: <2061521035.132139.1333986519728.JavaMail.root@zimbra.anl.gov> Message-ID: <1333993339.32425.3.camel@blabla> On Mon, 2012-04-09 at 10:48 -0500, Michael Wilde wrote: > But regarding the limitations between the new NIO code and GSI: the > current secure socket solution for coasters has long been a usability > obstacle. This sounds like a good time to design a highly usable > solution that will last us into the future. What alternatives can you > suggest for this? I know this looks like the right time to deal with that. However I want to mention that while the jglobus NIO issue is a technical one, the security part is not something that I have a solution for but missing the code. I do however welcome ideas on how to do security nicely in this case (though I think it is to a large extent the same as how to do security in a distributed/grid system) Mihael From wilde at mcs.anl.gov Mon Apr 9 13:05:21 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 9 Apr 2012 13:05:21 -0500 (CDT) Subject: [Swift-devel] coaster io with NIO. In-Reply-To: <1333993339.32425.3.camel@blabla> Message-ID: <723912979.132444.1333994721678.JavaMail.root@zimbra.anl.gov> I think the last time we discussed this, we determined that adding a -nosec mode for automatic coasters would be both safe and useful. Is that a reasonable approach for now, including making -nosec the default? - Mike ----- Original Message ----- > From: "Mihael Hategan" > To: "Michael Wilde" > Cc: "Swift Devel" > Sent: Monday, April 9, 2012 12:42:19 PM > Subject: Re: [Swift-devel] coaster io with NIO. > On Mon, 2012-04-09 at 10:48 -0500, Michael Wilde wrote: > > > But regarding the limitations between the new NIO code and GSI: the > > current secure socket solution for coasters has long been a > > usability > > obstacle. This sounds like a good time to design a highly usable > > solution that will last us into the future. What alternatives can > > you > > suggest for this? > > I know this looks like the right time to deal with that. However I > want > to mention that while the jglobus NIO issue is a technical one, the > security part is not something that I have a solution for but missing > the code. > > I do however welcome ideas on how to do security nicely in this case > (though I think it is to a large extent the same as how to do security > in a distributed/grid system) > > Mihael -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From hategan at mcs.anl.gov Mon Apr 9 13:11:59 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 09 Apr 2012 11:11:59 -0700 Subject: [Swift-devel] coaster io with NIO. In-Reply-To: <723912979.132444.1333994721678.JavaMail.root@zimbra.anl.gov> References: <723912979.132444.1333994721678.JavaMail.root@zimbra.anl.gov> Message-ID: <1333995119.889.5.camel@blabla> On Mon, 2012-04-09 at 13:05 -0500, Michael Wilde wrote: > I think the last time we discussed this, we determined that adding a > -nosec mode for automatic coasters would be both safe and useful. Is > that a reasonable approach for now, including making -nosec the > default? Right. Nosec on automatic coasters is a solution. We decided that the worst that can happen there is that there could be a rogue coaster service that could steal jobs and data and/or provide fake results. I wouldn't call that a solution though. It's a step backwards from a security perspective. But it may be an option to consider, in particular given that it's already possible to do this by having rogue workers. So yes. I can add that option. Mihael From hategan at mcs.anl.gov Tue Apr 10 02:17:08 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 10 Apr 2012 00:17:08 -0700 Subject: [Swift-devel] coaster io with NIO. In-Reply-To: <1333984016.31238.5.camel@blabla> References: <1333984016.31238.5.camel@blabla> Message-ID: <1334042228.4063.1.camel@blabla> On Mon, 2012-04-09 at 08:06 -0700, Mihael Hategan wrote: > I'll commit the changes today so that you guys can test it. Works fine > on localhost so far. Darn. I was 15 minutes late on that. Anyway, cog trunk r3370. Let me know how this works out. As I said, I did some preliminary testing locally, but various other combinations may be a bit shaky at this point. Mihael From davidk at ci.uchicago.edu Tue Apr 10 03:56:11 2012 From: davidk at ci.uchicago.edu (David Kelly) Date: Tue, 10 Apr 2012 03:56:11 -0500 (CDT) Subject: [Swift-devel] coaster io with NIO. In-Reply-To: <1334042228.4063.1.camel@blabla> Message-ID: <361193860.123071.1334048171829.JavaMail.root@zimbra-mb2.anl.gov> I seem to having some issues with this version. When I run "coaster-service -nosec -portfile /tmp/tmp.zdo0L0LNoK -localportfile /tmp/tmp.ZKOZqdZy9L -passive", I get this: Error starting coaster service: null Error starting coaster service java.lang.NullPointerException at org.globus.cog.abstraction.coaster.service.LocalTCPService.getPort(LocalTCPService.java:158) at org.globus.cog.abstraction.coaster.service.CoasterPersistentService.writePorts(CoasterPersistentService.java:188) at org.globus.cog.abstraction.coaster.service.CoasterPersistentService.main(CoasterPersistentService.java:145) When I specify the port numbers, I can connect, but nothing seems to happen. David ----- Original Message ----- > From: "Mihael Hategan" > To: "Swift Devel" > Sent: Tuesday, April 10, 2012 2:17:08 AM > Subject: Re: [Swift-devel] coaster io with NIO. > On Mon, 2012-04-09 at 08:06 -0700, Mihael Hategan wrote: > > > I'll commit the changes today so that you guys can test it. Works > > fine > > on localhost so far. > > Darn. I was 15 minutes late on that. > > Anyway, cog trunk r3370. > > Let me know how this works out. As I said, I did some preliminary > testing locally, but various other combinations may be a bit shaky at > this point. > > Mihael > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From svemalayan at yahoo.com Tue Apr 10 13:41:43 2012 From: svemalayan at yahoo.com (Emalayan Vairavanathan) Date: Tue, 10 Apr 2012 11:41:43 -0700 (PDT) Subject: [Swift-devel] Swift did not make progress with high throtteling rate Message-ID: <1334083303.40815.YahooMailNeo@web39501.mail.mud.yahoo.com> Hi All, I tired to run my pipeline-swift benchmark on GPFS+PVFS with 128 compute nodes (Surveyor), JOB_THROTTLE = 1000 and JOBS_PER_NODE = 4. I used GPFS as the central storage and PVFS as the intermediate storage. The benchmark did not make any progress and I found the following messages in the log file. (This happened even with MosaStore) 2012-04-10 18:18:36,710+0000 WARN? HangChecke No events in 10s. 2012-04-10 18:18:36,717+0000 WARN? HangChecker Registered futures: file stage_2_output - F/stage_2_output[95]:file - Open file stage_1_output - F/stage_1_output[85]:file - Open file stage_3_output - F/stage_3_output[62]:file - Open file stage_3_output - F/stage_3_output[44]:file - Open file stage_1_output - F/stage_1_output[4]:file - Open file stage_2_output - F/stage_2_output[3]:file - Open file input_data - F/input_data[121]:file - Open file stage_1_output - F/stage_1_output[113]:file - Open file stage_1_output - F/stage_1_output[98]:file - Open I am using the swift version that I took from Justin's home directory 3 weeks before. Do you have any idea ? Does swift has problem with high throttling rate / jobs-per-node ? I have attached swift log file and the benchmark with this mail. I highly appreciate your suggestions. Thank you Emalayan -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: attachment.tar.gz Type: application/x-gzip Size: 75954 bytes Desc: not available URL: From wozniak at mcs.anl.gov Tue Apr 10 14:22:29 2012 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Tue, 10 Apr 2012 14:22:29 -0500 (CDT) Subject: [Swift-devel] Swift did not make progress with high throtteling rate In-Reply-To: <1334083303.40815.YahooMailNeo@web39501.mail.mud.yahoo.com> References: <1334083303.40815.YahooMailNeo@web39501.mail.mud.yahoo.com> Message-ID: Hi Emalayan Are you saying that this case does run with the default throttles and fails with jobThrottle=1000? I just took a look at the log file. It looks like the jobs do get scheduled. Are there any -info files to look at? Justin On Tue, 10 Apr 2012, Emalayan Vairavanathan wrote: > Hi All, > > I tired to run my pipeline-swift benchmark on GPFS+PVFS with 128 compute > nodes (Surveyor), JOB_THROTTLE = 1000 and JOBS_PER_NODE = 4. > > I used GPFS as the central storage and PVFS as the intermediate storage. > The benchmark did not make any progress and I found the following > messages in the log file. (This happened even with MosaStore) > > > 2012-04-10 18:18:36,710+0000 WARN? HangChecke No events in 10s. > 2012-04-10 18:18:36,717+0000 WARN? HangChecker > Registered futures: > file stage_2_output - F/stage_2_output[95]:file - Open > file stage_1_output - F/stage_1_output[85]:file - Open > file stage_3_output - F/stage_3_output[62]:file - Open > file stage_3_output - F/stage_3_output[44]:file - Open > file stage_1_output - F/stage_1_output[4]:file - Open > file stage_2_output - F/stage_2_output[3]:file - Open > file input_data - F/input_data[121]:file - Open > file stage_1_output - F/stage_1_output[113]:file - Open > file stage_1_output - F/stage_1_output[98]:file - Open > > I am using the swift version that I took from Justin's home directory 3 weeks before. > > > Do you have any idea ? Does swift has problem with high throttling rate / jobs-per-node ? I have attached swift log file and the benchmark with this mail. I highly appreciate your suggestions. > > > Thank you > Emalayan -- Justin M Wozniak From jonmon at mcs.anl.gov Tue Apr 10 15:27:07 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Tue, 10 Apr 2012 15:27:07 -0500 Subject: [Swift-devel] Coaster socket issue In-Reply-To: <690810736.117300.1332988205375.JavaMail.root@zimbra.anl.gov> References: <690810736.117300.1332988205375.JavaMail.root@zimbra.anl.gov> Message-ID: Mihael, So the fix for the socket issue for bug 762 did not fix the issue. Over the weekend I ran a large scale run and encountered the same IOException for too many open file descriptors. Upon checking /proc//fd, there were 1017 open file descriptors. The limit on the machine was 1024 so 1017 is dangerously close to the limit. I am assuming that once the limit was reached some fds closed. Most of the fds in that directory were sockets. I then checked netstat -a and found several sockets in the CLOSE_WAIT state. They had the form: tcp 1 0 nid00008:51313 nid00014:58012 CLOSE_WAIT They run time for the PBS jobs were short(only about 30 mins) and the swift run was running for over 12 hours. Even doing the math for that does not explain why ~900 of the fds in /proc were sockets. Upon researching the "CLOSE_WAIT" state issue I found several posts about this. They all say that this is bad but they also had different reason why this would show up. One thing that all these CLOSE_WAIT sockets reported by netstat have in common is that the have message in the receive queue(the second column in the output I pasted above). My current theory is that the socket is waiting for that message to be read before actually closing. Do you think that is possible? I do not have any other evidence or data about this issue but I will be gathering data very soon. If you have any specific data you would like to see please let me know and I can gather that for you. What are your thoughts on this issue? On Mar 28, 2012, at 9:30 PM, Michael Wilde wrote: > Does > ls -l /proc/14598/fd > tell you anything more? > > Sounds to me like swift is trying to qstat a qsub'ed job. Perhaps some incompatibility between the SGE provider and the local SGE release? We've seen similar things with older (or newer) SGE releases. (I think you in fact diagnosed some of these issues as I recall...) > > - Mike > > ----- Original Message ----- >> From: "David Kelly" >> To: "Jonathan Monette" >> Cc: "swift-devel at ci.uchicago.edu Devel" >> Sent: Wednesday, March 28, 2012 9:11:31 PM >> Subject: Re: [Swift-devel] Coaster socket issue >> The limit here seems to be 1024. >> >> Just curious, what happens when you run 'lsof -u jonmon'? For me, I >> see lines like this that grow over time: >> >> java 14589 dkelly 220r FIFO 0,6 601514288 pipe >> java 14589 dkelly 221r FIFO 0,6 601514581 pipe >> java 14589 dkelly 222w FIFO 0,6 601514852 pipe >> java 14589 dkelly 223r FIFO 0,6 601514582 pipe >> >> >> ----- Original Message ----- >>> From: "Jonathan Monette" >>> To: "David Kelly" >>> Cc: "swift-devel at ci.uchicago.edu Devel" >>> >>> Sent: Wednesday, March 28, 2012 8:57:03 PM >>> Subject: Re: [Swift-devel] Coaster socket issue >>> What is the open files limit on that machine(ulimit -n)? I have >>> never >>> witnessed this issue before so it may only appear on machines with >>> relatively low open file limits(raven has 1K but beagle has 60K). >>> This >>> is still something we should look into though. >>> >>> On Mar 28, 2012, at 8:49 PM, David Kelly wrote: >>> >>>> >>>> Strange, I just ran into a similar issues tonight while running on >>>> ibicluster (SGE). I saw the "too many open files" error after >>>> sitting in the queue waiting for a job to start. I restarted the >>>> job >>>> and then periodically ran 'lsof' to see the number of java pipes >>>> increasing over time. I thought at first this might be SGE >>>> specific, >>>> but perhaps it is something else. (This was with 0.93) >>>> >>>> ----- Original Message ----- >>>>> From: "Jonathan Monette" >>>>> To: "swift-devel at ci.uchicago.edu Devel" >>>>> >>>>> Sent: Wednesday, March 28, 2012 8:30:52 PM >>>>> Subject: [Swift-devel] Coaster socket issue >>>>> Hello, >>>>> In running the SciColSim app on raven(which is a cluster similar >>>>> to >>>>> Beagle) I noticed that the app hung. It was not hung where the >>>>> hang >>>>> checker kicked in but Swift was waiting for jobs to be active but >>>>> there was none submitted to PBS. I took a look at the log file >>>>> and >>>>> noticed that I had a java.io.IOException thrown for "too many >>>>> open >>>>> files". Since I killed it I couldn't probe the run but I had the >>>>> same >>>>> run running on Beagle. Upon Mike's suggestion I took a look at >>>>> the >>>>> /proc//fd directory. There were over 2000 sockets in the >>>>> CLOSE_WAIT state with a single message in the receive queue. >>>>> Raven >>>>> has >>>>> a limit of 1024 open files at a time while Beagle has a limit >>>>> around >>>>> 60K number of files open. I got this limit using ulimit -n. >>>>> >>>>> So my question is, why is there so many sockets waiting to be >>>>> closed? >>>>> I did some reading about the CLOSE_WAIT state and it seems this >>>>> happens when one of the ends closes there socket but the other >>>>> does >>>>> not. Is Coaster not closing the socket when a worker shuts down? >>>>> What >>>>> other information should I be looking for to help debug the >>>>> issue. >>>>> _______________________________________________ >>>>> Swift-devel mailing list >>>>> Swift-devel at ci.uchicago.edu >>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > From hategan at mcs.anl.gov Tue Apr 10 16:32:06 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 10 Apr 2012 14:32:06 -0700 Subject: [Swift-devel] coaster io with NIO. In-Reply-To: <361193860.123071.1334048171829.JavaMail.root@zimbra-mb2.anl.gov> References: <361193860.123071.1334048171829.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: <1334093526.7475.0.camel@blabla> I'll fix that. In the mean time, can you try with automatic coasters? On Tue, 2012-04-10 at 03:56 -0500, David Kelly wrote: > I seem to having some issues with this version. When I run "coaster-service -nosec -portfile /tmp/tmp.zdo0L0LNoK -localportfile /tmp/tmp.ZKOZqdZy9L -passive", I get this: > > Error starting coaster service: null > Error starting coaster service > java.lang.NullPointerException > at org.globus.cog.abstraction.coaster.service.LocalTCPService.getPort(LocalTCPService.java:158) > at org.globus.cog.abstraction.coaster.service.CoasterPersistentService.writePorts(CoasterPersistentService.java:188) > at org.globus.cog.abstraction.coaster.service.CoasterPersistentService.main(CoasterPersistentService.java:145) > > When I specify the port numbers, I can connect, but nothing seems to happen. > > David > > ----- Original Message ----- > > From: "Mihael Hategan" > > To: "Swift Devel" > > Sent: Tuesday, April 10, 2012 2:17:08 AM > > Subject: Re: [Swift-devel] coaster io with NIO. > > On Mon, 2012-04-09 at 08:06 -0700, Mihael Hategan wrote: > > > > > I'll commit the changes today so that you guys can test it. Works > > > fine > > > on localhost so far. > > > > Darn. I was 15 minutes late on that. > > > > Anyway, cog trunk r3370. > > > > Let me know how this works out. As I said, I did some preliminary > > testing locally, but various other combinations may be a bit shaky at > > this point. > > > > Mihael > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From hategan at mcs.anl.gov Tue Apr 10 16:35:34 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 10 Apr 2012 14:35:34 -0700 Subject: [Swift-devel] Coaster socket issue In-Reply-To: References: <690810736.117300.1332988205375.JavaMail.root@zimbra.anl.gov> Message-ID: <1334093734.7475.1.camel@blabla> Passive workers or automatic? On Tue, 2012-04-10 at 15:27 -0500, Jonathan Monette wrote: > Mihael, > So the fix for the socket issue for bug 762 did not fix the issue. Over the weekend I ran a large scale run and encountered the same IOException for too many open file descriptors. Upon checking /proc//fd, there were 1017 open file descriptors. The limit on the machine was 1024 so 1017 is dangerously close to the limit. I am assuming that once the limit was reached some fds closed. > > Most of the fds in that directory were sockets. I then checked netstat -a and found several sockets in the CLOSE_WAIT state. They had the form: > tcp 1 0 nid00008:51313 nid00014:58012 CLOSE_WAIT > > They run time for the PBS jobs were short(only about 30 mins) and the swift run was running for over 12 hours. Even doing the math for that does not explain why ~900 of the fds in /proc were sockets. > > Upon researching the "CLOSE_WAIT" state issue I found several posts about this. They all say that this is bad but they also had different reason why this would show up. One thing that all these CLOSE_WAIT sockets reported by netstat have in common is that the have message in the receive queue(the second column in the output I pasted above). My current theory is that the socket is waiting for that message to be read before actually closing. Do you think that is possible? I do not have any other evidence or data about this issue but I will be gathering data very soon. If you have any specific data you would like to see please let me know and I can gather that for you. > > What are your thoughts on this issue? > > On Mar 28, 2012, at 9:30 PM, Michael Wilde wrote: > > > Does > > ls -l /proc/14598/fd > > tell you anything more? > > > > Sounds to me like swift is trying to qstat a qsub'ed job. Perhaps some incompatibility between the SGE provider and the local SGE release? We've seen similar things with older (or newer) SGE releases. (I think you in fact diagnosed some of these issues as I recall...) > > > > - Mike > > > > ----- Original Message ----- > >> From: "David Kelly" > >> To: "Jonathan Monette" > >> Cc: "swift-devel at ci.uchicago.edu Devel" > >> Sent: Wednesday, March 28, 2012 9:11:31 PM > >> Subject: Re: [Swift-devel] Coaster socket issue > >> The limit here seems to be 1024. > >> > >> Just curious, what happens when you run 'lsof -u jonmon'? For me, I > >> see lines like this that grow over time: > >> > >> java 14589 dkelly 220r FIFO 0,6 601514288 pipe > >> java 14589 dkelly 221r FIFO 0,6 601514581 pipe > >> java 14589 dkelly 222w FIFO 0,6 601514852 pipe > >> java 14589 dkelly 223r FIFO 0,6 601514582 pipe > >> > >> > >> ----- Original Message ----- > >>> From: "Jonathan Monette" > >>> To: "David Kelly" > >>> Cc: "swift-devel at ci.uchicago.edu Devel" > >>> > >>> Sent: Wednesday, March 28, 2012 8:57:03 PM > >>> Subject: Re: [Swift-devel] Coaster socket issue > >>> What is the open files limit on that machine(ulimit -n)? I have > >>> never > >>> witnessed this issue before so it may only appear on machines with > >>> relatively low open file limits(raven has 1K but beagle has 60K). > >>> This > >>> is still something we should look into though. > >>> > >>> On Mar 28, 2012, at 8:49 PM, David Kelly wrote: > >>> > >>>> > >>>> Strange, I just ran into a similar issues tonight while running on > >>>> ibicluster (SGE). I saw the "too many open files" error after > >>>> sitting in the queue waiting for a job to start. I restarted the > >>>> job > >>>> and then periodically ran 'lsof' to see the number of java pipes > >>>> increasing over time. I thought at first this might be SGE > >>>> specific, > >>>> but perhaps it is something else. (This was with 0.93) > >>>> > >>>> ----- Original Message ----- > >>>>> From: "Jonathan Monette" > >>>>> To: "swift-devel at ci.uchicago.edu Devel" > >>>>> > >>>>> Sent: Wednesday, March 28, 2012 8:30:52 PM > >>>>> Subject: [Swift-devel] Coaster socket issue > >>>>> Hello, > >>>>> In running the SciColSim app on raven(which is a cluster similar > >>>>> to > >>>>> Beagle) I noticed that the app hung. It was not hung where the > >>>>> hang > >>>>> checker kicked in but Swift was waiting for jobs to be active but > >>>>> there was none submitted to PBS. I took a look at the log file > >>>>> and > >>>>> noticed that I had a java.io.IOException thrown for "too many > >>>>> open > >>>>> files". Since I killed it I couldn't probe the run but I had the > >>>>> same > >>>>> run running on Beagle. Upon Mike's suggestion I took a look at > >>>>> the > >>>>> /proc//fd directory. There were over 2000 sockets in the > >>>>> CLOSE_WAIT state with a single message in the receive queue. > >>>>> Raven > >>>>> has > >>>>> a limit of 1024 open files at a time while Beagle has a limit > >>>>> around > >>>>> 60K number of files open. I got this limit using ulimit -n. > >>>>> > >>>>> So my question is, why is there so many sockets waiting to be > >>>>> closed? > >>>>> I did some reading about the CLOSE_WAIT state and it seems this > >>>>> happens when one of the ends closes there socket but the other > >>>>> does > >>>>> not. Is Coaster not closing the socket when a worker shuts down? > >>>>> What > >>>>> other information should I be looking for to help debug the > >>>>> issue. > >>>>> _______________________________________________ > >>>>> Swift-devel mailing list > >>>>> Swift-devel at ci.uchicago.edu > >>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >> _______________________________________________ > >> Swift-devel mailing list > >> Swift-devel at ci.uchicago.edu > >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > From jonmon at mcs.anl.gov Tue Apr 10 16:35:56 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Tue, 10 Apr 2012 16:35:56 -0500 Subject: [Swift-devel] Coaster socket issue In-Reply-To: <1334093734.7475.1.camel@blabla> References: <690810736.117300.1332988205375.JavaMail.root@zimbra.anl.gov> <1334093734.7475.1.camel@blabla> Message-ID: <1677CD93-A6C2-4409-BF26-2EDACBC0214B@mcs.anl.gov> Automatic On Apr 10, 2012, at 16:35, Mihael Hategan wrote: > Passive workers or automatic? > > On Tue, 2012-04-10 at 15:27 -0500, Jonathan Monette wrote: >> Mihael, >> So the fix for the socket issue for bug 762 did not fix the issue. Over the weekend I ran a large scale run and encountered the same IOException for too many open file descriptors. Upon checking /proc//fd, there were 1017 open file descriptors. The limit on the machine was 1024 so 1017 is dangerously close to the limit. I am assuming that once the limit was reached some fds closed. >> >> Most of the fds in that directory were sockets. I then checked netstat -a and found several sockets in the CLOSE_WAIT state. They had the form: >> tcp 1 0 nid00008:51313 nid00014:58012 CLOSE_WAIT >> >> They run time for the PBS jobs were short(only about 30 mins) and the swift run was running for over 12 hours. Even doing the math for that does not explain why ~900 of the fds in /proc were sockets. >> >> Upon researching the "CLOSE_WAIT" state issue I found several posts about this. They all say that this is bad but they also had different reason why this would show up. One thing that all these CLOSE_WAIT sockets reported by netstat have in common is that the have message in the receive queue(the second column in the output I pasted above). My current theory is that the socket is waiting for that message to be read before actually closing. Do you think that is possible? I do not have any other evidence or data about this issue but I will be gathering data very soon. If you have any specific data you would like to see please let me know and I can gather that for you. >> >> What are your thoughts on this issue? >> >> On Mar 28, 2012, at 9:30 PM, Michael Wilde wrote: >> >>> Does >>> ls -l /proc/14598/fd >>> tell you anything more? >>> >>> Sounds to me like swift is trying to qstat a qsub'ed job. Perhaps some incompatibility between the SGE provider and the local SGE release? We've seen similar things with older (or newer) SGE releases. (I think you in fact diagnosed some of these issues as I recall...) >>> >>> - Mike >>> >>> ----- Original Message ----- >>>> From: "David Kelly" >>>> To: "Jonathan Monette" >>>> Cc: "swift-devel at ci.uchicago.edu Devel" >>>> Sent: Wednesday, March 28, 2012 9:11:31 PM >>>> Subject: Re: [Swift-devel] Coaster socket issue >>>> The limit here seems to be 1024. >>>> >>>> Just curious, what happens when you run 'lsof -u jonmon'? For me, I >>>> see lines like this that grow over time: >>>> >>>> java 14589 dkelly 220r FIFO 0,6 601514288 pipe >>>> java 14589 dkelly 221r FIFO 0,6 601514581 pipe >>>> java 14589 dkelly 222w FIFO 0,6 601514852 pipe >>>> java 14589 dkelly 223r FIFO 0,6 601514582 pipe >>>> >>>> >>>> ----- Original Message ----- >>>>> From: "Jonathan Monette" >>>>> To: "David Kelly" >>>>> Cc: "swift-devel at ci.uchicago.edu Devel" >>>>> >>>>> Sent: Wednesday, March 28, 2012 8:57:03 PM >>>>> Subject: Re: [Swift-devel] Coaster socket issue >>>>> What is the open files limit on that machine(ulimit -n)? I have >>>>> never >>>>> witnessed this issue before so it may only appear on machines with >>>>> relatively low open file limits(raven has 1K but beagle has 60K). >>>>> This >>>>> is still something we should look into though. >>>>> >>>>> On Mar 28, 2012, at 8:49 PM, David Kelly wrote: >>>>> >>>>>> >>>>>> Strange, I just ran into a similar issues tonight while running on >>>>>> ibicluster (SGE). I saw the "too many open files" error after >>>>>> sitting in the queue waiting for a job to start. I restarted the >>>>>> job >>>>>> and then periodically ran 'lsof' to see the number of java pipes >>>>>> increasing over time. I thought at first this might be SGE >>>>>> specific, >>>>>> but perhaps it is something else. (This was with 0.93) >>>>>> >>>>>> ----- Original Message ----- >>>>>>> From: "Jonathan Monette" >>>>>>> To: "swift-devel at ci.uchicago.edu Devel" >>>>>>> >>>>>>> Sent: Wednesday, March 28, 2012 8:30:52 PM >>>>>>> Subject: [Swift-devel] Coaster socket issue >>>>>>> Hello, >>>>>>> In running the SciColSim app on raven(which is a cluster similar >>>>>>> to >>>>>>> Beagle) I noticed that the app hung. It was not hung where the >>>>>>> hang >>>>>>> checker kicked in but Swift was waiting for jobs to be active but >>>>>>> there was none submitted to PBS. I took a look at the log file >>>>>>> and >>>>>>> noticed that I had a java.io.IOException thrown for "too many >>>>>>> open >>>>>>> files". Since I killed it I couldn't probe the run but I had the >>>>>>> same >>>>>>> run running on Beagle. Upon Mike's suggestion I took a look at >>>>>>> the >>>>>>> /proc//fd directory. There were over 2000 sockets in the >>>>>>> CLOSE_WAIT state with a single message in the receive queue. >>>>>>> Raven >>>>>>> has >>>>>>> a limit of 1024 open files at a time while Beagle has a limit >>>>>>> around >>>>>>> 60K number of files open. I got this limit using ulimit -n. >>>>>>> >>>>>>> So my question is, why is there so many sockets waiting to be >>>>>>> closed? >>>>>>> I did some reading about the CLOSE_WAIT state and it seems this >>>>>>> happens when one of the ends closes there socket but the other >>>>>>> does >>>>>>> not. Is Coaster not closing the socket when a worker shuts down? >>>>>>> What >>>>>>> other information should I be looking for to help debug the >>>>>>> issue. >>>>>>> _______________________________________________ >>>>>>> Swift-devel mailing list >>>>>>> Swift-devel at ci.uchicago.edu >>>>>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>>> _______________________________________________ >>>> Swift-devel mailing list >>>> Swift-devel at ci.uchicago.edu >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >>> >>> -- >>> Michael Wilde >>> Computation Institute, University of Chicago >>> Mathematics and Computer Science Division >>> Argonne National Laboratory >>> >> > > From davidk at ci.uchicago.edu Tue Apr 10 17:25:59 2012 From: davidk at ci.uchicago.edu (David Kelly) Date: Tue, 10 Apr 2012 17:25:59 -0500 (CDT) Subject: [Swift-devel] coaster io with NIO. In-Reply-To: <1334093526.7475.0.camel@blabla> Message-ID: <1457600549.127034.1334096759159.JavaMail.root@zimbra-mb2.anl.gov> Yep, I gave it a try with automatic coasters, but am still seeing the timeouts. ----- Original Message ----- > From: "Mihael Hategan" > To: "David Kelly" > Cc: "Swift Devel" > Sent: Tuesday, April 10, 2012 4:32:06 PM > Subject: Re: [Swift-devel] coaster io with NIO. > I'll fix that. In the mean time, can you try with automatic coasters? > > On Tue, 2012-04-10 at 03:56 -0500, David Kelly wrote: > > I seem to having some issues with this version. When I run > > "coaster-service -nosec -portfile /tmp/tmp.zdo0L0LNoK -localportfile > > /tmp/tmp.ZKOZqdZy9L -passive", I get this: > > > > Error starting coaster service: null > > Error starting coaster service > > java.lang.NullPointerException > > at > > org.globus.cog.abstraction.coaster.service.LocalTCPService.getPort(LocalTCPService.java:158) > > at > > org.globus.cog.abstraction.coaster.service.CoasterPersistentService.writePorts(CoasterPersistentService.java:188) > > at > > org.globus.cog.abstraction.coaster.service.CoasterPersistentService.main(CoasterPersistentService.java:145) > > > > When I specify the port numbers, I can connect, but nothing seems to > > happen. > > > > David > > > > ----- Original Message ----- > > > From: "Mihael Hategan" > > > To: "Swift Devel" > > > Sent: Tuesday, April 10, 2012 2:17:08 AM > > > Subject: Re: [Swift-devel] coaster io with NIO. > > > On Mon, 2012-04-09 at 08:06 -0700, Mihael Hategan wrote: > > > > > > > I'll commit the changes today so that you guys can test it. > > > > Works > > > > fine > > > > on localhost so far. > > > > > > Darn. I was 15 minutes late on that. > > > > > > Anyway, cog trunk r3370. > > > > > > Let me know how this works out. As I said, I did some preliminary > > > testing locally, but various other combinations may be a bit shaky > > > at > > > this point. > > > > > > Mihael > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From hategan at mcs.anl.gov Tue Apr 10 18:45:10 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 10 Apr 2012 16:45:10 -0700 Subject: [Swift-devel] coaster io with NIO. In-Reply-To: <1457600549.127034.1334096759159.JavaMail.root@zimbra-mb2.anl.gov> References: <1457600549.127034.1334096759159.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: <1334101510.8559.1.camel@blabla> On Tue, 2012-04-10 at 17:25 -0500, David Kelly wrote: > Yep, I gave it a try with automatic coasters, but am still seeing the timeouts. Right. Though are all the transfers hanging? Can you post a link to a jstack output? > > ----- Original Message ----- > > From: "Mihael Hategan" > > To: "David Kelly" > > Cc: "Swift Devel" > > Sent: Tuesday, April 10, 2012 4:32:06 PM > > Subject: Re: [Swift-devel] coaster io with NIO. > > I'll fix that. In the mean time, can you try with automatic coasters? > > > > On Tue, 2012-04-10 at 03:56 -0500, David Kelly wrote: > > > I seem to having some issues with this version. When I run > > > "coaster-service -nosec -portfile /tmp/tmp.zdo0L0LNoK -localportfile > > > /tmp/tmp.ZKOZqdZy9L -passive", I get this: > > > > > > Error starting coaster service: null > > > Error starting coaster service > > > java.lang.NullPointerException > > > at > > > org.globus.cog.abstraction.coaster.service.LocalTCPService.getPort(LocalTCPService.java:158) > > > at > > > org.globus.cog.abstraction.coaster.service.CoasterPersistentService.writePorts(CoasterPersistentService.java:188) > > > at > > > org.globus.cog.abstraction.coaster.service.CoasterPersistentService.main(CoasterPersistentService.java:145) > > > > > > When I specify the port numbers, I can connect, but nothing seems to > > > happen. > > > > > > David > > > > > > ----- Original Message ----- > > > > From: "Mihael Hategan" > > > > To: "Swift Devel" > > > > Sent: Tuesday, April 10, 2012 2:17:08 AM > > > > Subject: Re: [Swift-devel] coaster io with NIO. > > > > On Mon, 2012-04-09 at 08:06 -0700, Mihael Hategan wrote: > > > > > > > > > I'll commit the changes today so that you guys can test it. > > > > > Works > > > > > fine > > > > > on localhost so far. > > > > > > > > Darn. I was 15 minutes late on that. > > > > > > > > Anyway, cog trunk r3370. > > > > > > > > Let me know how this works out. As I said, I did some preliminary > > > > testing locally, but various other combinations may be a bit shaky > > > > at > > > > this point. > > > > > > > > Mihael > > > > > > > > _______________________________________________ > > > > Swift-devel mailing list > > > > Swift-devel at ci.uchicago.edu > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From hategan at mcs.anl.gov Tue Apr 10 18:57:48 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 10 Apr 2012 16:57:48 -0700 Subject: [Swift-devel] coaster io with NIO. In-Reply-To: <1334093526.7475.0.camel@blabla> References: <361193860.123071.1334048171829.JavaMail.root@zimbra-mb2.anl.gov> <1334093526.7475.0.camel@blabla> Message-ID: <1334102268.9748.0.camel@blabla> On Tue, 2012-04-10 at 14:32 -0700, Mihael Hategan wrote: > I'll fix that. Fixed. Cog trunk r3371. > On Tue, 2012-04-10 at 03:56 -0500, David Kelly wrote: > > I seem to having some issues with this version. When I run "coaster-service -nosec -portfile /tmp/tmp.zdo0L0LNoK -localportfile /tmp/tmp.ZKOZqdZy9L -passive", I get this: > > > > Error starting coaster service: null > > Error starting coaster service > > java.lang.NullPointerException > > at org.globus.cog.abstraction.coaster.service.LocalTCPService.getPort(LocalTCPService.java:158) > > at org.globus.cog.abstraction.coaster.service.CoasterPersistentService.writePorts(CoasterPersistentService.java:188) > > at org.globus.cog.abstraction.coaster.service.CoasterPersistentService.main(CoasterPersistentService.java:145) From hategan at mcs.anl.gov Tue Apr 10 19:04:56 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 10 Apr 2012 17:04:56 -0700 Subject: [Swift-devel] coaster io with NIO. In-Reply-To: <1457600549.127034.1334096759159.JavaMail.root@zimbra-mb2.anl.gov> References: <1457600549.127034.1334096759159.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: <1334102696.9748.3.camel@blabla> On Tue, 2012-04-10 at 17:25 -0500, David Kelly wrote: > Yep, I gave it a try with automatic coasters, but am still seeing the timeouts. > I think I see the problem. With multiple jobs per worker the situation may such be that both a stagein and a stageout happen at the same time (on the same TCP connection). If the stageout runs out of buffers the writing to the socket on the worker side blocks causing the read loop to not happen. This eventually fills the other direction on the TCP link and everything deadlocks. From hategan at mcs.anl.gov Tue Apr 10 19:22:36 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 10 Apr 2012 17:22:36 -0700 Subject: [Swift-devel] coaster io with NIO. In-Reply-To: <1334102696.9748.3.camel@blabla> References: <1457600549.127034.1334096759159.JavaMail.root@zimbra-mb2.anl.gov> <1334102696.9748.3.camel@blabla> Message-ID: <1334103756.9748.4.camel@blabla> On Tue, 2012-04-10 at 17:04 -0700, Mihael Hategan wrote: > On Tue, 2012-04-10 at 17:25 -0500, David Kelly wrote: > > Yep, I gave it a try with automatic coasters, but am still seeing the timeouts. > > > > I think I see the problem. With multiple jobs per worker the situation > may such be that both a stagein and a stageout happen at the same time > (on the same TCP connection). If the stageout runs out of buffers the > writing to the socket on the worker side blocks causing the read loop to > not happen. This eventually fills the other direction on the TCP link > and everything deadlocks. This shouldn't happen if the worker socket write was non-blocking. Let me play with that a bit. From wilde at mcs.anl.gov Tue Apr 10 21:07:08 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 10 Apr 2012 21:07:08 -0500 (CDT) Subject: [Swift-devel] coaster io with NIO. In-Reply-To: <1334102696.9748.3.camel@blabla> Message-ID: <197573218.134869.1334110028043.JavaMail.root@zimbra.anl.gov> Mihael, while the scenario below seems plausible, I thought that the timeout problem was first detected on OSG nodes, which should have been running with jobsPerNode=1. David, Ketan, can you comment on the jobsPerNode settings for the many tests you have done which encountered this problem? - Mike ----- Original Message ----- > From: "Mihael Hategan" > To: "David Kelly" > Cc: "Swift Devel" > Sent: Tuesday, April 10, 2012 7:04:56 PM > Subject: Re: [Swift-devel] coaster io with NIO. > On Tue, 2012-04-10 at 17:25 -0500, David Kelly wrote: > > Yep, I gave it a try with automatic coasters, but am still seeing > > the timeouts. > > > > I think I see the problem. With multiple jobs per worker the situation > may such be that both a stagein and a stageout happen at the same time > (on the same TCP connection). If the stageout runs out of buffers the > writing to the socket on the worker side blocks causing the read loop > to > not happen. This eventually fills the other direction on the TCP link > and everything deadlocks. > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From hategan at mcs.anl.gov Tue Apr 10 21:13:03 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 10 Apr 2012 19:13:03 -0700 Subject: [Swift-devel] coaster io with NIO. In-Reply-To: <197573218.134869.1334110028043.JavaMail.root@zimbra.anl.gov> References: <197573218.134869.1334110028043.JavaMail.root@zimbra.anl.gov> Message-ID: <1334110383.12993.2.camel@blabla> On Tue, 2012-04-10 at 21:07 -0500, Michael Wilde wrote: > Mihael, while the scenario below seems plausible, I thought that the > timeout problem was first detected on OSG nodes, which should have > been running with jobsPerNode=1. It's possible that there is another problem. However, David's latest logs (posted in bug 690) show this: - read from worker socket thread blocked on allocation of buffers - worker blocked in send() (thus unable to recv()). - service blocked in send() to worker Part of this should be addressed by the NIO stuff, so we'll need new logs. From svemalayan at yahoo.com Tue Apr 10 22:22:05 2012 From: svemalayan at yahoo.com (Emalayan Vairavanathan) Date: Tue, 10 Apr 2012 20:22:05 -0700 (PDT) Subject: [Swift-devel] Swift did not make progress with high throtteling rate In-Reply-To: References: <1334083303.40815.YahooMailNeo@web39501.mail.mud.yahoo.com> Message-ID: <1334114525.38137.YahooMailNeo@web39506.mail.mud.yahoo.com> Hi Justin, Thank you for looking at the issue. The benchmark was working when JOB_THROTTLE = 0.05 at different scales (nodes = 64, 128, 256). But it didn't make any progress at high rates for a long time. I think this is due to the storage slowdown? (I was using GPFS to have both the worker directory and also to stage-out the files). Now I changed my setup to stage-out to PVFS and now the benchmark successfully works with different scale (nodes = 64, 128, 256) at high job throttle rate (JOB_THROTTLE = 1000) . Thank you again. Regards Emalayan ________________________________ From: Justin M Wozniak To: Emalayan Vairavanathan Cc: "swift-devel at ci.uchicago.edu" Sent: Tuesday, 10 April 2012 12:22 PM Subject: Re: [Swift-devel] Swift did not make progress with high throtteling rate Hi Emalayan ??? Are you saying that this case does run with the default throttles and fails with jobThrottle=1000? ??? I just took a look at the log file.? It looks like the jobs do get scheduled.? Are there any -info files to look at? ??? Justin On Tue, 10 Apr 2012, Emalayan Vairavanathan wrote: > Hi All, > > I tired to run my pipeline-swift benchmark on GPFS+PVFS with 128 compute nodes (Surveyor), JOB_THROTTLE = 1000 and JOBS_PER_NODE = 4. > > I used GPFS as the central storage and PVFS as the intermediate storage. The benchmark did not make any progress and I found the following messages in the log file. (This happened even with MosaStore) > > > 2012-04-10 18:18:36,710+0000 WARN? HangChecke No events in 10s. > 2012-04-10 18:18:36,717+0000 WARN? HangChecker > Registered futures: > file stage_2_output - F/stage_2_output[95]:file - Open > file stage_1_output - F/stage_1_output[85]:file - Open > file stage_3_output - F/stage_3_output[62]:file - Open > file stage_3_output - F/stage_3_output[44]:file - Open > file stage_1_output - F/stage_1_output[4]:file - Open > file stage_2_output - F/stage_2_output[3]:file - Open > file input_data - F/input_data[121]:file - Open > file stage_1_output - F/stage_1_output[113]:file - Open > file stage_1_output - F/stage_1_output[98]:file - Open > > I am using the swift version that I took from Justin's home directory 3 weeks before. > > Do you have any idea ? Does swift has problem with high throttling rate / jobs-per-node ? I have attached swift log file and the benchmark with this mail. I highly appreciate your suggestions. > > > Thank you > Emalayan -- Justin M Wozniak -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Tue Apr 10 22:51:05 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 10 Apr 2012 22:51:05 -0500 (CDT) Subject: [Swift-devel] coaster io with NIO. In-Reply-To: Message-ID: <1987971302.134917.1334116265742.JavaMail.root@zimbra.anl.gov> Thanks, Ketan. David, can you try to reproduce the problem with jobsPerNode=1? - Mike ----- Original Message ----- > From: "Ketan Maheshwari" > To: "Michael Wilde" > Sent: Tuesday, April 10, 2012 9:31:34 PM > Subject: Re: [Swift-devel] coaster io with NIO. > Jobspernode setting were indeed 1 on the tests done on osg. > > > I do not recall seeing the blocking messages seen by David's > current/recent tests. > > > On Tuesday, April 10, 2012, Michael Wilde wrote: > > > Mihael, while the scenario below seems plausible, I thought that the > timeout problem was first detected on OSG nodes, which should have > been running with jobsPerNode=1. > > David, Ketan, can you comment on the jobsPerNode settings for the many > tests you have done which encountered this problem? > > - Mike > > ----- Original Message ----- > > From: "Mihael Hategan" < hategan at mcs.anl.gov > > > To: "David Kelly" < davidk at ci.uchicago.edu > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > Sent: Tuesday, April 10, 2012 7:04:56 PM > > Subject: Re: [Swift-devel] coaster io with NIO. > > On Tue, 2012-04-10 at 17:25 -0500, David Kelly wrote: > > > Yep, I gave it a try with automatic coasters, but am still seeing > > > the timeouts. > > > > > > > I think I see the problem. With multiple jobs per worker the > > situation > > may such be that both a stagein and a stageout happen at the same > > time > > (on the same TCP connection). If the stageout runs out of buffers > > the > > writing to the socket on the worker side blocks causing the read > > loop > > to > > not happen. This eventually fills the other direction on the TCP > > link > > and everything deadlocks. > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > -- > Ketan -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From davidk at ci.uchicago.edu Tue Apr 10 23:33:20 2012 From: davidk at ci.uchicago.edu (David Kelly) Date: Tue, 10 Apr 2012 23:33:20 -0500 (CDT) Subject: [Swift-devel] coaster io with NIO. In-Reply-To: <1987971302.134917.1334116265742.JavaMail.root@zimbra.anl.gov> Message-ID: <1848963744.127741.1334118800448.JavaMail.root@zimbra-mb2.anl.gov> Since the latest update which fixes coaster-service, I have tested with two configurations: 1 machine only, 4 jobs per node, 100 200MB files (ran twice, passed twice) 2 MCS machines - swift and coaster-service running on one machine, 1 worker, 4 jobs per node, 500 20MB files (also ran twice, passed twice) These tests were failing pretty consistently yesterday. I am not positive it is completely fixed yet, but things have definitely improved. I have never been able to reproduce provider staging problems using jobs per node set of 1. It was only when I got to a value of 4 that I started seeing issues. I will write a test tonight that runs on OSG and let you know what happens. David ----- Original Message ----- > From: "Michael Wilde" > To: "Ketan Maheshwari" > Cc: "Swift Devel" > Sent: Tuesday, April 10, 2012 10:51:05 PM > Subject: Re: [Swift-devel] coaster io with NIO. > Thanks, Ketan. David, can you try to reproduce the problem with > jobsPerNode=1? > > - Mike > > ----- Original Message ----- > > From: "Ketan Maheshwari" > > To: "Michael Wilde" > > Sent: Tuesday, April 10, 2012 9:31:34 PM > > Subject: Re: [Swift-devel] coaster io with NIO. > > Jobspernode setting were indeed 1 on the tests done on osg. > > > > > > I do not recall seeing the blocking messages seen by David's > > current/recent tests. > > > > > > On Tuesday, April 10, 2012, Michael Wilde wrote: > > > > > > Mihael, while the scenario below seems plausible, I thought that the > > timeout problem was first detected on OSG nodes, which should have > > been running with jobsPerNode=1. > > > > David, Ketan, can you comment on the jobsPerNode settings for the > > many > > tests you have done which encountered this problem? > > > > - Mike > > > > ----- Original Message ----- > > > From: "Mihael Hategan" < hategan at mcs.anl.gov > > > > To: "David Kelly" < davidk at ci.uchicago.edu > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > Sent: Tuesday, April 10, 2012 7:04:56 PM > > > Subject: Re: [Swift-devel] coaster io with NIO. > > > On Tue, 2012-04-10 at 17:25 -0500, David Kelly wrote: > > > > Yep, I gave it a try with automatic coasters, but am still > > > > seeing > > > > the timeouts. > > > > > > > > > > I think I see the problem. With multiple jobs per worker the > > > situation > > > may such be that both a stagein and a stageout happen at the same > > > time > > > (on the same TCP connection). If the stageout runs out of buffers > > > the > > > writing to the socket on the worker side blocks causing the read > > > loop > > > to > > > not happen. This eventually fills the other direction on the TCP > > > link > > > and everything deadlocks. > > > > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > -- > > Ketan > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From davidk at ci.uchicago.edu Wed Apr 11 01:00:36 2012 From: davidk at ci.uchicago.edu (David Kelly) Date: Wed, 11 Apr 2012 01:00:36 -0500 (CDT) Subject: [Swift-devel] coaster io with NIO. In-Reply-To: <1848963744.127741.1334118800448.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: <453980153.127884.1334124036782.JavaMail.root@zimbra-mb2.anl.gov> I just ran a test on OSG similar to what Ketan described in the initial entry for ticket #690: Submit host: communicado using data on GPFS 100 nodes using Condor GlideinWMS 500 jobs 10MB data files The test completed without errors. David ----- Original Message ----- > From: "David Kelly" > To: "Michael Wilde" > Cc: "Swift Devel" > Sent: Tuesday, April 10, 2012 11:33:20 PM > Subject: Re: [Swift-devel] coaster io with NIO. > Since the latest update which fixes coaster-service, I have tested > with two configurations: > > 1 machine only, 4 jobs per node, 100 200MB files (ran twice, passed > twice) > 2 MCS machines - swift and coaster-service running on one machine, 1 > worker, 4 jobs per node, 500 20MB files (also ran twice, passed twice) > > These tests were failing pretty consistently yesterday. I am not > positive it is completely fixed yet, but things have definitely > improved. > > I have never been able to reproduce provider staging problems using > jobs per node set of 1. It was only when I got to a value of 4 that I > started seeing issues. > > I will write a test tonight that runs on OSG and let you know what > happens. > > David > > > ----- Original Message ----- > > From: "Michael Wilde" > > To: "Ketan Maheshwari" > > Cc: "Swift Devel" > > Sent: Tuesday, April 10, 2012 10:51:05 PM > > Subject: Re: [Swift-devel] coaster io with NIO. > > Thanks, Ketan. David, can you try to reproduce the problem with > > jobsPerNode=1? > > > > - Mike > > > > ----- Original Message ----- > > > From: "Ketan Maheshwari" > > > To: "Michael Wilde" > > > Sent: Tuesday, April 10, 2012 9:31:34 PM > > > Subject: Re: [Swift-devel] coaster io with NIO. > > > Jobspernode setting were indeed 1 on the tests done on osg. > > > > > > > > > I do not recall seeing the blocking messages seen by David's > > > current/recent tests. > > > > > > > > > On Tuesday, April 10, 2012, Michael Wilde wrote: > > > > > > > > > Mihael, while the scenario below seems plausible, I thought that > > > the > > > timeout problem was first detected on OSG nodes, which should have > > > been running with jobsPerNode=1. > > > > > > David, Ketan, can you comment on the jobsPerNode settings for the > > > many > > > tests you have done which encountered this problem? > > > > > > - Mike > > > > > > ----- Original Message ----- > > > > From: "Mihael Hategan" < hategan at mcs.anl.gov > > > > > To: "David Kelly" < davidk at ci.uchicago.edu > > > > > Cc: "Swift Devel" < swift-devel at ci.uchicago.edu > > > > > Sent: Tuesday, April 10, 2012 7:04:56 PM > > > > Subject: Re: [Swift-devel] coaster io with NIO. > > > > On Tue, 2012-04-10 at 17:25 -0500, David Kelly wrote: > > > > > Yep, I gave it a try with automatic coasters, but am still > > > > > seeing > > > > > the timeouts. > > > > > > > > > > > > > I think I see the problem. With multiple jobs per worker the > > > > situation > > > > may such be that both a stagein and a stageout happen at the > > > > same > > > > time > > > > (on the same TCP connection). If the stageout runs out of > > > > buffers > > > > the > > > > writing to the socket on the worker side blocks causing the read > > > > loop > > > > to > > > > not happen. This eventually fills the other direction on the TCP > > > > link > > > > and everything deadlocks. > > > > > > > > > > > > _______________________________________________ > > > > Swift-devel mailing list > > > > Swift-devel at ci.uchicago.edu > > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > -- > > > Michael Wilde > > > Computation Institute, University of Chicago > > > Mathematics and Computer Science Division > > > Argonne National Laboratory > > > > > > _______________________________________________ > > > Swift-devel mailing list > > > Swift-devel at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > > > > > > > -- > > > Ketan > > > > -- > > Michael Wilde > > Computation Institute, University of Chicago > > Mathematics and Computer Science Division > > Argonne National Laboratory > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From svemalayan at yahoo.com Thu Apr 12 00:44:45 2012 From: svemalayan at yahoo.com (Emalayan Vairavanathan) Date: Wed, 11 Apr 2012 22:44:45 -0700 (PDT) Subject: [Swift-devel] Montage workload Message-ID: <1334209485.88512.YahooMailNeo@web39503.mail.mud.yahoo.com> Hi Jon, I tired to run the large Montage-workload which I got from you recently on both PVFS and MosaStore. With both systems the workload failed (I copied the? standard output messages below). I guess this is due to the problem with the workload (because the system works with the small workloads).? Do you have any idea ? Did this workload work for you ? Thank you Emalayan Swift trunk swift-r5704 (swift modified locally) cog-r3361 (cog modified locally) RunID: 20120412-0530-vj96mfz5 No events in 10s. Registered futures: ---- Waiting threads: ---- No events in 10s. Registered futures: ---- Waiting threads: ---- No events in 10s. Registered futures: ---- Waiting threads: ---- No events in 10s. Registered futures: ---- Waiting threads: ---- ?(input): found 4116 files No events in 10s. Registered futures: ---- Waiting threads: ---- Failed to acquire exclusive lock on log file. Progress:? time: Thu, 12 Apr 2012 05:31:02 +0000 Progress:? time: Progress:? time: Thu, 12 Apr 2012 05:31:11 +0000Thu, 12 Apr 2012 05:31:11 +0000? Initializing:2? Initializing:2 Progress:? time: Thu, 12 Apr 2012 05:31:12 +0000? Initializing:1023? Selecting site:1 Progress:? time: Thu, 12 Apr 2012 05:31:13 +0000? Selecting site:1020? Initializing site shared directory:1? Stage in:3 Progress:? time: Thu, 12 Apr 2012 05:31:15 +0000? Selecting site:1018? Stage in:5? Submitting:1 Find: http://172.17.3.12:12346 Find:? keepalive(120), reconnect - http://172.17.3.12:12346 Passive queue processor initialized. Callback URI is http://172.17.3.12:12345 Progress:? time: Thu, 12 Apr 2012 05:31:16 +0000? Selecting site:1018? Active:6 Progress:? time: Thu, 12 Apr 2012 05:31:24 +0000? Selecting site:1018? Active:5 Failed but can retry:1 EXCEPTION Exception in mProjectPP_wrap: Arguments: [-X, raw_dir/2mass-atlas-991207s-j1130256.fits, proj_dir/proj_2mass-atlas-991207s-j1130256.fits, header.hdr] Host: persistent-coasters Directory: SwiftMontage-20120412-0530-vj96mfz5/jobs/e/mProjectPP_wrap-eozxvrpk stderr.txt: stdout.txt: [struct stat="ERROR", msg="All pixels are blank."] -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at mcs.anl.gov Thu Apr 12 09:12:10 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Thu, 12 Apr 2012 09:12:10 -0500 Subject: [Swift-devel] Montage workload In-Reply-To: <1334209485.88512.YahooMailNeo@web39503.mail.mud.yahoo.com> References: <1334209485.88512.YahooMailNeo@web39503.mail.mud.yahoo.com> Message-ID: So this looks like a problem in the Swift code. The hang checker is activated at the start of the execution which is not good. Could you point me to where you ran this? Was this on surveyor? If it was not on surveyor I can give it a try. It looks like the projection phase is trying to project empty files. This could be due to the files actually being empty(I sent corrupted data) or Swift cannot find the files but ran mProjectPP anyways. On Apr 12, 2012, at 12:44 AM, Emalayan Vairavanathan wrote: > Hi Jon, > > I tired to run the large Montage-workload which I got from you recently on both PVFS and MosaStore. With both systems the workload failed (I copied the standard output messages below). I guess this is due to the problem with the workload (because the system works with the small workloads). > Do you have any idea ? Did this workload work for you ? > > Thank you > Emalayan > > > Swift trunk swift-r5704 (swift modified locally) cog-r3361 (cog modified locally) > > RunID: 20120412-0530-vj96mfz5 > No events in 10s. > > Registered futures: > ---- > > Waiting threads: > ---- > > No events in 10s. > > Registered futures: > ---- > > Waiting threads: > ---- > > No events in 10s. > > Registered futures: > ---- > > Waiting threads: > ---- > > No events in 10s. > > Registered futures: > ---- > > Waiting threads: > ---- > > (input): found 4116 files > No events in 10s. > > Registered futures: > ---- > > Waiting threads: > ---- > > Failed to acquire exclusive lock on log file. > Progress: time: Thu, 12 Apr 2012 05:31:02 +0000 > Progress: time: Progress: time: Thu, 12 Apr 2012 05:31:11 +0000Thu, 12 Apr 2012 05:31:11 +0000 Initializing:2 Initializing:2 > > Progress: time: Thu, 12 Apr 2012 05:31:12 +0000 Initializing:1023 Selecting site:1 > Progress: time: Thu, 12 Apr 2012 05:31:13 +0000 Selecting site:1020 Initializing site shared directory:1 Stage in:3 > Progress: time: Thu, 12 Apr 2012 05:31:15 +0000 Selecting site:1018 Stage in:5 Submitting:1 > Find: http://172.17.3.12:12346 > Find: keepalive(120), reconnect - http://172.17.3.12:12346 > Passive queue processor initialized. Callback URI is http://172.17.3.12:12345 > Progress: time: Thu, 12 Apr 2012 05:31:16 +0000 Selecting site:1018 Active:6 > Progress: time: Thu, 12 Apr 2012 05:31:24 +0000 Selecting site:1018 Active:5 Failed but can retry:1 > EXCEPTION Exception in mProjectPP_wrap: > Arguments: [-X, raw_dir/2mass-atlas-991207s-j1130256.fits, proj_dir/proj_2mass-atlas-991207s-j1130256.fits, header.hdr] > Host: persistent-coasters > Directory: SwiftMontage-20120412-0530-vj96mfz5/jobs/e/mProjectPP_wrap-eozxvrpk > stderr.txt: > stdout.txt: [struct stat="ERROR", msg="All pixels are blank."] > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidk at ci.uchicago.edu Thu Apr 12 09:27:10 2012 From: davidk at ci.uchicago.edu (David Kelly) Date: Thu, 12 Apr 2012 09:27:10 -0500 (CDT) Subject: [Swift-devel] Montage workload In-Reply-To: Message-ID: <717467641.133722.1334240830924.JavaMail.root@zimbra-mb2.anl.gov> For what it's worth, I see the same hang checker messages early on in an unrelated script I am working on. It seems to be triggered by reading a large number of input files from a slower shared filesystem. In my case, once it finds all the input files, the hang checker messages stop and the job continues as normal. [davidk at communicado scec-sim]$ swift -sites.file sites.grid-ps.xml -tc.file tc.data -config cf scec-sim.swift Swift trunk swift-r5746 cog-r3371 RunID: 20120412-0914-0dsnyia7 No events in 10s. Registered futures: ---- Waiting threads: ---- (input): found 5938 files Progress: time: Thu, 12 Apr 2012 09:14:34 -0500 Progress: time: Thu, 12 Apr 2012 09:14:40 -0500 Initializing:1 Find: http://localhost:50000 Find: keepalive(120), reconnect - http://localhost:50000 Passive queue processor initialized. Callback URI is null Progress: time: Thu, 12 Apr 2012 09:14:42 -0500 Selecting site:25 Submitting:998 Submitted:1 ----- Original Message ----- > From: "Jonathan Monette" > To: "Emalayan Vairavanathan" > Cc: swift-devel at ci.uchicago.edu, "MosaStore" > Sent: Thursday, April 12, 2012 9:12:10 AM > Subject: Re: [Swift-devel] Montage workload > So this looks like a problem in the Swift code. The hang checker is > activated at the start of the execution which is not good. Could you > point me to where you ran this? Was this on surveyor? If it was not on > surveyor I can give it a try. It looks like the projection phase is > trying to project empty files. This could be due to the files actually > being empty(I sent corrupted data) or Swift cannot find the files but > ran mProjectPP anyways. > > > > On Apr 12, 2012, at 12:44 AM, Emalayan Vairavanathan wrote: > > > > > > Hi Jon, > > > I tired to run the large Montage-workload which I got from you > recently on both PVFS and MosaStore. With both systems the workload > failed (I copied the standard output messages below). I guess this is > due to the problem with the workload (because the system works with > the small workloads). > > Do you have any idea ? Did this workload work for you ? > > > > Thank you > Emalayan > > > > > > Swift trunk swift-r5704 (swift modified locally) cog-r3361 (cog > modified locally) > > RunID: 20120412-0530-vj96mfz5 > No events in 10s. > > Registered futures: > ---- > > Waiting threads: > ---- > > No events in 10s. > > Registered futures: > ---- > > Waiting threads: > ---- > > No events in 10s. > > Registered futures: > ---- > > Waiting threads: > ---- > > No events in 10s. > > Registered futures: > ---- > > Waiting threads: > ---- > > (input): found 4116 files > No events in 10s. > > Registered futures: > ---- > > Waiting threads: > ---- > > Failed to acquire exclusive lock on log file. > Progress: time: Thu, 12 Apr 2012 05:31:02 +0000 > Progress: time: Progress: time: Thu, 12 Apr 2012 05:31:11 +0000Thu, 12 > Apr 2012 05:31:11 +0000 Initializing:2 Initializing:2 > > Progress: time: Thu, 12 Apr 2012 05:31:12 +0000 Initializing:1023 > Selecting site:1 > Progress: time: Thu, 12 Apr 2012 05:31:13 +0000 Selecting site:1020 > Initializing site shared directory:1 Stage in:3 > Progress: time: Thu, 12 Apr 2012 05:31:15 +0000 Selecting site:1018 > Stage in:5 Submitting:1 > Find: http://172.17.3.12:12346 > Find: keepalive(120), reconnect - http://172.17.3.12:12346 > Passive queue processor initialized. Callback URI is > http://172.17.3.12:12345 > Progress: time: Thu, 12 Apr 2012 05:31:16 +0000 Selecting site:1018 > Active:6 > Progress: time: Thu, 12 Apr 2012 05:31:24 +0000 Selecting site:1018 > Active:5 Failed but can retry:1 > EXCEPTION Exception in mProjectPP_wrap: > Arguments: [-X, raw_dir/2mass-atlas-991207s-j1130256.fits, > proj_dir/proj_2mass-atlas-991207s-j1130256.fits, header.hdr] > Host: persistent-coasters > Directory: > SwiftMontage-20120412-0530-vj96mfz5/jobs/e/mProjectPP_wrap-eozxvrpk > stderr.txt: > stdout.txt: [struct stat="ERROR", msg="All pixels are blank."] > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From jonmon at mcs.anl.gov Thu Apr 12 09:29:42 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Thu, 12 Apr 2012 09:29:42 -0500 Subject: [Swift-devel] Montage workload In-Reply-To: <717467641.133722.1334240830924.JavaMail.root@zimbra-mb2.anl.gov> References: <717467641.133722.1334240830924.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: <85C159DE-D3B3-4A51-B41E-3CD2B0A89798@mcs.anl.gov> So that is my conclusion for the hang checker part(I think I saw this before when I ran on surveyor but that was a long time ago). I am not sure about the app failing though. When I sent the tarball I may have sent corrupted data. That is what I am checking right now. On Apr 12, 2012, at 9:27 AM, David Kelly wrote: > For what it's worth, I see the same hang checker messages early on in an unrelated script I am working on. It seems to be triggered by reading a large number of input files from a slower shared filesystem. In my case, once it finds all the input files, the hang checker messages stop and the job continues as normal. > > [davidk at communicado scec-sim]$ swift -sites.file sites.grid-ps.xml -tc.file tc.data -config cf scec-sim.swift > Swift trunk swift-r5746 cog-r3371 > > RunID: 20120412-0914-0dsnyia7 > No events in 10s. > > Registered futures: > ---- > > Waiting threads: > ---- > > (input): found 5938 files > Progress: time: Thu, 12 Apr 2012 09:14:34 -0500 > Progress: time: Thu, 12 Apr 2012 09:14:40 -0500 Initializing:1 > Find: http://localhost:50000 > Find: keepalive(120), reconnect - http://localhost:50000 > Passive queue processor initialized. Callback URI is null > Progress: time: Thu, 12 Apr 2012 09:14:42 -0500 Selecting site:25 Submitting:998 Submitted:1 > > > ----- Original Message ----- >> From: "Jonathan Monette" >> To: "Emalayan Vairavanathan" >> Cc: swift-devel at ci.uchicago.edu, "MosaStore" >> Sent: Thursday, April 12, 2012 9:12:10 AM >> Subject: Re: [Swift-devel] Montage workload >> So this looks like a problem in the Swift code. The hang checker is >> activated at the start of the execution which is not good. Could you >> point me to where you ran this? Was this on surveyor? If it was not on >> surveyor I can give it a try. It looks like the projection phase is >> trying to project empty files. This could be due to the files actually >> being empty(I sent corrupted data) or Swift cannot find the files but >> ran mProjectPP anyways. >> >> >> >> On Apr 12, 2012, at 12:44 AM, Emalayan Vairavanathan wrote: >> >> >> >> >> >> Hi Jon, >> >> >> I tired to run the large Montage-workload which I got from you >> recently on both PVFS and MosaStore. With both systems the workload >> failed (I copied the standard output messages below). I guess this is >> due to the problem with the workload (because the system works with >> the small workloads). >> >> Do you have any idea ? Did this workload work for you ? >> >> >> >> Thank you >> Emalayan >> >> >> >> >> >> Swift trunk swift-r5704 (swift modified locally) cog-r3361 (cog >> modified locally) >> >> RunID: 20120412-0530-vj96mfz5 >> No events in 10s. >> >> Registered futures: >> ---- >> >> Waiting threads: >> ---- >> >> No events in 10s. >> >> Registered futures: >> ---- >> >> Waiting threads: >> ---- >> >> No events in 10s. >> >> Registered futures: >> ---- >> >> Waiting threads: >> ---- >> >> No events in 10s. >> >> Registered futures: >> ---- >> >> Waiting threads: >> ---- >> >> (input): found 4116 files >> No events in 10s. >> >> Registered futures: >> ---- >> >> Waiting threads: >> ---- >> >> Failed to acquire exclusive lock on log file. >> Progress: time: Thu, 12 Apr 2012 05:31:02 +0000 >> Progress: time: Progress: time: Thu, 12 Apr 2012 05:31:11 +0000Thu, 12 >> Apr 2012 05:31:11 +0000 Initializing:2 Initializing:2 >> >> Progress: time: Thu, 12 Apr 2012 05:31:12 +0000 Initializing:1023 >> Selecting site:1 >> Progress: time: Thu, 12 Apr 2012 05:31:13 +0000 Selecting site:1020 >> Initializing site shared directory:1 Stage in:3 >> Progress: time: Thu, 12 Apr 2012 05:31:15 +0000 Selecting site:1018 >> Stage in:5 Submitting:1 >> Find: http://172.17.3.12:12346 >> Find: keepalive(120), reconnect - http://172.17.3.12:12346 >> Passive queue processor initialized. Callback URI is >> http://172.17.3.12:12345 >> Progress: time: Thu, 12 Apr 2012 05:31:16 +0000 Selecting site:1018 >> Active:6 >> Progress: time: Thu, 12 Apr 2012 05:31:24 +0000 Selecting site:1018 >> Active:5 Failed but can retry:1 >> EXCEPTION Exception in mProjectPP_wrap: >> Arguments: [-X, raw_dir/2mass-atlas-991207s-j1130256.fits, >> proj_dir/proj_2mass-atlas-991207s-j1130256.fits, header.hdr] >> Host: persistent-coasters >> Directory: >> SwiftMontage-20120412-0530-vj96mfz5/jobs/e/mProjectPP_wrap-eozxvrpk >> stderr.txt: >> stdout.txt: [struct stat="ERROR", msg="All pixels are blank."] >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From svemalayan at yahoo.com Thu Apr 12 09:39:25 2012 From: svemalayan at yahoo.com (Emalayan Vairavanathan) Date: Thu, 12 Apr 2012 07:39:25 -0700 (PDT) Subject: [Swift-devel] Montage workload In-Reply-To: <85C159DE-D3B3-4A51-B41E-3CD2B0A89798@mcs.anl.gov> References: <717467641.133722.1334240830924.JavaMail.root@zimbra-mb2.anl.gov> <85C159DE-D3B3-4A51-B41E-3CD2B0A89798@mcs.anl.gov> Message-ID: <1334241565.70564.YahooMailNeo@web39507.mail.mud.yahoo.com> Hi Jon and David, I was running this on Surveyor and I got the workload from your home directory. Could you please check the workload for data corruption (may be using MD5? sum ?)? Regarding hang checker :? Hang checked kicked in with swift-pipeline benchmarks too. I agree with David and I too think this is due to storage slow down. Thank you Emalayan ________________________________ From: Jonathan Monette To: David Kelly Cc: swift-devel at ci.uchicago.edu; MosaStore ; Emalayan Vairavanathan Sent: Thursday, 12 April 2012 7:29 AM Subject: Re: [Swift-devel] Montage workload So that is my conclusion for the hang checker part(I think I saw this before when I ran on surveyor but that was a long time ago).? I am not sure about the app failing though. When I sent the tarball I may have sent corrupted data.? That is what I am checking right now. On Apr 12, 2012, at 9:27 AM, David Kelly wrote: > For what it's worth, I see the same hang checker messages early on in an unrelated script I am working on. It seems to be triggered by reading a large number of input files from a slower shared filesystem. In my case, once it finds all the input files, the hang checker messages stop and the job continues as normal. > > [davidk at communicado scec-sim]$ swift -sites.file sites.grid-ps.xml -tc.file tc.data -config cf scec-sim.swift > Swift trunk swift-r5746 cog-r3371 > > RunID: 20120412-0914-0dsnyia7 > No events in 10s. > > Registered futures: > ---- > > Waiting threads: > ---- > > (input): found 5938 files > Progress:? time: Thu, 12 Apr 2012 09:14:34 -0500 > Progress:? time: Thu, 12 Apr 2012 09:14:40 -0500? Initializing:1 > Find: http://localhost:50000 > Find:? keepalive(120), reconnect - http://localhost:50000 > Passive queue processor initialized. Callback URI is null > Progress:? time: Thu, 12 Apr 2012 09:14:42 -0500? Selecting site:25? Submitting:998? Submitted:1 > > > ----- Original Message ----- >> From: "Jonathan Monette" >> To: "Emalayan Vairavanathan" >> Cc: swift-devel at ci.uchicago.edu, "MosaStore" >> Sent: Thursday, April 12, 2012 9:12:10 AM >> Subject: Re: [Swift-devel] Montage workload >> So this looks like a problem in the Swift code. The hang checker is >> activated at the start of the execution which is not good. Could you >> point me to where you ran this? Was this on surveyor? If it was not on >> surveyor I can give it a try. It looks like the projection phase is >> trying to project empty files. This could be due to the files actually >> being empty(I sent corrupted data) or Swift cannot find the files but >> ran mProjectPP anyways. >> >> >> >> On Apr 12, 2012, at 12:44 AM, Emalayan Vairavanathan wrote: >> >> >> >> >> >> Hi Jon, >> >> >> I tired to run the large Montage-workload which I got from you >> recently on both PVFS and MosaStore. With both systems the workload >> failed (I copied the standard output messages below). I guess this is >> due to the problem with the workload (because the system works with >> the small workloads). >> >> Do you have any idea ? Did this workload work for you ? >> >> >> >> Thank you >> Emalayan >> >> >> >> >> >> Swift trunk swift-r5704 (swift modified locally) cog-r3361 (cog >> modified locally) >> >> RunID: 20120412-0530-vj96mfz5 >> No events in 10s. >> >> Registered futures: >> ---- >> >> Waiting threads: >> ---- >> >> No events in 10s. >> >> Registered futures: >> ---- >> >> Waiting threads: >> ---- >> >> No events in 10s. >> >> Registered futures: >> ---- >> >> Waiting threads: >> ---- >> >> No events in 10s. >> >> Registered futures: >> ---- >> >> Waiting threads: >> ---- >> >> (input): found 4116 files >> No events in 10s. >> >> Registered futures: >> ---- >> >> Waiting threads: >> ---- >> >> Failed to acquire exclusive lock on log file. >> Progress: time: Thu, 12 Apr 2012 05:31:02 +0000 >> Progress: time: Progress: time: Thu, 12 Apr 2012 05:31:11 +0000Thu, 12 >> Apr 2012 05:31:11 +0000 Initializing:2 Initializing:2 >> >> Progress: time: Thu, 12 Apr 2012 05:31:12 +0000 Initializing:1023 >> Selecting site:1 >> Progress: time: Thu, 12 Apr 2012 05:31:13 +0000 Selecting site:1020 >> Initializing site shared directory:1 Stage in:3 >> Progress: time: Thu, 12 Apr 2012 05:31:15 +0000 Selecting site:1018 >> Stage in:5 Submitting:1 >> Find: http://172.17.3.12:12346 >> Find: keepalive(120), reconnect - http://172.17.3.12:12346 >> Passive queue processor initialized. Callback URI is >> http://172.17.3.12:12345 >> Progress: time: Thu, 12 Apr 2012 05:31:16 +0000 Selecting site:1018 >> Active:6 >> Progress: time: Thu, 12 Apr 2012 05:31:24 +0000 Selecting site:1018 >> Active:5 Failed but can retry:1 >> EXCEPTION Exception in mProjectPP_wrap: >> Arguments: [-X, raw_dir/2mass-atlas-991207s-j1130256.fits, >> proj_dir/proj_2mass-atlas-991207s-j1130256.fits, header.hdr] >> Host: persistent-coasters >> Directory: >> SwiftMontage-20120412-0530-vj96mfz5/jobs/e/mProjectPP_wrap-eozxvrpk >> stderr.txt: >> stdout.txt: [struct stat="ERROR", msg="All pixels are blank."] >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> >> >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- You received this message because you are subscribed to the Google Groups "MosaStore" group. To post to this group, send email to mosastore at googlegroups.com. To unsubscribe from this group, send email to mosastore+unsubscribe at googlegroups.com. For more options, visit this group at http://groups.google.com/group/mosastore?hl=en. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at mcs.anl.gov Fri Apr 13 11:37:19 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Fri, 13 Apr 2012 11:37:19 -0500 Subject: [Swift-devel] Plotting Message-ID: <6229D20A-9563-433A-A86E-D61940A77B49@mcs.anl.gov> Hello, I am trying to grab plots from the logs for the SciColSim application I have been working on. The problem is, when I work through the post log-processing plotting steps the output file has a lot of data points making the eps file quite large(~300M). When trying to open this file under preview on my local mac, preview crashes. I tried using gnu plot to plot this large data set and the output file was just as large. I have thought about going through the large data set file and averaging together points that are very close together(within some epsilon). Is this a solid approach? Is there any way of telling the plot routine(gnu plot or the plotter Justin made with JFreeChart) to do this for me? What other techniques could I try to shorten the amount of data that is being plotted? From iraicu at cs.iit.edu Fri Apr 13 11:52:53 2012 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Fri, 13 Apr 2012 11:52:53 -0500 Subject: [Swift-devel] Plotting In-Reply-To: <6229D20A-9563-433A-A86E-D61940A77B49@mcs.anl.gov> References: <6229D20A-9563-433A-A86E-D61940A77B49@mcs.anl.gov> Message-ID: <4F8859E5.4080609@cs.iit.edu> You could try to flatten the EPS file into a high resolution JPG. A flat image of 1000s x 1000s pixels should give you a very high quality plot that will look good on a computer screen, PPT presentation, or in a document. Ioan On 4/13/2012 11:37 AM, Jonathan Monette wrote: > Hello, > I am trying to grab plots from the logs for the SciColSim application I have been working on. The problem is, when I work through the post log-processing plotting steps the output file has a lot of data points making the eps file quite large(~300M). When trying to open this file under preview on my local mac, preview crashes. I tried using gnu plot to plot this large data set and the output file was just as large. > > I have thought about going through the large data set file and averaging together points that are very close together(within some epsilon). Is this a solid approach? Is there any way of telling the plot routine(gnu plot or the plotter Justin made with JFreeChart) to do this for me? What other techniques could I try to shorten the amount of data that is being plotted? > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================================= ================================================================= From wozniak at mcs.anl.gov Fri Apr 13 11:53:03 2012 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Fri, 13 Apr 2012 11:53:03 -0500 (CDT) Subject: [Swift-devel] Plotting In-Reply-To: <6229D20A-9563-433A-A86E-D61940A77B49@mcs.anl.gov> References: <6229D20A-9563-433A-A86E-D61940A77B49@mcs.anl.gov> Message-ID: On Fri, 13 Apr 2012, Jonathan Monette wrote: > I have thought about going through the large data set file and averaging > together points that are very close together(within some epsilon). Is > this a solid approach? Is there any way of telling the plot routine(gnu > plot or the plotter Justin made with JFreeChart) to do this for me? > What other techniques could I try to shorten the amount of data that is > being plotted? The plotter does not currently support this. I think this would be a nice, short, reusable script. I would take the two-column XY file and a window size. In each window, average X and Y. This should be fine for worker load plots and things like that. -- Justin M Wozniak From jonmon at mcs.anl.gov Fri Apr 13 11:56:56 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Fri, 13 Apr 2012 11:56:56 -0500 Subject: [Swift-devel] Plotting In-Reply-To: <4F8859E5.4080609@cs.iit.edu> References: <6229D20A-9563-433A-A86E-D61940A77B49@mcs.anl.gov> <4F8859E5.4080609@cs.iit.edu> Message-ID: <781DC3C1-2EBF-4AA5-B134-CD0324FAECEF@mcs.anl.gov> Thanks. I will look into trying this out. On Apr 13, 2012, at 11:52 AM, Ioan Raicu wrote: > You could try to flatten the EPS file into a high resolution JPG. A flat image of 1000s x 1000s pixels should give you a very high quality plot that will look good on a computer screen, PPT presentation, or in a document. > > Ioan > > On 4/13/2012 11:37 AM, Jonathan Monette wrote: >> Hello, >> I am trying to grab plots from the logs for the SciColSim application I have been working on. The problem is, when I work through the post log-processing plotting steps the output file has a lot of data points making the eps file quite large(~300M). When trying to open this file under preview on my local mac, preview crashes. I tried using gnu plot to plot this large data set and the output file was just as large. >> >> I have thought about going through the large data set file and averaging together points that are very close together(within some epsilon). Is this a solid approach? Is there any way of telling the plot routine(gnu plot or the plotter Justin made with JFreeChart) to do this for me? What other techniques could I try to shorten the amount of data that is being plotted? >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > -- > ================================================================= > Ioan Raicu, Ph.D. > Assistant Professor, Illinois Institute of Technology (IIT) > Guest Research Faculty, Argonne National Laboratory (ANL) > ================================================================= > Data-Intensive Distributed Systems Laboratory, CS/IIT > Distributed Systems Laboratory, MCS/ANL > ================================================================= > Cel: 1-847-722-0876 > Office: 1-312-567-5704 > Email: iraicu at cs.iit.edu > Web: http://www.cs.iit.edu/~iraicu/ > Web: http://datasys.cs.iit.edu/ > ================================================================= > ================================================================= > > From iraicu at cs.iit.edu Fri Apr 13 11:57:47 2012 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Fri, 13 Apr 2012 11:57:47 -0500 Subject: [Swift-devel] Plotting In-Reply-To: References: <6229D20A-9563-433A-A86E-D61940A77B49@mcs.anl.gov> Message-ID: <4F885B0B.6040809@cs.iit.edu> I also had scripts, and sometimes even Java programs that would take some large log(s) and summarize it, to make it easier to plot. Sometimes, plotting things was really a multi-stage process, but the plots many times came out looking pretty good in a relatively automated fashion. But it took lots of tinkering with scripts, programs, and the plotting software to make it happen, and much of it was specific to the logs I was looking at (in my case, they were Falkon logs). Ioan On 4/13/2012 11:53 AM, Justin M Wozniak wrote: > On Fri, 13 Apr 2012, Jonathan Monette wrote: > >> I have thought about going through the large data set file and averaging >> together points that are very close together(within some epsilon). Is >> this a solid approach? Is there any way of telling the plot routine(gnu >> plot or the plotter Justin made with JFreeChart) to do this for me? >> What other techniques could I try to shorten the amount of data that is >> being plotted? > The plotter does not currently support this. > > I think this would be a nice, short, reusable script. I would take the > two-column XY file and a window size. In each window, average X and Y. > This should be fine for worker load plots and things like that. > -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================================= ================================================================= From jonmon at mcs.anl.gov Fri Apr 13 11:58:04 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Fri, 13 Apr 2012 11:58:04 -0500 Subject: [Swift-devel] Plotting In-Reply-To: References: <6229D20A-9563-433A-A86E-D61940A77B49@mcs.anl.gov> Message-ID: That is similar to my current approach so I will use the window size. And the worker load plot is the plot I generated. I also have generating this plot scripted so I will update the instructions on how to plot with more information. On Apr 13, 2012, at 11:53 AM, Justin M Wozniak wrote: > On Fri, 13 Apr 2012, Jonathan Monette wrote: > >> I have thought about going through the large data set file and averaging together points that are very close together(within some epsilon). Is this a solid approach? Is there any way of telling the plot routine(gnu plot or the plotter Justin made with JFreeChart) to do this for me? What other techniques could I try to shorten the amount of data that is being plotted? > > The plotter does not currently support this. > > I think this would be a nice, short, reusable script. I would take the two-column XY file and a window size. In each window, average X and Y. This should be fine for worker load plots and things like that. > > -- > Justin M Wozniak From iraicu at cs.iit.edu Sat Apr 14 07:12:25 2012 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Sat, 14 Apr 2012 07:12:25 -0500 Subject: [Swift-devel] Call for Participation: ACM HPDC 2012 Message-ID: <4F8969A9.5080106@cs.iit.edu> (Please accept our apologies if you receive this message multiple times) **** CALL FOR PARTICIPATION **** *************************************************************** *** ** EARLY REGISTRATION DEADLINE: May 25, 2012 (CET) ** *** *************************************************************** The 21st International ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC'12) Delft University of Technology, Delft, the Netherlands June 18-22, 2012 http://www.hpdc.org/2012 The ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC) is the premier annual conference on the design, the implementation, the evaluation, and the use of parallel and distributed systems for high-end computing. HPDC'12 will take place in Delft, the Netherlands, a historical, picturesque city that is less than one hour away from Amsterdam-Schiphol airport. The conference will be held on June 20-22 (Wednesday to Friday 1 PM), with affiliated workshops taking place on June 18-19 (Monday and Tuesday). **** MAIN CONFERENCE FEATURES **** - High-quality single-track paper sessions - Two keynote presentations - Achievement Award talk (new in HPDC!) - Poster session plus conference reception - Seven workshops - Visit to Museum de Prinsenhof in Delft - Conference dinner in the historical place De Prinsenkelder in Delft **** CONFERENCE PROGRAM **** The program of the conference will be posted by mid-april on the conference website. **** KEYNOTE SPEAKERS (titles and abstracts will be posted online) **** Ricardo Bianchini, Rutgers University, USA Mihai Budiu, Microsoft Research, USA **** ACHIEVEMENT AWARD TALK (title and abstract will be posted online) **** Ian Foster, University of Chicago and Argonne National Laboratory, USA **** CALL FOR POSTERS **** HPDC'12 offers conference attendees the opportunity to participate in the poster session on Wednesday afternoon. For details on how to submit a poster, please consult the conference website. **** HPDC 2012 GENERAL CHAIR **** Dick Epema, Delft University of Technology, Delft, the Netherlands **** HPDC 2012 PROGRAM CO-CHAIRS **** Thilo Kielmann, Vrije Universiteit, Amsterdam, the Netherlands Matei Ripeanu, The University of British Columbia, Vancouver, Canada **** HPDC 2012 WORKSHOPS CHAIR **** Alexandru Iosup, Delft University of Technology, Delft, the Netherlands **** HPDC 2012 POSTERS CHAIR **** Ana Varbanescu, Delft University of Technology, Delft, the Netherlands **** EARLY REGISTRATION DEADLINE **** May 25, 2012 (CET) **** VENUE **** The HPDC'12 conference will be held on the campus of Delft University of Technology, which was founded in 1842 by King William II and which is the oldest and largest technical university in the Netherlands. It is well established as one of the leading technical universities in the world. Delft is a small, historical town dating back to the 13th century. Delft has many old buildings and small canals, and it has a lively atmosphere. The city offers a large variety of hotels and restaurants. Many other places of interest (e.g., Amsterdam and The Hague) are within one hour distance of traveling. Traveling to Delft is easy. Delft is close to Amsterdam Schiphol Airport (60 km, 45 min by train), which has direct connections from all major airports in the world. Delft also has excellent train connections to the rest of Europe. -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================================= ================================================================= From iraicu at cs.iit.edu Sat Apr 14 07:12:50 2012 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Sat, 14 Apr 2012 07:12:50 -0500 Subject: [Swift-devel] Call for Posters: The 21st Int. ACM Symp. on High-Performance Parallel and Distributed Computing (HPDC'12) Message-ID: <4F8969C2.4090009@cs.iit.edu> **** CALL FOR POSTERS **** The 21st International ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC'12) Delft University of Technology, Delft, the Netherlands June 18-22, 2012 http://www.hpdc.org/2012 The ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC) is the premier annual conference on the design, the implementation, the evaluation, and the use of parallel and distributed systems for high-end computing. HPDC'12 will take place in Delft, the Netherlands, a historical, picturesque city that is less than one hour away from Amsterdam-Schiphol airport. The conference will be held on June 20-22 (Wednesday to Friday), with affiliated workshops taking place on June 18-19 (Monday and Tuesday). HPDC'12 will feature a poster session that will provide the right environment for lively and informal discussions on various high performance parallel and distributed computing topics. We invite all potential authors to submit their contribution for this poster session in the form of a two-page PDF abstract (we recommend using the ACM Proceedings style, and fonts not smaller than 10 point). Posters may be accompanied by practical demonstrations. Abstracts must be submitted by sending an email to: hpdc-2012-posters at gmail.com before May 15th 2012, 23:59 CET. Participating posters will be selected based on the following criteria: * Submissions must describe new, interesting ideas on any HPDC topics of interest. * Submissions can present work in progress, but we strongly encourage the authors to include preliminary experimental results, if available. * Student submissions meeting the above criteria will be given preference. Please provide the following information in your PDF file: * Poster title. * Author names, affiliations, and email addresses. * Note which authors, if any, are students. * Indicate if you plan to set up a demo with your poster (the authors and organizers need to agree that the requirements for the demo to function can be met at the site of the poster exhibition). Authors will be notified of acceptance or rejection via e-mail by May 20th, 2012. No reviews will be provided. Posters will be published online on the conference Web site. Each poster will also have an A0 panel in a poster exhibition area, which will also include posters of the HPDC accepted papers. The poster session will be held on Wednesday, June 20, in the late afternoon, and it will start with a poster advertising session, during which the author(s) of each poster will give a very short presentation (2 slides, 1-2 minutes) of their poster. Following these presentations, the poster exhibition will be opened for visiting and, we hope, for fruitful discussions. Therefore, we kindly request at least one author of each poster to be present throughout the entire session. For any questions about the submission, selection, and presentation of the accepted posters, please contact the Poster Session Chair - Ana Lucia Varbanescu, Delft University of Technology, The Netherlands. -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================================= ================================================================= From iraicu at cs.iit.edu Sat Apr 14 07:42:32 2012 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Sat, 14 Apr 2012 07:42:32 -0500 Subject: [Swift-devel] Call for Participation: Cloud Futures 2012, Berkeley, CA (May 7-8) In-Reply-To: References: Message-ID: <4F8970B8.3030306@cs.iit.edu> *//* *//* */Cloud Futures: Hot Topics in Research and Education/* Berkeley, CA | May 7-8, 2012 http://research.microsoft.com/cloudfutures2012/ The Cloud Futures Workshop series brings together thought leaders from academia, industry, and government to discuss the role of cloud computing across a variety of research and educational areas---including computer science, engineering, Earth sciences, healthcare, humanities, interactive games, life sciences, and social sciences.Presentations, posters and discussions will highlight how new techniques, software platforms, and methods of research and teaching in the cloud may solve distinct challenges arising in those diverse areas. This year's Workshop is being hosted in conjunction with UC Berkeley. Conference Co-Chairs: Michael Franklin, UC Berkeley and Tony Hey, Microsoft Research *Register today! * Program: Day 1 Monday 05/07/2012 09:00 - 10:00 Keynote. Science In the Cloud, Joseph Hellerstein, Manager Big Science, Google_ 10:00 - 10:30 Break 10:30 - 12:00 plenary session ?10:30 -- 11:00 Advancing Declarative Query for Data-Intensive Science in the Cloud, Bill Howe, University of Washington ?11:00 -- 11:30 Programming Paradigms for Technical Computing on Clouds and Supercomputers, Geoffrey Fox, Indiana University, Dennis Gannon, Microsoft ?11:30 -- 12:00 Cloud Computing for Fundamental Spatial Operations on Polygonal GIS Data, Sushil Prasad, Dinesh Agarwal, Satish Puri, Xi He, Georgia State University 12:00 - 02:00 Lunch Posters 02:00 - 3:30 Session 1a Education ?02:00 -- 2:30 InstantLab 2.0 - A Platform for Operating System Experiments on Public Cloud Infrastructure, Andreas Polze, Christian Neuhaus, Rehab Alnemr, Lysann Kessler and Frank Schlegel, University of Potsdam ?02:30 -- 3:00 Case Study on Cloud Computing Infusion at a Leading Tertiary Institution in Singapore, Choong Wu Gary Lim, Nanyang Polytechnic ?03:00 -- 3:30 Teaching Web-scale Data Management using Microsoft Azure: POSTECH Experiences, Seung-won Hwang, POSTECH 02:00 -- 3:30 Session 1b Life Sciences ?02:00 -- 2:30 A-Brain: Using the Cloud to Understand the Impact of Genetic Variability on the Brain, Alexandru Costan, Radu Tudoran, Benoit Da Mota, Gabriel Antoniu and Bertrand Thirion, INRIA Rennes and Saclay ?02:30 -- 3:00 Very Large Scale Operon Predictions via Comparative Genomics, Ehsan Tabari, ZhengChang Su, UNC Charlotte ?03:00 -- 3:30 Fast Exploration of the QSAR Model Space with e-Science Central and Windows Azure, Jacek Cala, Hugo Hiden, Simon Woodman, Paul Watson, Newcastle University 03:30 - 04:00 Break 04:00 - 05:30 Session 1c Interactive Services ?04:00 -- 4:30 3D Remote Collaboration Framework for Virtual Cultural Heritage, Yasuhide Okamoto, Gregorij Kurillo, Ruzena Bajcsy University of California, Berkeley, Takeshi Oishi, Katsushi Ikeuchi , University of Tokyo ?04:30 -- 5:00 Interactive 3D Services over Windows Azure, Lukas Kencl, Jiri Danihelka , Czech Technical University , Prague ?05:00 -- 5:30 Microsoft Azure and the Kinect Join the World of Telemedicine to Save Lives, Janet Bailey, Aaron Rothberg, University of Arkansas, Bradley Jensen, Microsoft 04:00 -- 05:30 Session 1d Environmental Applications ?04:00 -- 4:30 Cloud-based Exploration of Complex Ecosystems for Science, Education and Entertainment, Ilmi Yoon, Sangyuk Yoon, Gary Ng, Hunvil Rodrigues, Sonal Mahajan, San Francisco State University, Neo Martinez, Pacific Ecoinformatics Lab ?04:30 -- 5:00 Cloud Computing as a Cyber-Infrastructure for Mass Customization and Collaboration, Kwa-Sur Tam, Virginia Tech ?05:00 -- 5:30 Green Prefab: Civil Engineering Hub In Ms Windows Azure, Furio Barzon, Collaboratorio, Italy 06:00-09:00 Dinner Day 2 Tuesday 05/08/2012 09:00 - 10:00 Keynote, Yousef Khalidi, Distinguished Engineer, Microsoft Corporation, Large Scale Cloud Computing: Opportunities and Challenges 10:00 - 10:30 Break 10:30 - 12:00 Plenary Session ?10:30 -- 11:00 Vision Paper: Towards an Understanding of the Limits of Map-Reduce Computation, Anish Das Sarma, Google Research, Semih Salihogluz, Jeffrey D. Ullman, Stanford University, Foto Afrati, National Technical University Athens ?11:00 -- 11:30 Twister4Azure: Parallel Data Analytics on Azure, Judy Qiu, Thilina Gunarathne, Indiana University ?11:30 -- 12:00 CumuloNimbo: Parallel-Distributed Transactional Processing, Ricardo Jimenez-Peris, Marta Pati?o-Mart?nez, Iv?n Brondino, Universidad Politecnica de Madrid, Jos? Pereira, Rui Oliveira, Ricardo Vila?a, University Minho, Bettina Kemme, Yousuf Ahmad, McGill Univ. 12:00 - 02:00 Lunch Posters 02:00 - 3:30 Session 2a Social and Mobile Services ?02:00 -- 2:30 An Efficient Meet-Up Mechanism by Mashing-up Social and Mobile Clouds, Li-Chun Wang, Chia-Yu Lin, Yu-Jia Chen, Yu-Chee Tseng, National Chiao Tung University ?02:30 -- 3:00 Scalable, Secure Analysis of Social Sciences Data on the Azure Platform, Yogesh Simmhan, Litao Den, Alok Kumbhare, Mark Redekopp and Viktor Prasanna, University of Southern California ?03:00 -- 3:30 Remote Software Service for Mobile Clients leveraging Cloud Computing, Chunming Hu, Beihang University 02:00 -- 3:30 Session 2b Computational Models and applications ?02:00 -- 2:30 Enabling cloud interoperability with COMPSs, Daniele Lezzi, Fabrizio Marozzo, Francesc Lordan, Roger Rafanell, Rosa Badia, Barcelona Supercomputer Center, Domenico Talia, University of Calabria ?02:30 -- 3:00 McCloud: Monte Carlo Service in Windows Azure, Rafael Nasser, Karin Breitman, Rubens Sampaio, Americo Cunha, Helio Vieira, PUC-Rio ?03:00 -- 3:30 Experiences using Windows Azure to Calibrate Watershed Models, Marty Humphrey, Norm Beekwilder, University of Virginia, Jon Goodall, Mehmet Ercan, University of South Carolina 03:30 - 04:00 Break 04:00 - 05:30 Panel TBD 05:30 - Close -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email:iraicu at cs.iit.edu Web:http://www.cs.iit.edu/~iraicu/ Web:http://datasys.cs.iit.edu/ ================================================================= ================================================================= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: image/jpeg Size: 4462 bytes Desc: not available URL: From iraicu at cs.iit.edu Sat Apr 14 07:55:42 2012 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Sat, 14 Apr 2012 07:55:42 -0500 Subject: [Swift-devel] CFP: 13th IEEE/ACM Int. Conf. on Grid Computing (GRID) 2012 Message-ID: <4F8973CE.9020905@cs.iit.edu> Call for papers *Grid 2012: 13th IEEE/ACM International Conference on Grid Computing* Beijing, China September 20-23, 2012 http://grid2012.meepo.org Co-located with ChinaGrid'12 Grid computing enables the sharing of distributed computing and data resources such as processing, network bandwidth and storage capacity to create a cohesive resource environment for executing distributed applications. The Grid conference series is an annual international meeting that brings together a community of researchers, developers, practitioners, and users involved with Grid technology. The objective of the meeting is to serve as both the premier venue for presenting foremost research results in the area and as a forum for introducing and exploring new concepts. In 2012, the Grid conference will come to China for the first time and will be held in Beijing, co-located with ChinaGrid'12. Grid 2012 will have a focus on important and immediate issues that are significantly influencing grid computing. Scope Grid 2012 topics of interest include, but are not limited to: * Architecture * Middleware and toolkits * Resource management, scheduling, and runtime environments * Performance modeling and evaluation * Programming models, tools and environments * Metadata, ontologies, and provenance * Cloud computing * Virtualization and grid computing * Scientific workflow * Storage systems and data management * Data-intensive computing and processing * QoS and SLA Negotiation * Applications and experiences in science, engineering, business and society Paper Submission Authors are invited to submit original papers (not published or currently under review for any other conference or journal). Submitted manuscripts should not exceed 8 letter size (8.5 x 11) pages including figures, tables and references using the IEEE format for conference proceedings. Authors should submit the manuscript in PDF format via https://www.easychair.org/conferences/?conf=grid12 All submitted papers will be reviewed by program committee members and selected based on their originality, correctness, technical strength, significance, quality of presentation, and interest and relevance to the conference attendees. Accepted papers will be published in the IEEE categorized conference proceedings and will be made available online through the IEEE Xplore and the CS Digital Library. Go to paper submission page... Important Dates Papers Submission Due: 15 March 2012 Extended to 15 April 2012. Notification of Acceptance: 15 May 2012 Camera Ready Papers Due: 15 June 2012 Committees Organising Committee * *General Co-Chairs:* o Dieter Kranzlmueller, Ludwig-Maximilians-Universit?t, Germany o Weimin Zheng, Tsinghua University, China * *Programme Co-Chairs:* o Rajkumar Buyya, University of Melbourne, Australia o Hai Jin, Huazhong University of Science and Technology, China * *Local Organization Chair:* Yongwei Wu, Tsinghua University, China * *Finance Chair:* Kang Chen, Tsinghua University, China Programme Committee (To be comfirmed) * *Programme Co-Chairs:* o Rajkumar Buyya, University of Melbourne, Australia o Hai Jin, Huazhong University of Science and Technology, China * *Workshop & Poster Chair:* Jinlei Jiang, Tsinghua University, China * *Vice Chair -- Clouds and Virtualisation: * Roger Barga, Microsoft Research * *Vice Chair -- Distributed Production Cyberinfrastructure and Middleware:* Andrew Grimshaw, Univ. of Virginia, US * *Vice Chair -- e-Research and Applications:* Daniel S. Katz, Univ. of Chicago & Argonne National Laboratory, US * *Vice Chair -- Tools & Services, Resource Management & Runtime Environments:* Ramin Yahyapour, Dortmund * *Vice Chair -- Distributed Data-Intensive Science and Systems:* Erwin Laure, KTH, Sweeden * *Publishing Chair: *Ran Zheng, Huazhong University of Science and Technology, China * *Publicity Chairs:* o Gilles Fedak, INRIA/LIP, France o Ioan Raicu, Illinois Institute of Technology and Argonne National Laboratory, USA o Xuanhua Shi, Huazhong University of Science and Technology, China * *Program Committe:* o David Abramson, Monash University, Australia o Gabrielle Allen, Louisiana State University, USA o Andreas Aschenbrenner, Austrian Academy of Sciences o David Bader, Georgia Institute of Technology, USA o Rosa Badia, UPC, Spain o Henri Bal, Vrije Universiteit, Netherlands o Chaitanya Baru, San Diego Supercomputer Center, US o Eloisa Bentivegna, Max Planck Institute for Gravitational Physics, Germany o Ignacio Blanquer, Universidad Polit?cnica de Valencia, Spain o Jinlei Jiang, Tsinghua University, China o Neil Chue Hong, EPCC, UK o Marco Danelutto, Universit? di Pisa, Italy o Eva Deelman, ISI, USC , US o Frederic Desprez, INRIA-LIP, France o Jim Dowling, SICS, Sweden o Jaliya Ekanayake, Microsoft Research, US o Erik Elmroth, Ume? University, Sweden o Vangelis Floros, GRNET, Greece o Ian Foster, Univ. of Chicago, US o Patrick Fuhrmann, DESY, DE o Kang Chen, Tsinghua University, China o Rob Gillen, Oak Ridge National Laboratory , US o Marty Humphrey, University of Virginia, US o Jens Jensen, STFC, UK o Kate Keahey, Argonne National Laboratory, US o Thilo Kielmann, Vrije Universiteit, The Netherlands o Bastian Koller, HLRS, Germany o Tevfik Kosar, University at Buffalo, US o Nicolas Kourtellis, University of South Florida, USA o Patricia Kovatch, University of Tennessee, USA o Dieter Kranzlm?ller, Ludwig-Maximilians-Universit?t M?nchen, Germany o Peter Kunszt, SystemsX, Switzerland o Miron Livny, Univ. of Wisconsin, US o Hideo Matsuda, University of Osaka, Japan o Satoshi Matsuoka, Tokyo Institute of Technology, Japan o Jim Myers, Rensselaer Polytechnic Institute, US o Steven Newhouse, EGI, NL o Manish Parashar, Rutgers, USA o Judy Qiu, Indiana University, US o Ioan Raicu, Illinois Institute of Technology and Argonne National Laboratory, USA o Alistair Rendell, Australian National University, Australia o Karolina Sarnowska-Upton, Univ. of Virginia, US o Heiko Schuldt, Basel University, Switzerland o Richard Sinnott, University of Melbourne, Australia o Alan Sill, Texas-Tech, US o Alex Sim, LBL, US o Mark Stillwell, INRIA-Universit? de Lyon-LIP, France o Alan Sussman, University of Maryland, USA o Osamu Tatebe, Tsukuba University, Japan o Domenico Talia, Universit? della Calabria, Italiy o Douglas Thain, University of Notre Dame, US o David Wallom, Oxford University, UK -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================================= ================================================================= -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Sat Apr 14 11:11:08 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sat, 14 Apr 2012 11:11:08 -0500 (CDT) Subject: [Swift-devel] Fwd: [Swift-commit] Cog update In-Reply-To: <20120414084025.5ACF38D00D2F@bridled.ci.uchicago.edu> Message-ID: <1780391778.141400.1334419868106.JavaMail.root@zimbra.anl.gov> David, I see 6 empty Swift-commit CoG update emails between 03:40 and 04:00 this morning. Something wrong with the update reporter? - Mike ----- Forwarded Message ----- From: swift at ci.uchicago.edu To: swift-commit at ci.uchicago.edu Sent: Saturday, April 14, 2012 3:40:25 AM Subject: [Swift-commit] Cog update _______________________________________________ Swift-commit mailing list Swift-commit at ci.uchicago.edu https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-commit -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From davidk at ci.uchicago.edu Sat Apr 14 20:58:03 2012 From: davidk at ci.uchicago.edu (David Kelly) Date: Sat, 14 Apr 2012 20:58:03 -0500 (CDT) Subject: [Swift-devel] Fwd: [Swift-commit] Cog update In-Reply-To: <1780391778.141400.1334419868106.JavaMail.root@zimbra.anl.gov> Message-ID: <1083172201.142174.1334455083717.JavaMail.root@zimbra-mb2.anl.gov> Yes, this happens when the sourceforge SVN repository goes down. I'll take a look at it. ----- Original Message ----- > From: "Michael Wilde" > To: "Swift Devel" > Sent: Saturday, April 14, 2012 11:11:08 AM > Subject: [Swift-devel] Fwd: [Swift-commit] Cog update > David, I see 6 empty Swift-commit CoG update emails between 03:40 and > 04:00 this morning. Something wrong with the update reporter? > > - Mike > > ----- Forwarded Message ----- > From: swift at ci.uchicago.edu > To: swift-commit at ci.uchicago.edu > Sent: Saturday, April 14, 2012 3:40:25 AM > Subject: [Swift-commit] Cog update > > > _______________________________________________ > Swift-commit mailing list > Swift-commit at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-commit > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From wilde at mcs.anl.gov Fri Apr 20 10:09:15 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 20 Apr 2012 10:09:15 -0500 (CDT) Subject: [Swift-devel] Update In-Reply-To: <8F3672B0-E727-4B63-9249-FF6AD132DE5D@mcs.anl.gov> Message-ID: <1529867699.149838.1334934555441.JavaMail.root@zimbra.anl.gov> ----- Original Message ----- > From: "Jonathan Monette" > To: "Michael Wilde" > Sent: Monday, April 16, 2012 10:41:20 AM > Subject: Update > Mike, > I have an update from work I did over the weekend: > > 1) I have implemented the USER_HOME fix in my copy of trunk and > tested. I will need this for using Beagle. Should I commit this fix to > trunk and 0.93 so we can update the Beagle module? Yes, please update trunk. I dont know how updates to 0.93 are being handled. How do we make these available to users? How many such changes do we have already? > 2) I added a line to the swift bash script to echo a warning message > if SWIFT_HOME is set manually and what it is set too. I know this > problem has bit you in the butt before and it seemed to have bit > Gustav over the weekend. Is this a solution that you feel would solve > the SWIFT_HOME set wrong problem? Sounds reasonable to say "SWIFT_HOME set to NNN". If we agree that this is only used in exceptional cases, can say "WARNING: SWIFT_HOME overidden to NNN. This may affect correct behavior." or something like that. We should also remove the currently annoying message about sites file overridden. That is not helping anything and just adding noise. Please discuss (1) and (2) on Swift devel and at today's meeting. > 3) I am finishing up small scale test files that uses the compute > nodes on Raven and Hera to send to Dave. Should be done within the > hour. I see Dave's report on these. Lets discuss the bencmark. > 4) I have an initial script that averages a group of data lines in the > XY column data I have from a full scale SciColSim run(this was > Justin's suggestion). I am going to apply this to my data and replot > and send the plot. I will then commit this util script to trunk for > the plotting tools. Still waiting to see this, I think? > I will send an update at the end of the day of what was accomplished > and what issues are blocking me. ? Q: Can we use the benchmark to get a full scicol sim run done? Also: whats the state of testing reproducibility for SciCol with the BOOST RNG? I discussed this with Justin; they want reproducibility for testing. Can we generate a vector of RN's up front wherever reproducibility depends on task ordering? (I can discuss what this means...) Thanks, - Mike -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From jonmon at mcs.anl.gov Fri Apr 20 10:18:16 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Fri, 20 Apr 2012 10:18:16 -0500 Subject: [Swift-devel] Update In-Reply-To: <1529867699.149838.1334934555441.JavaMail.root@zimbra.anl.gov> References: <1529867699.149838.1334934555441.JavaMail.root@zimbra.anl.gov> Message-ID: <1D84B996-DFF6-4316-AC69-42DB1BA5B813@mcs.anl.gov> On Apr 20, 2012, at 10:09, Michael Wilde wrote: > > > ----- Original Message ----- >> From: "Jonathan Monette" >> To: "Michael Wilde" >> Sent: Monday, April 16, 2012 10:41:20 AM >> Subject: Update >> Mike, >> I have an update from work I did over the weekend: >> >> 1) I have implemented the USER_HOME fix in my copy of trunk and >> tested. I will need this for using Beagle. Should I commit this fix to >> trunk and 0.93 so we can update the Beagle module? > > Yes, please update trunk. I dont know how updates to 0.93 are being handled. How do we make these available to users? How many such changes do we have already? I think David already made the changes to 0.93 and trunk. I have not tested them yet and not sure if he updated the release on Beagle. > >> 2) I added a line to the swift bash script to echo a warning message >> if SWIFT_HOME is set manually and what it is set too. I know this >> problem has bit you in the butt before and it seemed to have bit >> Gustav over the weekend. Is this a solution that you feel would solve >> the SWIFT_HOME set wrong problem? > > Sounds reasonable to say "SWIFT_HOME set to NNN". If we agree that this is only used in exceptional cases, can say "WARNING: SWIFT_HOME overidden to NNN. This may affect correct behavior." or something like that. Ok. I can change that message. Any other suggestions on what it should say? > > We should also remove the currently annoying message about sites file overridden. That is not helping anything and just adding noise. I think that is coming from inside Swift and not the bash script itself. > > Please discuss (1) and (2) on Swift devel and at today's meeting. Sure. And what time is the meeting? At 1? > >> 3) I am finishing up small scale test files that uses the compute >> nodes on Raven and Hera to send to Dave. Should be done within the >> hour. > > I see Dave's report on these. Lets discuss the bencmark. > >> 4) I have an initial script that averages a group of data lines in the >> XY column data I have from a full scale SciColSim run(this was >> Justin's suggestion). I am going to apply this to my data and replot >> and send the plot. I will then commit this util script to trunk for >> the plotting tools. > > Still waiting to see this, I think? > >> I will send an update at the end of the day of what was accomplished >> and what issues are blocking me. > > ? > > Q: Can we use the benchmark to get a full scicol sim run done? Depending on how long Dave can get the machine dedicated for him, yes I believe so. If he can only get an hour I think it would be highly unlikely. But if he can get several hours I think yes. I am going to start running tests for the higher TIs on Raven to get a feel for the time requirements. > > Also: whats the state of testing reproducibility for SciCol with the BOOST RNG? > > I discussed this with Justin; they want reproducibility for testing. Can we generate a vector of RN's up front wherever reproducibility depends on task ordering? (I can discuss what this means...) I have some changes. Just need testing. > > Thanks, > > - Mike > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > From wilde at mcs.anl.gov Fri Apr 20 10:22:44 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 20 Apr 2012 10:22:44 -0500 (CDT) Subject: [Swift-devel] Update In-Reply-To: <1D84B996-DFF6-4316-AC69-42DB1BA5B813@mcs.anl.gov> Message-ID: <1226045624.149853.1334935364752.JavaMail.root@zimbra.anl.gov> > > We should also remove the currently annoying message about sites file overridden. > > That is not helping anything and just adding noise. > I think that is coming from inside Swift and not the bash script itself. We should remove it regardless where its coming from, and revisit the original intent of that warning. - Mike From hategan at mcs.anl.gov Fri Apr 20 12:54:00 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 20 Apr 2012 10:54:00 -0700 Subject: [Swift-devel] Update In-Reply-To: <1226045624.149853.1334935364752.JavaMail.root@zimbra.anl.gov> References: <1226045624.149853.1334935364752.JavaMail.root@zimbra.anl.gov> Message-ID: <1334944440.15329.4.camel@blabla> On Fri, 2012-04-20 at 10:22 -0500, Michael Wilde wrote: > > > We should also remove the currently annoying message about sites file overridden. > > > That is not helping anything and just adding noise. > > > I think that is coming from inside Swift and not the bash script itself. > > We should remove it regardless where its coming from, and revisit the original intent of that warning. It's similar to the SWIFT_HOME issue. However: - overriding the sites file requires an explicit argument on the command line, so it's hard to think that the user is not aware of that - setting SWIFT_HOME and leaving it there does not require explicit user action In light of the two points above, I think that a warning message makes more sense in the SWIFT_HOME case than in the sites file case. Mihael From iraicu at cs.iit.edu Fri Apr 20 13:33:47 2012 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Fri, 20 Apr 2012 13:33:47 -0500 Subject: [Swift-devel] CFP: IEEE/ACM GRID 2012 -- Deadline extension to 04-25-12 Message-ID: <4F91AC0B.9090806@cs.iit.edu> Call for papers *Grid 2012: 13th IEEE/ACM International Conference on Grid Computing* Beijing, China September 20-23, 2012 http://grid2012.meepo.org Co-located with ChinaGrid'12 Grid computing enables the sharing of distributed computing and data resources such as processing, network bandwidth and storage capacity to create a cohesive resource environment for executing distributed applications. The Grid conference series is an annual international meeting that brings together a community of researchers, developers, practitioners, and users involved with Grid technology. The objective of the meeting is to serve as both the premier venue for presenting foremost research results in the area and as a forum for introducing and exploring new concepts. In 2012, the Grid conference will come to China for the first time and will be held in Beijing, co-located with ChinaGrid'12. Grid 2012 will have a focus on important and immediate issues that are significantly influencing grid computing. Scope Grid 2012 topics of interest include, but are not limited to: * Architecture * Middleware and toolkits * Resource management, scheduling, and runtime environments * Performance modeling and evaluation * Programming models, tools and environments * Metadata, ontologies, and provenance * Cloud computing * Virtualization and grid computing * Scientific workflow * Storage systems and data management * Data-intensive computing and processing * QoS and SLA Negotiation * Applications and experiences in science, engineering, business and society Paper Submission Authors are invited to submit original papers (not published or currently under review for any other conference or journal). Submitted manuscripts should not exceed 8 letter size (8.5 x 11) pages including figures, tables and references using the IEEE format for conference proceedings. Authors should submit the manuscript in PDF format via https://www.easychair.org/conferences/?conf=grid12 All submitted papers will be reviewed by program committee members and selected based on their originality, correctness, technical strength, significance, quality of presentation, and interest and relevance to the conference attendees. Accepted papers will be published in the IEEE categorized conference proceedings and will be made available online through the IEEE Xplore and the CS Digital Library. Go to paper submission page... Important Dates Papers Submission Due: 15 April 2012 Extended to 25 April 2012. Notification of Acceptance: 15 May 2012 Camera Ready Papers Due: 15 June 2012 -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================================= ================================================================= -------------- next part -------------- An HTML attachment was scrubbed... URL: From svemalayan at yahoo.com Mon Apr 23 12:43:02 2012 From: svemalayan at yahoo.com (Emalayan Vairavanathan) Date: Mon, 23 Apr 2012 10:43:02 -0700 (PDT) Subject: [Swift-devel] Unix command from Swift run-time Message-ID: <1335202982.81128.YahooMailNeo@web120001.mail.ne1.yahoo.com> Hi All, I have a question. I need to do the following to integrate MosaStore with extended attributes + Swift+ Application. 1) Execute a Unix command to find the location of a file from my swift-program. 2) Then need to use the output of this command as an input to the swift-app (remote) procedure so that it will be scheduled where the file is located (see below). app read_file (file input, string machine){ ??? readperf @input "machine=" + machine; } One way is to get the location of a file in another swift-app procedure as below. But this will be executed locally/remotely through a worker. app (file outputFile) getIP (string filename){ ??? } Given that Swift run-time has access to Mosa on the head-node is there any other way to do this task with low cost? (may be directly executing this command without going through a worker? ). Can I use external mappers / variable declared as external type / some other way ? If you have an example program in the repository please point me to that. Thank youEmalayan. -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Mon Apr 23 12:48:36 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 23 Apr 2012 12:48:36 -0500 (CDT) Subject: [Swift-devel] Unix command from Swift run-time In-Reply-To: <1335202982.81128.YahooMailNeo@web120001.mail.ne1.yahoo.com> Message-ID: <388286024.152691.1335203316880.JavaMail.root@zimbra.anl.gov> Emalayan, If the head node can indeed access Mosa, and you want the getIP app to execute only on the head node, then define the site "localhost" in the same way you would if running on a single local workstation, and only list getIP in your tc file as being installed on site "localhost". If I understand your need correctly, that should make Swift only run getIP on the head node. - Mike ----- Original Message ----- > From: "Emalayan Vairavanathan" > To: swift-devel at ci.uchicago.edu > Cc: "MosaStore" > Sent: Monday, April 23, 2012 12:43:02 PM > Subject: [Swift-devel] Unix command from Swift run-time > Hi All, > > > I have a question. > > > I need to do the following to integrate MosaStore with extended > attributes + Swift+ Application. > > > 1) Execute a Unix command to find the location of a file from my > swift-program. > > 2) Then need to use the output of this command as an input to the > swift-app (remote) procedure so that it will be scheduled where the > file is located (see below). > > > > app read_file (file input, string machine){ > readperf @input "machine=" + machine; > } > > One way is to get the location of a file in another swift-app > procedure as below. But this will be executed locally/remotely through > a worker. > > app (file outputFile) getIP (string filename){ > > } > > Given that Swift run-time has access to Mosa on the head-node is there > any other way to do this task with low cost (may be directly executing > this command without going through a worker? ). Can I use external > mappers / variable declared as external type / some other way ? > > > > If you have an example program in the repository please point me to > that. > > > > Thank you Emalayan. > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From svemalayan at yahoo.com Mon Apr 23 13:00:30 2012 From: svemalayan at yahoo.com (Emalayan Vairavanathan) Date: Mon, 23 Apr 2012 11:00:30 -0700 (PDT) Subject: [Swift-devel] Unix command from Swift run-time In-Reply-To: <388286024.152691.1335203316880.JavaMail.root@zimbra.anl.gov> References: <1335202982.81128.YahooMailNeo@web120001.mail.ne1.yahoo.com> <388286024.152691.1335203316880.JavaMail.root@zimbra.anl.gov> Message-ID: <1335204030.80783.YahooMailNeo@web120006.mail.ne1.yahoo.com> Hi Mike, "If the head node can indeed access Mosa" We made some modifications in Mosa for our experiments and now only metadata access can be done from head-node in BG/P. Thank you for your suggestion and it make sense. I will try that and get back to you if I have questions. Thank you Emalayan ________________________________ From: Michael Wilde To: Emalayan Vairavanathan Cc: MosaStore ; swift-devel at ci.uchicago.edu Sent: Monday, 23 April 2012 10:48 AM Subject: Re: [Swift-devel] Unix command from Swift run-time Emalayan, If the head node can indeed access Mosa, and you want the getIP app to execute only on the head node, then define the site "localhost" in the same way you would if running on a single local workstation, and only list getIP in your tc file as being installed on site "localhost". If I understand your need correctly, that should make Swift only run getIP on the head node. - Mike ----- Original Message ----- > From: "Emalayan Vairavanathan" > To: swift-devel at ci.uchicago.edu > Cc: "MosaStore" > Sent: Monday, April 23, 2012 12:43:02 PM > Subject: [Swift-devel] Unix command from Swift run-time > Hi All, > > > I have a question. > > > I need to do the following to integrate MosaStore with extended > attributes + Swift+ Application. > > > 1) Execute a Unix command to find the location of a file from my > swift-program. > > 2) Then need to use the output of this command as an input to the > swift-app (remote) procedure so that it will be scheduled where the > file is located (see below). > > > > app read_file (file input, string machine){ > readperf @input "machine=" + machine; > } > > One way is to get the location of a file in another swift-app > procedure as below. But this will be executed locally/remotely through > a worker. > > app (file outputFile) getIP (string filename){ > > } > > Given that Swift run-time has access to Mosa on the head-node is there > any other way to do this task with low cost (may be directly executing > this command without going through a worker? ). Can I use external > mappers / variable declared as external type / some other way ? > > > > If you have an example program in the repository please point me to > that. > > > > Thank you Emalayan. > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory -------------- next part -------------- An HTML attachment was scrubbed... URL: From svemalayan at yahoo.com Tue Apr 24 12:16:24 2012 From: svemalayan at yahoo.com (Emalayan Vairavanathan) Date: Tue, 24 Apr 2012 10:16:24 -0700 (PDT) Subject: [Swift-devel] Assigning stdout to a string variable in Swift Message-ID: <1335287784.81635.YahooMailNeo@web120001.mail.ne1.yahoo.com> Hi All, Is it possible to assign a stdout to a string variable in Swift ? I need to get the IP of a machine by executing an shell script and then assign it to a string variable. Please see the sample code pasted below (Here the string variable is? written to a file to verify program works correctly). But the program didn't make any progress and swift was repeatedly printing a message called "Initializing site shared directory:1" on the console. Any ideas ? Is this swift code is correct ? or swift does not support this feature ? In case if swift does not support this feature, is there a way to use the stdout without writing to a file ? Thank you Emalayan 1) Source code type messagefile; app (string machine) get_machine (string fileName) { ??? getIP fileName stdout = machine; } app (messagefile t) printIP (string m) { ??????? echo m stdout=@filename(t); } string IP = get_machine("myFile") messagefile outfile ; outfile = printIP(IP); 2) Tc file: localhost cat /bin/cat null null null localhost echo /bin/echo null null null localhost getIP /home/emalayan/Workspace/Mosa_March_12/tools/synthaticBenchmark/swift-scripts/pipeline-loc-aware/wrappers/getIP.sh null null null 3)Site file: ? ??? ??? ??? /home/emalayan/Mosa_March_12/tools/synthaticBenchmark/swift-scripts/pipeline-loc-aware/swift.work ??? 0 ? 4) stdout: Swift 0.93 swift-r5483 cog-r3339 RunID: 20120424-0950-tkdyv925 Progress:? time: Tue, 24 Apr 2012 09:50:45 -0700 Progress:? time: Tue, 24 Apr 2012 09:51:15 -0700? Initializing site shared directory:1 Progress:? time: Tue, 24 Apr 2012 09:51:45 -0700? Initializing site shared directory:1 Progress:? time: Tue, 24 Apr 2012 09:52:15 -0700? Initializing site shared directory:1 Progress:? time: Tue, 24 Apr 2012 09:52:45 -0700? Initializing site shared directory:1 Progress:? time: Tue, 24 Apr 2012 09:53:15 -0700? Initializing site shared directory:1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From wozniak at mcs.anl.gov Tue Apr 24 12:19:58 2012 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Tue, 24 Apr 2012 12:19:58 -0500 Subject: [Swift-devel] Assigning stdout to a string variable in Swift In-Reply-To: <1335287784.81635.YahooMailNeo@web120001.mail.ne1.yahoo.com> References: <1335287784.81635.YahooMailNeo@web120001.mail.ne1.yahoo.com> Message-ID: <4F96E0BE.60404@mcs.anl.gov> Hi Emalayan The only meaningful output from an app function is in files. I suggest you put the output in a file and read it with readData(). Justin On 04/24/2012 12:16 PM, Emalayan Vairavanathan wrote: > Hi All, > > Is it possible to assign a stdout to a string variable in Swift ? > > I need to get the IP of a machine by executing an shell script and then > assign it to a string variable. Please see the sample code pasted below > (Here the string variable is written to a file to verify program works > correctly). But the program didn't make any progress and swift was > repeatedly printing a message called "Initializing site shared > directory:1" on the console. > > Any ideas ? Is this swift code is correct ? or swift does not support > this feature ? In case if swift does not support this feature, is there > a way to use the stdout without writing to a file ? > > Thank you > Emalayan > > 1) Source code > > type messagefile; > > app (string machine) get_machine (string fileName) { > getIP fileName stdout = machine; > } > > app (messagefile t) printIP (string m) { > echo m stdout=@filename(t); > } > > string IP = get_machine("myFile") > messagefile outfile ; > outfile = printIP(IP); > > 2) Tc file: > > localhost cat /bin/cat null null null > localhost echo /bin/echo null null null > localhost getIP > /home/emalayan/Workspace/Mosa_March_12/tools/synthaticBenchmark/swift-scripts/pipeline-loc-aware/wrappers/getIP.sh > null null null > > 3)Site file: > > > > > > > /home/emalayan/Mosa_March_12/tools/synthaticBenchmark/swift-scripts/pipeline-loc-aware/swift.work > 0 > > > > > > 4) stdout: > > Swift 0.93 swift-r5483 cog-r3339 > > RunID: 20120424-0950-tkdyv925 > Progress: time: Tue, 24 Apr 2012 09:50:45 -0700 > Progress: time: Tue, 24 Apr 2012 09:51:15 -0700 Initializing site shared > directory:1 > Progress: time: Tue, 24 Apr 2012 09:51:45 -0700 Initializing site shared > directory:1 > Progress: time: Tue, 24 Apr 2012 09:52:15 -0700 Initializing site shared > directory:1 > Progress: time: Tue, 24 Apr 2012 09:52:45 -0700 Initializing site shared > directory:1 > Progress: time: Tue, 24 Apr 2012 09:53:15 -0700 Initializing site shared > directory:1 > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Justin M Wozniak From svemalayan at yahoo.com Tue Apr 24 13:23:05 2012 From: svemalayan at yahoo.com (Emalayan Vairavanathan) Date: Tue, 24 Apr 2012 11:23:05 -0700 (PDT) Subject: [Swift-devel] Assigning stdout to a string variable in Swift In-Reply-To: <4F96E0BE.60404@mcs.anl.gov> References: <1335287784.81635.YahooMailNeo@web120001.mail.ne1.yahoo.com> <4F96E0BE.60404@mcs.anl.gov> Message-ID: <1335291785.88451.YahooMailNeo@web120005.mail.ne1.yahoo.com> Thank you Justin. One more question. I wrote the following program to get the machine location using readData(). But the program fails saying "foo.txt (No such file or directory)".? Do you see any obvious bugs in the code below? Could you please help me to get this working ? type messagefile; app (messagefile t) get_machine_via_file(string fileName) { ??????? getIP fileName stdout=@filename(t); } app (messagefile t) print (string m) { ??????? echo m stdout=@filename(t); } messagefile machine_location ; messagefile output ; machine_location = get_machine_via_file("MyFile"); string s = readData(@filename(machine_location)); output= print(s); ________________________________ From: Justin M Wozniak To: swift-devel at ci.uchicago.edu Sent: Tuesday, 24 April 2012 10:19 AM Subject: Re: [Swift-devel] Assigning stdout to a string variable in Swift Hi Emalayan ??? The only meaningful output from an app function is in files.? I suggest you put the output in a file and read it with readData(). ??? Justin On 04/24/2012 12:16 PM, Emalayan Vairavanathan wrote: > Hi All, > > Is it possible to assign a stdout to a string variable in Swift ? > > I need to get the IP of a machine by executing an shell script and then > assign it to a string variable. Please see the sample code pasted below > (Here the string variable is written to a file to verify program works > correctly). But the program didn't make any progress and swift was > repeatedly printing a message called "Initializing site shared > directory:1" on the console. > > Any ideas ? Is this swift code is correct ? or swift does not support > this feature ? In case if swift does not support this feature, is there > a way to use the stdout without writing to a file ? > > Thank you > Emalayan > > 1) Source code > > type messagefile; > > app (string machine) get_machine (string fileName) { > getIP fileName stdout = machine; > } > > app (messagefile t) printIP (string m) { > echo m stdout=@filename(t); > } > > string IP = get_machine("myFile") > messagefile outfile ; > outfile = printIP(IP); > > 2) Tc file: > > localhost cat /bin/cat null null null > localhost echo /bin/echo null null null > localhost getIP > /home/emalayan/Workspace/Mosa_March_12/tools/synthaticBenchmark/swift-scripts/pipeline-loc-aware/wrappers/getIP.sh > null null null > > 3)Site file: > > > > > > > /home/emalayan/Mosa_March_12/tools/synthaticBenchmark/swift-scripts/pipeline-loc-aware/swift.work > 0 > > > > > > 4) stdout: > > Swift 0.93 swift-r5483 cog-r3339 > > RunID: 20120424-0950-tkdyv925 > Progress: time: Tue, 24 Apr 2012 09:50:45 -0700 > Progress: time: Tue, 24 Apr 2012 09:51:15 -0700 Initializing site shared > directory:1 > Progress: time: Tue, 24 Apr 2012 09:51:45 -0700 Initializing site shared > directory:1 > Progress: time: Tue, 24 Apr 2012 09:52:15 -0700 Initializing site shared > directory:1 > Progress: time: Tue, 24 Apr 2012 09:52:45 -0700 Initializing site shared > directory:1 > Progress: time: Tue, 24 Apr 2012 09:53:15 -0700 Initializing site shared > directory:1 > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -- Justin M Wozniak _______________________________________________ Swift-devel mailing list Swift-devel at ci.uchicago.edu https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From jonmon at mcs.anl.gov Tue Apr 24 16:27:29 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Tue, 24 Apr 2012 16:27:29 -0500 Subject: [Swift-devel] Assigning stdout to a string variable in Swift In-Reply-To: <1335291785.88451.YahooMailNeo@web120005.mail.ne1.yahoo.com> References: <1335287784.81635.YahooMailNeo@web120001.mail.ne1.yahoo.com> <4F96E0BE.60404@mcs.anl.gov> <1335291785.88451.YahooMailNeo@web120005.mail.ne1.yahoo.com> Message-ID: So, readData will probably run right away since you are calling it with the name of the file and not the mapped file itself. Swift knows the name of the file to be mapped so readData will run almost instantly. If you call it on the mapped file(machine_location) it should wait until that variable has been assigned too(after the app call). On Apr 24, 2012, at 1:23 PM, Emalayan Vairavanathan wrote: > Thank you Justin. > > One more question. I wrote the following program to get the machine location using readData(). But the program fails saying "foo.txt (No such file or directory)". Do you see any obvious bugs in the code below? Could you please help me to get this working ? > > type messagefile; > > app (messagefile t) get_machine_via_file(string fileName) { > getIP fileName stdout=@filename(t); > } > > app (messagefile t) print (string m) { > echo m stdout=@filename(t); > } > > messagefile machine_location ; > messagefile output ; > > machine_location = get_machine_via_file("MyFile"); > string s = readData(@filename(machine_location)); > output= print(s); > > > From: Justin M Wozniak > To: swift-devel at ci.uchicago.edu > Sent: Tuesday, 24 April 2012 10:19 AM > Subject: Re: [Swift-devel] Assigning stdout to a string variable in Swift > > Hi Emalayan > The only meaningful output from an app function is in files. I suggest > you put the output in a file and read it with readData(). > Justin > > On 04/24/2012 12:16 PM, Emalayan Vairavanathan wrote: > > Hi All, > > > > Is it possible to assign a stdout to a string variable in Swift ? > > > > I need to get the IP of a machine by executing an shell script and then > > assign it to a string variable. Please see the sample code pasted below > > (Here the string variable is written to a file to verify program works > > correctly). But the program didn't make any progress and swift was > > repeatedly printing a message called "Initializing site shared > > directory:1" on the console. > > > > Any ideas ? Is this swift code is correct ? or swift does not support > > this feature ? In case if swift does not support this feature, is there > > a way to use the stdout without writing to a file ? > > > > Thank you > > Emalayan > > > > 1) Source code > > > > type messagefile; > > > > app (string machine) get_machine (string fileName) { > > getIP fileName stdout = machine; > > } > > > > app (messagefile t) printIP (string m) { > > echo m stdout=@filename(t); > > } > > > > string IP = get_machine("myFile") > > messagefile outfile ; > > outfile = printIP(IP); > > > > 2) Tc file: > > > > localhost cat /bin/cat null null null > > localhost echo /bin/echo null null null > > localhost getIP > > /home/emalayan/Workspace/Mosa_March_12/tools/synthaticBenchmark/swift-scripts/pipeline-loc-aware/wrappers/getIP.sh > > null null null > > > > 3)Site file: > > > > > > > > > > > > > > /home/emalayan/Mosa_March_12/tools/synthaticBenchmark/swift-scripts/pipeline-loc-aware/swift.work > > 0 > > > > > > > > > > > > 4) stdout: > > > > Swift 0.93 swift-r5483 cog-r3339 > > > > RunID: 20120424-0950-tkdyv925 > > Progress: time: Tue, 24 Apr 2012 09:50:45 -0700 > > Progress: time: Tue, 24 Apr 2012 09:51:15 -0700 Initializing site shared > > directory:1 > > Progress: time: Tue, 24 Apr 2012 09:51:45 -0700 Initializing site shared > > directory:1 > > Progress: time: Tue, 24 Apr 2012 09:52:15 -0700 Initializing site shared > > directory:1 > > Progress: time: Tue, 24 Apr 2012 09:52:45 -0700 Initializing site shared > > directory:1 > > Progress: time: Tue, 24 Apr 2012 09:53:15 -0700 Initializing site shared > > directory:1 > > > > > > _______________________________________________ > > Swift-devel mailing list > > Swift-devel at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > -- > Justin M Wozniak > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Wed Apr 25 08:22:57 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 25 Apr 2012 08:22:57 -0500 (CDT) Subject: [Swift-devel] Provider staging problem (690) persists with single-core workers Message-ID: <1410151390.15.1335360177354.JavaMail.root@zimbra.anl.gov> Mihael, I just want to confirm that you've seen David's latest update on bug 690: Comment #47 From David Kelly 2012-04-24 10:13:03 (-) [reply] Yesterday, I tried a larger set of data on OSG using 1 worker per node and saw timeouts. I'll try again later to get a better set of worker logs (need to give them all a unique filename so they don't get overwritten when condor sends them back). The file below has the coaster service log, swift log, and 1 worker log. http://www.ci.uchicago.edu/~davidk/logs/logs-2012-04-23.tar.gz I.e., the problem still occurs with workersPerNode=1, on OSG. Does the log above provide what you need to debug this? - Mike -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From hategan at mcs.anl.gov Wed Apr 25 12:31:35 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 25 Apr 2012 10:31:35 -0700 Subject: [Swift-devel] Provider staging problem (690) persists with single-core workers In-Reply-To: <1410151390.15.1335360177354.JavaMail.root@zimbra.anl.gov> References: <1410151390.15.1335360177354.JavaMail.root@zimbra.anl.gov> Message-ID: <1335375095.17084.0.camel@blabla> I saw that. It's new code, so some problems need to be ironed out. I'm working on that and the large memory consumption issue. On Wed, 2012-04-25 at 08:22 -0500, Michael Wilde wrote: > Mihael, I just want to confirm that you've seen David's latest update on bug 690: > > Comment #47 From David Kelly 2012-04-24 10:13:03 (-) [reply] > > Yesterday, I tried a larger set of data on OSG using 1 worker per node and saw > timeouts. I'll try again later to get a better set of worker logs (need to give > them all a unique filename so they don't get overwritten when condor sends them > back). The file below has the coaster service log, swift log, and 1 worker log. > > http://www.ci.uchicago.edu/~davidk/logs/logs-2012-04-23.tar.gz > > I.e., the problem still occurs with workersPerNode=1, on OSG. Does the log above provide what you need to debug this? > > - Mike > From wilde at mcs.anl.gov Thu Apr 26 20:37:48 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 26 Apr 2012 20:37:48 -0500 (CDT) Subject: [Swift-devel] How to increase Swift-coaster task rate Message-ID: <1443360180.3250.1335490668378.JavaMail.root@zimbra.anl.gov> Mihael, David, Jon, and I are working on Cray benchmarks for a paper for the Cray Users Group. In tests so far, we are being limited by job submission rates of about 80 tasks/sec. We'd like very much to drive that up closer to 200/sec if at all possible for the benchmarks we're trying to run. The current tests are doing sleep 0 jobs with no file transfer to about 2400 cores on a Cray benchmark system. The workdir is set to /dev/shm. The throttles are almost all set way up (Jon can post the specific config and values). One thing we have not yet done is try to get the log traffic way down; thats next up to try. We'll revert to testing against 480 cores on raven for now. That should still be enough to push the upper limit of Swift, Karajan and coasters. Can you give us a set of things to check (set, turn off, etc) to try to get closer to 200 tasks/sec? Do we need to set to /dev/shm in addition to work dir? This latest run was I think with provider staging and pin coaster files. Thanks, - Mike -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From hategan at mcs.anl.gov Thu Apr 26 20:47:38 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 26 Apr 2012 18:47:38 -0700 Subject: [Swift-devel] How to increase Swift-coaster task rate In-Reply-To: <1443360180.3250.1335490668378.JavaMail.root@zimbra.anl.gov> References: <1443360180.3250.1335490668378.JavaMail.root@zimbra.anl.gov> Message-ID: <1335491258.20721.2.camel@blabla> Can I see the log? I don't think that there is a set of things that I can easily point out, but I can try to see what the problems might be. What's the submit host (cpu/cores/mem, etc)? Separate service or auto/local? Mihael On Thu, 2012-04-26 at 20:37 -0500, Michael Wilde wrote: > Mihael, > > David, Jon, and I are working on Cray benchmarks for a paper for the Cray Users Group. > > In tests so far, we are being limited by job submission rates of about 80 tasks/sec. > > We'd like very much to drive that up closer to 200/sec if at all possible for the benchmarks we're trying to run. > > The current tests are doing sleep 0 jobs with no file transfer to about 2400 cores on a Cray benchmark system. The workdir is set to /dev/shm. The throttles are almost all set way up (Jon can post the specific config and values). > > One thing we have not yet done is try to get the log traffic way down; thats next up to try. > > We'll revert to testing against 480 cores on raven for now. That should still be enough to push the upper limit of Swift, Karajan and coasters. > > Can you give us a set of things to check (set, turn off, etc) to try to get closer to 200 tasks/sec? Do we need to set to /dev/shm in addition to work dir? > > This latest run was I think with provider staging and pin coaster files. > > Thanks, > > - Mike > > From ketancmaheshwari at gmail.com Thu Apr 26 21:07:45 2012 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Thu, 26 Apr 2012 22:07:45 -0400 Subject: [Swift-devel] How to increase Swift-coaster task rate In-Reply-To: <1443360180.3250.1335490668378.JavaMail.root@zimbra.anl.gov> References: <1443360180.3250.1335490668378.JavaMail.root@zimbra.anl.gov> Message-ID: One thing that could be checked is the JVM. Cray's default java appears to be IBM. If the benchmark is not already running on IBM java, it could be tested to see if better performance is achieved. On Thu, Apr 26, 2012 at 9:37 PM, Michael Wilde wrote: > Mihael, > > David, Jon, and I are working on Cray benchmarks for a paper for the Cray > Users Group. > > In tests so far, we are being limited by job submission rates of about 80 > tasks/sec. > > We'd like very much to drive that up closer to 200/sec if at all possible > for the benchmarks we're trying to run. > > The current tests are doing sleep 0 jobs with no file transfer to about > 2400 cores on a Cray benchmark system. The workdir is set to /dev/shm. The > throttles are almost all set way up (Jon can post the specific config and > values). > > One thing we have not yet done is try to get the log traffic way down; > thats next up to try. > > We'll revert to testing against 480 cores on raven for now. That should > still be enough to push the upper limit of Swift, Karajan and coasters. > > Can you give us a set of things to check (set, turn off, etc) to try to > get closer to 200 tasks/sec? Do we need to set to /dev/shm in > addition to work dir? > > This latest run was I think with provider staging and pin coaster files. > > Thanks, > > - Mike > > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Thu Apr 26 21:44:54 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 26 Apr 2012 21:44:54 -0500 (CDT) Subject: [Swift-devel] How to increase Swift-coaster task rate In-Reply-To: <1335491258.20721.2.camel@blabla> Message-ID: <829376912.3271.1335494694756.JavaMail.root@zimbra.anl.gov> We need to re-run the tests on a machine we can get to. The current tests were run for us by a Cray engineer, and we havent tarred the logs back yet. Submit host machine was a 6-core Cray login host, probably AMD at 2.5GHz or so, probably 32GB RAM. Coaster service was local in Swift jvm. Can try a separate service. Java was (I think) Sun 1.6.0_21, IBM Java is there. Not sure if Java was 64 bit. Jon will re-run clean tests on the Raven Cray where we have full access. Advice on dialing the logging way down would be valuable. Outside of that we're trying to avoid all ancillary touches of shared filesystems. Thanks, - Mike ----- Original Message ----- > From: "Mihael Hategan" > To: "Michael Wilde" > Cc: "Swift Devel" > Sent: Thursday, April 26, 2012 8:47:38 PM > Subject: Re: How to increase Swift-coaster task rate > Can I see the log? > > I don't think that there is a set of things that I can easily point > out, > but I can try to see what the problems might be. > > What's the submit host (cpu/cores/mem, etc)? > > Separate service or auto/local? > > Mihael > > On Thu, 2012-04-26 at 20:37 -0500, Michael Wilde wrote: > > Mihael, > > > > David, Jon, and I are working on Cray benchmarks for a paper for the > > Cray Users Group. > > > > In tests so far, we are being limited by job submission rates of > > about 80 tasks/sec. > > > > We'd like very much to drive that up closer to 200/sec if at all > > possible for the benchmarks we're trying to run. > > > > The current tests are doing sleep 0 jobs with no file transfer to > > about 2400 cores on a Cray benchmark system. The workdir is set to > > /dev/shm. The throttles are almost all set way up (Jon can post the > > specific config and values). > > > > One thing we have not yet done is try to get the log traffic way > > down; thats next up to try. > > > > We'll revert to testing against 480 cores on raven for now. That > > should still be enough to push the upper limit of Swift, Karajan and > > coasters. > > > > Can you give us a set of things to check (set, turn off, etc) to try > > to get closer to 200 tasks/sec? Do we need to set to > > /dev/shm in addition to work dir? > > > > This latest run was I think with provider staging and pin coaster > > files. > > > > Thanks, > > > > - Mike > > > > -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From jonmon at mcs.anl.gov Thu Apr 26 22:32:24 2012 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Thu, 26 Apr 2012 22:32:24 -0500 Subject: [Swift-devel] How to increase Swift-coaster task rate In-Reply-To: <829376912.3271.1335494694756.JavaMail.root@zimbra.anl.gov> References: <829376912.3271.1335494694756.JavaMail.root@zimbra.anl.gov> Message-ID: Yes. I will get a log file representing the run for you Mihael. I believe java was actually IBM but I could be wrong. I do not think we asked the Cray engineer to load sun java. It would be nice if swift logged that info in the log file. On Apr 26, 2012, at 21:44, Michael Wilde wrote: > We need to re-run the tests on a machine we can get to. The current tests were run for us by a Cray engineer, and we havent tarred the logs back yet. > > Submit host machine was a 6-core Cray login host, probably AMD at 2.5GHz or so, probably 32GB RAM. > > Coaster service was local in Swift jvm. Can try a separate service. > > Java was (I think) Sun 1.6.0_21, IBM Java is there. Not sure if Java was 64 bit. > > Jon will re-run clean tests on the Raven Cray where we have full access. > > Advice on dialing the logging way down would be valuable. Outside of that we're trying to avoid all ancillary touches of shared filesystems. > > Thanks, > > - Mike > > > ----- Original Message ----- >> From: "Mihael Hategan" >> To: "Michael Wilde" >> Cc: "Swift Devel" >> Sent: Thursday, April 26, 2012 8:47:38 PM >> Subject: Re: How to increase Swift-coaster task rate >> Can I see the log? >> >> I don't think that there is a set of things that I can easily point >> out, >> but I can try to see what the problems might be. >> >> What's the submit host (cpu/cores/mem, etc)? >> >> Separate service or auto/local? >> >> Mihael >> >> On Thu, 2012-04-26 at 20:37 -0500, Michael Wilde wrote: >>> Mihael, >>> >>> David, Jon, and I are working on Cray benchmarks for a paper for the >>> Cray Users Group. >>> >>> In tests so far, we are being limited by job submission rates of >>> about 80 tasks/sec. >>> >>> We'd like very much to drive that up closer to 200/sec if at all >>> possible for the benchmarks we're trying to run. >>> >>> The current tests are doing sleep 0 jobs with no file transfer to >>> about 2400 cores on a Cray benchmark system. The workdir is set to >>> /dev/shm. The throttles are almost all set way up (Jon can post the >>> specific config and values). >>> >>> One thing we have not yet done is try to get the log traffic way >>> down; thats next up to try. >>> >>> We'll revert to testing against 480 cores on raven for now. That >>> should still be enough to push the upper limit of Swift, Karajan and >>> coasters. >>> >>> Can you give us a set of things to check (set, turn off, etc) to try >>> to get closer to 200 tasks/sec? Do we need to set to >>> /dev/shm in addition to work dir? >>> >>> This latest run was I think with provider staging and pin coaster >>> files. >>> >>> Thanks, >>> >>> - Mike >>> >>> > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From hategan at mcs.anl.gov Fri Apr 27 13:33:33 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 27 Apr 2012 11:33:33 -0700 Subject: [Swift-devel] How to increase Swift-coaster task rate In-Reply-To: <829376912.3271.1335494694756.JavaMail.root@zimbra.anl.gov> References: <829376912.3271.1335494694756.JavaMail.root@zimbra.anl.gov> Message-ID: <1335551613.30465.2.camel@blabla> On Thu, 2012-04-26 at 21:44 -0500, Michael Wilde wrote: > We need to re-run the tests on a machine we can get to. The current tests were run for us by a Cray engineer, and we havent tarred the logs back yet. > > Submit host machine was a 6-core Cray login host, probably AMD at 2.5GHz or so, probably 32GB RAM. Looks beefy enough. It should give you a better rate than 80j/s. > > Coaster service was local in Swift jvm. Can try a separate service. That's going to be the fastest, so I'd keep it that way. > > Java was (I think) Sun 1.6.0_21, IBM Java is there. Not sure if Java was 64 bit. > > Jon will re-run clean tests on the Raven Cray where we have full access. > > Advice on dialing the logging way down would be valuable. Outside of that we're trying to avoid all ancillary touches of shared filesystems. The obvious: disable worker logging, make sure that the amount of logging is minimal, don't enable provenance logging, etc. But I'm assuming that's already the case, that's why I want logs. Mihael From davidk at ci.uchicago.edu Sat Apr 28 12:09:17 2012 From: davidk at ci.uchicago.edu (David Kelly) Date: Sat, 28 Apr 2012 12:09:17 -0500 (CDT) Subject: [Swift-devel] Condor with coasters question In-Reply-To: <46900181.16129.1335632336707.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: <269652826.16180.1335632957236.JavaMail.root@zimbra-mb2.anl.gov> Hello, I am trying to get Swift working well on a machine that uses condor. It has 480 available slots. I am using a swift script that will run 1000 tasks. sites.xml: /home/davidk/test/benchmark-release/run012 480 480 1 1 1000 10000 cf: wrapperlog.always.transfer=true sitedir.keep=false execution.retries=0 lazy.errors=false status.mode=provider use.provider.staging=false provider.staging.pin.swiftfiles=true foreach.max.threads=1000 What I am seeing is that only ~70 tasks are active at once. When I look at condor_q, I see there are ~70 jobs that I have submitted, no more, none idle. Any ideas where this limit is coming from? I thought I would be around this by setting nodeGranularity to 50. But when I do this, what seems to happen is that there are 50 machines allocated per 1 worker.pl which would make sense for an MPI job, but not what I want here. (The condor submit script sets machine_count to 50, but only queues 1) I can get around this now by using the plain condor provider, but ideally would like to use coasters. Thanks, David From hategan at mcs.anl.gov Sat Apr 28 13:55:47 2012 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Sat, 28 Apr 2012 11:55:47 -0700 Subject: [Swift-devel] Condor with coasters question In-Reply-To: <269652826.16180.1335632957236.JavaMail.root@zimbra-mb2.anl.gov> References: <269652826.16180.1335632957236.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: <1335639347.30035.2.camel@blabla> On Sat, 2012-04-28 at 12:09 -0500, David Kelly wrote: > What I am seeing is that only ~70 tasks are active at once. When I > look at condor_q, I see there are ~70 jobs that I have submitted, no > more, none idle. Any ideas where this limit is coming from? Most of those jobs should be multi-node jobs (i.e. more than one worker per job). > > I thought I would be around this by setting nodeGranularity to 50. But > when I do this, what seems to happen is that there are 50 machines > allocated per 1 worker.pl which would make sense for an MPI job, but > not what I want here. Not really. It's reasonable and desirable to have multiple machines allocated per 1 job (though that should start an instance of worker.pl on each machine). > (The condor submit script sets machine_count to 50, but only queues > 1) > > I can get around this now by using the plain condor provider, but ideally would like to use coasters. > > Thanks, > David > _______________________________________________ > Swift-devel mailing list > Swift-devel at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel From wilde at mcs.anl.gov Sat Apr 28 15:13:15 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sat, 28 Apr 2012 15:13:15 -0500 Subject: [Swift-devel] Condor with coasters question In-Reply-To: References: <46900181.16129.1335632336707.JavaMail.root@zimbra-mb2.anl.gov> <269652826.16180.1335632957236.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: I meant to cc this to swift-devel so am resending it. I think multi-node jobs on Condor should work in principle but in practice may need to be tested and debugged. I think we should first see if we can fill the UC3 cluster with maxnode=1 slots=500. One possible reason that only 70 jobs were issued is that your prior test, David, looks like it was using default values for the times involved, and possible Swift "packed" the pending requests into the 70 job slots you saw. Hence my suggestion to try the config below. - Mike On Sat, Apr 28, 2012 at 1:28 PM, Michael Wilde wrote: > David, can you try a test that specifies: > > Maxtime 3600 > Maxwalltime 00:00:10 (or as needed for your app) > High and lowoverallocation 100 > > I would think each coaster ( x 480 ) should get a separate submit file > with count 1, just as would be done for PBS. > > - Mike > > On 4/28/12, David Kelly wrote: >> Hello, >> >> I am trying to get Swift working well on a machine that uses condor. It has >> 480 available slots. I am using a swift script that will run 1000 tasks. >> >> sites.xml: >> >> ? ? >> ? ? ? >> ? ? ? >> >> /home/davidk/test/benchmark-release/run012 >> ? ? ?480 >> ? ? ?480 >> ? ? ?1 >> ? ? ?1 >> ? ? ?1000 >> ? ? ?10000 >> ? ? >> >> >> cf: >> wrapperlog.always.transfer=true >> sitedir.keep=false >> execution.retries=0 >> lazy.errors=false >> status.mode=provider >> use.provider.staging=false >> provider.staging.pin.swiftfiles=true >> foreach.max.threads=1000 >> >> What I am seeing is that only ~70 tasks are active at once. When I look at >> condor_q, I see there are ~70 jobs that I have submitted, no more, none >> idle. Any ideas where this limit is coming from? >> >> I thought I would be around this by setting nodeGranularity to 50. But when >> I do this, what seems to happen is that there are 50 machines allocated per >> 1 worker.pl which would make sense for an MPI job, but not what I want here. >> (The condor submit script sets machine_count to 50, but only queues 1) >> >> I can get around this now by using the plain condor provider, but ideally >> would like to use coasters. >> >> Thanks, >> David >> _______________________________________________ >> Swift-devel mailing list >> Swift-devel at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel >> > > -- > Sent from my mobile device From davidk at ci.uchicago.edu Sat Apr 28 16:54:46 2012 From: davidk at ci.uchicago.edu (David Kelly) Date: Sat, 28 Apr 2012 16:54:46 -0500 (CDT) Subject: [Swift-devel] Condor with coasters question In-Reply-To: Message-ID: <88446355.16505.1335650086371.JavaMail.root@zimbra-mb2.anl.gov> I adjusted the parameters a bit and tried again with this configuration: _WORK_ 1000 1000 3600 00:05:00 100 100 1 1 1000 10000 The maximum number of active jobs maxed out at 101 with this. Thanks, David ----- Original Message ----- > From: "Michael Wilde" > To: "David Kelly" > Cc: "Swift Devel" > Sent: Saturday, April 28, 2012 3:13:15 PM > Subject: Re: [Swift-devel] Condor with coasters question > I meant to cc this to swift-devel so am resending it. > > I think multi-node jobs on Condor should work in principle but in > practice may need to be tested and debugged. > > I think we should first see if we can fill the UC3 cluster with > maxnode=1 slots=500. > > One possible reason that only 70 jobs were issued is that your prior > test, David, looks like it was using default values for the times > involved, and possible Swift "packed" the pending requests into the 70 > job slots you saw. Hence my suggestion to try the config below. > > - Mike > > On Sat, Apr 28, 2012 at 1:28 PM, Michael Wilde > wrote: > > David, can you try a test that specifies: > > > > Maxtime 3600 > > Maxwalltime 00:00:10 (or as needed for your app) > > High and lowoverallocation 100 > > > > I would think each coaster ( x 480 ) should get a separate submit > > file > > with count 1, just as would be done for PBS. > > > > - Mike > > > > On 4/28/12, David Kelly wrote: > >> Hello, > >> > >> I am trying to get Swift working well on a machine that uses > >> condor. It has > >> 480 available slots. I am using a swift script that will run 1000 > >> tasks. > >> > >> sites.xml: > >> > >> ? ? > >> ? ? ? >> ? ? ?url="none"/> > >> ? ? ? > >> > >> /home/davidk/test/benchmark-release/run012 > >> ? ? ?480 > >> ? ? ?480 > >> ? ? ?1 > >> ? ? ?1 > >> ? ? ?1000 > >> ? ? ? >> ? ? ?key="initialScore">10000 > >> ? ? > >> > >> > >> cf: > >> wrapperlog.always.transfer=true > >> sitedir.keep=false > >> execution.retries=0 > >> lazy.errors=false > >> status.mode=provider > >> use.provider.staging=false > >> provider.staging.pin.swiftfiles=true > >> foreach.max.threads=1000 > >> > >> What I am seeing is that only ~70 tasks are active at once. When I > >> look at > >> condor_q, I see there are ~70 jobs that I have submitted, no more, > >> none > >> idle. Any ideas where this limit is coming from? > >> > >> I thought I would be around this by setting nodeGranularity to 50. > >> But when > >> I do this, what seems to happen is that there are 50 machines > >> allocated per > >> 1 worker.pl which would make sense for an MPI job, but not what I > >> want here. > >> (The condor submit script sets machine_count to 50, but only queues > >> 1) > >> > >> I can get around this now by using the plain condor provider, but > >> ideally > >> would like to use coasters. > >> > >> Thanks, > >> David > >> _______________________________________________ > >> Swift-devel mailing list > >> Swift-devel at ci.uchicago.edu > >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > >> > > > > -- > > Sent from my mobile device From wilde at mcs.anl.gov Sat Apr 28 17:22:58 2012 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sat, 28 Apr 2012 17:22:58 -0500 (CDT) Subject: [Swift-devel] Condor with coasters question In-Reply-To: <88446355.16505.1335650086371.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: <89507462.5391.1335651778988.JavaMail.root@zimbra.anl.gov> Try reducing maxtime or increasing maxwalltime. The Swift script is running 500 cat jobs, right? Each worker has 60 mins wall time. Each app is sized at 5 mins, so 12 app calls can fit per coaster slot. By the time that coasters started 100 slots, it had allocated all the slots it needed to run the work your script has queued up. If you make your jobs run longer (eg use catsnsleep for say 30 secs), and you give maxwalltime of 10 mins (600), and say maxtime of 900 secs, then the coaster scheduler will think it needs a separate coaster for each one, which is what you want to see. Similarly, when you shift to the real DSSAT code. Real runtime there is 150 secs. Do the division to see how what time you need to specify to get the max number of coasters started. If you give too large a maxtime, coasters will think its best to fill those slots out to their max time rather than launch more coasters. - Mike ----- Original Message ----- > From: "David Kelly" > To: "Michael Wilde" > Cc: "Swift Devel" > Sent: Saturday, April 28, 2012 4:54:46 PM > Subject: Re: [Swift-devel] Condor with coasters question > I adjusted the parameters a bit and tried again with this > configuration: > > > > > > _WORK_ > 1000 > 1000 > 3600 > 00:05:00 > 100 > 100 > 1 > 1 > 1000 > 10000 > > > > The maximum number of active jobs maxed out at 101 with this. > > Thanks, > David > > ----- Original Message ----- > > From: "Michael Wilde" > > To: "David Kelly" > > Cc: "Swift Devel" > > Sent: Saturday, April 28, 2012 3:13:15 PM > > Subject: Re: [Swift-devel] Condor with coasters question > > I meant to cc this to swift-devel so am resending it. > > > > I think multi-node jobs on Condor should work in principle but in > > practice may need to be tested and debugged. > > > > I think we should first see if we can fill the UC3 cluster with > > maxnode=1 slots=500. > > > > One possible reason that only 70 jobs were issued is that your prior > > test, David, looks like it was using default values for the times > > involved, and possible Swift "packed" the pending requests into the > > 70 > > job slots you saw. Hence my suggestion to try the config below. > > > > - Mike > > > > On Sat, Apr 28, 2012 at 1:28 PM, Michael Wilde > > wrote: > > > David, can you try a test that specifies: > > > > > > Maxtime 3600 > > > Maxwalltime 00:00:10 (or as needed for your app) > > > High and lowoverallocation 100 > > > > > > I would think each coaster ( x 480 ) should get a separate submit > > > file > > > with count 1, just as would be done for PBS. > > > > > > - Mike > > > > > > On 4/28/12, David Kelly wrote: > > >> Hello, > > >> > > >> I am trying to get Swift working well on a machine that uses > > >> condor. It has > > >> 480 available slots. I am using a swift script that will run 1000 > > >> tasks. > > >> > > >> sites.xml: > > >> > > >> ? ? > > >> ? ? ? > >> ? ? ?url="none"/> > > >> ? ? ? > > >> > > >> /home/davidk/test/benchmark-release/run012 > > >> ? ? ?480 > > >> ? ? ?480 > > >> ? ? ? > >> ? ? ?namespace="globus">1 > > >> ? ? ?1 > > >> ? ? ? > >> ? ? ?key="jobThrottle">1000 > > >> ? ? ? > >> ? ? ?key="initialScore">10000 > > >> ? ? > > >> > > >> > > >> cf: > > >> wrapperlog.always.transfer=true > > >> sitedir.keep=false > > >> execution.retries=0 > > >> lazy.errors=false > > >> status.mode=provider > > >> use.provider.staging=false > > >> provider.staging.pin.swiftfiles=true > > >> foreach.max.threads=1000 > > >> > > >> What I am seeing is that only ~70 tasks are active at once. When > > >> I > > >> look at > > >> condor_q, I see there are ~70 jobs that I have submitted, no > > >> more, > > >> none > > >> idle. Any ideas where this limit is coming from? > > >> > > >> I thought I would be around this by setting nodeGranularity to > > >> 50. > > >> But when > > >> I do this, what seems to happen is that there are 50 machines > > >> allocated per > > >> 1 worker.pl which would make sense for an MPI job, but not what I > > >> want here. > > >> (The condor submit script sets machine_count to 50, but only > > >> queues > > >> 1) > > >> > > >> I can get around this now by using the plain condor provider, but > > >> ideally > > >> would like to use coasters. > > >> > > >> Thanks, > > >> David > > >> _______________________________________________ > > >> Swift-devel mailing list > > >> Swift-devel at ci.uchicago.edu > > >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-devel > > >> > > > > > > -- > > > Sent from my mobile device -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From ketancmaheshwari at gmail.com Mon Apr 30 13:39:54 2012 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Mon, 30 Apr 2012 14:39:54 -0400 Subject: [Swift-devel] Streams Message-ID: Hi, I am working on a DOE powergrid related project here at Cornell. An aim of the project is to compute power grid state estimation and react in time-critical fashion. The application at a very high level, is a distributed producer/consumer system where multiple producers produce data streams consumed by multiple consumers in a publish-subscribe model of data flow. The producers (phasor measurement units) produce streams continuously and consumers(State Estimators) can subscribe to the producers. There can be multiple consumers consuming from a single producer for performance and consistency purposes. Can Swift support this model of computation? In particular, I am wondering how to go about the following aspects with Swift: 1. Describe application which could run in an 'infinite' loop. 2. Mappers to streams. I think these streams should be some kind of named buffers. A memory to memory stream model is what I vaguely view this as. The streams are binary encoded ones in big-endian format and could be parsed (by consumers) as id'd tuples each containing 5-6 fixed width field of timestamp, voltage, current, delta etc. data. There are other requirements of the application and plenty of low level nitty gritty but I think Swift could handle all of'em. I am just unsure of the above two at the moment. We are in discussion with WSU collaborators to deliver some of the 'real' parts of the application. However, in the meantime we do have toy components to test and play with. Any input is greatly appreciated. Regards, -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From benc at hawaga.org.uk Mon Apr 30 14:16:49 2012 From: benc at hawaga.org.uk (Ben Clifford) Date: Tue, 1 May 2012 03:16:49 +0800 Subject: [Swift-devel] Streams In-Reply-To: References: Message-ID: <75896CBC-FF67-4FB0-A6F1-986D3B2EA0D5@hawaga.org.uk> There's been some discussion about streaming data in the past. At a high level, you could have an array that never closes, and over time keeps filling up with more elements as they are delivered. Then you could run a regular foreach over that array, and processing would happen as it happen. At a lower level, that's probably really difficult to make work well with the present implementation - I think you would end up filling up memory pretty fast, because there's no mechanism to forget fully processed data... --