From jlee734 at gmail.com Sat Mar 9 17:48:17 2013 From: jlee734 at gmail.com (Jay Lee) Date: Sat, 9 Mar 2013 18:48:17 -0500 Subject: [Swift-user] Variable Declaration Message-ID: Hello, I just started with swift today, so excuse my lack of knowledge. I have the following code: type messagefile; type string; app (messagefile t) parse(messagefile n) { string v = @regexp("abcdefghi", "c(def)g","monkey"); echo @extractint(n) stdout=@filename(t); } app (messagefile t) greeting() { echo "Hello, world!" stdout=@filename(t); } messagefile outfile <"hello.txt">; messagefile input <"compile.txt">; outfile = parse(input); I get an error: Could not compile SwiftScript source: line 6:10: expecting a semicolon, found '=' I found that there are mappers that can be used to declare variables (namely files), but are these required? -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Sat Mar 9 17:55:10 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Sat, 9 Mar 2013 17:55:10 -0600 (CST) Subject: [Swift-user] Variable Declaration In-Reply-To: Message-ID: <144646098.1232621.1362873310054.JavaMail.root@mcs.anl.gov> Jay, An app() function's body is restricted to contain a single command line template and nothing else. So instead of: app (messagefile t) parse(messagefile n) { string v = @regexp("abcdefghi", "c(def)g","monkey"); echo @extractint(n) stdout=@filename(t); } ...you need create a separate ("compound") function to do things like string v=(). You can embed arbitrary expressions in the app() body, but it can only have one semicolon-terminated command. Thats the likely cause of the syntax error. Also note that your statement "string v = ..." is creating a value that as far as I can see is not used anywhere, so you may want to re-examine that logic. - Mike ----- Original Message ----- > From: "Jay Lee" > To: swift-user at ci.uchicago.edu > Sent: Saturday, March 9, 2013 5:48:17 PM > Subject: [Swift-user] Variable Declaration > > > Hello, > > I just started with swift today, so excuse my lack of knowledge. I > have the following code: > > type messagefile; > type string; > > app (messagefile t) parse(messagefile n) { > string v = @regexp("abcdefghi", "c(def)g","monkey"); > echo @extractint(n) stdout=@filename(t); > } > > app (messagefile t) greeting() { > echo "Hello, world!" stdout=@filename(t); > } > > messagefile outfile <"hello.txt">; > messagefile input <"compile.txt">; > > outfile = parse(input); > > > > I get an error: Could not compile SwiftScript source: line 6:10: > expecting a semicolon, found '=' > > I found that there are mappers that can be used to declare variables > (namely files), but are these required? > > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user From ketancmaheshwari at gmail.com Sat Mar 9 18:50:42 2013 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Sat, 9 Mar 2013 18:50:42 -0600 Subject: [Swift-user] Variable Declaration In-Reply-To: <144646098.1232621.1362873310054.JavaMail.root@mcs.anl.gov> References: <144646098.1232621.1362873310054.JavaMail.root@mcs.anl.gov> Message-ID: Jay, In addition to what Mike said, please note that string is available as built-in type with Swift. So, "type string" statement is not required. On Sat, Mar 9, 2013 at 5:55 PM, Michael Wilde wrote: > Jay, > > An app() function's body is restricted to contain a single command line > template and nothing else. So instead of: > > app (messagefile t) parse(messagefile n) { > string v = @regexp("abcdefghi", "c(def)g","monkey"); > echo @extractint(n) stdout=@filename(t); > } > > ...you need create a separate ("compound") function to do things like > string v=(). You can embed arbitrary expressions in the app() body, but it > can only have one semicolon-terminated command. Thats the likely cause of > the syntax error. > > Also note that your statement "string v = ..." is creating a value that as > far as I can see is not used anywhere, so you may want to re-examine that > logic. > > - Mike > > ----- Original Message ----- > > From: "Jay Lee" > > To: swift-user at ci.uchicago.edu > > Sent: Saturday, March 9, 2013 5:48:17 PM > > Subject: [Swift-user] Variable Declaration > > > > > > Hello, > > > > I just started with swift today, so excuse my lack of knowledge. I > > have the following code: > > > > type messagefile; > > type string; > > > > app (messagefile t) parse(messagefile n) { > > string v = @regexp("abcdefghi", "c(def)g","monkey"); > > echo @extractint(n) stdout=@filename(t); > > } > > > > app (messagefile t) greeting() { > > echo "Hello, world!" stdout=@filename(t); > > } > > > > messagefile outfile <"hello.txt">; > > messagefile input <"compile.txt">; > > > > outfile = parse(input); > > > > > > > > I get an error: Could not compile SwiftScript source: line 6:10: > > expecting a semicolon, found '=' > > > > I found that there are mappers that can be used to declare variables > > (namely files), but are these required? > > > > > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From lpesce at uchicago.edu Wed Mar 20 13:12:42 2013 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Wed, 20 Mar 2013 13:12:42 -0500 Subject: [Swift-user] Question about arrays of struct data Message-ID: <74DF262F-B5CC-44D8-8021-FFEF9261B57C@uchicago.edu> I need to create a struct that will become more complex with time, let's say this is what it looks like now: type Sample{ string name; string dir; } Then I need an array of such structs Sample [] mySamples; However, I would like to create the struct on the fly into the swift script itself. For arrays I would typically do string FileID [] = ["C440.TCGA-BR-7196-10A-01D-2053-08.2","C440.TCGA-CG-4477-10A-01D-1158-08.5"]; How can I do that for a struct? (I don't want to use readData, but I can if there is no other choice) Thanks, Lorenzo From lpesce at uchicago.edu Wed Mar 20 13:21:29 2013 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Wed, 20 Mar 2013 13:21:29 -0500 Subject: [Swift-user] Question about arrays of struct data In-Reply-To: <74DF262F-B5CC-44D8-8021-FFEF9261B57C@uchicago.edu> References: <74DF262F-B5CC-44D8-8021-FFEF9261B57C@uchicago.edu> Message-ID: I changed my mind, readData works just fine ;-) On Mar 20, 2013, at 1:12 PM, Lorenzo Pesce wrote: > I need to create a struct that will become more complex with time, let's say this is what it looks like now: > > type Sample{ > string name; > string dir; > } > > Then I need an array of such structs > > Sample [] mySamples; > > However, I would like to create the struct on the fly into the swift script itself. For arrays I would typically do > string FileID [] = ["C440.TCGA-BR-7196-10A-01D-2053-08.2","C440.TCGA-CG-4477-10A-01D-1158-08.5"]; > > How can I do that for a struct? (I don't want to use readData, but I can if there is no other choice) > > Thanks, > > Lorenzo > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user From wilde at mcs.anl.gov Wed Mar 20 13:32:04 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 20 Mar 2013 13:32:04 -0500 (CDT) Subject: [Swift-user] Question about arrays of struct data In-Reply-To: Message-ID: <478308849.1710420.1363804324587.JavaMail.root@mcs.anl.gov> > I changed my mind, readData works just fine ;-) OK. If the arrays are of modest size, you can make them constant array constructors and then set the structure with a foreach and assignment statements. Swift doesnt have structure initialization constructors, but perhaps it should, eg ala Perl. - Mike ----- Original Message ----- > From: "Lorenzo Pesce" > Cc: "Swift User Discussion List" > Sent: Wednesday, March 20, 2013 1:21:29 PM > Subject: Re: [Swift-user] Question about arrays of struct data > > I changed my mind, readData works just fine ;-) > > On Mar 20, 2013, at 1:12 PM, Lorenzo Pesce wrote: > > > I need to create a struct that will become more complex with time, > > let's say this is what it looks like now: > > > > type Sample{ > > string name; > > string dir; > > } > > > > Then I need an array of such structs > > > > Sample [] mySamples; > > > > However, I would like to create the struct on the fly into the > > swift script itself. For arrays I would typically do > > string FileID [] = > > ["C440.TCGA-BR-7196-10A-01D-2053-08.2","C440.TCGA-CG-4477-10A-01D-1158-08.5"]; > > > > How can I do that for a struct? (I don't want to use readData, but > > I can if there is no other choice) > > > > Thanks, > > > > Lorenzo > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > From lpesce at uchicago.edu Wed Mar 20 13:53:06 2013 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Wed, 20 Mar 2013 13:53:06 -0500 Subject: [Swift-user] Question about arrays of struct data In-Reply-To: <478308849.1710420.1363804324587.JavaMail.root@mcs.anl.gov> References: <478308849.1710420.1363804324587.JavaMail.root@mcs.anl.gov> Message-ID: <00A95F70-3657-4C26-9561-7790CE70CD0F@uchicago.edu> thanks Mike, The reason why I eventually decided that the standard approach was better is that it is a lot easier to make sense out of the structured files for readData then trying to hack together a large structure with a lot of files (hundreds or more). In fact it is so much better that I regret not having done it before :-) On Mar 20, 2013, at 1:32 PM, Michael Wilde wrote: >> I changed my mind, readData works just fine ;-) > > OK. > > If the arrays are of modest size, you can make them constant array constructors and then set the structure with a foreach and assignment statements. > > Swift doesnt have structure initialization constructors, but perhaps it should, eg ala Perl. > > - Mike > > > ----- Original Message ----- >> From: "Lorenzo Pesce" >> Cc: "Swift User Discussion List" >> Sent: Wednesday, March 20, 2013 1:21:29 PM >> Subject: Re: [Swift-user] Question about arrays of struct data >> >> I changed my mind, readData works just fine ;-) >> >> On Mar 20, 2013, at 1:12 PM, Lorenzo Pesce wrote: >> >>> I need to create a struct that will become more complex with time, >>> let's say this is what it looks like now: >>> >>> type Sample{ >>> string name; >>> string dir; >>> } >>> >>> Then I need an array of such structs >>> >>> Sample [] mySamples; >>> >>> However, I would like to create the struct on the fly into the >>> swift script itself. For arrays I would typically do >>> string FileID [] = >>> ["C440.TCGA-BR-7196-10A-01D-2053-08.2","C440.TCGA-CG-4477-10A-01D-1158-08.5"]; >>> >>> How can I do that for a struct? (I don't want to use readData, but >>> I can if there is no other choice) >>> >>> Thanks, >>> >>> Lorenzo >>> >>> _______________________________________________ >>> Swift-user mailing list >>> Swift-user at ci.uchicago.edu >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user >> >> _______________________________________________ >> Swift-user mailing list >> Swift-user at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user >> From lpesce at uchicago.edu Wed Mar 20 15:27:05 2013 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Wed, 20 Mar 2013 15:27:05 -0500 Subject: [Swift-user] What system calls do the mappers use? Message-ID: <1FA7D975-ADED-4C5C-A6E6-519872DFF5AC@uchicago.edu> Hi -- I am working with mappers that might be repeated thousands of times in each workflow run. Lustre doesn't like that type of search when it is based on approaches similar to "ls", on the other hand "find" works fine. I could conceivably find a work around, but I would rather not have to do it. Lorenzo From wilde at mcs.anl.gov Wed Mar 20 15:37:53 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 20 Mar 2013 15:37:53 -0500 (CDT) Subject: [Swift-user] What system calls do the mappers use? In-Reply-To: <1FA7D975-ADED-4C5C-A6E6-519872DFF5AC@uchicago.edu> Message-ID: <778840368.1784444.1363811873110.JavaMail.root@mcs.anl.gov> The best approach, if you want fine-grained control over how the mapper operates (other then writing your own Java mapper) is to do the mapping in an app(), return an array of mapped strings via readData(), and then map the strings to the file array using array_mapper (from an array of strings) or fixed_array_mapper (from one string). You can also use an ext_mapper to get the degree of control you want. The difference between travering a dir structure with find vs ls is that a simple "find ." with no other filters just reads directories, while most ls calls and find filters all need to do a stat() on the file's inode to look at its metadata. On a heavily loaded shared file server like GPFS or lustre, these metadata operations are what typically causes significant overhead and slowdown, and may be further by locks help by apps doing metadata updates. - Mike ----- Original Message ----- > From: "Lorenzo Pesce" > To: "Swift User Discussion List" > Sent: Wednesday, March 20, 2013 3:27:05 PM > Subject: [Swift-user] What system calls do the mappers use? > > Hi -- > > I am working with mappers that might be repeated thousands of times > in each workflow run. > Lustre doesn't like that type of search when it is based on > approaches similar to "ls", on the other hand "find" works fine. > > I could conceivably find a work around, but I would rather not have > to do it. > > Lorenzo > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > From wilde at mcs.anl.gov Wed Mar 20 15:41:03 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 20 Mar 2013 15:41:03 -0500 (CDT) Subject: [Swift-user] What system calls do the mappers use? In-Reply-To: <1FA7D975-ADED-4C5C-A6E6-519872DFF5AC@uchicago.edu> Message-ID: <445553494.1786893.1363812063491.JavaMail.root@mcs.anl.gov> Also, to answer your question more directly: "I dont know". You can try to answer this by writing some very simple swift scripts that do the kinds of built-in mappings you are looking at, and use strace() wuth suitable filtering and grepping do see what Swift (via Java) is doing to implement the mapping. Mihael may be able to point you to the Java classes that do the mapping to distill this process further. - Mike ----- Original Message ----- > From: "Lorenzo Pesce" > To: "Swift User Discussion List" > Sent: Wednesday, March 20, 2013 3:27:05 PM > Subject: [Swift-user] What system calls do the mappers use? > > Hi -- > > I am working with mappers that might be repeated thousands of times > in each workflow run. > Lustre doesn't like that type of search when it is based on > approaches similar to "ls", on the other hand "find" works fine. > > I could conceivably find a work around, but I would rather not have > to do it. > > Lorenzo > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > From lpesce at uchicago.edu Wed Mar 20 15:43:19 2013 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Wed, 20 Mar 2013 15:43:19 -0500 Subject: [Swift-user] What system calls do the mappers use? In-Reply-To: <445553494.1786893.1363812063491.JavaMail.root@mcs.anl.gov> References: <445553494.1786893.1363812063491.JavaMail.root@mcs.anl.gov> Message-ID: Can one make hashes of arrays in or arrays of arrays of different sizes in swift? e.g., and array of an array type of variable size? On Mar 20, 2013, at 3:41 PM, Michael Wilde wrote: > > Also, to answer your question more directly: "I dont know". You can try to answer this by writing some very simple swift scripts that do the kinds of built-in mappings you are looking at, and use strace() wuth suitable filtering and grepping do see what Swift (via Java) is doing to implement the mapping. > > Mihael may be able to point you to the Java classes that do the mapping to distill this process further. > > - Mike > > > ----- Original Message ----- >> From: "Lorenzo Pesce" >> To: "Swift User Discussion List" >> Sent: Wednesday, March 20, 2013 3:27:05 PM >> Subject: [Swift-user] What system calls do the mappers use? >> >> Hi -- >> >> I am working with mappers that might be repeated thousands of times >> in each workflow run. >> Lustre doesn't like that type of search when it is based on >> approaches similar to "ls", on the other hand "find" works fine. >> >> I could conceivably find a work around, but I would rather not have >> to do it. >> >> Lorenzo >> _______________________________________________ >> Swift-user mailing list >> Swift-user at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user >> From hategan at mcs.anl.gov Wed Mar 20 15:55:34 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 20 Mar 2013 13:55:34 -0700 Subject: [Swift-user] What system calls do the mappers use? In-Reply-To: <445553494.1786893.1363812063491.JavaMail.root@mcs.anl.gov> References: <445553494.1786893.1363812063491.JavaMail.root@mcs.anl.gov> Message-ID: <1363812934.28485.4.camel@echo> On Wed, 2013-03-20 at 15:41 -0500, Michael Wilde wrote: > Also, to answer your question more directly: "I dont know". You can try to answer this by writing some very simple swift scripts that do the kinds of built-in mappings you are looking at, and use strace() wuth suitable filtering and grepping do see what Swift (via Java) is doing to implement the mapping. > > Mihael may be able to point you to the Java classes that do the mapping to distill this process further. If it's input data, I'm afraid it makes calls to cog to list files, and that involves building an object with all kinds of file information. If it's output data, it should not do that. It might be worth timing this to see the actual difference, and then it might be worth "fixing" the problem if it exists. Mihael From hategan at mcs.anl.gov Wed Mar 20 15:58:22 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 20 Mar 2013 13:58:22 -0700 Subject: [Swift-user] What system calls do the mappers use? In-Reply-To: References: <445553494.1786893.1363812063491.JavaMail.root@mcs.anl.gov> Message-ID: <1363813102.28485.5.camel@echo> On Wed, 2013-03-20 at 15:43 -0500, Lorenzo Pesce wrote: > Can one make hashes of arrays in or arrays of arrays of different sizes in swift? > e.g., and array of an array type of variable size? Yes. Arrays are dynamic and sparse (internally they are stored as hashtables). Mihael From wilde at mcs.anl.gov Wed Mar 20 16:04:53 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 20 Mar 2013 16:04:53 -0500 (CDT) Subject: [Swift-user] What system calls do the mappers use? In-Reply-To: Message-ID: <535533672.1800767.1363813493840.JavaMail.root@mcs.anl.gov> Lorenzo, All Swift arrays are varying in size: you dont declare the array size in the declaration. Further, they can be sparse (because the implementation is in fact a hashtable). Swift has code that supports user-level hashes by by declaring arrays with string instead of integer keys. I thought this made it to the User Guide but I see now that it did not. Its possible/likely thats because the code is not in trunk yet. Can anyone on the devel team reply with the status of associative arrays? Thanks, - Mike ----- Original Message ----- > From: "Lorenzo Pesce" > To: "Michael Wilde" > Cc: "Swift User Discussion List" > Sent: Wednesday, March 20, 2013 3:43:19 PM > Subject: Re: [Swift-user] What system calls do the mappers use? > > Can one make hashes of arrays in or arrays of arrays of different > sizes in swift? > e.g., and array of an array type of variable size? > > On Mar 20, 2013, at 3:41 PM, Michael Wilde wrote: > > > > > Also, to answer your question more directly: "I dont know". You > > can try to answer this by writing some very simple swift scripts > > that do the kinds of built-in mappings you are looking at, and use > > strace() wuth suitable filtering and grepping do see what Swift > > (via Java) is doing to implement the mapping. > > > > Mihael may be able to point you to the Java classes that do the > > mapping to distill this process further. > > > > - Mike > > > > > > ----- Original Message ----- > >> From: "Lorenzo Pesce" > >> To: "Swift User Discussion List" > >> Sent: Wednesday, March 20, 2013 3:27:05 PM > >> Subject: [Swift-user] What system calls do the mappers use? > >> > >> Hi -- > >> > >> I am working with mappers that might be repeated thousands of > >> times > >> in each workflow run. > >> Lustre doesn't like that type of search when it is based on > >> approaches similar to "ls", on the other hand "find" works fine. > >> > >> I could conceivably find a work around, but I would rather not > >> have > >> to do it. > >> > >> Lorenzo > >> _______________________________________________ > >> Swift-user mailing list > >> Swift-user at ci.uchicago.edu > >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > >> > > From hategan at mcs.anl.gov Wed Mar 20 16:52:17 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 20 Mar 2013 14:52:17 -0700 Subject: [Swift-user] What system calls do the mappers use? In-Reply-To: <535533672.1800767.1363813493840.JavaMail.root@mcs.anl.gov> References: <535533672.1800767.1363813493840.JavaMail.root@mcs.anl.gov> Message-ID: <1363816337.14167.1.camel@echo> They are in trunk. They should also be in 0.94. You declare them as: valueType[keyType] arrayName; For example: int[string] a; a["one"] = 1; Mihael On Wed, 2013-03-20 at 16:04 -0500, Michael Wilde wrote: > Lorenzo, > > All Swift arrays are varying in size: you dont declare the array size in the declaration. Further, they can be sparse (because the implementation is in fact a hashtable). > > Swift has code that supports user-level hashes by by declaring arrays with string instead of integer keys. I thought this made it to the User Guide but I see now that it did not. Its possible/likely thats because the code is not in trunk yet. > > Can anyone on the devel team reply with the status of associative arrays? > > Thanks, > > - Mike > > ----- Original Message ----- > > From: "Lorenzo Pesce" > > To: "Michael Wilde" > > Cc: "Swift User Discussion List" > > Sent: Wednesday, March 20, 2013 3:43:19 PM > > Subject: Re: [Swift-user] What system calls do the mappers use? > > > > Can one make hashes of arrays in or arrays of arrays of different > > sizes in swift? > > e.g., and array of an array type of variable size? > > > > On Mar 20, 2013, at 3:41 PM, Michael Wilde wrote: > > > > > > > > Also, to answer your question more directly: "I dont know". You > > > can try to answer this by writing some very simple swift scripts > > > that do the kinds of built-in mappings you are looking at, and use > > > strace() wuth suitable filtering and grepping do see what Swift > > > (via Java) is doing to implement the mapping. > > > > > > Mihael may be able to point you to the Java classes that do the > > > mapping to distill this process further. > > > > > > - Mike > > > > > > > > > ----- Original Message ----- > > >> From: "Lorenzo Pesce" > > >> To: "Swift User Discussion List" > > >> Sent: Wednesday, March 20, 2013 3:27:05 PM > > >> Subject: [Swift-user] What system calls do the mappers use? > > >> > > >> Hi -- > > >> > > >> I am working with mappers that might be repeated thousands of > > >> times > > >> in each workflow run. > > >> Lustre doesn't like that type of search when it is based on > > >> approaches similar to "ls", on the other hand "find" works fine. > > >> > > >> I could conceivably find a work around, but I would rather not > > >> have > > >> to do it. > > >> > > >> Lorenzo > > >> _______________________________________________ > > >> Swift-user mailing list > > >> Swift-user at ci.uchicago.edu > > >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > > >> > > > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user From lpesce at uchicago.edu Wed Mar 20 17:01:19 2013 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Wed, 20 Mar 2013 17:01:19 -0500 Subject: [Swift-user] What system calls do the mappers use? In-Reply-To: <1363813102.28485.5.camel@echo> References: <445553494.1786893.1363812063491.JavaMail.root@mcs.anl.gov> <1363813102.28485.5.camel@echo> Message-ID: On Mar 20, 2013, at 3:58 PM, Mihael Hategan wrote: > On Wed, 2013-03-20 at 15:43 -0500, Lorenzo Pesce wrote: >> Can one make hashes of arrays in or arrays of arrays of different sizes in swift? >> e.g., and array of an array type of variable size? > > Yes. Arrays are dynamic and sparse (internally they are stored as > hashtables). good. I think that I will create a file with all the files names and then read it in with readData and avoid the mapper altogether. This has also the advantage of allowing some room to exploit some data locality. > > Mihael > From lpesce at uchicago.edu Fri Mar 22 14:00:26 2013 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Fri, 22 Mar 2013 14:00:26 -0500 Subject: [Swift-user] Question about arrays and ordering of execution Message-ID: I apologize if I asked this question already or it is obvious, sometimes I mix all these problems together... I start with having a number of files associated with a specific Sample, do something with all of them, then merge them and move on to do more things (which also involved splitting them and remerging them). My rough swift code for it would be the following. string [] FILES = readdata.... ; string {} OUTFILES = ; # don't exist yet for file in FILES { (OUTFILE{file}...) do_something (file); } (next_file) = do_somethingmore(OUTFILE); My question is: can I use an array/hash of files as input to a function and will this be sufficient to tell swift to wait till *all* the OUTFILE elements have been run? If not, which is the next simplest approach (most portable, less prone to fall to pieces, less dependent on filesystems and so on). Thanks a lot as usual, Lorenzo From hategan at mcs.anl.gov Fri Mar 22 14:16:40 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Fri, 22 Mar 2013 12:16:40 -0700 Subject: [Swift-user] Question about arrays and ordering of execution In-Reply-To: References: Message-ID: <1363979800.31616.1.camel@echo> On Fri, 2013-03-22 at 14:00 -0500, Lorenzo Pesce wrote: > I apologize if I asked this question already or it is obvious, sometimes I mix all these problems together... > > I start with having a number of files associated with a specific Sample, do something with all of them, then merge them and move on to do more things (which also involved splitting them and remerging them). > > My rough swift code for it would be the following. > > string [] FILES = readdata.... ; > > string {} OUTFILES = ; # don't exist yet > > for file in FILES { > > (OUTFILE{file}...) do_something (file); > } > > (next_file) = do_somethingmore(OUTFILE); > > > My question is: can I use an array/hash of files as input to a function and will this be sufficient to tell swift to wait till *all* the OUTFILE elements have been run? Yes. Swift will wait for all the input parameters to an app (and that includes waiting until it knows exactly how many items are in an array - i.e., when it knows that no more items will ever be added). Mihael From wilde at mcs.anl.gov Fri Mar 22 14:17:36 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Fri, 22 Mar 2013 14:17:36 -0500 (CDT) Subject: [Swift-user] Question about arrays and ordering of execution In-Reply-To: Message-ID: <1302314160.494686.1363979856221.JavaMail.root@mcs.anl.gov> Lorenzo, > I start with having a number of files associated with a specific > Sample, do something with all of them, then merge them and move on > to do more things (which also involved splitting them and remerging > them). > > My rough swift code for it would be the following. > > string [] FILES = readdata.... ; > > string {} OUTFILES = ; # don't exist yet > > for file in FILES { > > (OUTFILE{file}...) do_something (file); > } > > (next_file) = do_somethingmore(OUTFILE); > > > My question is: can I use an array/hash of files as input to a > function and will this be sufficient to tell swift to wait till > *all* the OUTFILE elements have been run? By "merge them" do you mean "merge them all into one file"? If do_somethingmore is the merge function, and its an app(), it will wait for OUTFILE to be closed before it runs. There are very few situations in which you need to tell swift to wait: that should happen transparently: if you refer to a data element (ie, a scalar, array element, array, etc) with an app or a primitive, swift will wait for that element to have a value. If OUTFILE is very large, however, you may need to use some special techniques to pass all the file names to an app. (I think that involves using writeData() to write the file names to another file and pass the file). Im not sure thats well tested or sufficiently documented yet, but lets deal with that after you get the basic structure of the script worked out. Another thing to watch out for is whether string-indexed arrays is handled correctly by all mappers. Would it be useful for you to work through the logic of a simple case first? Id suggest to write up the example above to handle 10 files, first. I think you can find this logic in the MODIS example in the 2011 Swift Language paper. - Mike > If not, which is the next simplest approach (most portable, less > prone to fall to pieces, less dependent on filesystems and so on). > > Thanks a lot as usual, > > Lorenzo > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > From lpesce at uchicago.edu Fri Mar 22 14:20:22 2013 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Fri, 22 Mar 2013 14:20:22 -0500 Subject: [Swift-user] Question about arrays and ordering of execution In-Reply-To: <1302314160.494686.1363979856221.JavaMail.root@mcs.anl.gov> References: <1302314160.494686.1363979856221.JavaMail.root@mcs.anl.gov> Message-ID: <31CF5665-92B0-451F-86D7-FD65464BDD89@uchicago.edu> Thank you all for the reply. I am going to try it right now. :-) It is rarely more than 100 files (the overall operation might be repeated 10,000 times) On Mar 22, 2013, at 2:17 PM, Michael Wilde wrote: > > Lorenzo, > >> I start with having a number of files associated with a specific >> Sample, do something with all of them, then merge them and move on >> to do more things (which also involved splitting them and remerging >> them). >> >> My rough swift code for it would be the following. >> >> string [] FILES = readdata.... ; >> >> string {} OUTFILES = ; # don't exist yet >> >> for file in FILES { >> >> (OUTFILE{file}...) do_something (file); >> } >> >> (next_file) = do_somethingmore(OUTFILE); >> >> >> My question is: can I use an array/hash of files as input to a >> function and will this be sufficient to tell swift to wait till >> *all* the OUTFILE elements have been run? > > By "merge them" do you mean "merge them all into one file"? > > If do_somethingmore is the merge function, and its an app(), it will wait for OUTFILE to be closed before it runs. > > There are very few situations in which you need to tell swift to wait: that should happen transparently: if you refer to a data element (ie, a scalar, array element, array, etc) with an app or a primitive, swift will wait for that element to have a value. > > If OUTFILE is very large, however, you may need to use some special techniques to pass all the file names to an app. (I think that involves using writeData() to write the file names to another file and pass the file). Im not sure thats well tested or sufficiently documented yet, but lets deal with that after you get the basic structure of the script worked out. > > Another thing to watch out for is whether string-indexed arrays is handled correctly by all mappers. Would it be useful for you to work through the logic of a simple case first? Id suggest to write up the example above to handle 10 files, first. I think you can find this logic in the MODIS example in the 2011 Swift Language paper. > > - Mike > >> If not, which is the next simplest approach (most portable, less >> prone to fall to pieces, less dependent on filesystems and so on). >> >> Thanks a lot as usual, >> >> Lorenzo >> _______________________________________________ >> Swift-user mailing list >> Swift-user at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user >> From lpesce at uchicago.edu Thu Mar 28 09:15:27 2013 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Thu, 28 Mar 2013 09:15:27 -0500 Subject: [Swift-user] Question about arrays and ordering of execution In-Reply-To: <1302314160.494686.1363979856221.JavaMail.root@mcs.anl.gov> References: <1302314160.494686.1363979856221.JavaMail.root@mcs.anl.gov> Message-ID: <47EB5B9A-E375-4FDE-99F5-1D4D1184853A@uchicago.edu> Thanks Mike. It looks like it is working. Now we need to scale it up (and then adds some more twists...). Medium scale tests planned for today (up to 50-100 nodes). If this works we have only problem type II and then we are ready to plan the seminar. :-) On Mar 22, 2013, at 2:17 PM, Michael Wilde wrote: > > Lorenzo, > >> I start with having a number of files associated with a specific >> Sample, do something with all of them, then merge them and move on >> to do more things (which also involved splitting them and remerging >> them). >> >> My rough swift code for it would be the following. >> >> string [] FILES = readdata.... ; >> >> string {} OUTFILES = ; # don't exist yet >> >> for file in FILES { >> >> (OUTFILE{file}...) do_something (file); >> } >> >> (next_file) = do_somethingmore(OUTFILE); >> >> >> My question is: can I use an array/hash of files as input to a >> function and will this be sufficient to tell swift to wait till >> *all* the OUTFILE elements have been run? > > By "merge them" do you mean "merge them all into one file"? > > If do_somethingmore is the merge function, and its an app(), it will wait for OUTFILE to be closed before it runs. > > There are very few situations in which you need to tell swift to wait: that should happen transparently: if you refer to a data element (ie, a scalar, array element, array, etc) with an app or a primitive, swift will wait for that element to have a value. > > If OUTFILE is very large, however, you may need to use some special techniques to pass all the file names to an app. (I think that involves using writeData() to write the file names to another file and pass the file). Im not sure thats well tested or sufficiently documented yet, but lets deal with that after you get the basic structure of the script worked out. > > Another thing to watch out for is whether string-indexed arrays is handled correctly by all mappers. Would it be useful for you to work through the logic of a simple case first? Id suggest to write up the example above to handle 10 files, first. I think you can find this logic in the MODIS example in the 2011 Swift Language paper. > > - Mike > >> If not, which is the next simplest approach (most portable, less >> prone to fall to pieces, less dependent on filesystems and so on). >> >> Thanks a lot as usual, >> >> Lorenzo >> _______________________________________________ >> Swift-user mailing list >> Swift-user at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user >> From iraicu at cs.iit.edu Fri Mar 29 23:19:55 2013 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Fri, 29 Mar 2013 23:19:55 -0500 Subject: [Swift-user] CALL FOR PARTICIPATION: IEEE/ACM CCGrid 2013 Message-ID: <515667EB.90406@cs.iit.edu> **** CALL FOR PARTICIPATION **** *********************************************************** *** *** EARLY REGISTRATION DEADLINE: April 22, 2013 *** *** *********************************************************** and **** CALL FOR POSTERS **** The 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid 2013) Delft University of Technology, Delft, the Netherlands May 13-16, 2013 http://www.pds.ewi.tudelft.nl/ccgrid2013 CCGrid is a series of very successful conferences, sponsored by the IEEE Computer Society Technical Committee on Scalable Computing (TCSC) and the ACM, with the overarching goal of bringing together international researchers, developers, and users to provide an international forum to present leading research activities and results on a broad range of topics related to clusters, grids and clouds and their applications. **** VENUE **** The CCGrid 2013 conference will be held on the campus of Delft University of Technology, which was founded in 1842 by King William II and which is the oldest and largest technical university in the Netherlands. It is well established as one of the leading technical universities in the world. Delft is a small, historical town dating back to the 13th century. Delft has many old buildings and small canals, and it has a lively atmosphere. The city offers a large variety of hotels and restaurants. Many other places of interest (e.g., Amsterdam and The Hague) are within one hour distance of traveling. Traveling to Delft is easy. Delft is close to Amsterdam Schiphol Airport (60 km, 45 min by train), which has direct connections from all major airports in the world. Delft also has excellent train connections to the rest of Europe. **** HIGHLIGHTS OF THE CONFERENCE PROGRAM **** - A keynote by the winner of the IEEE Award for Excellence in Scalable Computing Speaker: Marc Snir, Argonne National Laboratory and University of Illinois at Urbana-Champaign, USA Title: Programming Models for High-Performance Computing - Two additional keynote speakers: * Speaker: Simon Portegies Zwart, Leiden University, the Netherlands Title: The Astronomical Multipurpose Software Environment and the Ecology of Star Clusters * Speaker: Daniel A. Reed, University of Iowa, USA Title: Clusters, Grids and Clouds: A Look from Both Sides - Four workshops and three tutorials on Monday, May 13 - 14 technical paper sessions - A poster presentation and a poster session plus reception - A panel on Cloud Computing - A conference dinner on Wednesday, May 15 **** CALL FOR POSTERS **** CCGrid 2013 offers conference attendees the opportunity to participate in the poster session on Tuesday afternoon. For details on how to submit a poster, please consult the conference website (look for web-published posters). The submission deadline is April 15, 2013. **** GENERAL CHAIR **** Dick Epema, Delft University of Technology, the Netherlands **** PROGRAM CHAIR **** Thomas Fahringer, University of Innsbruck, Austria **** PROGRAM VICE-CHAIRS **** Rosa Badia, Barcelona Supercomputing Center, Spain Henri Bal, Vrije Universiteit, the Netherlands Marios Dikaiakos, University of Cyprus, Cyprus Kirk Cameron, VirginiaTech, USA Daniel Katz, University of Chicago & Argonne Nat Lab, USA Kate Keahey, Argonne National Laboratory, USA Martin Schulz, Lawrence Livermore National Laboratory, USA Douglas Thain, University of Notre Dame, USA Cheng-Zhong Xu, Shenzhen Inst. of Advanced Techn, China **** DOCTORAL SYMPOSIUM CO-CHAIRS **** Yogesh Simmhan, University of Southern California, USA Ana Varbanescu, Delft University of Technology, the Netherlands **** SCALE CHALLENGE CO-CHAIRS **** Alexandru Iosup, Delft University of Technology, the Netherlands Douglas Thain, Notre-Dame University, USA **** POSTERS CHAIR **** Rob van Nieuwpoort, Netherlands eScience Center, the Netherlands **** WORKSHOPS CO-CHAIRS **** Shantenu Jha, Rutgers and Louisana State University, USA Ioan Raicu, Illinois Institute of Technology, USA **** TOTORIALS CHAIR **** Radu Prodan, University of Innsbruck, Austria **** SUBMISSIONS AND PROCEEDINGS CHAIR **** Pavan Balaji, Argonne National Laboratory, USA **** FINANCE AND REGISTRATION CHAIR **** Alexandru Iosup, Delft University of Technology, the Netherlands **** PUBLICITY CO-CHAIRS **** Nazareno Andrade, University Federal de Campina Grance, Brazil Gabriel Antoniu, INRIA, France Bahman Javadi, University of Western Sysney, Australia Ioan Raicu, Illinois Institute of Technology and Argonne National Laboratory, USA Kin Choong Yow, Shenzhen Inst. of Advanced Technology, China **** CYBER CHAIR **** Stephen van der Laan, Delft University of Technology, the Netherlands **** LOCAL ARRANGEMENTS **** Esther van Rooijen, Delft University of Technology, the Netherlands -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Editor: IEEE TCC, Springer JoCCASA Chair: IEEE/ACM MTAGS, ACM ScienceCloud, IEEE/ACM DataCloud ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ LinkedIn: http://www.linkedin.com/in/ioanraicu Google: http://scholar.google.com/citations?user=jE73HYAAAAAJ ================================================================= ================================================================= From iraicu at cs.iit.edu Fri Mar 29 23:55:08 2013 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Fri, 29 Mar 2013 23:55:08 -0500 Subject: [Swift-user] Call for Posters: ACM HPDC 2013 Message-ID: <5156702C.8070105@cs.iit.edu> **** CALL FOR POSTERS **** The 22nd International ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC'13) New York City, USA - June 17-21, 2013 http://www.hpdc.org/2013 The ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC)?is the premier annual conference on the design, the implementation, the evaluation, and?the use of parallel and distributed systems for high-end computing.HPDC'13 will take place in the heart of iconic New York City from June 17-21. The conference will be held on June 19-21 (Wednesday to Friday), with affiliated workshops taking place on June 17-18 (Monday and Tuesday). __ HPDC'13 will feature a poster session that will provide the right environment for lively and informal discussions on various high performance parallel and distributed computing topics. We *invite all potential authors* to submit their contribution to this poster session in the form of a two-page PDF abstract (we recommend using the ACM Proceedings style, and fonts not smaller than 10 point). Posters may be accompanied by practical demonstrations. Participating posters will be selected based on the following criteria: ?Submissions must describe new, interesting ideas on any HPDC topics of interest ?Submissions can present work in progress, but we strongly encourage the authors to include preliminary experimental results, if available ?Student submissions meeting the above criteria will be given preference Please provide the following information in your PDF file: ?Poster title ?Author names, affiliations, and email addresses ?Note which authors, if any, are students ?Indicate if you plan to set up a demo with your poster (the authors and organizers need to agree that the requirements for the demo to function can be met at the site of the poster exhibition) Abstracts must be submitted through EasyChair (https://www.easychair.org/conferences/?conf=hpdc13posters) *before May 15 2013, 23:59 EDT*. Authors will be notified of acceptance or rejection via e-mail by May 20, 2013. No reviews will be provided. Posters will be published online on the conference website. Each poster will have an A0 panel in the poster exhibition area, which will also include posters of the HPDC accepted papers. The *poster session* will be held on Wednesday, June 19, in the late afternoon, and it will start with a poster advertising session during which the author(s) of each poster will give a very short presentation (2 slides, 1-2 minutes) of their poster. Following these presentations, the poster exhibition will be opened for visiting and, we hope, for fruitful discussions. Therefore, we kindly request at least one author of each poster to be present throughout the entire session. For any questions about the submission, selection, and presentation of the accepted posters, please contact the Posters Chair -- Ivan Rodero, Rutgers University. -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Editor: IEEE TCC, Springer JoCCASA Chair: IEEE/ACM MTAGS, ACM ScienceCloud, IEEE/ACM DataCloud ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ LinkedIn: http://www.linkedin.com/in/ioanraicu Google: http://scholar.google.com/citations?user=jE73HYAAAAAJ ================================================================= ================================================================= -------------- next part -------------- An HTML attachment was scrubbed... URL: