From jamalphd at gmail.com Mon Jun 2 18:20:46 2008 From: jamalphd at gmail.com (J A) Date: Mon, 2 Jun 2008 19:20:46 -0400 Subject: [Swift-user] Performance of Swift Message-ID: Hi All: Based on my reading, the performance from execution a swift workflow depends on the parallelism that a workflow has. If I have a workflow that contains several processors where each processor (procedure) depends on the previous one (output of a processor "A" is the input for processors "B" and so on.) How the performance of using swift in this case compare to other systems that execute workflows where there isn't any parallelism in the workflow? -- Thanks, Jamal -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Mon Jun 2 18:40:34 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 02 Jun 2008 18:40:34 -0500 Subject: [Swift-user] Performance of Swift In-Reply-To: References: Message-ID: <1212450034.22473.5.camel@localhost> The swift engine itself should be about equally fast (or equally slow, depending on perspective) whether you have N jobs that are sequential or N jobs that are parallel. However, there may be scheduling constraints (such as "don't run more than 2 parallel jobs on this site right now") that may interfere with that. You can measure a large portion of the Swift overhead by using something like -dryrun on the command line. Some of it is constant overhead (i.e. the jvm + engine startup), and some of it is nearly linear in the number of jobs (whether parallel or sequential). In terms of comparisons with other systems, I am not aware of any such recent comparisons. Others may know different. Mihael On Mon, 2008-06-02 at 19:20 -0400, J A wrote: > Hi All: > > Based on my reading, the performance from execution a swift workflow > depends on the parallelism that a workflow has. > > > > If I have a workflow that contains several processors where each > processor (procedure) depends on the previous one (output of a > processor "A" is the input for processors "B" and so on.) > > > > How the performance of using swift in this case compare to other > systems that execute workflows where there isn't any parallelism in > the workflow? > > > > -- > Thanks, > > Jamal > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user From wilde at mcs.anl.gov Mon Jun 2 18:47:44 2008 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 02 Jun 2008 18:47:44 -0500 Subject: [Swift-user] Performance of Swift In-Reply-To: References: Message-ID: <484486A0.5050508@mcs.anl.gov> We dont have any recent data that I know of comparing Swift performance to other workflow systems. When executing a serial workflow, performance will depend a lot on what job and data provider you are using. What kind of application are you considering, and what kind of latency between jobs are you looking for? What execution environment are you interested in? (GRAM, PBS, local, or something different?) Whats the profile of your jobs in terms of input data size, job duration, output data size? All of these will affect the performance of a serial job pipeline of the kind you describe. If your procedures behave in a streaming manner (in that they start writing output while still reading input) then perhaps you want to run them as a UNIX pipeline instead of separate procedures under Swift. - Mike On 6/2/08 6:20 PM, J A wrote: > Hi All: > > > Based on my reading, the performance from execution a swift workflow > depends on the parallelism that a workflow has. > > > > If I have a workflow that contains several processors where each > processor (procedure) depends on the previous one (output of a processor > "A" is the input for processors "B" and so on.) > > > > How the performance of using swift in this case compare to other systems > that execute workflows where there isn't any parallelism in the workflow? > > > > -- > Thanks, > > Jamal > > > ------------------------------------------------------------------------ > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user -- Michael Wilde Computation Institute University of Chicago and Argonne National Laboratory From iraicu at cs.uchicago.edu Tue Jun 3 11:05:57 2008 From: iraicu at cs.uchicago.edu (Ioan Raicu) Date: Tue, 03 Jun 2008 11:05:57 -0500 Subject: [Swift-user] Performance of Swift In-Reply-To: References: Message-ID: <48456BE5.3080307@cs.uchicago.edu> Hi, There are several papers out there from our group that shows different aspects of performance of Swift. Here are a few: * http://people.cs.uchicago.edu/~iraicu/publications/2008_NOVA08_book-chapter_Swift.pdf o Figure 9: Shows the memory footprint per job (aka tasks, or nodes in the DAG graph) + shows memory footprint of 3.2KB per node o Figure 18: Shows a large scale application + 20K tasks on 200 CPUs with average task lengths of 200 seconds is a comfortable range for Swift and Falkon + we have more recent results, not published yet, that has 16K tasks on 2048 CPUs with an average task length of 87 seconds which worked well * http://people.cs.uchicago.edu/~iraicu/publications/2007_SWF07_Swift.pdf o Figure 6: Shows the speedup achieved with different task lengths + Conclusion is that using multi-level scheduling with the Falkon provider, even tasks in the range of seconds long can achieve good speedup o Figure 7: Shows the throughput in tasks/sec achieved by Swift + shows Swift achieving 50+ tasks/sec throughputs using Falkon + the paragraph right after this figure mentions that Swift running directly with GRAM2 and PBS can achieve 2 jobs/sec; the implication of this is that jobs typically take 15~60 seconds to startup, which reflects the cost of scheduling, scheduling cycles, and local resource manager's (LRM) time to setup the remote nodes; there are also limitations on how many jobs can be submitted at a time, as each job queued might consume some resources on the LRM, or there might be policies in place that limit the number of jobs that can be queued; this means that aggressive throttling must take place, which in practice, reduces the sustained rate that Swift can submit/execute jobs to a single site, to even lower than 2 jobs/sec So, to answer you question, the performance of Swift (and any other workflow system) will heavily rely on how efficient you can dispatch jobs/tasks to remote resources, how long jobs/tasks are, how data intensive the application is, and how much data movement must happen before the job runs and after. If you have a fast enough file system, and your application execution times are small, you can expect anywhere from 1 to 50 jobs/sec from Swift, depending on what technologies you use to interface between Swift and the remote resources (e.g. GRAM, PBS, Condor, Falkon, etc). Cheers, Ioan J A wrote: > Hi All: > > > Based on my reading, the performance from execution a swift workflow > depends on the parallelism that a workflow has. > > > > If I have a workflow that contains several processors where each > processor (procedure) depends on the previous one (output of a > processor "A" is the input for processors "B" and so on.) > > > > How the performance of using swift in this case compare to other > systems that execute workflows where there isn't any parallelism in > the workflow? > > > > -- > Thanks, > > Jamal > > ------------------------------------------------------------------------ > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > -- =================================================== Ioan Raicu Ph.D. Candidate =================================================== Distributed Systems Laboratory Computer Science Department University of Chicago 1100 E. 58th Street, Ryerson Hall Chicago, IL 60637 =================================================== Email: iraicu at cs.uchicago.edu Web: http://www.cs.uchicago.edu/~iraicu http://dev.globus.org/wiki/Incubator/Falkon http://dsl-wiki.cs.uchicago.edu/index.php/Main_Page =================================================== =================================================== -------------- next part -------------- An HTML attachment was scrubbed... URL: From lixi at uchicago.edu Tue Jun 3 14:24:11 2008 From: lixi at uchicago.edu (lixi at uchicago.edu) Date: Tue, 3 Jun 2008 14:24:11 -0500 (CDT) Subject: [Swift-user] Failed to link input file Message-ID: <20080603142411.BAW94307@m4500-03.uchicago.edu> Hi, Recently, I encountered the following error many times: ... Host: UCSDT2 Directory: workflowtest-20080603-0934-cctuq211/jobs/2/node- 2sl4okti stderr.txt: stdout.txt: ---- Caused by: UCSDT2 Failed to link input file _concurrent/intermediatefile-272352a4-9803-4509-8f19- fcddb7de230b- ... In fact, it did happen before on other sites except "UCSDT2". Who can help me to determine what on earth the error is. Is something to do with my workflow itself or remote sites. If I want to avoid such things, what conditions I could judge based on when selecting sites? I will appreciate any suggestions. Thanks, Xi From hategan at mcs.anl.gov Tue Jun 3 14:40:31 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 03 Jun 2008 14:40:31 -0500 Subject: [Swift-user] Failed to link input file In-Reply-To: <20080603142411.BAW94307@m4500-03.uchicago.edu> References: <20080603142411.BAW94307@m4500-03.uchicago.edu> Message-ID: <1212522031.14982.13.camel@localhost> On Tue, 2008-06-03 at 14:24 -0500, lixi at uchicago.edu wrote: > Hi, > > Recently, I encountered the following error many times: > ... > Host: UCSDT2 > Directory: workflowtest-20080603-0934-cctuq211/jobs/2/node- > 2sl4okti > stderr.txt: > > stdout.txt: > > ---- > > Caused by: > UCSDT2 Failed to link input file > _concurrent/intermediatefile-272352a4-9803-4509-8f19- > fcddb7de230b- > ... What's in the wrapper log for that job? From lixi at uchicago.edu Tue Jun 3 14:52:47 2008 From: lixi at uchicago.edu (lixi at uchicago.edu) Date: Tue, 3 Jun 2008 14:52:47 -0500 (CDT) Subject: [Swift-user] Failed to link input file Message-ID: <20080603145247.BAX00142@m4500-03.uchicago.edu> I couldn't find the wrapper log for that node. The log file and wrapper log are on CI: /home/lixi/newswift/latest/score/1000/workflowtest-20080603- 1243-i380lqr6.log /home/lixi/newswift/latest/score/1000/workflowtest-20080603- 1243-i380lqr6.d Thanks, Xi ---- Original message ---- >Date: Tue, 03 Jun 2008 14:40:31 -0500 >From: Mihael Hategan >Subject: Re: [Swift-user] Failed to link input file >To: lixi at uchicago.edu >Cc: swift-user > >On Tue, 2008-06-03 at 14:24 -0500, lixi at uchicago.edu wrote: >> Hi, >> >> Recently, I encountered the following error many times: >> ... >> Host: UCSDT2 >> Directory: workflowtest-20080603-0934- cctuq211/jobs/2/node- >> 2sl4okti >> stderr.txt: >> >> stdout.txt: >> >> ---- >> >> Caused by: >> UCSDT2 Failed to link input file >> _concurrent/intermediatefile-272352a4-9803-4509-8f19- >> fcddb7de230b- >> ... > >What's in the wrapper log for that job? > > From hategan at mcs.anl.gov Tue Jun 3 14:58:19 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 03 Jun 2008 14:58:19 -0500 Subject: [Swift-user] Failed to link input file In-Reply-To: <20080603145247.BAX00142@m4500-03.uchicago.edu> References: <20080603145247.BAX00142@m4500-03.uchicago.edu> Message-ID: <1212523099.16304.0.camel@localhost> On Tue, 2008-06-03 at 14:52 -0500, lixi at uchicago.edu wrote: > I couldn't find the wrapper log for that node. That very likely means that the shared filesystem is not working properly on that node. > > The log file and wrapper log are on CI: > /home/lixi/newswift/latest/score/1000/workflowtest-20080603- > 1243-i380lqr6.log > /home/lixi/newswift/latest/score/1000/workflowtest-20080603- > 1243-i380lqr6.d > > Thanks, > > Xi > > ---- Original message ---- > >Date: Tue, 03 Jun 2008 14:40:31 -0500 > >From: Mihael Hategan > >Subject: Re: [Swift-user] Failed to link input file > >To: lixi at uchicago.edu > >Cc: swift-user > > > >On Tue, 2008-06-03 at 14:24 -0500, lixi at uchicago.edu wrote: > >> Hi, > >> > >> Recently, I encountered the following error many times: > >> ... > >> Host: UCSDT2 > >> Directory: workflowtest-20080603-0934- > cctuq211/jobs/2/node- > >> 2sl4okti > >> stderr.txt: > >> > >> stdout.txt: > >> > >> ---- > >> > >> Caused by: > >> UCSDT2 Failed to link input file > >> _concurrent/intermediatefile-272352a4-9803-4509-8f19- > >> fcddb7de230b- > >> ... > > > >What's in the wrapper log for that job? > > > > From jamalphd at gmail.com Tue Jun 3 15:41:44 2008 From: jamalphd at gmail.com (J A) Date: Tue, 3 Jun 2008 16:41:44 -0400 Subject: [Swift-user] Performance of Swift In-Reply-To: <48456BE5.3080307@cs.uchicago.edu> References: <48456BE5.3080307@cs.uchicago.edu> Message-ID: Thank you all for your replies ... I have a workflow that i developed using C code. I am thinking of using Swift to execute the workflow, so my thinking is that i need first to change the code to be Swift script. More info about my workflow: The workflow consist of several major tasks: Task 1: create a 1000 uniquely strings where each string is 1000 bytes. Task 2: merge the strings where every 2 strings ( A, B) will exchange a segment of it at a certain point and produce 1 string (C) with the same length (1000 bytes). Then C replaces B. Task 3: duplicate the list of string so we will have now 2000 strings Task 4: randomly choose 1000 strings from the current 2000 strings. Task 5: repeat Tasks 2, 3, and 4 for N times (N is given) and now the list of strings used in the next iteration is the output of Task 4. Do you think changing the whole program into Swift script is necessary or just certain sections? Can i just use wrappers around certain tasks and use Swift Script to call these tasks? Will the performance be the same? Any suggestion will be really appreciated. Thanks, Jamal On 6/3/08, Ioan Raicu wrote: > > Hi, > There are several papers out there from our group that shows different > aspects of performance of Swift. Here are a few: > > - > http://people.cs.uchicago.edu/~iraicu/publications/2008_NOVA08_book-chapter_Swift.pdf > - Figure 9: Shows the memory footprint per job (aka tasks, or nodes > in the DAG graph) > - shows memory footprint of 3.2KB per node > - Figure 18: Shows a large scale application > - 20K tasks on 200 CPUs with average task lengths of 200 seconds > is a comfortable range for Swift and Falkon > - we have more recent results, not published yet, that has 16K > tasks on 2048 CPUs with an average task length of 87 seconds which worked > well > - > http://people.cs.uchicago.edu/~iraicu/publications/2007_SWF07_Swift.pdf > - Figure 6: Shows the speedup achieved with different task lengths > - Conclusion is that using multi-level scheduling with the Falkon > provider, even tasks in the range of seconds long can achieve good speedup > - Figure 7: Shows the throughput in tasks/sec achieved by Swift > - shows Swift achieving 50+ tasks/sec throughputs using Falkon > - the paragraph right after this figure mentions that Swift > running directly with GRAM2 and PBS can achieve 2 jobs/sec; the implication > of this is that jobs typically take 15~60 seconds to startup, which reflects > the cost of scheduling, scheduling cycles, and local resource manager's > (LRM) time to setup the remote nodes; there are also limitations on how many > jobs can be submitted at a time, as each job queued might consume some > resources on the LRM, or there might be policies in place that limit the > number of jobs that can be queued; this means that aggressive throttling > must take place, which in practice, reduces the sustained rate that Swift > can submit/execute jobs to a single site, to even lower than 2 jobs/sec > > So, to answer you question, the performance of Swift (and any other > workflow system) will heavily rely on how efficient you can dispatch > jobs/tasks to remote resources, how long jobs/tasks are, how data intensive > the application is, and how much data movement must happen before the job > runs and after. If you have a fast enough file system, and your application > execution times are small, you can expect anywhere from 1 to 50 jobs/sec > from Swift, depending on what technologies you use to interface between > Swift and the remote resources (e.g. GRAM, PBS, Condor, Falkon, etc). > > Cheers, > Ioan > > > J A wrote: > > Hi All: > > > Based on my reading, the performance from execution a swift workflow > depends on the parallelism that a workflow has. > > > > If I have a workflow that contains several processors where each processor > (procedure) depends on the previous one (output of a processor "A" is the > input for processors "B" and so on.) > > > > How the performance of using swift in this case compare to other systems > that execute workflows where there isn't any parallelism in the workflow? > > > > -- > Thanks, > > Jamal > > ------------------------------ > > _______________________________________________ > Swift-user mailing listSwift-user at ci.uchicago.eduhttp://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > -- > =================================================== > Ioan Raicu > Ph.D. Candidate > =================================================== > Distributed Systems Laboratory > Computer Science Department > University of Chicago > 1100 E. 58th Street, Ryerson Hall > Chicago, IL 60637 > =================================================== > Email: iraicu at cs.uchicago.edu > Web: http://www.cs.uchicago.edu/~iraicu http://dev.globus.org/wiki/Incubator/Falkonhttp://dsl-wiki.cs.uchicago.edu/index.php/Main_Page > =================================================== > =================================================== > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jamalphd at gmail.com Sat Jun 7 05:18:29 2008 From: jamalphd at gmail.com (J A) Date: Sat, 7 Jun 2008 06:18:29 -0400 Subject: [Swift-user] Performance of Swift In-Reply-To: References: <48456BE5.3080307@cs.uchicago.edu> Message-ID: Thank you all for your replies ... > > I have a workflow that i developed using C code. I am thinking of using > Swift to execute the workflow, so my thinking is that i need first to change > the code to be Swift script. > > More info about my workflow: > > The workflow consist of several major tasks: > > Task 1: create a 1000 uniquely strings where each string is 1000 bytes. > Task 2: merge the strings where every 2 strings ( A, B) will exchange a > segment of it at a certain point and produce 1 string (C) with the same > length (1000 bytes). Then C replaces B. > Task 3: duplicate the list of string so we will have now 2000 strings > Task 4: randomly choose 1000 strings from the current 2000 strings. > Task 5: repeat Tasks 2, 3, and 4 for N times (N is given) and now the > list of strings used in the next iteration is the output of Task 4. > > > Do you think changing the whole program into Swift script is necessary or > just certain sections? Can i just use wrappers around certain tasks and use > Swift Script to call these tasks? > > Will the performance be the same? > > > Any suggestion will be really appreciated. > > Thanks, > Jamal > > > > > > On 6/3/08, Ioan Raicu wrote: >> >> Hi, >> There are several papers out there from our group that shows different >> aspects of performance of Swift. Here are a few: >> >> - >> http://people.cs.uchicago.edu/~iraicu/publications/2008_NOVA08_book-chapter_Swift.pdf >> - Figure 9: Shows the memory footprint per job (aka tasks, or nodes >> in the DAG graph) >> - shows memory footprint of 3.2KB per node >> - Figure 18: Shows a large scale application >> - 20K tasks on 200 CPUs with average task lengths of 200 seconds >> is a comfortable range for Swift and Falkon >> - we have more recent results, not published yet, that has 16K >> tasks on 2048 CPUs with an average task length of 87 seconds which worked >> well >> - >> http://people.cs.uchicago.edu/~iraicu/publications/2007_SWF07_Swift.pdf >> - Figure 6: Shows the speedup achieved with different task lengths >> - Conclusion is that using multi-level scheduling with the >> Falkon provider, even tasks in the range of seconds long can achieve good >> speedup >> - Figure 7: Shows the throughput in tasks/sec achieved by Swift >> - shows Swift achieving 50+ tasks/sec throughputs using Falkon >> - the paragraph right after this figure mentions that Swift >> running directly with GRAM2 and PBS can achieve 2 jobs/sec; the implication >> of this is that jobs typically take 15~60 seconds to startup, which reflects >> the cost of scheduling, scheduling cycles, and local resource manager's >> (LRM) time to setup the remote nodes; there are also limitations on how many >> jobs can be submitted at a time, as each job queued might consume some >> resources on the LRM, or there might be policies in place that limit the >> number of jobs that can be queued; this means that aggressive throttling >> must take place, which in practice, reduces the sustained rate that Swift >> can submit/execute jobs to a single site, to even lower than 2 jobs/sec >> >> So, to answer you question, the performance of Swift (and any other >> workflow system) will heavily rely on how efficient you can dispatch >> jobs/tasks to remote resources, how long jobs/tasks are, how data intensive >> the application is, and how much data movement must happen before the job >> runs and after. If you have a fast enough file system, and your application >> execution times are small, you can expect anywhere from 1 to 50 jobs/sec >> from Swift, depending on what technologies you use to interface between >> Swift and the remote resources (e.g. GRAM, PBS, Condor, Falkon, etc). >> >> Cheers, >> Ioan >> >> >> J A wrote: >> >> Hi All: >> >> >> Based on my reading, the performance from execution a swift workflow >> depends on the parallelism that a workflow has. >> >> >> >> If I have a workflow that contains several processors where each processor >> (procedure) depends on the previous one (output of a processor "A" is the >> input for processors "B" and so on.) >> >> >> >> How the performance of using swift in this case compare to other systems >> that execute workflows where there isn't any parallelism in the workflow? >> >> >> >> -- >> Thanks, >> >> Jamal >> >> ------------------------------ >> >> _______________________________________________ >> Swift-user mailing listSwift-user at ci.uchicago.eduhttp://mail.ci.uchicago.edu/mailman/listinfo/swift-user >> >> >> -- >> =================================================== >> Ioan Raicu >> Ph.D. Candidate >> =================================================== >> Distributed Systems Laboratory >> Computer Science Department >> University of Chicago >> 1100 E. 58th Street, Ryerson Hall >> Chicago, IL 60637 >> =================================================== >> Email: iraicu at cs.uchicago.edu >> Web: http://www.cs.uchicago.edu/~iraicu http://dev.globus.org/wiki/Incubator/Falkonhttp://dsl-wiki.cs.uchicago.edu/index.php/Main_Page >> =================================================== >> =================================================== >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From benc at hawaga.org.uk Sat Jun 7 07:03:21 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Sat, 7 Jun 2008 12:03:21 +0000 (GMT) Subject: [Swift-user] Performance of Swift In-Reply-To: References: <48456BE5.3080307@cs.uchicago.edu> Message-ID: > > Do you think changing the whole program into Swift script is necessary or > > just certain sections? Can i just use wrappers around certain tasks and use > > Swift Script to call these tasks? Almost definitely do not convert that whole program into SwiftScript - Swift is not intended to efficiently execute "short" operations like string operations. It would deal better with plugging together larger pieces of your application (eg pieces that take minutes to run), with those implemented in (in your case) perhaps C. To get decent benefit, though, I think you will need to figure out which pieces can run in parallel - breaking your app into eg 4 pieces and then only running them in sequence won't give much/any performance improvement. You program looks almost, but not quite, like a genetic algorithm implementation; and there is a lot on the web about parallelising those. -- From jamalphd at gmail.com Tue Jun 10 08:10:43 2008 From: jamalphd at gmail.com (J A) Date: Tue, 10 Jun 2008 09:10:43 -0400 Subject: [Swift-user] Performance of Swift In-Reply-To: References: <48456BE5.3080307@cs.uchicago.edu> Message-ID: Thanks ... On 6/7/08, Ben Clifford wrote: > > > > > Do you think changing the whole program into Swift script is necessary > or > > > just certain sections? Can i just use wrappers around certain tasks > and use > > > Swift Script to call these tasks? > > Almost definitely do not convert that whole program into SwiftScript - > Swift is not intended to efficiently execute "short" operations like > string operations. It would deal better with plugging together larger > pieces of your application (eg pieces that take minutes to run), with > those implemented in (in your case) perhaps C. > > To get decent benefit, though, I think you will need to figure out which > pieces can run in parallel - breaking your app into eg 4 pieces and then > only running them in sequence won't give much/any performance improvement. > > You program looks almost, but not quite, like a genetic algorithm > implementation; and there is a lot on the web about parallelising those. > > > > -- > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mikekubal at yahoo.com Wed Jun 11 13:48:36 2008 From: mikekubal at yahoo.com (Mike Kubal) Date: Wed, 11 Jun 2008 11:48:36 -0700 (PDT) Subject: [Swift-user] suggestion for program flow control Message-ID: <370852.53715.qm@web52312.mail.re2.yahoo.com> In a swift script I have a loop that iterates over potentially thousands of files and performs a function on each. After the loop ends another swift function is called to parse the results. Since I do no want to pass a result file for each of the thousand processed files in the loop, I have added an additional localhost script that checks to see if X number of results file have been produced and then generates a text file that is passed into the swift function that parses the results. It wasn't a big deal to add the extra script, but it may be desirable for the swift program to not move to a function call outside of the loop until the loop is finished. Mike From hategan at mcs.anl.gov Wed Jun 11 14:16:51 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 11 Jun 2008 14:16:51 -0500 Subject: [Swift-user] suggestion for program flow control In-Reply-To: <370852.53715.qm@web52312.mail.re2.yahoo.com> References: <370852.53715.qm@web52312.mail.re2.yahoo.com> Message-ID: <1213211811.7071.10.camel@localhost> On Wed, 2008-06-11 at 11:48 -0700, Mike Kubal wrote: > It wasn't a big deal to add the extra script, but it may be desirable for the swift program to not move to a function call outside of the loop until the loop is finished. One of the design issues with Swift was that it is also desirable for dependent loops to be pipelined. Which seems to be the opposite of what you want. Why is it that you don't want to pass a result file for each of the thousand processed files? > > Mike > > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user From benc at hawaga.org.uk Wed Jun 11 16:50:34 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Wed, 11 Jun 2008 21:50:34 +0000 (GMT) Subject: [Swift-user] suggestion for program flow control In-Reply-To: <370852.53715.qm@web52312.mail.re2.yahoo.com> References: <370852.53715.qm@web52312.mail.re2.yahoo.com> Message-ID: > Since I do no want to pass a result file for each of the thousand > processed files in the loop, I have added an additional localhost script > that checks to see if X number of results file have been produced and > then generates a text file that is passed into the swift function that > parses the results. >From a swift-purist perspective, you shouldn't be saying "I do not want to pass files around" without substantial justification (eg. evidence that it hurts performance - which it presumably does?). More interesting is to see how stuff works entirely file-based and figure out what is going wrong with that approach. What you post sounds like there's some foldy stuff that has been talked about before - its probably interesting to talk about that a bit more - eg imagine if you could write what you are doing in SwiftScript and point out what is going wrong at the moment. -- From hategan at mcs.anl.gov Wed Jun 11 16:53:05 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 11 Jun 2008 16:53:05 -0500 Subject: [Swift-user] suggestion for program flow control In-Reply-To: References: <370852.53715.qm@web52312.mail.re2.yahoo.com> Message-ID: <1213221185.11775.0.camel@localhost> On Wed, 2008-06-11 at 21:50 +0000, Ben Clifford wrote: > > Since I do no want to pass a result file for each of the thousand > > processed files in the loop, I have added an additional localhost script > > that checks to see if X number of results file have been produced and > > then generates a text file that is passed into the swift function that > > parses the results. > > >From a swift-purist perspective, you shouldn't be saying "I do not want to > pass files around" without substantial justification (eg. evidence that it > hurts performance - which it presumably does?). More interesting is to see > how stuff works entirely file-based and figure out what is going wrong > with that approach. > > What you post sounds like there's some foldy reducy? > stuff that has been talked > about before - its probably interesting to talk about that a bit more - eg > imagine if you could write what you are doing in SwiftScript and point out > what is going wrong at the moment. > From mikekubal at yahoo.com Wed Jun 11 16:54:35 2008 From: mikekubal at yahoo.com (Mike Kubal) Date: Wed, 11 Jun 2008 14:54:35 -0700 (PDT) Subject: [Swift-user] suggestion for program flow control In-Reply-To: <1213211811.7071.10.camel@localhost> Message-ID: <380944.70847.qm@web52304.mail.re2.yahoo.com> --- On Wed, 6/11/08, Mihael Hategan wrote: > From: Mihael Hategan > Subject: Re: [Swift-user] suggestion for program flow control > To: mikekubal at yahoo.com > Cc: swift-user at ci.uchicago.edu > Date: Wednesday, June 11, 2008, 2:16 PM > On Wed, 2008-06-11 at 11:48 -0700, Mike Kubal wrote: > > > It wasn't a big deal to add the extra script, but > it may be desirable for the swift program to not move to a > function call outside of the loop until the loop is > finished. > > One of the design issues with Swift was that it is also > desirable for > dependent loops to be pipelined. Which seems to be the > opposite of what > you want. > > Why is it that you don't want to pass a result file for > each of the > thousand processed files? laziness, and also I'm not sure what that should look like in the function definition? Is there a way I can make the argument list for the function dynamic since the number of result files will vary based on the selected database to process? I can definitely see the benefit of having separate pipelines for non-dependent parts within the same script, but perhaps there is a way to chain dependent functions that is not dependent on files produced by previous functions? Like I said it wasn't a big deal to add the extra script to pause and count files, just different behavior than I expected from the loop code. > > > > > Mike > > > > > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user From benc at hawaga.org.uk Wed Jun 11 16:54:52 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Wed, 11 Jun 2008 21:54:52 +0000 (GMT) Subject: [Swift-user] suggestion for program flow control In-Reply-To: <1213221185.11775.0.camel@localhost> References: <370852.53715.qm@web52312.mail.re2.yahoo.com> <1213221185.11775.0.camel@localhost> Message-ID: > > What you post sounds like there's some foldy > > reducy? same as. -- From hategan at mcs.anl.gov Wed Jun 11 17:09:16 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 11 Jun 2008 17:09:16 -0500 Subject: [Swift-user] suggestion for program flow control In-Reply-To: <380944.70847.qm@web52304.mail.re2.yahoo.com> References: <380944.70847.qm@web52304.mail.re2.yahoo.com> Message-ID: <1213222156.11996.6.camel@localhost> > > > > Why is it that you don't want to pass a result file for > > each of the > > thousand processed files? > > laziness, Fair enough an argument for me. > > and also I'm not sure what that should look like in the function definition? > > Is there a way I can make the argument list for the function dynamic since the number of result files will vary based on the selected database to process? If I understand this correctly, then you can pass the whole array. You'd get each file name on the command line. There's a question of whether you'd cross the command line length limitation. We should have array slices in Swift maybe? > > I can definitely see the benefit of having separate pipelines for non-dependent parts within the same script, but perhaps there is a way to chain dependent functions that is not dependent on files produced by previous functions? Only that it is, but Swift has no idea about it. The dependency is handled/created by your script. > > Like I said it wasn't a big deal to add the extra script to pause and count files, just different behavior than I expected from the loop code. Right. If you look at it through the prism of an intuition constructed by doing C and Java and the likes, it doesn't look right. > > > > > > > > > Mike > > > > > > > > > > > > _______________________________________________ > > > Swift-user mailing list > > > Swift-user at ci.uchicago.edu > > > > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > http://mail.ci.uchicago.edu/mailman/listinfo/swift-user > > > From hategan at mcs.anl.gov Wed Jun 11 17:09:54 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 11 Jun 2008 17:09:54 -0500 Subject: [Swift-user] suggestion for program flow control In-Reply-To: References: <370852.53715.qm@web52312.mail.re2.yahoo.com> <1213221185.11775.0.camel@localhost> Message-ID: <1213222194.11996.8.camel@localhost> On Wed, 2008-06-11 at 21:54 +0000, Ben Clifford wrote: > > > What you post sounds like there's some foldy > > > > reducy? > > same as. > How so? From benc at hawaga.org.uk Wed Jun 11 17:13:36 2008 From: benc at hawaga.org.uk (Ben Clifford) Date: Wed, 11 Jun 2008 22:13:36 +0000 (GMT) Subject: [Swift-user] suggestion for program flow control In-Reply-To: <1213222194.11996.8.camel@localhost> References: <370852.53715.qm@web52312.mail.re2.yahoo.com> <1213221185.11775.0.camel@localhost> <1213222194.11996.8.camel@localhost> Message-ID: On Wed, 11 Jun 2008, Mihael Hategan wrote: > On Wed, 2008-06-11 at 21:54 +0000, Ben Clifford wrote: > > > > What you post sounds like there's some foldy > > > > > > reducy? > > > > same as. > > > > How so? repeated application of X -> x -> X -- From wilde at mcs.anl.gov Wed Jun 11 17:51:32 2008 From: wilde at mcs.anl.gov (Michael Wilde) Date: Wed, 11 Jun 2008 17:51:32 -0500 Subject: [Swift-user] suggestion for program flow control In-Reply-To: References: <370852.53715.qm@web52312.mail.re2.yahoo.com> Message-ID: <485056F4.3050904@mcs.anl.gov> My understanding was that MikeK was not trying to do this for performance, but was simply unsure of how to pass a large set of files from one function to the next. When he said "I do not want to pass a result file for each of the thousand processed files in the loop" I think he was talking about a coding issue, not a performance issue. I did not yet look at his latest Swift code, but suggested he forward it to the list to ask for advice on how best to express the data flow. I suspect that one of the existing mappers and correct use of a dataset type can solve his problem and eliminate the need to for localhost "file waiting" script. Unless he's tripping into the issue of having an non-predetermined number of files in the dataset. Its not clear to me if, when he does that, that new performance issues wont arise. But lets at least first look at how best to express the problem. Also, I think the discussion of fold(y) and reduce(y) concepts is likely very cryptic to non-functional programmers. - MikeW On 6/11/08 4:50 PM, Ben Clifford wrote: >> Since I do no want to pass a result file for each of the thousand >> processed files in the loop, I have added an additional localhost script >> that checks to see if X number of results file have been produced and >> then generates a text file that is passed into the swift function that >> parses the results. > >>From a swift-purist perspective, you shouldn't be saying "I do not want to > pass files around" without substantial justification (eg. evidence that it > hurts performance - which it presumably does?). More interesting is to see > how stuff works entirely file-based and figure out what is going wrong > with that approach. > > What you post sounds like there's some foldy stuff that has been talked > about before - its probably interesting to talk about that a bit more - eg > imagine if you could write what you are doing in SwiftScript and point out > what is going wrong at the moment. > From hategan at mcs.anl.gov Wed Jun 11 18:44:10 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 11 Jun 2008 18:44:10 -0500 Subject: [Swift-user] suggestion for program flow control In-Reply-To: References: <370852.53715.qm@web52312.mail.re2.yahoo.com> <1213221185.11775.0.camel@localhost> <1213222194.11996.8.camel@localhost> Message-ID: <1213227850.12786.1.camel@localhost> On Wed, 2008-06-11 at 22:13 +0000, Ben Clifford wrote: > On Wed, 11 Jun 2008, Mihael Hategan wrote: > > > On Wed, 2008-06-11 at 21:54 +0000, Ben Clifford wrote: > > > > > What you post sounds like there's some foldy > > > > > > > > reducy? > > > > > > same as. > > > > > > > How so? > > repeated application of X -> x -> X > Foldy less general than reducy here. The former assumes one step at a time. From hategan at mcs.anl.gov Wed Jun 11 18:47:29 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 11 Jun 2008 18:47:29 -0500 Subject: [Swift-user] suggestion for program flow control In-Reply-To: <485056F4.3050904@mcs.anl.gov> References: <370852.53715.qm@web52312.mail.re2.yahoo.com> <485056F4.3050904@mcs.anl.gov> Message-ID: <1213228049.12786.4.camel@localhost> On Wed, 2008-06-11 at 17:51 -0500, Michael Wilde wrote: > > Also, I think the discussion of fold(y) and reduce(y) concepts is likely > very cryptic to non-functional programmers. > Is that meant to be gentle slap on the wrist? From fedorov at cs.wm.edu Thu Jun 12 20:19:11 2008 From: fedorov at cs.wm.edu (Andriy Fedorov) Date: Thu, 12 Jun 2008 21:19:11 -0400 Subject: [Swift-user] Running first.swift remotely on NCSA Message-ID: <82f536810806121819g546ea332w187039d75b39be02@mail.gmail.com> Hello, I am a beginner user of TeraGrid/Swift, with very little experience. I am trying to run first.swift on NCSA Mercury. I updated etc/sites.xml with (what I think) the latest correct information from teragrid.org: /home/ac/fedorov/scratch I updated the etc/tc.data with the line where to find echo on Mercury: Mercury echo /bin/echo INSTALLED INTEL32::LINUX null I ran grid-proxy-init, and myproxy-init. I installed userkey.pem and usercert.pem in ~/.globus, and I installed the package with the root certificates from here http://security.teragrid.org/docs/teragrid-certs.tar.gz in ~/.globus/certificates. $GLOBUS_PATH is set, swift is in the $PATH. I am able to log on to Mercury with gsissh, and I am able to execute globus-url-copy, both without being asked password I did run 'globusrun -a -r grid-hg.ncsa.teragrid.org' and the authentication test was uccessful. Now is my question, finally: Why then, when I run 'swift first.swift', I see this (below) forever? What did I miss? How to find what is the problem? Doesn't seem normal, that 'echo' + scp takes minutes. I didn't have patience to wait until it finishes, I'll leave it overnight to see if it finishes tomorrow. [fedorov at ri vdsk] swift first.swift Swift 0.5 swift-r1783 cog-r1962 RunID: 20080612-2101-yvp36l3c Progress: echo started Progress: Executing:1 Progress: Executing:1 Progress: Executing:1 Progress: Executing:1 Progress: Executing:1 Progress: Executing:1 Thanks in advance for your kind attention...... Andrey Fedorov -- Center for Real-Time Computing College of William and Mary http://www.cs.wm.edu/~fedorov From hategan at mcs.anl.gov Thu Jun 12 20:26:28 2008 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 12 Jun 2008 20:26:28 -0500 Subject: [Swift-user] Running first.swift remotely on NCSA In-Reply-To: <82f536810806121819g546ea332w187039d75b39be02@mail.gmail.com> References: <82f536810806121819g546ea332w187039d75b39be02@mail.gmail.com> Message-ID: <1213320388.1281.1.camel@localhost> On Thu, 2008-06-12 at 21:19 -0400, Andriy Fedorov wrote: > Hello, > > I am a beginner user of TeraGrid/Swift, with very little experience. > > I am trying to run first.swift on NCSA Mercury. I updated > etc/sites.xml with (what I think) the latest correct information from > teragrid.org: > > > > url="grid-hg.ncsa.teragrid.org/jobmanager" major="2" /> That doesn't look right. You need a specific job manager, such as "fork" or "pbs". I'd recommend trying "fork" for simple testing. > > [fedorov at ri vdsk] swift first.swift > Swift 0.5 swift-r1783 cog-r1962 > > RunID: 20080612-2101-yvp36l3c > Progress: > echo started > Progress: Executing:1 > Progress: Executing:1 > Progress: Executing:1 > Progress: Executing:1 > Progress: Executing:1 > Progress: Executing:1 This may happen if the callback address for you submit host is unknown to the GRAM service or if you're behind a fierewall or NAT. If you're not, try setting $GLOBUS_HOSTNAME with your DNS address or IP. Mihael From fedorov at cs.wm.edu Fri Jun 13 08:27:58 2008 From: fedorov at cs.wm.edu (Andriy Fedorov) Date: Fri, 13 Jun 2008 09:27:58 -0400 Subject: [Swift-user] Running first.swift remotely on NCSA In-Reply-To: <1213320388.1281.1.camel@localhost> References: <82f536810806121819g546ea332w187039d75b39be02@mail.gmail.com> <1213320388.1281.1.camel@localhost> Message-ID: <82f536810806130627k4de1cc1dta4860206a80de8a1@mail.gmail.com> Michael, Thank you for the reply. Unfortunately, your suggestions didn't help. >> >> >> > url="grid-hg.ncsa.teragrid.org/jobmanager" major="2" /> > > That doesn't look right. You need a specific job manager, such as "fork" > or "pbs". I'd recommend trying "fork" for simple testing. > I got "grid-hg.ncsa.teragrid.org/jobmanager" from here: http://www.teragrid.org/userinfo/jobs/gram.php I think it is just an alias for "fork". I substituted "jobmanager" with "jobmanager-fork", but everything is the same way. I also tried to use "jobmanager-pbs", and I could see my job in the queue, but the same result "Progress: Executing:1" on the client host. >> [fedorov at ri vdsk] swift first.swift >> Swift 0.5 swift-r1783 cog-r1962 >> >> RunID: 20080612-2101-yvp36l3c >> Progress: >> echo started >> Progress: Executing:1 >> Progress: Executing:1 >> Progress: Executing:1 >> Progress: Executing:1 >> Progress: Executing:1 >> Progress: Executing:1 > > This may happen if the callback address for you submit host is unknown > to the GRAM service or if you're behind a fierewall or NAT. If you're > not, try setting $GLOBUS_HOSTNAME with your DNS address or IP. > No, I have a valid IP. I do have $GLOBUS_HOSTNAME set now, but this doesn't help. I did some more looking around, and I found directories named "first--