From lpesce at uchicago.edu Mon Aug 1 13:37:54 2011 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Mon, 1 Aug 2011 13:37:54 -0500 Subject: [Swift-user] Questions about configuration files Message-ID: Hi to all -- Ketan kindly provided us with a nice training and I was trying to implement something useful on Beagle. I have two probably not too smart questions about the configuration variables. status.mode=provider In the case of a machine with a shared filesystem, whether this is files or provider should make no difference, right? use.provider.staging=false I can't find this option in the user guide (I am using the single page trunk so that I can search it). Thanks a lot! Lorenzo From lpesce at uchicago.edu Mon Aug 1 13:52:52 2011 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Mon, 1 Aug 2011 13:52:52 -0500 Subject: [Swift-user] meaning of foreach.max.threads Message-ID: foreach.max.threads. When I read the explanation I left somewhat perplexed. Do you mean threads in the sense of shared memory parallel processes here (then 1024 is very large number to me) or something else? Do you really mean saving memory, or I am not understand it? Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Mon Aug 1 13:55:34 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 1 Aug 2011 13:55:34 -0500 (CDT) Subject: [Swift-user] Questions about configuration files In-Reply-To: Message-ID: <2027341255.178185.1312224934311.JavaMail.root@zimbra.anl.gov> Hi Lorenzo, For executing on Beagle, leave use.provider.staging set to false. Its described in the more complete properties documentation, in the comments in the etc/swift.properties file in the Swift distribution etc/ directory. Provider staging does data transfer to compute nodes using the Coaster protocol. For Beagle execution you want to leave that off, as Beagle nodes dont have much of a local filesystem to stage to (they just have /dev/shm, ie RAM disk). - Mike ----- Original Message ----- > From: "Lorenzo Pesce" > To: swift-user at ci.uchicago.edu > Sent: Monday, August 1, 2011 1:37:54 PM > Subject: [Swift-user] Questions about configuration files > Hi to all -- > Ketan kindly provided us with a nice training and I was trying to > implement something useful on Beagle. > I have two probably not too smart questions about the configuration > variables. > > status.mode=provider > > In the case of a machine with a shared filesystem, whether this is > files or provider should make no difference, right? > > use.provider.staging=false > > I can't find this option in the user guide (I am using the single page > trunk so that I can search it). > > Thanks a lot! > > Lorenzo > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From hategan at mcs.anl.gov Mon Aug 1 13:57:37 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 01 Aug 2011 11:57:37 -0700 Subject: [Swift-user] meaning of foreach.max.threads In-Reply-To: References: Message-ID: <1312225057.30749.2.camel@blabla> On Mon, 2011-08-01 at 13:52 -0500, Lorenzo Pesce wrote: > foreach.max.threads. When I read the explanation I left somewhat > perplexed. > > > Do you mean threads in the sense of shared memory parallel processes > here (then 1024 is very large number to me) or something else? They are the lightweight threads that the swift system uses. We had some instances of users trying to run a few million of them in parallel and that would eat up most memory. Since there was no way that 5 million CPUs could be in use at one time, we put in that limit in place. Made things a bit better. From ketancmaheshwari at gmail.com Mon Aug 1 13:58:34 2011 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Mon, 1 Aug 2011 13:58:34 -0500 Subject: [Swift-user] meaning of foreach.max.threads In-Reply-To: References: Message-ID: Lorenzo, foreach.max.threads is kind of a throttle parameter that will control the number of threads spawned when a foreach loop is encountered. For example: foreach i in [0:10000]{ do_something(); } can potentially lead to 10,000 concurrent threads being spawned simultaneously which can lead to a jam in the system. The foreach.max.threads param will limit it to the number specified. I normally set it to less than 1000 or even less if I have nested foreach loops. Hope that will help. Ketan On Mon, Aug 1, 2011 at 1:52 PM, Lorenzo Pesce wrote: > foreach.max.threads. When I read the explanation I left somewhat > perplexed. > > Do you mean threads in the sense of shared memory parallel processes here > (then 1024 is very large number to me) or something else? > Do you really mean saving memory, or I am not understand it? > > Thanks! > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From wilde at mcs.anl.gov Mon Aug 1 14:01:10 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Mon, 1 Aug 2011 14:01:10 -0500 (CDT) Subject: [Swift-user] meaning of foreach.max.threads In-Reply-To: Message-ID: <1296213176.178229.1312225270544.JavaMail.root@zimbra.anl.gov> ----- Original Message ----- > From: "Lorenzo Pesce" > To: swift-user at ci.uchicago.edu > Sent: Monday, August 1, 2011 1:52:52 PM > Subject: [Swift-user] meaning of foreach.max.threads > foreach.max.threads. When I read the explanation I left somewhat > perplexed. > > > Do you mean threads in the sense of shared memory parallel processes > here (then 1024 is very large number to me) or something else? This number refers to Karajan threads - much lighter weight than Java threads. Karajan multiplexes its concurrent operations into a much smaller number of real Java threads. Its best to think of this property as just setting the maximum number of concurrent iterations that any foreach loop will be allowed to execute in parallel. > Do you really mean saving memory, or I am not understand it? Yes - this throttle is there to put a controllable bound on the amount of memory Swift uses in processing concurrent foreach loops. The higher the throttle, the more loop bodies Swift will run in parallel, and the objects required for those executions consume memory. So if a foreach loop had 1M elements to process, the throttle would only allow 1K of these to run at any given time. - Mike From hategan at mcs.anl.gov Mon Aug 1 14:08:20 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 01 Aug 2011 12:08:20 -0700 Subject: [Swift-user] meaning of foreach.max.threads In-Reply-To: <1312225057.30749.2.camel@blabla> References: <1312225057.30749.2.camel@blabla> Message-ID: <1312225700.31060.0.camel@blabla> On Mon, 2011-08-01 at 11:57 -0700, Mihael Hategan wrote: > They are the lightweight threads that the swift system uses. We had some > instances of users trying to run a few million of them in parallel Correction: that number was either 500,000 or 1M. From lpesce at uchicago.edu Mon Aug 1 14:10:37 2011 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Mon, 1 Aug 2011 14:10:37 -0500 Subject: [Swift-user] jobsPerNode Message-ID: If I look at this line: 24 I assume I have to interpret this as: 1) If I want to pack as many jobs as possible (i.e., if I don't have memory issues, which unfortunately Matlab is excellent at creating) 2) If my jobs are single-threaded I wanted to make sure that this was not a setting that would be used for combining both threads and repetitions. What would I do if I want to repeat much larger chunks, e.g., I want to make a parameter swift in NAMD where each sim will be run on 20 nodes concurrently? If you have a pointer for where I can find the answer, I will go there. Lorenzo From benc at hawaga.org.uk Mon Aug 1 14:37:33 2011 From: benc at hawaga.org.uk (Ben Clifford) Date: Mon, 1 Aug 2011 21:37:33 +0200 Subject: [Swift-user] Questions about configuration files In-Reply-To: References: Message-ID: <6FC8FB21-9CA2-401F-B9E9-7F951D96F8E1@hawaga.org.uk> On Aug 1, 2011, at 8:37 PM, Lorenzo Pesce wrote: > Hi to all -- > Ketan kindly provided us with a nice training and I was trying to implement something useful on Beagle. > I have two probably not too smart questions about the configuration variables. > > status.mode=provider > > In the case of a machine with a shared filesystem, whether this is files or provider should make no difference, right? The files mode is slower but works on more sites. If provider works for you, then use provider. In some cases, the execution provider couldn't correctly indicate whether a job worked or not, which is why the files mode exists. Ben From lpesce at uchicago.edu Tue Aug 9 11:54:36 2011 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Tue, 9 Aug 2011 11:54:36 -0500 Subject: [Swift-user] Shared memory Message-ID: I use a Cray XE6 with 24 core per node. While some of the tasks I will work on will be very suited to Swift, one of my issues is memory. Specifically, some applications will require a small part of "independent" memory and a lot of common memory (e.g., a large data matrix). Usually, I would handle this with openMP or a global o co-array, depending upon its size. Does swift have any mechanism to deal with this from independent programs or would I need to create an original code that deals with it? It seems unlikely, but I thought I might ask anyway. Where can I find details about how Swift deals with the structure of a machine (e.g., cpu-binding, memory binding, NUMA node exploitation)? Thanks a lot! Lorenzo From wozniak at mcs.anl.gov Tue Aug 9 12:32:17 2011 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Tue, 9 Aug 2011 12:32:17 -0500 (CDT) Subject: [Swift-user] jobsPerNode In-Reply-To: References: Message-ID: On Mon, 1 Aug 2011, Lorenzo Pesce wrote: > If I look at this line: > > 24 > > I assume I have to interpret this as: > 1) If I want to pack as many jobs as possible (i.e., if I don't have memory issues, which unfortunately Matlab is excellent at creating) > 2) If my jobs are single-threaded > > I wanted to make sure that this was not a setting that would be used for combining both threads and repetitions. > > What would I do if I want to repeat much larger chunks, e.g., I want to > make a parameter swift in NAMD where each sim will be run on 20 nodes > concurrently? > > If you have a pointer for where I can find the answer, I will go there. The easiest way to get multi-node jobs going on Beagle right now is to just use the local provider and launch your application with aprun. In this mode, you can use all of the PBS/aprun features. To do this, you put the call to swift in your qsub file. Justin -- Justin M Wozniak From wozniak at mcs.anl.gov Tue Aug 9 12:35:32 2011 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Tue, 9 Aug 2011 12:35:32 -0500 (CDT) Subject: [Swift-user] Shared memory In-Reply-To: References: Message-ID: On Tue, 9 Aug 2011, Lorenzo Pesce wrote: > I use a Cray XE6 with 24 core per node. > While some of the tasks I will work on will be very suited to Swift, one of my issues is memory. Specifically, some applications will require a small part of "independent" memory and a lot of common memory (e.g., a large data matrix). Usually, I would handle this with openMP or a global o co-array, depending upon its size. > > Does swift have any mechanism to deal with this from independent programs or would I need to create an original code that deals with it? It seems unlikely, but I thought I might ask anyway. > > Where can I find details about how Swift deals with the structure of a machine (e.g., cpu-binding, memory binding, NUMA node exploitation)? > > Thanks a lot! > > Lorenzo Swift does not currently deal with these issues. The Coasters worker spawns multiple single-process jobs on each compute node in a straight-forward way. Justin -- Justin M Wozniak From lpesce at uchicago.edu Tue Aug 9 12:49:12 2011 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Tue, 9 Aug 2011 12:49:12 -0500 Subject: [Swift-user] Shared memory In-Reply-To: References:

Message-ID: <5586B242-710E-4FA6-96A4-005388898DD6@uchicago.edu> Thanks a lot for your replies Justin. They are very useful. When Swift does the packing, does it have to wait until all the jobs packed in a node are finished, or can it resubmit when a core is freed up? Thanks again! Lorenzo On Aug 9, 2011, at 12:35 PM, Justin M Wozniak wrote: > On Tue, 9 Aug 2011, Lorenzo Pesce wrote: > >> I use a Cray XE6 with 24 core per node. >> While some of the tasks I will work on will be very suited to Swift, one of my issues is memory. Specifically, some applications will require a small part of "independent" memory and a lot of common memory (e.g., a large data matrix). Usually, I would handle this with openMP or a global o co-array, depending upon its size. >> >> Does swift have any mechanism to deal with this from independent programs or would I need to create an original code that deals with it? It seems unlikely, but I thought I might ask anyway. >> >> Where can I find details about how Swift deals with the structure of a machine (e.g., cpu-binding, memory binding, NUMA node exploitation)? >> >> Thanks a lot! >> >> Lorenzo > > Swift does not currently deal with these issues. The Coasters worker spawns multiple single-process jobs on each compute node in a straight-forward way. > > Justin > > -- > Justin M Wozniak From lpesce at uchicago.edu Tue Aug 9 14:50:38 2011 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Tue, 9 Aug 2011 14:50:38 -0500 Subject: [Swift-user] about http://www.ci.uchicago.edu/swift/cookbook/cookbook-asciidoc.html#_beagle Message-ID: <3AA9C53B-38AD-4C2E-8712-8D39236E8F09@uchicago.edu> Looks nice and easy. I would change a couple of things: 1) In the presentation about Swift for Beagle users, you work all the time from Lustre. Maybe you want to do the same here. 2) Woking from Sandbox can be tricky because programs behave differently on Sandbox than they do on the compute nodes. Thanks! Lorenzo From hategan at mcs.anl.gov Tue Aug 9 16:31:45 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Tue, 09 Aug 2011 14:31:45 -0700 Subject: [Swift-user] about http://www.ci.uchicago.edu/swift/cookbook/cookbook-asciidoc.html#_beagle In-Reply-To: <3AA9C53B-38AD-4C2E-8712-8D39236E8F09@uchicago.edu> References: <3AA9C53B-38AD-4C2E-8712-8D39236E8F09@uchicago.edu> Message-ID: <1312925505.4795.3.camel@blabla> On Tue, 2011-08-09 at 14:50 -0500, Lorenzo Pesce wrote: > Looks nice and easy. > I would change a couple of things: > 1) In the presentation about Swift for Beagle users, you work all the time from Lustre. Maybe you want to do the same here. > 2) Woking from Sandbox can be tricky because programs behave differently on Sandbox than they do on the compute nodes. There's a mode in swift that allows you to work from sandbox to sandbox without Lustre. It should be transparent to the user, though I don't think we did any performance comparison in this case. From lpesce at uchicago.edu Tue Aug 30 09:22:51 2011 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Tue, 30 Aug 2011 09:22:51 -0500 Subject: [Swift-user] Do you have any resource for learning about SwiftR? Message-ID: <6185AB2A-788D-46ED-9FE2-5AC79E4CE29E@uchicago.edu> Hi - I want to run relatively small sized simulations (say at most 50 cores or so, probably mostly one or two) but many many times over. The simulations will be coded in R. Thanks a lot! Lorenzo From wilde at mcs.anl.gov Tue Aug 30 11:05:33 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 30 Aug 2011 11:05:33 -0500 (CDT) Subject: [Swift-user] Do you have any resource for learning about SwiftR? In-Reply-To: <283375835.259628.1314717726557.JavaMail.root@zimbra.anl.gov> Message-ID: <596328981.259897.1314720333244.JavaMail.root@zimbra.anl.gov> Lorenzo, The SwiftR documentation is currently at: http://www.ci.uchicago.edu/wiki/bin/view/SWFT/SwiftR which also provides a quick start guide at: http://www.ci.uchicago.edu/wiki/bin/view/SWFT/SwiftRQuickstart Further examples and some performance measurements are at: http://people.cs.uchicago.edu/~tga/swiftR/ And more examples are available with ?SwiftR help once you load the package: > source("http://people.cs.uchicago.edu/~tga/swiftR/getSwift.R") I just built an R-2.13.1 release on Beagle with plain gcc, which I think *should* be runnable in parallel on worker nodes. (Not yet tested though). This R should be capable of running SwiftR. Im hoping that Tim cam verify this soon. We'll likely need an additional SwiftR server name and config for Beagle and other Cray systems. We'll try to consolidate the SwiftR documentation in a user guide on the Swift in the future. Tim, can you do a quick check of the documentation to make sure its still correct and that it points to the latest SwiftR package? Thanks, - Mike ----- Original Message ----- > From: "Lorenzo Pesce" > To: swift-user at ci.uchicago.edu > Sent: Tuesday, August 30, 2011 9:22:51 AM > Subject: [Swift-user] Do you have any resource for learning about SwiftR? > Hi - > > I want to run relatively small sized simulations (say at most 50 cores > or so, probably mostly one or two) but many many times over. The > simulations will be coded in R. > > Thanks a lot! > > Lorenzo > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From lpesce at uchicago.edu Tue Aug 30 14:59:57 2011 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Tue, 30 Aug 2011 14:59:57 -0500 Subject: [Swift-user] Is there a way to set env variables in swift Message-ID: Specifically something like export TMP=/lustre/beagle/`whoami`/tmp Thanks Lorenzo From wozniak at mcs.anl.gov Tue Aug 30 16:17:54 2011 From: wozniak at mcs.anl.gov (Justin M Wozniak) Date: Tue, 30 Aug 2011 16:17:54 -0500 (CDT) Subject: [Swift-user] Is there a way to set env variables in swift In-Reply-To: References: Message-ID: Hi Lorenzo Just use the env namespace: http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_env_namespace Justin On Tue, 30 Aug 2011, Lorenzo Pesce wrote: > Specifically something like > > export TMP=/lustre/beagle/`whoami`/tmp > > Thanks > > Lorenzo > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > -- Justin M Wozniak From wilde at mcs.anl.gov Tue Aug 30 16:22:26 2011 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 30 Aug 2011 16:22:26 -0500 (CDT) Subject: [Swift-user] Is there a way to set env variables in swift In-Reply-To: Message-ID: <1598462977.261497.1314739346693.JavaMail.root@zimbra.anl.gov> Lorenzo, You can set env vars for an application in the tc.data file. Use the "env namespace profile" as described at: http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_env_namespace Like this: beagle myapp /home/joe/bin/myapp INSTALLED INTEL32::LINUX ENV::TMP="/lustre/beagle/joe/tmp" If you need "joe" substituted into the "TMP" env var in the tc file dynamically (as with your example below), you need to do that with your own wrapper script on the client host (before you run the swift command). Alternatively, if you are running locally on Beagle, there may also be an option you can specify in your sites file that will get passed down to the PBS qsub command, to pass the environment vars of the submitting job down to the submitted job, in which case you dont need the ENV::TMP profile in your tc.data file. David or Justin may be able to clarify whats possible there. (And this should go into the new 0.93 Site Configuration Guide). - Mike ----- Original Message ----- > From: "Lorenzo Pesce" > To: swift-user at ci.uchicago.edu > Sent: Tuesday, August 30, 2011 2:59:57 PM > Subject: [Swift-user] Is there a way to set env variables in swift > Specifically something like > > export TMP=/lustre/beagle/`whoami`/tmp > > Thanks > > Lorenzo > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From iraicu at cs.iit.edu Wed Aug 31 00:34:27 2011 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Wed, 31 Aug 2011 00:34:27 -0500 Subject: [Swift-user] CFP: 2011 Chicago Colloquium on Digital Humanities and Computer Science Message-ID: <4E5DC7E3.5080201@cs.iit.edu> Call for Papers 2011 Chicago Colloquium on Digital Humanities and Computer Science November 19-21, 2011 Loyola University Chicago -- Chicago, Illinois, USA Submission Deadline: September 15, 2011 http://chicagocolloquium.org The Chicago Colloquium on Digital Humanities and Computer Science (DHCS) brings together researchers and scholars in the humanities and computer science to examine the current state of digital humanities as a field of intellectual inquiry and to identify and explore new directions and perspectives for future research. Here is a brief look at the three most recent conferences in the DHCS series, which celebrates its sixth year running in 2011. * DHCS 2008 (University of Chicago) focused on "Making Sense" -- an exploration of how meaning is created and apprehended at the transition from the digital to the analog. * DHCS 2009 (IIT) focused on computational methods in digital humanities, including computational stylistics, text analytics, and visualization. * DHCS 2010 (Northwestern) focused on "Working with Digital Data: Collaborate, Curate, Analyze, Annotate." With broad agency support for and continued cross-disciplinary interest in "digging into data" as well as cyberinfrastructure and collaboration, this year's DHCS will continue to focus on these and related topics of interest to the community, with a formal colloquium theme to be unveiled as the program is finalized. We invite submissions from scholars, researchers, practitioners (independent scholars and industry), librarians, technologists, and students, on all topics that intersect current theory and practice in the humanities and computer science. This year's DHCS is sponsored by Loyola University Chicago, The University of Chicago, Northwestern University, and the Illinois Institute of Technology. Location and Venue Description Loyola University Chicago Water Tower Campus 820 N. Michigan Avenue Chicago, IL 60640 The conference will be held at Loyola University Chicago at its Water Tower Campus. Located near the Magnificent Mile and the historic Water Tower, the venue offers convenient access to excellent hotels and restaurants, not to mention ample opportunities for sightseeing and shopping. The time frame for the conference coincides with the annual unveiling of the holiday lights and delightful walks on the Magnificent Mile--the last chance before Chicago's winter arrives in full force. Keynote Speakers The list of keynote speakers is still being determined and will be posted as the conference program is nearing completion. Co-Chairs George K. Thiruvathukal, Computer Science, Loyola University Chicago, http://www.thiruvathukal.com Steven E. Jones, English, Loyola University Chicago, http://stevenejones.org/ Program Committee * Shlomo Argamon, Computer Science, Illinois Institute of Technology, http://www.iit.edu/csl/cs/faculty/argamon_shlomo.shtml * Arno Bosse, Comparative Literature, University of Chicago * Helma Dik, Classics, University of Chicago, http://classics.uchicago.edu/faculty/dik * Doug Downey, Computer Science, Northwestern University, http://www.cs.northwestern.edu/~ddowney/ * William L. Honig, Computer Science, Loyola University Chicago, http://people.cs.luc.edu/whonig * Konstantin L?ufer, Computer Science, Loyola University Chicago, http://laufer.cs.luc.edu * Peter Leonard, Humanities Research Computing, University of Chicago, http://home.uchicago.edu/psleonar/ * Catherine Mardikes, University Library, University of Chicago * Mark Olsen, ARTFL Project, University of Chicago, http://artfl-project.uchicago.edu/ * Ioan Raicu, Computer Science, Illinois Institute of Technology, http://www.cs.iit.edu/~iraicu/ * Claire Stewart, University Library, Northwestern University, http://www.library.northwestern.edu/directory/claire-stewart Journal of the Chicago DHCS Colloquium Select papers and posters accepted at DHCS are published in the /Journal of the Chicago Colloquium on Digital Humanities and Computer Science (JDHCS)/. Please visit http://jdhcs.uchicago.edu to view the full text of presentations from these colloquia. Preliminary Colloquium Schedule The formal DHCS colloquium program runs Saturday November 19 (afternoon), Sunday, November 20 (all day), and Monday, November 21 (ending mid-afternoon) and will consist of four, 1-1/2 hour paper panels and two, two-hour poster sessions as well as three keynotes. Pre-conference birds of a feather and tutorials will occur on Saturday, November 19, in the afternoon. Generous time has been set aside for questions and follow-up discussions after each panel and in the schedule breaks. There are no plans for parallel sessions. For further details, please see the conference website. Registration Fee Attendance for DHCS 2011 is free. All conference participants, however, will be required to register in advance. Details to follow as the conference program is finalized. Submission Format We welcome submissions that are either extended abstracts or full papers (8-page maximum, please) in PDF format. We welcome submissions for: * Paper presentations (15 and 30 minute presentations) * Posters * Software demonstrations * Performances * Pre-conference tutorials/workshops/seminars, and * Pre-conference "birds of a feather" meetings This year, we are using the EasyChair software to handle all submissions. http://www.easychair.org/conferences/?conf=dhcs2011 The instructions are simple: 1. Register yourself (you will add co-authors later) 2. Confirm the registration e-mail. 3. Make sure you go back to the main link and sign in. 4. Create a "New Submission". Fill in all appropriate sections. 5. Don't forget to Upload Paper at the end of the form. Submissions will only be accepted at the EasyChair URL above. Should you run into problems, please contact George K. Thiruvathukal at gkt+dhcs at cs.luc.edu . (The +dhcs is optional but will help to prioritize your e-mail.) Graduate Student Travel Fund A limited number of bursaries are available to assist graduate students who are presenting at the colloquium with their travel and accommodation expenses. More information about the application process will be available shortly at the Chicago Colloquium web site. Important Dates Deadline for Submissions: September 15 Notification of Acceptance: October 1 Full Program Announcement: October 15 Registration: October 1-November 15 (on-site will also be possible) Colloquium: Sunday, November 20 -- Monday, November 21, 2011 Contact Info Please email gkt+dhcs at cs.luc.edu Conference Hash Tag: #dhcs2011 -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================================= ================================================================= -------------- next part -------------- An HTML attachment was scrubbed... URL: From iraicu at cs.iit.edu Wed Aug 31 02:37:08 2011 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Wed, 31 Aug 2011 02:37:08 -0500 Subject: [Swift-user] CFP: 4th Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS) 2011 -- co-located with IEEE/ACM Supercomputing 2011 Message-ID: <4E5DE4A4.4010807@cs.iit.edu> 4th Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS) 2011 *http://datasys.cs.iit.edu/events/MTAGS11/index.html * *Co-located with * *Supercomputing/SC 2011* * Seattle Washington -- November 14th, 2011* News * *Keynote Speaker: *Professor David Abramson from Monash University, Australia * *Special Issue on Data Intensive Computing in the Clouds in the Springer Journal of Grid Computing* * *The Second International Workshop on Data Intensive Computing in the Clouds (DataCloud-SC11) 2011, co-located at Supercomputing/SC 2011, November 14th, 2011 * Overview The 4th workshop on Many-Task Computing on Grids and Supercomputers (MTAGS) will provide the scientific community a dedicated forum for presenting new research, development, and deployment efforts of large-scale many-task computing (MTC) applications on large scale clusters, Grids, Supercomputers, and Cloud Computing infrastructure. MTC, the theme of the workshop encompasses loosely coupled applications, which are generally composed of many tasks (both independent and dependent tasks) to achieve some larger application goal. This workshop will cover challenges that can hamper efficiency and utilization in running applications on large-scale systems, such as local resource manager scalability and granularity, efficient utilization of raw hardware, parallel file system contention and scalability, data management, I/O management, reliability at scale, and application scalability. We welcome paper submissions on all topics related to MTC on large scale systems. Papers will be peer-reviewed, and accepted papers will be published in the workshop proceedings as part of the ACM digital library (pending approval). The workshop will be co-located with the IEEE/ACM Supercomputing 2011 Conference in Seattle Washington on November 14th, 2011. For more information, please see http://datasys.cs.iit.edu/events/MTAGS11/ . For more information on past workshops, please see MTAGS10 , MTAGS09 , and MTAGS08 . We also ran a Special Issue on Many-Task Computing in the IEEE Transactions on Parallel and Distributed Systems (TPDS) which has appeared in June 2011; the proceedings can be found online at http://www.computer.org/portal/web/csdl/abs/trans/td/2011/06/ttd201106toc.htm. We, the workshop organizers, also published two papers that are highly relevant to this workshop. One paper is titled "Toward Loosely Coupled Programming on Petascale Systems ", and was published in SC08 ; the second paper is titled "Many-Task Computing for Grids and Supercomputers ", which was published in MTAGS08 . Topics We invite the submission of original work that is related to the topics below. The papers can be either short (5 pages) position papers, or long (10 pages) research papers. Topics of interest include (in the context of Many-Task Computing): * Compute Resource Management o Scheduling o Job execution frameworks o Local resource manager extensions o Performance evaluation of resource managers in use on large scale systems o Dynamic resource provisioning o Techniques to manage many-core resources and/or GPUs o Challenges and opportunities in running many-task workloads on HPC systems o Challenges and opportunities in running many-task workloads on Cloud Computing infrastructure * Storage architectures and implementations o Distributed file systems o Parallel file systems o Distributed meta-data management o Content distribution systems for large data o Data caching frameworks and techniques o Data management within and across data centers o Data-aware scheduling o Data-intensive computing applications o Eventual-consistency storage usage and management * Programming models and tools o Map-reduce and its generalizations o Many-task computing middleware and applications o Parallel programming frameworks o Ensemble MPI techniques and frameworks o Service-oriented science applications * Large-Scale Workflow Systems o Workflow system performance and scalability analysis o Scalability of workflow systems o Workflow infrastructure and e-Science middleware o Programming Paradigms and Models * Large-Scale Many-Task Applications o High-throughput computing (HTC) applications o Data-intensive applications o Quasi-supercomputing applications, deployments, and experiences o Performance Evaluation * Performance evaluation o Real systems o Simulations o Reliability of large systems Important Dates * Abstract submission: September 2, 2011 * Paper submission: September 9, 2011 * Acceptance notification: October 7, 2011 * Final papers due: October 28, 2011 Paper Submission Authors are invited to submit papers with unpublished, original work of not more than 10 pages of double column text using single spaced 10 point size on 8.5 x 11 inch pages, as per ACM 8.5 x 11 manuscript guidelines (http://www.acm.org/publications/instructions_for_proceedings_volumes); document templates can be found at http://www.acm.org/sigs/publications/proceedings-templates. We are also seeking position papers of no more than 5 pages in length. A 250 word abstract (PDF format) must be submitted online at https://cmt.research.microsoft.com/MTAGS2011/ before the deadline of September 2nd, 2011 at 11:59PM PST; the final 5/10 page papers in PDF format will be due on September 9th, 2011 at 11:59PM PST. Papers will be peer-reviewed, and accepted papers will be published in the workshop proceedings as part of the ACM digital library (pending approval). Notifications of the paper decisions will be sent out by October 7th, 2011. Selected excellent work may be eligible for additional post-conference publication as journal articles or book chapters, such as the previous Special Issue on Many-Task Computing in the IEEE Transactions on Parallel and Distributed Systems (TPDS) which has appeared in June 2011. Submission implies the willingness of at least one of the authors to register and present the paper. For more information, please http://datasys.cs.iit.edu/events/MTAGS11/ , or send email to mtags11-chairs at datasys.cs.iit.edu . Organization *General Chairs (mtags11-chairs at datasys.cs.iit.edu )* * Ioan Raicu, Illinois Institute of Technology & Argonne National Laboratory, USA * Ian Foster, University of Chicago & Argonne National Laboratory, USA * Yong Zhao, University of Electronic Science and Technology of China, China *Steering Committee* * David Abramson, Monash University, Australia * Jack Dongara, University of Tennessee, USA * Geoffrey Fox, Indiana University, USA * Manish Parashar, Rutgers University, USA * Marc Snir, University of Illinois at Urbana Champaign, USA * Xian-He Sun, Illinois Institute of Technology, USA * Weimin Zheng, Tsinghua University, China *Program Committee* * Roger Barga, Microsoft Research, USA * Mihai Budiu, Microsoft Research, USA * Rajkumar Buyya, University of Melbourne, Australia * Catalin Dumitrescu, Fermi National Labs, USA * Alexandru Iosup, Delft University of Technology, Netherlands * Florin Isaila, Universidad Carlos III de Madrid, Spain * Kamil Iskra, Argonne National Laboratory, USA * Hui Jin, Illinois Institute of Technology, USA * Daniel S. Katz, University of Chicago, USA * Tevfik Kosar, Louisiana State University, USA * Zhiling Lan, Illinois Institute of Technology, USA * Reagan Moore, University of North Carolina, Chappel Hill, USA * Jose Moreira, IBM Research, USA * Marlon Pierce, Indiana University, USA * Judy Qiu, Indiana University, USA * Lavanya Ramakrishnan, Lawrence Berkeley National Laboratory, USA * Matei Ripeanu, University of British Columbia, Canada * Alain Roy, University of Wisconsin, Madison, USA * Edward Walker, Texas Advanced Computing Center, USA * Mike Wilde, University of Chicago & Argonne National Laboratory, USA * Matthew Woitaszek, The University Corporation for Atmospheric Research, USA * Ken Yocum, University of California at San Diego, USA -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================================= ================================================================= -------------- next part -------------- An HTML attachment was scrubbed... URL: From iraicu at cs.iit.edu Wed Aug 31 02:48:14 2011 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Wed, 31 Aug 2011 02:48:14 -0500 Subject: [Swift-user] CFP: The Second International Workshop on Data Intensive Computing in the Clouds (DataCloud-SC11) 2011 -- co-located with IEEE/ACM Supercomputing 2011 Message-ID: <4E5DE73E.1070800@cs.iit.edu> The Second International Workshop on Data Intensive Computing in the Clouds (DataCloud-SC11) 2011 http://datasys.cs.iit.edu/events/DataCloud-SC11/ *Co-located with * *Supercomputing/SC 2011* * Seattle Washington -- November 14th, 2011* News * *Special Issue on Data Intensive Computing in the Clouds in the Springer Journal of Grid Computing* * *4th Workshop on Many-Task Computing on Grids and Supercomputers (MTAGS) 2011, co-located at Supercomputing/SC 2011, November 14th, 2011 * Overview Applications and experiments in all areas of science are becoming increasingly complex and more demanding in terms of their computational and data requirements. Some applications generate data volumes reaching hundreds of terabytes and even petabytes. As scientific applications become more data intensive, the management of data resources and dataflow between the storage and compute resources is becoming the main bottleneck. Analyzing, visualizing, and disseminating these large data sets has become a major challenge and data intensive computing is now considered as the "fourth paradigm" in scientific discovery after theoretical, experimental, and computational science. The second international workshop on Data-intensive Computing in the Clouds (DataCloud-SC11) will provide the scientific community a dedicated forum for discussing new research, development, and deployment efforts in running data-intensive computing workloads on Cloud Computing infrastructures. The DataCloud-SC11 workshop will focus on the use of cloud-based technologies to meet the new data intensive scientific challenges that are not well served by the current supercomputers, grids or compute-intensive clouds. We believe the workshop will be an excellent place to help the community define the current state, determine future goals, and present architectures and services for future clouds supporting data intensive computing. For more information about the workshop, please see http://datasys.cs.iit.edu/events/DataCloud-SC11/. To see the 1st workshop's program agenda, and accepted papers and presentations, please see http://www.cse.buffalo.edu/faculty/tkosar/datacloud2011/. We are also running a Special Issue on Data Intensive Computing in the Clouds in the Springer Journal of Grid Computing with a paper submission deadline of August 16th 2011, which will appear in print in June 2012. Topics * Data-intensive cloud computing applications, characteristics, challenges * Case studies of data intensive computing in the clouds * Performance evaluation of data clouds, data grids, and data centers * Energy-efficient data cloud design and management * Data placement, scheduling, and interoperability in the clouds * Accountability, QoS, and SLAs * Data privacy and protection in a public cloud environment * Distributed file systems for clouds * Data streaming and parallelization * New programming models for data-intensive cloud computing * Scalability issues in clouds * Social computing and massively social gaming * 3D Internet and implications * Future research challenges in data-intensive cloud computing Important Dates * Abstract submission: September 2, 2011 * Paper submission: September 9, 2011 * Acceptance notification: October 7, 2011 * Final papers due: October 28, 2011 Paper Submission Authors are invited to submit papers with unpublished, original work of not more than 10 pages of double column text using single spaced 10 point size on 8.5 x 11 inch pages, as per ACM 8.5 x 11 manuscript guidelines (http://www.acm.org/publications/instructions_for_proceedings_volumes); document templates can be found at http://www.acm.org/sigs/publications/proceedings-templates. We are also seeking position papers of no more than 5 pages in length. A 250 word abstract (PDF format) must be submitted online at https://cmt.research.microsoft.com/DataCloud_SC11/ before the deadline of September 2nd, 2011 at 11:59PM PST; the final 5/10 page papers in PDF format will be due on September 9th, 2011 at 11:59PM PST. Papers will be peer-reviewed, and accepted papers will be published in the workshop proceedings as part of the ACM digital library (pending approval). Notifications of the paper decisions will be sent out by October 7th, 2011. Selected excellent work may be eligible for additional post-conference publication as journal articles. We are currently running a Special Issue on Data Intensive Computing in the Clouds in the Springer Journal of Grid Computing . Submission implies the willingness of at least one of the authors to register and present the paper. For more information, please see http://datasys.cs.iit.edu/events/DataCloud-SC11/ or send email to datacloud-sc11-chairs at datasys.cs.iit.edu . Organization *General Chairs (datacloud-sc11-chairs at datasys.cs.iit.edu )* * Ioan Raicu, Illinois Institute of Technology & Argonne National Laboratory, USA * Tevfik Kosar, University at Buffalo, USA * Roger Barga, Microsoft Research, USA *Steering Committee* * Ian Foster, University of Chicago & Argonne National Laboratory, USA * Geoffrey Fox, Indiana University, USA * James Hamilton, Amazon, USA * Manish Parashar, Rutgers University, USA * Dan Reed, Microsoft Research, USA * Rich Wolski, University of California at Santa Barbara, USA * Rong Chang, IBM, USA *Program Committee* * David Abramson, Monash University, Australia * Abhishek Chandra, University of Minnesota, USA * Yong Chen, Texas Tech University, USA * Terence Critchlow, Pacific Northwest National Laboratory, USA * Murat Demirbas, SUNY Buffalo, USA * Jaliya Ekanayake, Microsoft Research, USA * Rob Gillen, Oak Ridge National Laboratory, USA * Maria Indrawan, Monash University, Australia * Alexandru Iosup, Delft University of Technology, Netherlands * Hui Jin, Illinois Institute of Technology, USA * Dan S. Katz, University of Chicago, USA * Gregor von Laszewski, Indiana University, USA * Erwin Laure, CERN, Switzerland * Reagan Moore, University of North Carolina at Chapel Hill, USA * Jim Myers, Rensselaer Polytechnic Institute, USA * Judy Qiu, Indiana University, USA * Lavanya Ramakrishnan, Lawrence Berkeley National Laboratory, USA * Florian Schintke, Zuse Institute Berlin, Germany * Borja Sotomayor, University of Chicago, USA * Ian Taylor, Cardiff University, UK * Bernard Traversat, Oracle Corporation, USA -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================================= ================================================================= -------------- next part -------------- An HTML attachment was scrubbed... URL: From iraicu at cs.iit.edu Wed Aug 31 03:42:30 2011 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Wed, 31 Aug 2011 03:42:30 -0500 Subject: [Swift-user] CiteSearcher: a Google Scholar front-end for iOS and Android mobile devices Message-ID: <4E5DF3F6.7060003@cs.iit.edu> CiteSearcher v1.2search%20results.png Google Scholar on your iPod, iPhone, iPad, and Android based mobile devices *http://datasys.cs.iit.edu/projects/CiteSearcher/* CiteSearcher is a Google Scholar front-end for iOS and Android mobile devices. With it, you can easily search Google Scholar for an author's work, his/her Hirsch index (H-index, http://en.wikipedia.org/wiki/H-index), and G-Index (http://en.wikipedia.org/wiki/G-index). For a detailed list of features and screenshots, see http://datasys.cs.iit.edu/projects/CiteSearcher/details.html. For the free downloads, see IOS (http://itunes.apple.com/us/app/citesearcher/id453186643?mt=8) or Android (https://market.android.com/details?id=datasys.iit). We plan to maintain this software as long as there is demand from the community, and improve it with new features and by supporting additional mobile devices. The lead developer of these applications is Kevin Brandstatter from the DataSys Laboratory at Illinois Institute of Technology. If you would like to signup to the CiteSearcher user mailing list in order to find out information about future releases of CiteSearcher, please see http://datasys.cs.iit.edu/mailman/listinfo/citesearcher-user. For any comments or feedback, please write to citesearcher-devel at datasys.cs.iit.edu. Bugs can be reported to http://datasys.cs.iit.edu/projects/CiteSearcher/bugReport.php. -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================================= ================================================================= -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: search results.png Type: image/png Size: 49607 bytes Desc: not available URL: From lpesce at uchicago.edu Wed Aug 31 10:25:41 2011 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Wed, 31 Aug 2011 10:25:41 -0500 Subject: [Swift-user] About packing jobs on a supercomputer Message-ID: <5AE0F887-9923-4232-A085-C821C748271B@uchicago.edu> Hi All -- I have two (more) questions: 1) Listening to Justin and Mike I inferred that you could help me figure out how to pack jobs "as much as possible", when memory is a limiting factor, but not entirely predictable (for various reasons Matlab seems to be sloppy in taking care of its memory usage and I don't seem to be able to accurately predict how large a job will be). The project is made of may thousands of simulations of different sizes. The ones of the same size and the same type behave similarly. The only reliable predictor of maximum packing is the value before I get the out of memory (OOM) message. :-) 2) When I log in and check the cluster status, usually there is some backfill space available (my calculations comprise many many small calculations in addition to larger ones). Do you have any crafty ways to find out what type of jobs will run immediately as backfill and which one will have to wait? (BTW, this would make our boss happy too because we will increase utilization at no real cost). To give you an idea every campaign churns a few hundred thousands core hours and the current plans involved many campaigns. Thanks! Lorenzo From lpesce at uchicago.edu Wed Aug 31 10:57:43 2011 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Wed, 31 Aug 2011 10:57:43 -0500 Subject: [Swift-user] Is there a way to set env variables in swift In-Reply-To: <1598462977.261497.1314739346693.JavaMail.root@zimbra.anl.gov> References: <1598462977.261497.1314739346693.JavaMail.root@zimbra.anl.gov> Message-ID: <2611EE9F-52A4-45C4-AED8-75E9B97A8D76@uchicago.edu> Sometimes one had to ask stupid questions, so I will go ahead and do it. This is the script // file to run the Margic square example from the matlab web site type file; string MCRPath = "/soft/mcr/v714"; app (file outdata) magicsq (string mcr, int size) { runMagicSquare mcr size stdout=@outdata; } int matsize[] = [0:23]; file MagicSquares[] ; foreach s, i in matsize { MagicSquares[i] = magicsq (MCRPath, s); } this is tc # sitename transformation path pbs echo /bin/echo pbs cat /bin/cat pbs ls /bin/ls pbs grep /bin/grep pbs sort /bin/sort pbs paste /bin/paste pbs cp /bin/cp pbs touch /bin/touch pbs wc /usr/bin/wc # custom entries pbs convert /home/ketan/ImageMagick-install/bin/convert pbs script1 /lustre/beagle/ketan/beagle-swift-training-examples/script1.sh pbs script2 /lustre/beagle/ketan/beagle-swift-training-examples/script2.sh pbs annotate /lustre/beagle/ketan/beagle-swift-training-examples/annotate.sh pbs runMagicSquare /home/lpesce/matlab/bin/run_magicsquare.sh INSTALLED AMD64 ENV::TMP="/lustre/beagle/lpesce/tmp" this is the call: swift -config cf -sites.file sites.xml -tc.file tc script.swift This is what I get: .... 011-08-31 15:49:25,816+0000 INFO vdl:execute The application "runMagicSquare" is not available in your tc.data catalog Caused by: org.globus.cog.karajan.scheduler.NoSuchResourceException 2011-08-31 15:49:25,825+0000 DEBUG magicsq PROCEDURE_END line=6 .... What am I doing stupidly here? Thanks! On Aug 30, 2011, at 4:22 PM, Michael Wilde wrote: > Lorenzo, > > You can set env vars for an application in the tc.data file. Use the "env namespace profile" as described at: > > http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_env_namespace > > Like this: > > beagle myapp /home/joe/bin/myapp INSTALLED INTEL32::LINUX ENV::TMP="/lustre/beagle/joe/tmp" > > If you need "joe" substituted into the "TMP" env var in the tc file dynamically (as with your example below), you need to do that with your own wrapper script on the client host (before you run the swift command). > > Alternatively, if you are running locally on Beagle, there may also be an option you can specify in your sites file that will get passed down to the PBS qsub command, to pass the environment vars of the submitting job down to the submitted job, in which case you dont need the ENV::TMP profile in your tc.data file. David or Justin may be able to clarify whats possible there. (And this should go into the new 0.93 Site Configuration Guide). > > - Mike > > ----- Original Message ----- >> From: "Lorenzo Pesce" >> To: swift-user at ci.uchicago.edu >> Sent: Tuesday, August 30, 2011 2:59:57 PM >> Subject: [Swift-user] Is there a way to set env variables in swift >> Specifically something like >> >> export TMP=/lustre/beagle/`whoami`/tmp >> >> Thanks >> >> Lorenzo >> _______________________________________________ >> Swift-user mailing list >> Swift-user at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > From jonmon at mcs.anl.gov Wed Aug 31 10:59:57 2011 From: jonmon at mcs.anl.gov (Jonathan Monette) Date: Wed, 31 Aug 2011 10:59:57 -0500 Subject: [Swift-user] Is there a way to set env variables in swift In-Reply-To: <2611EE9F-52A4-45C4-AED8-75E9B97A8D76@uchicago.edu> References: <1598462977.261497.1314739346693.JavaMail.root@zimbra.anl.gov> <2611EE9F-52A4-45C4-AED8-75E9B97A8D76@uchicago.edu> Message-ID: <2121BEDD-4B80-430F-83DA-F0F131E707AC@mcs.anl.gov> What is your sites file? Is there a pool handle in the sites file names pbs? On Aug 31, 2011, at 10:57 AM, Lorenzo Pesce wrote: > Sometimes one had to ask stupid questions, so I will go ahead and do it. > > This is the script > > // file to run the Margic square example from the matlab web site > type file; > > string MCRPath = "/soft/mcr/v714"; > > app (file outdata) magicsq (string mcr, int size) > { > runMagicSquare mcr size stdout=@outdata; > } > > int matsize[] = [0:23]; > > file MagicSquares[] ; > > foreach s, i in matsize { > MagicSquares[i] = magicsq (MCRPath, s); > } > > > this is tc > > # sitename transformation path > pbs echo /bin/echo > pbs cat /bin/cat > pbs ls /bin/ls > pbs grep /bin/grep > pbs sort /bin/sort > pbs paste /bin/paste > pbs cp /bin/cp > pbs touch /bin/touch > pbs wc /usr/bin/wc > > # custom entries > pbs convert /home/ketan/ImageMagick-install/bin/convert > pbs script1 /lustre/beagle/ketan/beagle-swift-training-examples/script1.sh > pbs script2 /lustre/beagle/ketan/beagle-swift-training-examples/script2.sh > pbs annotate /lustre/beagle/ketan/beagle-swift-training-examples/annotate.sh > pbs runMagicSquare /home/lpesce/matlab/bin/run_magicsquare.sh INSTALLED AMD64 ENV::TMP="/lustre/beagle/lpesce/tmp" > > > this is the call: > > swift -config cf -sites.file sites.xml -tc.file tc script.swift > > This is what I get: > > .... > 011-08-31 15:49:25,816+0000 INFO vdl:execute The application "runMagicSquare" is not available in your tc.data catalog > Caused by: org.globus.cog.karajan.scheduler.NoSuchResourceException > 2011-08-31 15:49:25,825+0000 DEBUG magicsq PROCEDURE_END line=6 > .... > > What am I doing stupidly here? > > Thanks! > > > > > > On Aug 30, 2011, at 4:22 PM, Michael Wilde wrote: > >> Lorenzo, >> >> You can set env vars for an application in the tc.data file. Use the "env namespace profile" as described at: >> >> http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_env_namespace >> >> Like this: >> >> beagle myapp /home/joe/bin/myapp INSTALLED INTEL32::LINUX ENV::TMP="/lustre/beagle/joe/tmp" >> >> If you need "joe" substituted into the "TMP" env var in the tc file dynamically (as with your example below), you need to do that with your own wrapper script on the client host (before you run the swift command). >> >> Alternatively, if you are running locally on Beagle, there may also be an option you can specify in your sites file that will get passed down to the PBS qsub command, to pass the environment vars of the submitting job down to the submitted job, in which case you dont need the ENV::TMP profile in your tc.data file. David or Justin may be able to clarify whats possible there. (And this should go into the new 0.93 Site Configuration Guide). >> >> - Mike >> >> ----- Original Message ----- >>> From: "Lorenzo Pesce" >>> To: swift-user at ci.uchicago.edu >>> Sent: Tuesday, August 30, 2011 2:59:57 PM >>> Subject: [Swift-user] Is there a way to set env variables in swift >>> Specifically something like >>> >>> export TMP=/lustre/beagle/`whoami`/tmp >>> >>> Thanks >>> >>> Lorenzo >>> _______________________________________________ >>> Swift-user mailing list >>> Swift-user at ci.uchicago.edu >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user >> >> -- >> Michael Wilde >> Computation Institute, University of Chicago >> Mathematics and Computer Science Division >> Argonne National Laboratory >> > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user From lpesce at uchicago.edu Wed Aug 31 11:47:42 2011 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Wed, 31 Aug 2011 11:47:42 -0500 Subject: [Swift-user] Is there a way to set env variables in swift In-Reply-To: <2121BEDD-4B80-430F-83DA-F0F131E707AC@mcs.anl.gov> References: <1598462977.261497.1314739346693.JavaMail.root@zimbra.anl.gov> <2611EE9F-52A4-45C4-AED8-75E9B97A8D76@uchicago.edu> <2121BEDD-4B80-430F-83DA-F0F131E707AC@mcs.anl.gov> Message-ID: Sorry for the question... it turns out that I did not put ::LINUX after AMD64. I could not figure it out from the error messages or the guides, but with some trial and error that appears to be the issue. BTW, I can't find a description of the structure of tc.files. Don't worry, I am actually reading the manual and tutorial right now... (BTW, the new tutorial is so much better than the last one I tried to read, thanks!) (of course having to read the tutorial is going to turn off nearly 100% of my users, but luckily for them I will most likely take care of their scripts). On Aug 31, 2011, at 10:59 AM, Jonathan Monette wrote: > What is your sites file? Is there a pool handle in the sites file names pbs? > > On Aug 31, 2011, at 10:57 AM, Lorenzo Pesce wrote: > >> Sometimes one had to ask stupid questions, so I will go ahead and do it. >> >> This is the script >> >> // file to run the Margic square example from the matlab web site >> type file; >> >> string MCRPath = "/soft/mcr/v714"; >> >> app (file outdata) magicsq (string mcr, int size) >> { >> runMagicSquare mcr size stdout=@outdata; >> } >> >> int matsize[] = [0:23]; >> >> file MagicSquares[] ; >> >> foreach s, i in matsize { >> MagicSquares[i] = magicsq (MCRPath, s); >> } >> >> >> this is tc >> >> # sitename transformation path >> pbs echo /bin/echo >> pbs cat /bin/cat >> pbs ls /bin/ls >> pbs grep /bin/grep >> pbs sort /bin/sort >> pbs paste /bin/paste >> pbs cp /bin/cp >> pbs touch /bin/touch >> pbs wc /usr/bin/wc >> >> # custom entries >> pbs convert /home/ketan/ImageMagick-install/bin/convert >> pbs script1 /lustre/beagle/ketan/beagle-swift-training-examples/script1.sh >> pbs script2 /lustre/beagle/ketan/beagle-swift-training-examples/script2.sh >> pbs annotate /lustre/beagle/ketan/beagle-swift-training-examples/annotate.sh >> pbs runMagicSquare /home/lpesce/matlab/bin/run_magicsquare.sh INSTALLED AMD64 ENV::TMP="/lustre/beagle/lpesce/tmp" >> >> >> this is the call: >> >> swift -config cf -sites.file sites.xml -tc.file tc script.swift >> >> This is what I get: >> >> .... >> 011-08-31 15:49:25,816+0000 INFO vdl:execute The application "runMagicSquare" is not available in your tc.data catalog >> Caused by: org.globus.cog.karajan.scheduler.NoSuchResourceException >> 2011-08-31 15:49:25,825+0000 DEBUG magicsq PROCEDURE_END line=6 >> .... >> >> What am I doing stupidly here? >> >> Thanks! >> >> >> >> >> >> On Aug 30, 2011, at 4:22 PM, Michael Wilde wrote: >> >>> Lorenzo, >>> >>> You can set env vars for an application in the tc.data file. Use the "env namespace profile" as described at: >>> >>> http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_env_namespace >>> >>> Like this: >>> >>> beagle myapp /home/joe/bin/myapp INSTALLED INTEL32::LINUX ENV::TMP="/lustre/beagle/joe/tmp" >>> >>> If you need "joe" substituted into the "TMP" env var in the tc file dynamically (as with your example below), you need to do that with your own wrapper script on the client host (before you run the swift command). >>> >>> Alternatively, if you are running locally on Beagle, there may also be an option you can specify in your sites file that will get passed down to the PBS qsub command, to pass the environment vars of the submitting job down to the submitted job, in which case you dont need the ENV::TMP profile in your tc.data file. David or Justin may be able to clarify whats possible there. (And this should go into the new 0.93 Site Configuration Guide). >>> >>> - Mike >>> >>> ----- Original Message ----- >>>> From: "Lorenzo Pesce" >>>> To: swift-user at ci.uchicago.edu >>>> Sent: Tuesday, August 30, 2011 2:59:57 PM >>>> Subject: [Swift-user] Is there a way to set env variables in swift >>>> Specifically something like >>>> >>>> export TMP=/lustre/beagle/`whoami`/tmp >>>> >>>> Thanks >>>> >>>> Lorenzo >>>> _______________________________________________ >>>> Swift-user mailing list >>>> Swift-user at ci.uchicago.edu >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user >>> >>> -- >>> Michael Wilde >>> Computation Institute, University of Chicago >>> Mathematics and Computer Science Division >>> Argonne National Laboratory >>> >> >> _______________________________________________ >> Swift-user mailing list >> Swift-user at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > From lpesce at uchicago.edu Wed Aug 31 11:55:48 2011 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Wed, 31 Aug 2011 11:55:48 -0500 Subject: [Swift-user] Is there a way to set env variables in swift In-Reply-To: <1598462977.261497.1314739346693.JavaMail.root@zimbra.anl.gov> References: <1598462977.261497.1314739346693.JavaMail.root@zimbra.anl.gov> Message-ID: <70D1068A-FA60-462F-B912-C78D34238BE8@uchicago.edu> Mike, It works perfectly. Thanks. In this way I can integrate the swift and bash sections because they use exactly the same matlab files, scripts and MCR libraries. :-) Thanks, Lorenzo On Aug 30, 2011, at 4:22 PM, Michael Wilde wrote: > Lorenzo, > > You can set env vars for an application in the tc.data file. Use the "env namespace profile" as described at: > > http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_env_namespace > > Like this: > > beagle myapp /home/joe/bin/myapp INSTALLED INTEL32::LINUX ENV::TMP="/lustre/beagle/joe/tmp" > > If you need "joe" substituted into the "TMP" env var in the tc file dynamically (as with your example below), you need to do that with your own wrapper script on the client host (before you run the swift command). > > Alternatively, if you are running locally on Beagle, there may also be an option you can specify in your sites file that will get passed down to the PBS qsub command, to pass the environment vars of the submitting job down to the submitted job, in which case you dont need the ENV::TMP profile in your tc.data file. David or Justin may be able to clarify whats possible there. (And this should go into the new 0.93 Site Configuration Guide). > > - Mike > > ----- Original Message ----- >> From: "Lorenzo Pesce" >> To: swift-user at ci.uchicago.edu >> Sent: Tuesday, August 30, 2011 2:59:57 PM >> Subject: [Swift-user] Is there a way to set env variables in swift >> Specifically something like >> >> export TMP=/lustre/beagle/`whoami`/tmp >> >> Thanks >> >> Lorenzo >> _______________________________________________ >> Swift-user mailing list >> Swift-user at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > > -- > Michael Wilde > Computation Institute, University of Chicago > Mathematics and Computer Science Division > Argonne National Laboratory > From hategan at mcs.anl.gov Wed Aug 31 13:55:56 2011 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Wed, 31 Aug 2011 11:55:56 -0700 Subject: [Swift-user] About packing jobs on a supercomputer In-Reply-To: <5AE0F887-9923-4232-A085-C821C748271B@uchicago.edu> References: <5AE0F887-9923-4232-A085-C821C748271B@uchicago.edu> Message-ID: <1314816956.14552.2.camel@blabla> On Wed, 2011-08-31 at 10:25 -0500, Lorenzo Pesce wrote: > 2) When I log in and check the cluster status, usually there is some > backfill space available (my calculations comprise many many small > calculations in addition to larger ones). Do you have any crafty ways > to find out what type of jobs will run immediately as backfill and > which one will have to wait? (BTW, this would make our boss happy too > because we will increase utilization at no real cost). If you give it enough slots, coasters should submit jobs (blocks) of varied sizes. That might do the trick. If you know specifics about the backfill holes, let us know and we can probably figure out some better settings.