From lpesce at uchicago.edu Thu Jan 3 15:48:31 2013 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Thu, 3 Jan 2013 15:48:31 -0600 Subject: [Swift-user] Cray XE6:: Swift appears to wait till nodes are empty before submitting new jobs. Message-ID: I am making some small tests of sequential jobs and it seems like once step 1 is finished, step 2 doesn't start until enough step 1s have completed. I assumed that swift would be able to send a jobs to a node before completion. Does it have to do with submission settings? Lorenzo From davidk at ci.uchicago.edu Fri Jan 4 12:05:46 2013 From: davidk at ci.uchicago.edu (David Kelly) Date: Fri, 4 Jan 2013 12:05:46 -0600 (CST) Subject: [Swift-user] Cray XE6:: Swift appears to wait till nodes are empty before submitting new jobs. In-Reply-To: Message-ID: <1596597553.8339.1357322746318.JavaMail.root@zimbra-mb2.anl.gov> Lorenzo, Could you please show an example of how you are trying to do this? Are you using iterate? ----- Original Message ----- > From: "Lorenzo Pesce" > To: swift-user at ci.uchicago.edu > Cc: "Joe Urbanski" > Sent: Thursday, January 3, 2013 3:48:31 PM > Subject: [Swift-user] Cray XE6:: Swift appears to wait till nodes are empty before submitting new jobs. > I am making some small tests of sequential jobs and it seems like once > step 1 is finished, step 2 doesn't start until enough step 1s have > completed. > I assumed that swift would be able to send a jobs to a node before > completion. > Does it have to do with submission settings? > > Lorenzo > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user From lpesce at uchicago.edu Fri Jan 4 12:21:36 2013 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Fri, 4 Jan 2013 12:21:36 -0600 Subject: [Swift-user] Cray XE6:: Swift appears to wait till nodes are empty before submitting new jobs. In-Reply-To: <1596597553.8339.1357322746318.JavaMail.root@zimbra-mb2.anl.gov> References: <1596597553.8339.1357322746318.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: David, Thanks a lot for your reply. It is not an iterative call. I talked with Mike about it and I think that he solved the problem (I still need to test that): 172800 47:50:00 prevented the second app to be send in the same coaster because there was not enough time given the maxwalltime setting. I just realized how different the app times actually are. Can you point me to where I can figure out how to instruct the swift that different apps have different maxwalltimes and can run a different number of jobs per node? I might have asked this question already, I know... I slowly make progress through the coding of all the apps I have to write. Thanks a lot, Lorenzo On Jan 4, 2013, at 12:05 PM, David Kelly wrote: > Lorenzo, > > Could you please show an example of how you are trying to do this? Are you using iterate? > > ----- Original Message ----- >> From: "Lorenzo Pesce" >> To: swift-user at ci.uchicago.edu >> Cc: "Joe Urbanski" >> Sent: Thursday, January 3, 2013 3:48:31 PM >> Subject: [Swift-user] Cray XE6:: Swift appears to wait till nodes are empty before submitting new jobs. >> I am making some small tests of sequential jobs and it seems like once >> step 1 is finished, step 2 doesn't start until enough step 1s have >> completed. >> I assumed that swift would be able to send a jobs to a node before >> completion. >> Does it have to do with submission settings? >> >> Lorenzo >> _______________________________________________ >> Swift-user mailing list >> Swift-user at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user From davidk at ci.uchicago.edu Fri Jan 4 13:32:43 2013 From: davidk at ci.uchicago.edu (David Kelly) Date: Fri, 4 Jan 2013 13:32:43 -0600 (CST) Subject: [Swift-user] Cray XE6:: Swift appears to wait till nodes are empty before submitting new jobs. In-Reply-To: Message-ID: <1762303723.8901.1357327963878.JavaMail.root@zimbra-mb2.anl.gov> Lorenzo, I think one way you can do this is by setting walltimes in tc.data. beagle shortjob /bin/shortjob null null GLOBUS::maxwalltime="00:05:00" beagle longjob /bin/longjob null null GLOBUS::maxwalltime="47:50:00" For multiple jobsPerNode values, you could define two pool entries in sites.xml. Each entry could have a different value for jobsPerNode. Then modify your tc.data to point to the appropriate entry. David ----- Original Message ----- > From: "Lorenzo Pesce" > To: "David Kelly" > Cc: "Joe Urbanski" , swift-user at ci.uchicago.edu > Sent: Friday, January 4, 2013 12:21:36 PM > Subject: Re: [Swift-user] Cray XE6:: Swift appears to wait till nodes are empty before submitting new jobs. > David, > > Thanks a lot for your reply. > > > It is not an iterative call. > > I talked with Mike about it and I think that he solved the problem (I > still need to test that): > 172800 > 47:50:00 > > prevented the second app to be send in the same coaster because there > was not enough time given the maxwalltime setting. > > I just realized how different the app times actually are. Can you > point me to where I can figure out how to instruct the swift that > different apps have different maxwalltimes and can run a different > number of jobs per node? > I might have asked this question already, I know... I slowly make > progress through the coding of all the apps I have to write. > > Thanks a lot, > > Lorenzo > > > > On Jan 4, 2013, at 12:05 PM, David Kelly wrote: > > > Lorenzo, > > > > Could you please show an example of how you are trying to do this? > > Are you using iterate? > > > > ----- Original Message ----- > >> From: "Lorenzo Pesce" > >> To: swift-user at ci.uchicago.edu > >> Cc: "Joe Urbanski" > >> Sent: Thursday, January 3, 2013 3:48:31 PM > >> Subject: [Swift-user] Cray XE6:: Swift appears to wait till nodes > >> are empty before submitting new jobs. > >> I am making some small tests of sequential jobs and it seems like > >> once > >> step 1 is finished, step 2 doesn't start until enough step 1s have > >> completed. > >> I assumed that swift would be able to send a jobs to a node before > >> completion. > >> Does it have to do with submission settings? > >> > >> Lorenzo > >> _______________________________________________ > >> Swift-user mailing list > >> Swift-user at ci.uchicago.edu > >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user From lpesce at uchicago.edu Fri Jan 4 14:13:39 2013 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Fri, 4 Jan 2013 14:13:39 -0600 Subject: [Swift-user] Cray XE6:: Swift appears to wait till nodes are empty before submitting new jobs. In-Reply-To: <1762303723.8901.1357327963878.JavaMail.root@zimbra-mb2.anl.gov> References: <1762303723.8901.1357327963878.JavaMail.root@zimbra-mb2.anl.gov> Message-ID: My calls are a bit different, should I tc to something like this? # custom entries pbs GATKIntRecalWrapper /lustre/beagle/lpesce/Jason/SwiftRun3/gatkIntRecalWrapper.sh INSTALLED AMD64::LINUX ENV::TMP="/dev/shm/GATK_post2013-01-03_11:09:08" GLOBUS::maxwalltime="00:05:00" pbs GATKBQRecalWrapper /lustre/beagle/lpesce/Jason/SwiftRun3/gatkBQRecalWrapper.sh INSTALLED AMD64::LINUX ENV::TMP="/dev/shm/GATK_post2013-01-03_11:09:08" GLOBUS::maxwalltime="01:05:00" pbs PicardMarkDuplWrapper /lustre/beagle/lpesce/Jason/SwiftRun3/picardMarkDuplWrapper.sh INSTALLED AMD64::LINUX ENV::TMP="/dev/shm/GATK_post2013-01-03_11:09:08" GLOBUS::maxwalltime="00:15:00" On Jan 4, 2013, at 1:32 PM, David Kelly wrote: > Lorenzo, > > I think one way you can do this is by setting walltimes in tc.data. > > beagle shortjob /bin/shortjob null null GLOBUS::maxwalltime="00:05:00" > beagle longjob /bin/longjob null null GLOBUS::maxwalltime="47:50:00" > > For multiple jobsPerNode values, you could define two pool entries in sites.xml. Each entry could have a different value for jobsPerNode. Then modify your tc.data to point to the appropriate entry. I will look up in the guide to understand this better, but the pointer you gave me should be enough. Thanks. Thanks a lot David, Lorenzo > > David > > ----- Original Message ----- >> From: "Lorenzo Pesce" >> To: "David Kelly" >> Cc: "Joe Urbanski" , swift-user at ci.uchicago.edu >> Sent: Friday, January 4, 2013 12:21:36 PM >> Subject: Re: [Swift-user] Cray XE6:: Swift appears to wait till nodes are empty before submitting new jobs. >> David, >> >> Thanks a lot for your reply. >> >> >> It is not an iterative call. >> >> I talked with Mike about it and I think that he solved the problem (I >> still need to test that): >> 172800 >> 47:50:00 >> >> prevented the second app to be send in the same coaster because there >> was not enough time given the maxwalltime setting. >> >> I just realized how different the app times actually are. Can you >> point me to where I can figure out how to instruct the swift that >> different apps have different maxwalltimes and can run a different >> number of jobs per node? >> I might have asked this question already, I know... I slowly make >> progress through the coding of all the apps I have to write. >> >> Thanks a lot, >> >> Lorenzo >> >> >> >> On Jan 4, 2013, at 12:05 PM, David Kelly wrote: >> >>> Lorenzo, >>> >>> Could you please show an example of how you are trying to do this? >>> Are you using iterate? >>> >>> ----- Original Message ----- >>>> From: "Lorenzo Pesce" >>>> To: swift-user at ci.uchicago.edu >>>> Cc: "Joe Urbanski" >>>> Sent: Thursday, January 3, 2013 3:48:31 PM >>>> Subject: [Swift-user] Cray XE6:: Swift appears to wait till nodes >>>> are empty before submitting new jobs. >>>> I am making some small tests of sequential jobs and it seems like >>>> once >>>> step 1 is finished, step 2 doesn't start until enough step 1s have >>>> completed. >>>> I assumed that swift would be able to send a jobs to a node before >>>> completion. >>>> Does it have to do with submission settings? >>>> >>>> Lorenzo >>>> _______________________________________________ >>>> Swift-user mailing list >>>> Swift-user at ci.uchicago.edu >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user From davidk at ci.uchicago.edu Fri Jan 4 19:09:30 2013 From: davidk at ci.uchicago.edu (David Kelly) Date: Fri, 4 Jan 2013 19:09:30 -0600 (CST) Subject: [Swift-user] Cray XE6:: Swift appears to wait till nodes are empty before submitting new jobs. In-Reply-To: Message-ID: <2116556137.10824.1357348170451.JavaMail.root@zimbra-mb2.anl.gov> I think something like this should do the trick: # custom entries pbs GATKIntRecalWrapper /lustre/beagle/lpesce/Jason/SwiftRun3/gatkIntRecalWrapper.sh INSTALLED AMD64::LINUX ENV::TMP="/dev/shm/GATK_post2013-01-03_11:09:08";GLOBUS::maxwalltime="00:05:00" pbs GATKBQRecalWrapper /lustre/beagle/lpesce/Jason/SwiftRun3/gatkBQRecalWrapper.sh INSTALLED AMD64::LINUX ENV::TMP="/dev/shm/GATK_post2013-01-03_11:09:08";GLOBUS::maxwalltime="01:05:00" pbs PicardMarkDuplWrapper /lustre/beagle/lpesce/Jason/SwiftRun3/picardMarkDuplWrapper.sh INSTALLED AMD64::LINUX ENV::TMP="/dev/shm/GATK_post2013-01-03_11:09:08";GLOBUS::maxwalltime="00:15:00" ----- Original Message ----- > From: "Lorenzo Pesce" > To: "David Kelly" > Cc: "Joe Urbanski" , swift-user at ci.uchicago.edu > Sent: Friday, January 4, 2013 2:13:39 PM > Subject: Re: [Swift-user] Cray XE6:: Swift appears to wait till nodes are empty before submitting new jobs. > My calls are a bit different, should I tc to something like this? > > # custom entries > pbs GATKIntRecalWrapper > /lustre/beagle/lpesce/Jason/SwiftRun3/gatkIntRecalWrapper.sh INSTALLED > AMD64::LINUX ENV::TMP="/dev/shm/GATK_post2013-01-03_11:09:08" > GLOBUS::maxwalltime="00:05:00" > pbs GATKBQRecalWrapper > /lustre/beagle/lpesce/Jason/SwiftRun3/gatkBQRecalWrapper.sh INSTALLED > AMD64::LINUX ENV::TMP="/dev/shm/GATK_post2013-01-03_11:09:08" > GLOBUS::maxwalltime="01:05:00" > pbs PicardMarkDuplWrapper > /lustre/beagle/lpesce/Jason/SwiftRun3/picardMarkDuplWrapper.sh > INSTALLED AMD64::LINUX > ENV::TMP="/dev/shm/GATK_post2013-01-03_11:09:08" > GLOBUS::maxwalltime="00:15:00" > > > On Jan 4, 2013, at 1:32 PM, David Kelly wrote: > > > Lorenzo, > > > > I think one way you can do this is by setting walltimes in tc.data. > > > > beagle shortjob /bin/shortjob null null > > GLOBUS::maxwalltime="00:05:00" > > beagle longjob /bin/longjob null null GLOBUS::maxwalltime="47:50:00" > > > > For multiple jobsPerNode values, you could define two pool entries > > in sites.xml. Each entry could have a different value for > > jobsPerNode. Then modify your tc.data to point to the appropriate > > entry. > > I will look up in the guide to understand this better, but the pointer > you gave me should be enough. Thanks. > > Thanks a lot David, > > Lorenzo > > > > > David > > > > ----- Original Message ----- > >> From: "Lorenzo Pesce" > >> To: "David Kelly" > >> Cc: "Joe Urbanski" , > >> swift-user at ci.uchicago.edu > >> Sent: Friday, January 4, 2013 12:21:36 PM > >> Subject: Re: [Swift-user] Cray XE6:: Swift appears to wait till > >> nodes are empty before submitting new jobs. > >> David, > >> > >> Thanks a lot for your reply. > >> > >> > >> It is not an iterative call. > >> > >> I talked with Mike about it and I think that he solved the problem > >> (I > >> still need to test that): > >> 172800 > >> 47:50:00 > >> > >> prevented the second app to be send in the same coaster because > >> there > >> was not enough time given the maxwalltime setting. > >> > >> I just realized how different the app times actually are. Can you > >> point me to where I can figure out how to instruct the swift that > >> different apps have different maxwalltimes and can run a different > >> number of jobs per node? > >> I might have asked this question already, I know... I slowly make > >> progress through the coding of all the apps I have to write. > >> > >> Thanks a lot, > >> > >> Lorenzo > >> > >> > >> > >> On Jan 4, 2013, at 12:05 PM, David Kelly wrote: > >> > >>> Lorenzo, > >>> > >>> Could you please show an example of how you are trying to do this? > >>> Are you using iterate? > >>> > >>> ----- Original Message ----- > >>>> From: "Lorenzo Pesce" > >>>> To: swift-user at ci.uchicago.edu > >>>> Cc: "Joe Urbanski" > >>>> Sent: Thursday, January 3, 2013 3:48:31 PM > >>>> Subject: [Swift-user] Cray XE6:: Swift appears to wait till nodes > >>>> are empty before submitting new jobs. > >>>> I am making some small tests of sequential jobs and it seems like > >>>> once > >>>> step 1 is finished, step 2 doesn't start until enough step 1s > >>>> have > >>>> completed. > >>>> I assumed that swift would be able to send a jobs to a node > >>>> before > >>>> completion. > >>>> Does it have to do with submission settings? > >>>> > >>>> Lorenzo > >>>> _______________________________________________ > >>>> Swift-user mailing list > >>>> Swift-user at ci.uchicago.edu > >>>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user From iraicu at cs.iit.edu Sat Jan 5 09:11:58 2013 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Sat, 05 Jan 2013 09:11:58 -0600 Subject: [Swift-user] CFP: Scientific Cloud Computing (ScienceCloud) -- co-located with ACM HPDC 2013 Message-ID: <50E842BE.20307@cs.iit.edu> ------------------------------------------------------------------------------- *** Call for Papers *** 4th Workshop on Scientific Cloud Computing (ScienceCloud) 2013 Co-located with ACM HPDC 2013, New York City, NY, USA -- June 17th, 2013 http://datasys.cs.iit.edu/events/ScienceCloud2013/ ------------------------------------------------------------------------------- Computational and Data-Driven Sciences have become the third and fourth pillar of scientific discovery in addition to experimental and theoretical sciences. Scientific Computing has already begun to change how science is done, enabling scientific breakthroughs through new kinds of experiments that would have been impossible only a decade ago. Today?s ?Big Data? science is generating datasets that are increasing exponentially in both complexity and volume, making their analysis, archival, and sharing one of the grand challenges of the 21st century. The support for data intensive computing is critical to advance modern science as storage systems have exposed a widening gap between their capacity and their bandwidth by more than 10-fold over the last decade. There is a growing need for advanced techniques to manipulate, visualize and interpret large datasets. Scientific Computing is the key to solving ?grand challenges? in many domains and providing breakthroughs in new knowledge, and it comes in many shapes and forms: high-performance computing (HPC) which is heavily focused on compute- intensive applications; high-throughput computing (HTC) which focuses on using many computing resources over long periods of time to accomplish its computational tasks; many-task computing (MTC) which aims to bridge the gap between HPC and HTC by focusing on using many resources over short periods of time; and data-intensive computing which is heavily focused on data distribution, data-parallel execution, and harnessing data locality by scheduling of computations close to the data. The 4th workshop on Scientific Cloud Computing (ScienceCloud) will provide the scientific community a dedicated forum for discussing new research, development, and deployment efforts in running these kinds of scientific computing workloads on Cloud Computing infrastructures. The ScienceCloud workshop will focus on the use of cloud-based technologies to meet new compute-intensive and data- intensive scientific challenges that are not well served by the current supercomputers, grids and HPC clusters. The workshop will aim to address questions such as: What architectural changes to the current cloud frameworks (hardware, operating systems, networking and/or programming models) are needed to support science? Dynamic information derived from remote instruments and coupled simulation, and sensor ensembles that stream data for real-time analysis are important emerging techniques in scientific and cyber-physical engineering systems. How can cloud technologies enable and adapt to these new scientific approaches dealing with dynamism? How are scientists using clouds? Are there scientific HPC/HTC/MTC workloads that are suitable candidates to take advantage of emerging cloud computing resources with high efficiency? Commercial public clouds provide easy access to cloud infrastructure for scientists. What are the gaps in commercial cloud offerings and how can they be adapted for running existing and novel eScience applications? What benefits exist by adopting the cloud model, over clusters, grids, or supercomputers? What factors are limiting clouds use or would make them more usable/efficient? This workshop encourages interaction and cross-pollination between those developing applications, algorithms, software, hardware and networking, emphasizing scientific computing for such cloud platforms. We believe the workshop will be an excellent place to help the community define the current state, determine future goals, and define architectures and services for future science clouds. TOPICS ------------------------------------------------------------------------------- We invite the submission of original work that is related to the topics below. The papers can be either short (4 pages) position papers, or long (8 pages) research papers. Topics of interest include (in the context of Cloud Computing): - Scientific application cases studies on Cloud infrastructure - Performance evaluation of Cloud environments and technologies - Fault tolerance and reliability in cloud systems - Data-intensive workloads and tools on Clouds - Use of programming models such as Map-Reduce and its implementations - Storage cloud architectures - I/O and Data management in the Cloud - Workflow and resource management in the Cloud - Use of cloud technologies (e.g., NoSQL databases) for scientific applications - Data streaming and dynamic applications on Clouds - Dynamic resource provisioning - Many-Task Computing in the Cloud - Application of cloud concepts in HPC environments or vice versa - High performance parallel file systems in virtual environments - Virtualized high performance I/O network interconnects - Virtualization - Distributed Operating Systems - Many-core computing and accelerators (e.g. GPUs, MIC) in the Cloud - Cloud security IMPORTANT DATES ------------------------------------------------------------------------------- - Paper submission: February 11th, 2013 (11:59PM PST) - Acceptance notification: March 18th, 2013 - Final papers due: April 15th, 2013 PAPER SUBMISSION ------------------------------------------------------------------------------- Authors are invited to submit papers with unpublished, original work of not more than 8 pages of double column text using single spaced 10 point size on 8.5 x 11 inch pages (including all text, figures, and references), as per ACM 8.5 x 11 manuscript guidelines (document templates can be found at http://www.acm.org/sigs/publications/proceedings-templates). A 250 word abstract and the final paper in PDF format must be submitted online at https://cmt.research.microsoft.com/ScienceCloud2013/ before the deadline of February 11th, 2013 at 11:59PM PST. Papers will be peer-reviewed, and accepted papers will be published in the workshop proceedings as part of the ACM digital library. Notifications of the paper decisions will be sent out by March 18th, 2013. Selected excellent work will be invited to submit extended versions of the workshop paper to a special issue journal. Submission implies the willingness of at least one of the authors to register and present the paper. GENERAL CHAIRS ------------------------------------------------------------------------------- - Ioan Raicu, Illinois Institute of Technology & Argonne National Lab., USA - Yogesh Simmhan, University of Southern California, USA PROGRAM COMMITTEE CHAIRS ------------------------------------------------------------------------------- - Kyle Chard, University of Chicago, USA - Gabriel Antoniu, INRIA, France - Lavanya Ramakrishnan, Lawrence Berkeley National Lab, USA STEERING COMMITTEE ------------------------------------------------------------------------------- - Ian Foster, University of Chicago & Argonne National Laboratory, USA - Pete Beckman, University of Chicago & Argonne National Laboratory, USA - Carole Goble, University of Manchester, UK - Dennis Gannon, Microsoft Research, USA - Robert Grossman, University of Chicago, USA - Kate Keahey, University of Chicago & Argonne National Laboratory, USA - Ed Lazowska, University of Washington & Computing Community Consortium, USA - David O'Hallaron, Carnegie Mellon University & Intel Labs, USA - Jack Dongarra, University of Tennessee, USA - Geoffrey Fox, Indiana University, USA PROGRAM COMMITTEE ------------------------------------------------------------------------------- - Samer Al-Kiswany (University of British Columbia) - Roger Barga (Microsoft Research) - Roy Campbell (University of Illinois at Urbana Champaign) - Charlie Catlett (Argonne National Laboratory) - Simon Caton (KIT) - David Chiu (Washington State University) - Jack Dongara (University of Tennessee) - Ake Edlund (Royal Institute of Technology) - Chathura Herath (Indiana University) - Neil Chue Hong (University of Edinburgh) - Adriana Iamnitchi (University of South Florida) - Shantenu Jha (Louisiana State University) - Hui Jin (Illinois Institute of Technology) - Carl Kesselman (University of Southern California) - Thilo Kielmann (Vrije University) - Gregor von Laszewski (Indiana University) - Shiyong Lu (Wayne State University) - Wei Lu (Microsoft Research) - Andr Luckow (Louisiana State University) - David Martin (Argonne National Laboratory) - Gabriel Mateescu (Virginia Tech) - Paolo Missier (University of Manchester) - Ruben Montero (Universidad Complutense de Madrid) - Reagan Moore (University of North Carolina) - Jose Moreira (IBM Research) - Christine Morin (INRIA Rennes) - Pasquale Pagano (ISTI) - Beth Plale (Indiana University) - Omer Rana (Cardiff University) - Matei Ripeanu (University of British Columbia) - Josh Simons (VMWare) - Douglas Thain (University of Notre Dame) - Johan Tordsson (Ume University) - Vasudeva Varma (IIIT-Hyderabad) - Zhifeng Yun (Louisiana State University) - Yong Zhao (University of Electronic and Science Technology of China) -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================================= ================================================================= From marialemos72 at gmail.com Sat Jan 5 12:59:59 2013 From: marialemos72 at gmail.com (Maria Lemos) Date: Sat, 5 Jan 2013 18:59:59 +0000 Subject: [Swift-user] Accepted Workshops in the CISTI'2013 - 8th Iberian Conference on IST Message-ID: <20130105185858.959867CC08D@mailrelay.anl.gov> *************************************************************************************************** CISTI'2013 Accepted Workshops 8th Iberian Conference on Information Systems and Technologies Lisbon, Portugal, June 19 - 23, 2013 http://www.aisti.eu/cisti2013/index.php?option=com_content&view=article&id=64&Itemid=68&lang=en *************************************************************************************************** List of accepted workshops in the CISTI'2013 - 8th Iberian Conference on Information Systems and Technologies: > IAwDQ'2013 - Fourth Ibero-American Workshop on Data Quality > SGaMePlay'2013 - Third Iberian Workshop on Serious Games and Meaningful Play > WISA'2013 - Fifth Workshop on Intelligent Systems and Applications > WISIS'2013 - Third Workshop on Information Systems for Interactive Spaces > WSEQP'2013 - First Workshop in Software Engineering and Quality Process Best regards, Maria Lemos AISTI / CISTI'2013 http://www.aisti.eu/cisti2013 From iraicu at cs.iit.edu Sun Jan 6 07:54:14 2013 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Sun, 06 Jan 2013 07:54:14 -0600 Subject: [Swift-user] CFP: ACM HPDC 2013 -- Abstracts due January 14th Message-ID: <50E98206.5040308@cs.iit.edu> **** CALL FOR PAPERS **** The 22nd International ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC'13) New York City, USA - June 17-21, 2013 http://www.hpdc.org/2013 The ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC) is the premier annual conference for presenting the latest research on the design, implementation, evaluation, and the use of parallel and distributed systems for high-end computing. In 2013, the 22nd HPDC and affiliated workshops will take place in the heart of iconic New York City from June 17-21. **** IMPORTANT DATES **** Abstracts: 14 January 2013 Papers: 21 January 2013 (no extensions) Reviews released to authors: 6 March 2013 Author rebuttals due: 10 March 2013 Author notifications: 17 March 2013 **** HPDC'13 GENERAL CO-CHAIRS **** Manish Parashar, Rutgers University Jon Weissman, University of Minnesota **** HPDC'13 PROGRAM CO-CHAIRS **** Dick Epema, Delft University of Technology Renato Figueiredo, University of Florida **** HPDC'13 WORKSHOPS CHAIR **** Abhishek Chandra, University of Minnesota **** HPDC'13 LOCAL ARRANGEMENTS CHAIR **** Daniele Scarpazza, DEShaw Research **** HPDC'13 SPONSORSHIP CHAIR **** Dean Hildebrand, IBM Almaden **** HPDC'13 PUBLICITY CO-CHAIRS **** Alexandru Iosup, Delft University of Technology, the Netherlands Ioan Raicu, Illinois Institute of Technology, USA Kenjiro Taura, University of Tokyo, Japan Bruno Schulze, National Laboratory for Scientific Computing, Brazil **** SCOPE AND TOPICS **** Submissions are welcomed on high-performance parallel and distributed computing topics including but not limited to: clusters, clouds, grids, data-intensive computing, massively multicore, and global-scale computing systems. New scholarly research showing empirical and reproducible results in architectures, systems, and networks is strongly encouraged, as are experience reports of operational deployments that can provide insights for future research on HPDC applications and systems. All papers will be evaluated for their originality, technical depth and correctness, potential impact, relevance to the conference, and quality of presentation. Research papers must clearly demonstrate research contributions and novelty, while experience reports must clearly describe lessons learned and demonstrate impact. In the context of high-performance parallel and distributed computing, the topics of interest include, but are not limited to: * Systems, networks, and architectures for high-end computing * Massively multicore systems * Resource virtualization * Programming languages and environments * I/O, storage systems, and data management * Resource management and scheduling, including energy-?????aware techniques * Performance modeling and analysis * Fault tolerance, reliability, and availability * Data-intensive computing * Applications of parallel and distributed computing **** PAPER SUBMISSION GUIDELINES **** Authors are invited to submit technical papers of at most 12 pages in PDF format, including figures and references. Papers should be formatted in the ACM Proceedings Style and submitted via the conference web site. No changes to the margins, spacing, or font sizes as specified by the style file are allowed. Accepted papers will appear in the conference proceedings, and will be incorporated into the ACM Digital Library. A limited number of papers will be accepted as posters. Papers must be self-contained and provide the technical substance required for the program committee to evaluate their contributions. Papers should thoughtfully address all related work, particularly work presented at previous HPDC events. Submitted papers must be original work that has not appeared in and is not under consideration for another conference or a journal. See the ACM Prior Publication Policy for more details. **** IMPORTANT DATES **** Abstracts Due: 14 January 2013 Papers Due: 21 January 2013 (no extensions) **** Program Committee **** David Abramson, Monash University, Australia Kento Aida, National Institute of Informatics, Japan Gabriel Antoniu INRIA, France Henri Bal, Vrije Universiteit, the Netherlands Adam Barker, University of St Andrews, UK Michela Becchi, University of Missouri - Columbia, USA John Bent, EMC, USA Ali Butt, Virginia Tech, USA Kirk Cameron, Virginia Tech, USA Franck Cappello, INRIA, France and University of Illinois at Urbana-Champaign, USA Henri Casanova, University of Hawaii, USA Abhishek Chandra, University of Minnesota, USA Andrew Chien, University of Chicago and Argonne National Laboratory, USA Paolo Costa, Imperial College London, UK Peter Dinda, Northwestern University, USA Gilles Fedak, INRIA, France Ian Foster, University of Chicago and Argonne National Laboratory, USA Clemens Grelck, University of Amsterdam, the Netherlands Dean Hildebrand, IBM Research, USA Fabrice Huet, INRIA-University of Nice, France Adriana Iamnitchi, University of South Florida, USA Alexandru Iosup, Delft University of Technology, the Netherlands Kate Keahey, Argonne National Laboratory, USA Thilo Kielmann, Vrije Universiteit, the Netherlands Charles Kilian, Purdue University, USA Zhiling Lan, Illinois Institute of Technology, USA John Lange, University of Pittsburgh, USA Barney Maccabe, Oak Ridge National Laboratory, USA Carlos Maltzahn, University of California, Santa Cruz, USA Naoya Maruyama, RIKEN Advanced Institute for Computational Science, Japan Satoshi Matsuoka, Tokyo Institute of Technology, Japan Manish Parashar, Rutgers University, USA Judy Qiu, Indiana University, USA Ioan Raicu, Illinois Institute of Technology, USA Philip Rhodes, University of Mississippi, USA Matei Ripeanu, University of British Columbia, Canada Prasenjit Sarkar, IBM Research, USA Daniele Scarpazza, D.E. Shaw Research, USA Karsten Schwan, Georgia Institute of Technology, USA Martin Swany, Indiana University, USA Michela Taufer, University of Delaware, USA Kenjiro Taura, University of Tokyo, Japan Douglas Thain, University of Notre Dame, USA Cristian Ungureanu, NEC Research, USA Ana Varbanescu, Delft University of Technology, the Netherlands Chuliang Weng, Shanghai Jiao Tong University, China Jon Weissman, University of Minnesota, USA Yongwei Wu, Tsinghua University, China Dongyan Xu, Purdue University, USA Ming Zhao, Florida International University, USA **** Steering Committee **** Henri Bal, Vrije Universiteit, the Netherlands Andrew A. Chien, University of Chicago and Argonne National Laboratory, USA Peter Dinda, Northwestern University, USA Dick Epema, Delft University of Technology, the Netherlands Ian Foster, University of Chicago and Argonne National Laboratory, USA Salim Hariri, University of Arizona, USA Thilo Kielmann, Vrije Universiteit, the Netherlands Arthur "Barney" Maccabe, Oak Ridge National Laboratory, USA Manish Parashar, Rutgers University, USA Matei Ripeanu, University of British Columbia, Canada Karsten Schwan, Georgia Tech, USA Doug Thain, University of Notre Dame, USA Jon Weissman, University of Minnesota (Chair), USA -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ ================================================================= ================================================================= From lpesce at uchicago.edu Fri Jan 11 08:59:53 2013 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Fri, 11 Jan 2013 08:59:53 -0600 Subject: [Swift-user] Large Job not starting Message-ID: <7F179AF0-A876-4CEB-99EF-837FC9CBDF8C@uchicago.edu> Hi, We are working on a project which involves about 3 million tasks. We have run through 1,5 million tasks and we were resuming the job. I have been seeing this for a while: Progress: time: Thu, 10 Jan 2013 20:39:20 +0000 Selecting site:63831 Submitted:7171 Finished in previous run:1486037 ... Progress: time: Fri, 11 Jan 2013 14:50:21 +0000 Selecting site:63831 Submitted:7171 Finished in previous run:1486037 from the ps command: lpesce 28172 28102 19 Jan10 pts/4 04:20:32 java -Xmx12072M -XX:+HeapDumpOnOutOfMemoryError -Djava.endorsed.dirs=/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/endorsed -DUID=1978 -DGLOBUS_HOSTNAME=login5.beagle.ci.uchicago.edu -DCOG_INSTALL_PATH=/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/.. -Dswift.home=/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/.. -Duser.home=/lustre/beagle/lpesce -Djava.security.egd=file:///dev/urandom -XX:+UseParallelGC -XX:ParallelGCThreads=2 -classpath /home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../etc:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../libexec:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/ant.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/antlr-2.7.5.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/castor-0.9.6.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/coaster-bootstrap.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/cog-abstraction-common-2.4.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/cog-grapheditor-0.47.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/cog-jglobus-1.7.0.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/cog-karajan-0.36-dev.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-coaster-0.3.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-dcache-0.1.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-gt2-2.4.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-local-2.2.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-localscheduler-0.4.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-ssh-2.4.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/cog-provider-webdav-2.1.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/cog-resources-1.0.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/cog-swift-svn.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/cog-util-0.92.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/commons-httpclient.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/commons-logging-1.1.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/cryptix32.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/cryptix-asn1.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/cryptix.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/j2ssh-common-0.2.2.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/j2ssh-core-0.2.2-patch-b.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/jakarta-regexp-1.2.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/jakarta-slide-webdavlib-2.0.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/jaxrpc.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/jce-jdk13-131.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/jgss.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/jline-0.9.94.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/jsr173_1.0_api.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/jug-lgpl-2.0.0.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/junit.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/log4j-1.2.16.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/puretls.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/resolver.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift-svn/bin/../lib/stringtemplate.jar:/home/davidk/swift-trunk/cog/modules/swift/dist/swift lpesce at login5:/lustre/beagle/GCNet/RG/Oreo/o080522_BS1> ps v 28172 PID TTY STAT TIME MAJFL TRS DRS RSS %MEM COMMAND 28172 pts/4 Sl+ 260:32 84 2 12868101 11612816 70.2 java -Xmx12072M -XX:+HeapDumpOnOutOfMemoryError Job seems to be using zero cpu at this time. It has no jobs in the queue lpesce at login5:/lustre/beagle/GCNet/RG/Oreo/o080522_BS1> qstat -u lpesce sdb: Req'd Req'd Elap Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time -------------------- -------- -------- ---------------- ------ ----- --- ------ ----- - ----- 1919801.sdb lpesce advanced B0109-030545-00 6144 -- -- -- 117:4 R 45:54 1919802.sdb lpesce advanced B0109-030545-00 2868 -- -- -- 117:4 R 45:53 1919806.sdb lpesce advanced B0109-080540-00 27222 -- -- -- 117:3 R 45:49 1919807.sdb lpesce advanced B0109-080540-00 6609 -- -- -- 117:3 R 45:49 1919808.sdb lpesce advanced B0109-080540-00 3328 -- -- -- 117:3 R 45:48 (Unrelted jobs, which have been running for more than a day) Suggestions? Thanks, Lorenzo From lpesce at uchicago.edu Mon Jan 14 14:15:37 2013 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Mon, 14 Jan 2013 14:15:37 -0600 Subject: [Swift-user] Question about walltime References: <566615CB-D2AE-45D7-AA3B-07B1EE240DBC@uchicago.edu> Message-ID: I have three entries like this in my tc file: # custom entries pbs GATKIntRecalWrapper ${GATKIntRecalWrapper} INSTALLED AMD64::LINUX ENV::TMP="$GATK_TMP";GLOBUS::maxwalltime="15:00:00" pbs GATKBQRecalWrapper ${GATKBQRecalWrapper} INSTALLED AMD64::LINUX ENV::TMP="$GATK_TMP";GLOBUS::maxwalltime="62:00:00" pbs PicardMarkDuplWrapper ${PicardMarkDuplWrapper} INSTALLED AMD64::LINUX ENV::TMP="$PICARD_TMP";GLOBUS::maxwalltime="08:00:00" What do I do with my sites file? 1) Can I skip the maxwalltime alltogether? 2) If not, do I need to put a value larger than all of them, smaller than all of them or it doesn't matter? My impression is that it is used first and nothing is sent if the time isn't smaller, but I am not sure. Thanks, Lorenzo -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Mon Jan 14 14:21:27 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Mon, 14 Jan 2013 12:21:27 -0800 Subject: [Swift-user] Question about walltime In-Reply-To: References: <566615CB-D2AE-45D7-AA3B-07B1EE240DBC@uchicago.edu> Message-ID: <1358194887.31198.3.camel@blabla> The walltime in sites.xml is used as a default if the app walltime is not set in tc.data. So for the apps you list below, won't make a difference. Step by step, the walltime is set like this: 1. look in tc.data, and set if present. 2. if not in tc.data, look in sites.xml and set if present. 3. if in neither tc.data or sites.xml, use "00:10:00" (i.e. 10 minutes). Mihael On Mon, 2013-01-14 at 14:15 -0600, Lorenzo Pesce wrote: > I have three entries like this in my tc file: > > # custom entries > pbs GATKIntRecalWrapper ${GATKIntRecalWrapper} INSTALLED > AMD64::LINUX ENV::TMP="$GATK_TMP";GLOBUS::maxwalltime="15:00:00" > pbs GATKBQRecalWrapper ${GATKBQRecalWrapper} INSTALLED > AMD64::LINUX ENV::TMP="$GATK_TMP";GLOBUS::maxwalltime="62:00:00" > pbs PicardMarkDuplWrapper ${PicardMarkDuplWrapper} INSTALLED > AMD64::LINUX ENV::TMP="$PICARD_TMP";GLOBUS::maxwalltime="08:00:00" > > > What do I do with my sites file? > 1) Can I skip the maxwalltime alltogether? > 2) If not, do I need to put a value larger than all of them, smaller > than all of them or it doesn't matter? > > My impression is that it is used first and nothing is sent if the time > isn't smaller, but I am not sure. > > Thanks, > > Lorenzo > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user From boue at oddjob.uchicago.edu Tue Jan 15 11:18:29 2013 From: boue at oddjob.uchicago.edu (=?ISO-8859-1?B?R3dlbmHrbCBCb3Xp?=) Date: Tue, 15 Jan 2013 11:18:29 -0600 Subject: [Swift-user] coaster-service.conf for orion.uchicago.edu Message-ID: Good morning, I am a new user of swift. I would like to run my swift scripts on orion which has the queueing system "torque", 8 nodes and 8 CPUs per node. I have difficulties to write the coaster-service.conf and the sites.xml. Can anyone help me ? Gwenael -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidk at ci.uchicago.edu Tue Jan 15 13:00:57 2013 From: davidk at ci.uchicago.edu (David Kelly) Date: Tue, 15 Jan 2013 13:00:57 -0600 (CST) Subject: [Swift-user] coaster-service.conf for orion.uchicago.edu In-Reply-To: Message-ID: <1033387787.3653.1358276457812.JavaMail.root@zimbra-mb2.anl.gov> Hello Gwenael, Here is a sites.xml file that should help you get started. 8 8 fast 5.99 10000 /tmp You'll probably need to replace the values for queue and work directory here. I don't think you will need a coaster-service.conf for this setup. I am not familiar with this machine, so please let me now what issues you run into. Thanks! David ----- Original Message ----- > From: "Gwena?l Bou?" > To: swift-user at ci.uchicago.edu > Sent: Tuesday, January 15, 2013 11:18:29 AM > Subject: [Swift-user] coaster-service.conf for orion.uchicago.edu > Good morning, > > > I am a new user of swift. I would like to run my swift scripts on > orion which has the queueing system "torque", 8 nodes and 8 CPUs per > node. > I have difficulties to write the coaster-service.conf and the > sites.xml. Can anyone help me ? > > > Gwenael > > > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user From nbest at ci.uchicago.edu Thu Jan 17 11:04:21 2013 From: nbest at ci.uchicago.edu (Neil Best) Date: Thu, 17 Jan 2013 11:04:21 -0600 Subject: [Swift-user] would rm be a valid app? Message-ID: I wanted to use Swift to clean up files that are no longer needed. It seemed natural to define rm as an app then script this: type file; string oldGrbFiles[] = readData( "data/oldGrbFiles.txt"); file grb[]; app rm (file f) { rm @f; } foreach g in grb { rm( g); } Swift runs happily and tasks complete but the number of files in the subfolder that I am trying to clean out does not change. I suspect that what is being rm-ed are the Swift-ified copies of the mapped input files. I simply want a distributed "make clean". Is this feasible? From wilde at mcs.anl.gov Thu Jan 17 12:34:18 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 17 Jan 2013 12:34:18 -0600 (CST) Subject: [Swift-user] would rm be a valid app? In-Reply-To: Message-ID: <619034740.4530.1358447658807.JavaMail.root@zimbra.anl.gov> Philosophically, its better not to use rm explicitly in a swift script, as this issue should ideally be transparent to the application-level logic of the script. If instead of mapping the file, you let Swift determine the file name (i.e by implicitly using the "concurrent" mapper) then by default Swift will remove intermediate files as soon as they are no longer needed within the scripts' execution. A technical problem with using rm explicitly is that in the default data management mode, the app is operating on a *copy* of the input data. rm will only have the desired effect if the file being rm'ed is under CDM "DIRECT" mode, then the actual mapped file will be deleted. If thats done, then the script writer is responsible for being sure that the file wont be needed again later in the workflow. - Mike ----- Original Message ----- > From: "Neil Best" > To: swift-user at ci.uchicago.edu > Sent: Thursday, January 17, 2013 11:04:21 AM > Subject: [Swift-user] would rm be a valid app? > I wanted to use Swift to clean up files that are no longer needed. It > seemed natural to define rm as an app then script this: > > type file; > > string oldGrbFiles[] = readData( "data/oldGrbFiles.txt"); > > file grb[]; > > app rm (file f) { > rm @f; > } > > foreach g in grb { > rm( g); > } > > > Swift runs happily and tasks complete but the number of files in the > subfolder that I am trying to clean out does not change. I suspect > that what is being rm-ed are the Swift-ified copies of the mapped > input files. I simply want a distributed "make clean". Is this > feasible? > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user -- Michael Wilde Computation Institute, University of Chicago Mathematics and Computer Science Division Argonne National Laboratory From boue at oddjob.uchicago.edu Thu Jan 17 17:10:17 2013 From: boue at oddjob.uchicago.edu (=?ISO-8859-1?B?R3dlbmHrbCBCb3Xp?=) Date: Thu, 17 Jan 2013 17:10:17 -0600 Subject: [Swift-user] integer division Message-ID: Hi, I did this small script: -------------------------------------------------------------- int n = 7 %/ 5; // expect n = 1 float x = 1.0*@tofloat(n); // expect x = 1.0 string msg = @strcat("x = ", x); trace(msg); -------------------------------------------------------------- I expected to see "x = 1.0", but I got "x = 1.4". Is it normal ? Gwenael -------------- next part -------------- An HTML attachment was scrubbed... URL: From davidk at ci.uchicago.edu Thu Jan 17 23:07:59 2013 From: davidk at ci.uchicago.edu (David Kelly) Date: Thu, 17 Jan 2013 23:07:59 -0600 (CST) Subject: [Swift-user] integer division In-Reply-To: Message-ID: <726205375.57.1358485679257.JavaMail.root@zimbra-mb2.anl.gov> Hi Gwenael, I see the same behavior in 0.93, but it looks like it has been fixed in 0.94 release candidate. The latest release candidate is http://www.ci.uchicago.edu/swift/packages/swift-0.94RC3.tar.gz. Hope this helps. Thanks, David ----- Original Message ----- > From: "Gwena?l Bou?" > To: swift-user at ci.uchicago.edu > Sent: Thursday, January 17, 2013 5:10:17 PM > Subject: [Swift-user] integer division > Hi, > > > I did this small script: > -------------------------------------------------------------- > > int n = 7 %/ 5; // expect n = 1 > float x = 1.0*@tofloat(n); // expect x = 1.0 > string msg = @strcat("x = ", x); > trace(msg); > -------------------------------------------------------------- > I expected to see "x = 1.0", but I got "x = 1.4". Is it normal ? > > > Gwenael > > > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user From lpesce at uchicago.edu Mon Jan 21 11:25:34 2013 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Mon, 21 Jan 2013 11:25:34 -0600 Subject: [Swift-user] Question about programs, wrappers and exceptions Message-ID: <97FF74A6-FCD1-4D84-B0B8-0F77A733FDEB@uchicago.edu> Hi -- Mike help me a bit figuring things out with wrappers and exceptions. I haven't implemented those yet, but I will get to it. My current issue has to deal with returning return values from programs, contained into wrappers and to a degree how to transparently manage exceptions and return values within swift. I am happy to look into the guide if you can tell me where to look for it. I would also to understand in more details about the mechanics of coasters. It is not entirely clear to me how sometimes they seem to relinquish control of nodes even when not all the calculations are completed. It seems like a new batch of jobs is submitted as a new calculation even when it is not strictly necessary (or doesn't seem to be). I am going to try to include all of this in the swift on Beagle presentation. Lorenzo From marialemos72 at gmail.com Tue Jan 22 13:22:18 2013 From: marialemos72 at gmail.com (Maria Lemos) Date: Tue, 22 Jan 2013 19:22:18 +0000 Subject: [Swift-user] Workshops in the CISTI'2013 - 8th Iberian Conference on IST Message-ID: <20130122192112.80D547CC0B1@mailrelay.anl.gov> *************************************************************************************************** Workshop in the CISTI'2013 8th Iberian Conference on Information Systems and Technologies Lisbon, Portugal, June 19 - 23, 2013 http://www.aisti.eu/cisti2013/index.php?option=com_content&view=article&id=82&Itemid=67&lang=en *************************************************************************************************** Complete list of workshops accepted in the CISTI'2013 - 8th Iberian Conference on Information Systems and Technologies: > IAwDQ 2013 - Fourth Ibero-American Workshop on Data Quality > SGaMePlay 2013 - Third Iberian Workshop on Serious Games and Meaningful Play > TICAMES 2013 - First Workshop on Information and Communication Technology in Higher Education: Learning Mathematics > WIA 2013 - Primero Workshop en Innovaci?n Abierta > WISA 2013 - Fifth Workshop on Intelligent Systems and Applications > WISIS 2013 - Third Workshop on Information Systems for Interactive Spaces > WSEQP 2013 - First Workshop in Software Engineering and Quality Process Best regards, Maria Lemos AISTI / CISTI'2013 http://www.aisti.eu/cisti2013 From wilde at mcs.anl.gov Tue Jan 22 16:04:32 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Tue, 22 Jan 2013 16:04:32 -0600 (CST) Subject: [Swift-user] Question about programs, wrappers and exceptions In-Reply-To: <97FF74A6-FCD1-4D84-B0B8-0F77A733FDEB@uchicago.edu> Message-ID: <135169695.883725.1358892272940.JavaMail.root@mcs.anl.gov> > From: "Lorenzo Pesce" > > My current issue has to deal with returning return values from > programs, contained into wrappers and to a degree how to > transparently manage exceptions and return values within swift. Programs (i.e., app() functions) can only return files. If you want to return values (eg a set of scalars) from an app() back into the calling Swift script, your app should place them into a file, and you can then use readData() or readData2() to read these values into Swift scalars, arrays, or structures. The best way to do this is to define a wrapping function written in Swift to do the readData() and make it transparent to higher-level callers. Regarding errors and exceptions: Swift does not have an exception facility. An app() is considered by the Swift runtime to have failed when it returns a non-zero exit code. When it does so, Swift considers the app() call to have failed and will then apply the retry and recovery logic specific by the properties execution.retries and lazy.errors. Swift will also consider the app to have failed if the files that it expects the app to create (per its parameters and their mappings) do not exist. > I would also to understand in more details about the mechanics of > coasters. It is not entirely clear to me how sometimes they seem to > relinquish control of nodes even when not all the calculations are > completed. It seems like a new batch of jobs is submitted as a new > calculation even when it is not strictly necessary (or doesn't seem > to be). A coaster worker will exit when no ready app() jobs submitted by the Swift script will fit into the remaining wall time of the worker. Could that be happening in the cases of this you have seen? - Mike From lpesce at uchicago.edu Tue Jan 22 20:08:07 2013 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Tue, 22 Jan 2013 20:08:07 -0600 Subject: [Swift-user] Question about programs, wrappers and exceptions In-Reply-To: <135169695.883725.1358892272940.JavaMail.root@mcs.anl.gov> References: <135169695.883725.1358892272940.JavaMail.root@mcs.anl.gov> Message-ID: <7326C1B5-8B3C-4813-B4CE-B4920CBE44F6@uchicago.edu> Thanks a lot for your reply. On Jan 22, 2013, at 4:04 PM, Michael Wilde wrote: > >> From: "Lorenzo Pesce" >> >> My current issue has to deal with returning return values from >> programs, contained into wrappers and to a degree how to >> transparently manage exceptions and return values within swift. > > Programs (i.e., app() functions) can only return files. > > If you want to return values (eg a set of scalars) from an app() back into the calling Swift script, your app should place them into a file, and you can then use readData() or readData2() to read these values into Swift scalars, arrays, or structures. > > The best way to do this is to define a wrapping function written in Swift to do the readData() and make it transparent to higher-level callers. I see, so if I need to perform a step that produces an unknown number of files with unknown patters (they depend upon the content of the file and we don't know what they are till we are done processing it) , I put that in a file and then have http://www.ci.uchicago.edu/swift/guides/trunk/userguide/userguide.html#_readdata Then I can use the output to create a mapper and send out the new files for the following step. I assume that swift can take into account the implied dependencies created by the readdata step. Right? If so, it makes perfect sense, thanks. > Regarding errors and exceptions: Swift does not have an exception facility. An app() is considered by the Swift runtime to have failed when it returns a non-zero exit code. When it does so, Swift considers the app() call to have failed and will then apply the retry and recovery logic specific by the properties execution.retries and lazy.errors. Swift will also consider the app to have failed if the files that it expects the app to create (per its parameters and their mappings) do not exist. That is good enough, I can hack the wrappers to make them return the exit code I want depending upon how it ran. I can trap the signals inside of the apps as you told me previously and that should take care of it. I think. > >> I would also to understand in more details about the mechanics of >> coasters. It is not entirely clear to me how sometimes they seem to >> relinquish control of nodes even when not all the calculations are >> completed. It seems like a new batch of jobs is submitted as a new >> calculation even when it is not strictly necessary (or doesn't seem >> to be). > > A coaster worker will exit when no ready app() jobs submitted by the Swift script will fit into the remaining wall time of the worker. Could that be happening in the cases of this you have seen? That is what I thought, but the coasters were scheduled to run for 10 days and they were exiting a lot before that time (and being resubmitted). We noticed because some other submissions slipped in before the new ones. I haven't gone carefully over those logs, so it is possible that I am missing something here. Thanks again, Lorenzo > > - Mike From lpesce at uchicago.edu Thu Jan 24 09:43:24 2013 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Thu, 24 Jan 2013 09:43:24 -0600 Subject: [Swift-user] How can one force ordering into swift operations Message-ID: <2AF9770C-5796-4D66-9D4C-94A72D2CE6BD@uchicago.edu> I have a simple problem: step 1: I run an app that splits a file in a group of files and we don't know how many they are. step2: I want to map those files using a mapper after the fact Problem is that the mapper doesn't know that it can't run till step 1 is done because it has no input files. How can I tell the mapper (and what follows it by consequence since those files will not be there) that it has to wait for step 1 to b finished? Thanks, Lorenzo From hockyg at gmail.com Thu Jan 24 10:08:01 2013 From: hockyg at gmail.com (Glen Hocky) Date: Thu, 24 Jan 2013 11:08:01 -0500 Subject: [Swift-user] How can one force ordering into swift operations In-Reply-To: <2AF9770C-5796-4D66-9D4C-94A72D2CE6BD@uchicago.edu> References: <2AF9770C-5796-4D66-9D4C-94A72D2CE6BD@uchicago.edu> Message-ID: Lorenzo, This may not work for your purposes, but a simple solution similar to what I do, is to actually do step 1 in the wrapper before the mapping is done. This guarantees that all files are in place. Best, Glen On Thu, Jan 24, 2013 at 10:43 AM, Lorenzo Pesce wrote: > I have a simple problem: > step 1: I run an app that splits a file in a group of files and we don't > know how many they are. > step2: I want to map those files using a mapper after the fact > > Problem is that the mapper doesn't know that it can't run till step 1 is > done because it has no input files. How can I tell the mapper (and what > follows it by consequence since those files will not be there) that it has > to wait for step 1 to b finished? > > Thanks, > > Lorenzo > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lpesce at uchicago.edu Thu Jan 24 10:13:41 2013 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Thu, 24 Jan 2013 10:13:41 -0600 Subject: [Swift-user] How can one force ordering into swift operations In-Reply-To: References: <2AF9770C-5796-4D66-9D4C-94A72D2CE6BD@uchicago.edu> Message-ID: So that I return an array of files of unknown size (don't know how many files they will be) to the calling swift script? On Jan 24, 2013, at 10:08 AM, Glen Hocky wrote: > Lorenzo, > This may not work for your purposes, but a simple solution similar to what I do, is to actually do step 1 in the wrapper before the mapping is done. This guarantees that all files are in place. > > Best, > Glen > > > On Thu, Jan 24, 2013 at 10:43 AM, Lorenzo Pesce wrote: > I have a simple problem: > step 1: I run an app that splits a file in a group of files and we don't know how many they are. > step2: I want to map those files using a mapper after the fact > > Problem is that the mapper doesn't know that it can't run till step 1 is done because it has no input files. How can I tell the mapper (and what follows it by consequence since those files will not be there) that it has to wait for step 1 to b finished? > > Thanks, > > Lorenzo > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dsk at ci.uchicago.edu Thu Jan 24 10:20:52 2013 From: dsk at ci.uchicago.edu (Daniel S. Katz) Date: Thu, 24 Jan 2013 11:20:52 -0500 Subject: [Swift-user] How can one force ordering into swift operations In-Reply-To: References: <2AF9770C-5796-4D66-9D4C-94A72D2CE6BD@uchicago.edu> Message-ID: <02576391-A8AA-4327-8214-2011F442F66B@ci.uchicago.edu> you could just add an artificial dependency. Make step out output file "flag" when it is done. Make step 2 dependent on file "flag" Dan On Jan 24, 2013, at 11:13 AM, Lorenzo Pesce wrote: > So that I return an array of files of unknown size (don't know how many files they will be) to the calling swift script? > > > On Jan 24, 2013, at 10:08 AM, Glen Hocky wrote: > >> Lorenzo, >> This may not work for your purposes, but a simple solution similar to what I do, is to actually do step 1 in the wrapper before the mapping is done. This guarantees that all files are in place. >> >> Best, >> Glen >> >> >> On Thu, Jan 24, 2013 at 10:43 AM, Lorenzo Pesce wrote: >> I have a simple problem: >> step 1: I run an app that splits a file in a group of files and we don't know how many they are. >> step2: I want to map those files using a mapper after the fact >> >> Problem is that the mapper doesn't know that it can't run till step 1 is done because it has no input files. How can I tell the mapper (and what follows it by consequence since those files will not be there) that it has to wait for step 1 to b finished? >> >> Thanks, >> >> Lorenzo >> _______________________________________________ >> Swift-user mailing list >> Swift-user at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user >> > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user -- Daniel S. Katz University of Chicago (773) 834-7186 (voice) (773) 834-6818 (fax) d.katz at ieee.org or dsk at ci.uchicago.edu http://www.ci.uchicago.edu/~dsk/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From hockyg at gmail.com Thu Jan 24 10:41:35 2013 From: hockyg at gmail.com (Glen Hocky) Date: Thu, 24 Jan 2013 11:41:35 -0500 Subject: [Swift-user] How can one force ordering into swift operations In-Reply-To: References: <2AF9770C-5796-4D66-9D4C-94A72D2CE6BD@uchicago.edu> Message-ID: Yes, for example in your case, a mapper might look like (using the "ext" mapper type) ... split -l NLINES INPUTFILE # split makes a bunch of files like xaa, xab, xac ... splitfiles=$(find -name 'x*') count=0 for file in $splitfiles;do echo "[$count] $file" count=$(($count+1)) done ... On Thu, Jan 24, 2013 at 11:13 AM, Lorenzo Pesce wrote: > So that I return an array of files of unknown size (don't know how many > files they will be) to the calling swift script? > > > On Jan 24, 2013, at 10:08 AM, Glen Hocky wrote: > > Lorenzo, > This may not work for your purposes, but a simple solution similar to what > I do, is to actually do step 1 in the wrapper before the mapping is done. > This guarantees that all files are in place. > > Best, > Glen > > > On Thu, Jan 24, 2013 at 10:43 AM, Lorenzo Pesce wrote: > >> I have a simple problem: >> step 1: I run an app that splits a file in a group of files and we don't >> know how many they are. >> step2: I want to map those files using a mapper after the fact >> >> Problem is that the mapper doesn't know that it can't run till step 1 is >> done because it has no input files. How can I tell the mapper (and what >> follows it by consequence since those files will not be there) that it has >> to wait for step 1 to b finished? >> >> Thanks, >> >> Lorenzo >> _______________________________________________ >> Swift-user mailing list >> Swift-user at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user >> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hategan at mcs.anl.gov Thu Jan 24 10:42:30 2013 From: hategan at mcs.anl.gov (Mihael Hategan) Date: Thu, 24 Jan 2013 08:42:30 -0800 Subject: [Swift-user] How can one force ordering into swift operations In-Reply-To: <02576391-A8AA-4327-8214-2011F442F66B@ci.uchicago.edu> References: <2AF9770C-5796-4D66-9D4C-94A72D2CE6BD@uchicago.edu> <02576391-A8AA-4327-8214-2011F442F66B@ci.uchicago.edu> Message-ID: <1359045750.23814.1.camel@echo> There is the "external" type just for that. It indicates some type of data that is not managed by swift but that nonetheless exists. Mihael On Thu, 2013-01-24 at 11:20 -0500, Daniel S. Katz wrote: > you could just add an artificial dependency. Make step out output > file "flag" when it is done. > > > Make step 2 dependent on file "flag" > > > Dan > > > > On Jan 24, 2013, at 11:13 AM, Lorenzo Pesce > wrote: > > > So that I return an array of files of unknown size (don't know how > > many files they will be) to the calling swift script? > > > > > > On Jan 24, 2013, at 10:08 AM, Glen Hocky wrote: > > > > > Lorenzo, > > > This may not work for your purposes, but a simple solution similar > > > to what I do, is to actually do step 1 in the wrapper before the > > > mapping is done. This guarantees that all files are in place. > > > > > > > > > Best, > > > Glen > > > > > > > > > On Thu, Jan 24, 2013 at 10:43 AM, Lorenzo Pesce > > > wrote: > > > I have a simple problem: > > > step 1: I run an app that splits a file in a group of > > > files and we don't know how many they are. > > > step2: I want to map those files using a mapper after the > > > fact > > > > > > Problem is that the mapper doesn't know that it can't run > > > till step 1 is done because it has no input files. How can > > > I tell the mapper (and what follows it by consequence > > > since those files will not be there) that it has to wait > > > for step 1 to b finished? > > > > > > Thanks, > > > > > > Lorenzo > > > _______________________________________________ > > > Swift-user mailing list > > > Swift-user at ci.uchicago.edu > > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > > > > > > > > > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > > -- > Daniel S. Katz > University of Chicago > (773) 834-7186 (voice) > (773) 834-6818 (fax) > d.katz at ieee.org or dsk at ci.uchicago.edu > http://www.ci.uchicago.edu/~dsk/ > > > > > > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user From wilde at mcs.anl.gov Thu Jan 24 10:42:55 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 24 Jan 2013 10:42:55 -0600 (CST) Subject: [Swift-user] How can one force ordering into swift operations In-Reply-To: <02576391-A8AA-4327-8214-2011F442F66B@ci.uchicago.edu> Message-ID: <1985062751.1386710.1359045775756.JavaMail.root@mcs.anl.gov> A few brief additional tips to help you make progress with this: - your split app can create and return a single file containing a list of file names - use readData to read that list into an array; then use one of the array mappers to map the list of files. Separately: the "flag" Dan suggests can also be done using a variable of type "external" which allows you to do explicit synchronization. Its only honored as a return or an input of an app() function. - Mike ----- Original Message ----- > From: "Daniel S. Katz" > To: "Lorenzo Pesce" > Cc: "Glen Hocky" , "Swift User Discussion List" > Sent: Thursday, January 24, 2013 10:20:52 AM > Subject: Re: [Swift-user] How can one force ordering into swift operations > > > you could just add an artificial dependency. Make step out output > file "flag" when it is done. > > > Make step 2 dependent on file "flag" > > > Dan > > > > > > On Jan 24, 2013, at 11:13 AM, Lorenzo Pesce < lpesce at uchicago.edu > > wrote: > > > > So that I return an array of files of unknown size (don't know how > many files they will be) to the calling swift script? > > > > > > On Jan 24, 2013, at 10:08 AM, Glen Hocky wrote: > > > > Lorenzo, > This may not work for your purposes, but a simple solution similar to > what I do, is to actually do step 1 in the wrapper before the > mapping is done. This guarantees that all files are in place. > > > Best, > Glen > > > > On Thu, Jan 24, 2013 at 10:43 AM, Lorenzo Pesce < lpesce at uchicago.edu > > wrote: > > > I have a simple problem: > step 1: I run an app that splits a file in a group of files and we > don't know how many they are. > step2: I want to map those files using a mapper after the fact > > Problem is that the mapper doesn't know that it can't run till step 1 > is done because it has no input files. How can I tell the mapper > (and what follows it by consequence since those files will not be > there) that it has to wait for step 1 to b finished? > > Thanks, > > Lorenzo > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > > > > -- > > > > > > Daniel S. Katz > University of Chicago > (773) 834-7186 (voice) > (773) 834-6818 (fax) > d.katz at ieee.org or dsk at ci.uchicago.edu > http://www.ci.uchicago.edu/~dsk/ > > > > > > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user From wilde at mcs.anl.gov Thu Jan 24 11:59:43 2013 From: wilde at mcs.anl.gov (Michael Wilde) Date: Thu, 24 Jan 2013 11:59:43 -0600 (CST) Subject: [Swift-user] How can one force ordering into swift operations In-Reply-To: <1985062751.1386710.1359045775756.JavaMail.root@mcs.anl.gov> Message-ID: <906728816.1403172.1359050383715.JavaMail.root@mcs.anl.gov> Here's a split-and-process example: $ cat SplitAndProcess.swift type file; app (file flist) split (file i) { sh "-c" @strcat("split -l 50 ", @filename(i), " /tmp/segment ; /bin/ls -1 /tmp/segment??") stdout=@filename(flist); } app (file counts) wc (file i) { sh "-c" @strcat("wc ", @filename(i)) stdout=@filename(counts); } file infile<"infile">; string segnames[] = readData(split(infile)); foreach s,i in segnames { file segment ; string counts = readData(wc(segment)); tracef("segment %i is file %s, counts=%s\n", i, s, counts ); } $ wc -l infile 460 infile $ swift -config cf -tc.file tc -sites.file local.xml SplitAndProcess.swift Warning: Procedure split is deprecated, at 15 Warning: Procedure wc is deprecated, at 19 Swift trunk swift-r6151 cog-r3552 (cog modified locally) RunID: 20130124-1152-gg21bvq8 Progress: time: Thu, 24 Jan 2013 11:52:06 -0600 Progress: time: Thu, 24 Jan 2013 11:52:08 -0600 Active:9 Checking status:1 Finished successfully:1 segment 0 is file /tmp/segmentaa, counts= 50 267 2584 tmp/segmentaa segment 5 is file /tmp/segmentaf, counts= 50 597 7284 tmp/segmentaf segment 1 is file /tmp/segmentab, counts= 50 350 4196 tmp/segmentab segment 8 is file /tmp/segmentai, counts= 50 579 7082 tmp/segmentai segment 2 is file /tmp/segmentac, counts= 50 452 4949 tmp/segmentac segment 9 is file /tmp/segmentaj, counts= 10 71 835 tmp/segmentaj segment 4 is file /tmp/segmentae, counts= 50 490 6093 tmp/segmentae segment 3 is file /tmp/segmentad, counts= 50 589 7026 tmp/segmentad segment 7 is file /tmp/segmentah, counts= 50 498 6047 tmp/segmentah segment 6 is file /tmp/segmentag, counts= 50 591 7046 tmp/segmentag Final status: Thu, 24 Jan 2013 11:52:08 -0600 Finished successfully:11 Note that the script forces the split segments to be written to /tmp; otherwise they would be written to the job directory in which the split() app runs. This is not "location independent" but works fine when you run split on a local host. You can use $PWD instead of /tmp by passing it into swift eg -cwd=$PWD and adjusting the script accordingly. - Mike ----- Original Message ----- > From: "Michael Wilde" > To: "Daniel S. Katz" > Cc: "Glen Hocky" , "Swift User Discussion List" > Sent: Thursday, January 24, 2013 10:42:55 AM > Subject: Re: [Swift-user] How can one force ordering into swift operations > > > A few brief additional tips to help you make progress with this: > > - your split app can create and return a single file containing a > list of file names > > - use readData to read that list into an array; then use one of the > array mappers to map the list of files. > > Separately: the "flag" Dan suggests can also be done using a variable > of type "external" which allows you to do explicit synchronization. > Its only honored as a return or an input of an app() function. > > - Mike > > ----- Original Message ----- > > From: "Daniel S. Katz" > > To: "Lorenzo Pesce" > > Cc: "Glen Hocky" , "Swift User Discussion List" > > > > Sent: Thursday, January 24, 2013 10:20:52 AM > > Subject: Re: [Swift-user] How can one force ordering into swift > > operations > > > > > > you could just add an artificial dependency. Make step out output > > file "flag" when it is done. > > > > > > Make step 2 dependent on file "flag" > > > > > > Dan > > > > > > > > > > > > On Jan 24, 2013, at 11:13 AM, Lorenzo Pesce < lpesce at uchicago.edu > > > wrote: > > > > > > > > So that I return an array of files of unknown size (don't know how > > many files they will be) to the calling swift script? > > > > > > > > > > > > On Jan 24, 2013, at 10:08 AM, Glen Hocky wrote: > > > > > > > > Lorenzo, > > This may not work for your purposes, but a simple solution similar > > to > > what I do, is to actually do step 1 in the wrapper before the > > mapping is done. This guarantees that all files are in place. > > > > > > Best, > > Glen > > > > > > > > On Thu, Jan 24, 2013 at 10:43 AM, Lorenzo Pesce < > > lpesce at uchicago.edu > > > wrote: > > > > > > I have a simple problem: > > step 1: I run an app that splits a file in a group of files and we > > don't know how many they are. > > step2: I want to map those files using a mapper after the fact > > > > Problem is that the mapper doesn't know that it can't run till step > > 1 > > is done because it has no input files. How can I tell the mapper > > (and what follows it by consequence since those files will not be > > there) that it has to wait for step 1 to b finished? > > > > Thanks, > > > > Lorenzo > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > > > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > > > > > > > > -- > > > > > > > > > > > > Daniel S. Katz > > University of Chicago > > (773) 834-7186 (voice) > > (773) 834-6818 (fax) > > d.katz at ieee.org or dsk at ci.uchicago.edu > > http://www.ci.uchicago.edu/~dsk/ > > > > > > > > > > > > _______________________________________________ > > Swift-user mailing list > > Swift-user at ci.uchicago.edu > > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user > From lpesce at uchicago.edu Thu Jan 24 13:09:40 2013 From: lpesce at uchicago.edu (Lorenzo Pesce) Date: Thu, 24 Jan 2013 13:09:40 -0600 Subject: [Swift-user] How can one force ordering into swift operations In-Reply-To: <906728816.1403172.1359050383715.JavaMail.root@mcs.anl.gov> References: <906728816.1403172.1359050383715.JavaMail.root@mcs.anl.gov> Message-ID: Thanks a lot Mike & all. Let me try this and get back to you. The devil is in the details and in the flood of genome datafiles that will come out of this... So far my main problem is that there are 10 possible approaches, most of which don't scale to 50,000 files, 12,000 tasks and 30 TB of data :-( On Jan 24, 2013, at 11:59 AM, Michael Wilde wrote: > > Here's a split-and-process example: > > $ cat SplitAndProcess.swift > > type file; > > app (file flist) split (file i) > { > sh "-c" @strcat("split -l 50 ", @filename(i), " /tmp/segment ; /bin/ls -1 /tmp/segment??") stdout=@filename(flist); > } > > app (file counts) wc (file i) > { > sh "-c" @strcat("wc ", @filename(i)) stdout=@filename(counts); > } > > file infile<"infile">; > > string segnames[] = readData(split(infile)); > > foreach s,i in segnames { > file segment ; > string counts = readData(wc(segment)); > tracef("segment %i is file %s, counts=%s\n", i, s, counts ); > } > > $ wc -l infile > > 460 infile > > $ swift -config cf -tc.file tc -sites.file local.xml SplitAndProcess.swift > > Warning: Procedure split is deprecated, at 15 > Warning: Procedure wc is deprecated, at 19 > Swift trunk swift-r6151 cog-r3552 (cog modified locally) > > RunID: 20130124-1152-gg21bvq8 > Progress: time: Thu, 24 Jan 2013 11:52:06 -0600 > Progress: time: Thu, 24 Jan 2013 11:52:08 -0600 Active:9 Checking status:1 Finished successfully:1 > segment 0 is file /tmp/segmentaa, counts= 50 267 2584 tmp/segmentaa > segment 5 is file /tmp/segmentaf, counts= 50 597 7284 tmp/segmentaf > segment 1 is file /tmp/segmentab, counts= 50 350 4196 tmp/segmentab > segment 8 is file /tmp/segmentai, counts= 50 579 7082 tmp/segmentai > segment 2 is file /tmp/segmentac, counts= 50 452 4949 tmp/segmentac > segment 9 is file /tmp/segmentaj, counts= 10 71 835 tmp/segmentaj > segment 4 is file /tmp/segmentae, counts= 50 490 6093 tmp/segmentae > segment 3 is file /tmp/segmentad, counts= 50 589 7026 tmp/segmentad > segment 7 is file /tmp/segmentah, counts= 50 498 6047 tmp/segmentah > segment 6 is file /tmp/segmentag, counts= 50 591 7046 tmp/segmentag > Final status: Thu, 24 Jan 2013 11:52:08 -0600 Finished successfully:11 > > Note that the script forces the split segments to be written to /tmp; otherwise they would be written to the job directory in which the split() app runs. This is not "location independent" but works fine when you run split on a local host. You can use $PWD instead of /tmp by passing it into swift eg -cwd=$PWD and adjusting the script accordingly. > > - Mike > > ----- Original Message ----- >> From: "Michael Wilde" >> To: "Daniel S. Katz" >> Cc: "Glen Hocky" , "Swift User Discussion List" >> Sent: Thursday, January 24, 2013 10:42:55 AM >> Subject: Re: [Swift-user] How can one force ordering into swift operations >> >> >> A few brief additional tips to help you make progress with this: >> >> - your split app can create and return a single file containing a >> list of file names >> >> - use readData to read that list into an array; then use one of the >> array mappers to map the list of files. >> >> Separately: the "flag" Dan suggests can also be done using a variable >> of type "external" which allows you to do explicit synchronization. >> Its only honored as a return or an input of an app() function. >> >> - Mike >> >> ----- Original Message ----- >>> From: "Daniel S. Katz" >>> To: "Lorenzo Pesce" >>> Cc: "Glen Hocky" , "Swift User Discussion List" >>> >>> Sent: Thursday, January 24, 2013 10:20:52 AM >>> Subject: Re: [Swift-user] How can one force ordering into swift >>> operations >>> >>> >>> you could just add an artificial dependency. Make step out output >>> file "flag" when it is done. >>> >>> >>> Make step 2 dependent on file "flag" >>> >>> >>> Dan >>> >>> >>> >>> >>> >>> On Jan 24, 2013, at 11:13 AM, Lorenzo Pesce < lpesce at uchicago.edu > >>> wrote: >>> >>> >>> >>> So that I return an array of files of unknown size (don't know how >>> many files they will be) to the calling swift script? >>> >>> >>> >>> >>> >>> On Jan 24, 2013, at 10:08 AM, Glen Hocky wrote: >>> >>> >>> >>> Lorenzo, >>> This may not work for your purposes, but a simple solution similar >>> to >>> what I do, is to actually do step 1 in the wrapper before the >>> mapping is done. This guarantees that all files are in place. >>> >>> >>> Best, >>> Glen >>> >>> >>> >>> On Thu, Jan 24, 2013 at 10:43 AM, Lorenzo Pesce < >>> lpesce at uchicago.edu >>>> wrote: >>> >>> >>> I have a simple problem: >>> step 1: I run an app that splits a file in a group of files and we >>> don't know how many they are. >>> step2: I want to map those files using a mapper after the fact >>> >>> Problem is that the mapper doesn't know that it can't run till step >>> 1 >>> is done because it has no input files. How can I tell the mapper >>> (and what follows it by consequence since those files will not be >>> there) that it has to wait for step 1 to b finished? >>> >>> Thanks, >>> >>> Lorenzo >>> _______________________________________________ >>> Swift-user mailing list >>> Swift-user at ci.uchicago.edu >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user >>> >>> >>> _______________________________________________ >>> Swift-user mailing list >>> Swift-user at ci.uchicago.edu >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user >>> >>> >>> >>> -- >>> >>> >>> >>> >>> >>> Daniel S. Katz >>> University of Chicago >>> (773) 834-7186 (voice) >>> (773) 834-6818 (fax) >>> d.katz at ieee.org or dsk at ci.uchicago.edu >>> http://www.ci.uchicago.edu/~dsk/ >>> >>> >>> >>> >>> >>> _______________________________________________ >>> Swift-user mailing list >>> Swift-user at ci.uchicago.edu >>> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user >> _______________________________________________ >> Swift-user mailing list >> Swift-user at ci.uchicago.edu >> https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user >> > _______________________________________________ > Swift-user mailing list > Swift-user at ci.uchicago.edu > https://lists.ci.uchicago.edu/cgi-bin/mailman/listinfo/swift-user From iraicu at cs.iit.edu Mon Jan 28 07:51:33 2013 From: iraicu at cs.iit.edu (Ioan Raicu) Date: Mon, 28 Jan 2013 07:51:33 -0600 Subject: [Swift-user] CFP: IEEE Transactions on Cloud Computing (TCC) Message-ID: <51068265.1030101@cs.iit.edu> Dear Colleagues: The newly established IEEE Transactions on Cloud Computing (TCC) is seeking original and innovative research papers in all areas related to Cloud computing. For details, please see the attached call for papers. Best regards Ioan Raicu Associate Editor IEEE Transactions on Cloud Computing (TCC) ------------------------------------------------------------------------------ IEEE Transactions on Cloud Computing (TCC) ------------------------------------------------------------------------------ Call for Papers IEEE Transactions on Cloud Computing will publish peer-reviewed articles that provide innovative research ideas and applications results in all areas relating to cloud computing. Topics relating to novel theory, algorithms, performance analyses and applications of techniques relating to all areas of cloud computing will be considered for the transactions. The transactions will consider submissions specifically in the areas of cloud security, tradeoffs between privacy and utility of cloud, cloud standards, the architecture of cloud computing, cloud development tools, cloud software, cloud backup and recovery, cloud interoperability, cloud applications management, cloud data analytics, cloud communications protocols, mobile cloud, liability issues for data loss on clouds, data integration on clouds, big data on clouds, cloud education, cloud skill sets, cloud energy consumption, cloud applications in commerce, education and industry. This title will also consider submissions on Infrastructure as a Service (IaaS), Platform as a Service (PaaS), Software as a Service (SaaS), and Business Process as a Service (BPaaS). For details of the submission process, please consult the relevant Web pages at http://www.computer.org/portal/web/tcc ================ Editorial Board: ================ Editor-in-Chief (EiC): ---------------------- Rajkumar Buyya Director, Cloud Computing and Distributed Systems (CLOUDS) Lab The University of Melbourne, Australia CEO, Manjrasoft Pvt Ltd, Melbourne, Australia Web: http://www.cloudbus.org/~raj Associate Editors: ------------------ Adam Wierman, CalTech (California Institute of Technology), USA Albert Zomaya, University of Sydney, Australia Andrew Martin, Oxford University, UK Beng Chin OOI, National University of Singapore, Singapore Beniamino Di Martino, Second University of Naples, Italy Bharadwaj Veeravalli, National University of Singapore, Singapore Carlos Varela, Rensselaer Polytechnic Institute, USA Cesar A. De Rose, PUCRS, Brazil Chandra Krintz, University of California, Santa Barbara, USA Cheng-Zhong Xu, Wayne State University, USA Cho-Li Wang, The University of Hong Kong, China Chunming Rong, University of Stavanger, Norway David Bernstein, Cloud Strategy Partners LLC, USA David De Roure, Oxford University, UK David Lie, University of Toronto, Canada Dejan Milojicic, HP Labs, USA Dick Epema, Delft University of Technology, The Netherlands Gagan Agrawal, Ohio State University, USA Geoffrey Fox, Indiana University, USA Hai Jin, Huazhong University of Science & Technology, China Hui Lei, IBM T. J. Watson Research Center, USA Ignacio Mart?n, Llorente Universidad Complutense de Madrid, Spain Ioan Raicu, Illinois Institute of Technology, Chicago, USA Irena Bojanova, University of Maryland, USA Ivan Stojmenovic, University of Ottawa, Canada Ivona Brandic, Vienna University of Technology, Austria D. Janakiram, Indian Institute of Technology (IIT) Madras, India Jose Fortes, University of Florida, USA Junwei Cao, Tsinghua University, China Kai Hwang, University of Southern California, USA Laurent Lef?vre, INRIA, France Manish Parashar, Rutgers University, USA Masum Z. Hasan, CISCO, USA Murat Demirbas, SUNY Buffalo, USA Omer Rana, Cardiff University, UK Pavan Balaji, Argonne National Laboratory, USA Phillip B. Gibbons, Intel Labs Pittsburgh, USA Pierangela Samarati, Universit? degli Studi di Milano, Italy Qianhui Althea, HP Labs, Singapore Ramesh Sitaraman, University of Massachusetts, USA Rao Kotagiri, University of Melborune, Australia Ricardo Bianchini, Rutgers University, USA Roy Campbell, University of Illinois, Urbana-Champaign, USA Ruby B. Lee, Princeton University, USA Ruppa Thulasiram, University of Manitoba, Canada Sanjeev Aggarwal, Indian Institute of Technology (IIT) Kanpur, India Shivnath Babu, Duke University, USA Shubhashis Sengupta, Accenture, India Siani Pearson, HP Labs, UK Sorav Bansal, Indian Institute of Technology (IIT) Delhi, India Thamarai Selvi, Anna University, India Thomas Fahringer, University of Innsbruck , Austria Umesh Bellur, Indian Institute of Technology (IIT) Mumbai, India Vijaya Varadharajan, Macquarie University, Australia Vojislav Misic, Ryerson University, Canada Xiaofang Zhou, University of Queensland, Australia Yong Cui, Tsinghua University, China ------------------------------------------------------------------------------ -- ================================================================= Ioan Raicu, Ph.D. Assistant Professor, Illinois Institute of Technology (IIT) Guest Research Faculty, Argonne National Laboratory (ANL) ================================================================= Data-Intensive Distributed Systems Laboratory, CS/IIT Distributed Systems Laboratory, MCS/ANL ================================================================= Editor: IEEE TCC, Springer JoCCASA Chair: IEEE/ACM MTAGS, ACM ScienceCloud, IEEE/ACM DataCloud ================================================================= Cel: 1-847-722-0876 Office: 1-312-567-5704 Email: iraicu at cs.iit.edu Web: http://www.cs.iit.edu/~iraicu/ Web: http://datasys.cs.iit.edu/ LinkedIn: http://www.linkedin.com/in/ioanraicu Google: http://scholar.google.com/citations?user=jE73HYAAAAAJ ================================================================= =================================================================